I am working on a Flask app where the main function is taking an uploaded CSV file, turning it into a Pandas dataframe, and then running a series of analyses on it.
The application works, but is somewhat rigid in the upload process requiring it to exactly match the column names and expected values.
I want to provide a fallback function that will dynamically generate forms that will ask the user questions in order to figure out column mappings back to the original template.
So for example if the template that worked was laid out like this:
['Name', 'Phone', 'Email']
and the uploaded version was:
['Customer Name', 'Phone Number', 'Email Address
]
I would want it to ask:
Which of ['Customer Name', 'Phone Number', 'Email Address'] correspond to Name.
Preferably I would like to use Flask-wtf to maintain consistency, unless there is a clearly better way to do this.
I imagine the way to do this would be to do something like:
import pandas as pd
from flask_wtf import FlaskForm
from wtforms import SelectField
minimum_viable_columns = ['Name', 'Phone', 'Email']
data = pd.read_csv(uploaded_file)
potential_columns = []
for x in data.columns.values.tolist():
if x not in minimum_viable_columns:
potential_columns.append(x)
missing_columns = []
for x in minimum_viable_columns:
if x not in claims.columns.values.tolist():
missing_columns.append(x)
class MatchingForm(FlaskForm):
field_name = SelectField('Corresponding field name', choices=potential_columns)
I would have to generate these forms for each column in missing_columns, and I am trying to figure out a good way to do that and hopefully tie them all to a single submit button.
So the solution I ended up landing one involved a loop to create attributes for the form and a loop on the route for flask to capture the values. The code looks like this:
def map_csv():
possible_columns = session['potential_columns']
missing_columns = session['missing_columns']
class MappingForm(FlaskForm):
submit = SubmitField('Submit new mapping')
pass
possible_columns = [(x, x) for x in possible_columns]
if missing_columns is not None:
for name in missing_columns:
setattr(MappingForm, name, SelectField(name, choices=possible_columns))
form = MappingForm()
if request.method == 'POST':
if form.validate_on_submit:
mapping = {}
for x in missing_columns:
column = getattr(form, x)
mapping[column.data] = x
session['mapping'] = mapping
return redirect(url_for('main.upload'))
return render_template('map_csv.html', form=form)
Related
Im trying to import some data from a csv file to a django database using django-import-export, with a foreign key (location). What I want to achieve is, that the location_id is passed by the request url.
value,datetime,location
4.46,2020-01-01,1
4.46,2020-01-02,1
My urls look like this, so I want "location_id" to be passed into the uploaded csv file:
urlpatterns = [
...
...
path('..../<int:location_id>/upload', views.simple_upload, name='upload'),
]
My view looks like this:
def simple_upload(request, location_id):
if request.method == 'POST':
rainfall_resource = RainfallResource()
dataset = Dataset()
new_rainfall = request.FILES['myfile']
imported_data = dataset.load(new_rainfall.read().decode("utf-8"), format="csv")
try:
result = rainfall_resource.import_data(dataset, dry_run=True) # Test the data import
except Exception as e:
return HttpResponse(e, status=status.HTTP_400_BAD_REQUEST)
if not result.has_errors():
rainfall_resource.import_data(dataset, dry_run=False) # Actually import now
return render(request, '/import.html')
My ModelResource looks like this:
class RainfallResource(resources.ModelResource):
location_id = fields.Field(
column_name='location_id',
attribute='location_id',
widget=ForeignKeyWidget(Location, 'Location'))
class Meta:
model = Rainfall
def before_import_row(self, row, **kwargs):
row['location'] = location_id
The manipulation works when I hardcode "location_id" like:
def before_import_row(self, row, **kwargs):
row['location'] = 123
However, I do not understand how to pass the location_id argument from the "url" to the "before_import_row" function. Help would be highly appreciated:-)
I think you will have to modify your imported_data in memory, before importing.
You can use the tablib API to update the dataset:
# import the data as per your existing code
imported_data = dataset.load(new_rainfall.read().decode("utf-8"), format="csv")
# create an array containing the location_id
location_arr = [location_id] * len(imported_data)
# use the tablib API to add a new column, and insert the location array values
imported_data.append_col(location_arr, header="location")
By using this approach, you won't need to override before_import_row()
I am using django-datable-view for rendering data from django models.
Everything works fine before decorating the url, after i added login_required to the url, it threw weird error.
According to the doc, it states that i can add login_required to the url.
Below is my code
from django_datatables_view.base_datatable_view import BaseDatatableView
class OrderListJson(BaseDatatableView):
# The model we're going to show
model = MyModel
# define the columns that will be returned
columns = ['number', 'user', 'state', 'created', 'modified']
# define column names that will be used in sorting
# order is important and should be same as order of columns
# displayed by datatables. For non sortable columns use empty
# value like ''
order_columns = ['number', 'user', 'state', '', '']
# set max limit of records returned, this is used to protect our site if someone tries to attack our site
# and make it return huge amount of data
max_display_length = 500
def render_column(self, row, column):
# We want to render user as a custom column
if column == 'user':
return '{0} {1}'.format(row.customer_firstname, row.customer_lastname)
else:
return super(OrderListJson, self).render_column(row, column)
def filter_queryset(self, qs):
# use parameters passed in GET request to filter queryset
# simple example:
search = self.request.GET.get(u'search[value]', None)
if search:
qs = qs.filter(name__istartswith=search)
# more advanced example using extra parameters
filter_customer = self.request.GET.get(u'customer', None)
if filter_customer:
customer_parts = filter_customer.split(' ')
qs_params = None
for part in customer_parts:
q = Q(customer_firstname__istartswith=part)|Q(customer_lastname__istartswith=part)
qs_params = qs_params | q if qs_params else q
qs = qs.filter(qs_params)
return qs
url
url(_(r'^users/all/?$'),
login_required(dashboard.v1.views.OrderListJson.as_view()),
name='all_users'),
i keep getting error 500, if i remove the login_required, everything works well. If i can get suggestions on how to decorate the class view, i will be glad since that is what am trying to achieve
Here we have the basic code for getting django-datatables-view to display
from django_datatables_view.base_datatable_view import BaseDatatableView
class OrderListJson(BaseDatatableView):
# The model we're going to show
model = MyModel
# define the columns that will be returned
columns = ['number', 'user', 'state', 'created', 'modified']
# define column names that will be used in sorting
# order is important and should be same as order of columns
# displayed by datatables. For non sortable columns use empty
# value like ''
order_columns = ['number', 'user', 'state', '', '']
# set max limit of records returned, this is used to protect our site if someone tries to attack our site
# and make it return huge amount of data
max_display_length = 500
def render_column(self, row, column):
# We want to render user as a custom column
if column == 'user':
return '{0} {1}'.format(row.customer_firstname, row.customer_lastname)
else:
return super(OrderListJson, self).render_column(row, column)
def filter_queryset(self, qs):
# use parameters passed in GET request to filter queryset
# simple example:
search = self.request.GET.get(u'search[value]', None)
if search:
qs = qs.filter(name__istartswith=search)
# more advanced example using extra parameters
filter_customer = self.request.GET.get(u'customer', None)
if filter_customer:
customer_parts = filter_customer.split(' ')
qs_params = None
for part in customer_parts:
q = Q(customer_firstname__istartswith=part)|Q(customer_lastname__istartswith=part)
qs_params = qs_params | q if qs_params else q
qs = qs.filter(qs_params)
return qs
This code works fine, however, how do I get it to not display the whole model, but only filtered content from the model? I've tried setting it to model = MyModel.objects.filter(name="example") but this returns an error.
def get_initial_queryset(self):
return MyModel.objects.filter(name="example")
Add this in your Class OrderListJson
You can use the get_initial_queryset method that returns the queryset used to populate the datatable.
I have 2 drop downs.
First drop down asks for the type.
I have a second drop down that shows values from another Model.
What I need is that, if the first drop down as a type: qualif, I only want to show the pk=1 of the second drop down.
This is what I have so far:
name = models.CharField(max_length=40,verbose_name="nom")
type = models.CharField(max_length=6,choices=TYPE_CHOICES)
division = models.ForeignKey(Division,verbose_name="division")
class TournamentForm(forms.ModelForm):
def clean(self):
super(TournamentForm, self).clean() #if necessary
if 'division' in self._errors:
"""
reset the value (something like this i
think to set the value b/c it doesnt get set
b/c the field fails validation initially)
"""
if self.data['type'] == 'qualif':
division = Division.objects.get(pk=1)
self.division = division
# remove the error
del self._errors['division']
return self.cleaned_data
# Register your models here.
class TournamentAdmin(reversion.VersionAdmin):
form = TournamentForm
list_display = ('name', 'date', 'division', 'gender')
ordering = ('date', 'name')
list_filter = ['date', 'season', 'division', 'gender']
admin.site.register(Tournament, TournamentAdmin)
I read from another stack question to use clean...but sadly it's not working...
EDIT:
After looking at #Mardo's link, I tried loading up a static file.
Here is my folder setup:
myproject/static/admin/js/myfile.js
And this in my settings.py
STATIC_URL = '/static/'
But it keeps saying file not found...
Thanks,
Ara
Make two forms. First have only first dropdown, second is disabled. On selecting value commit form and render full version based on first.
I'm attempting to do something simple and documented well, except for that it's not working on my web app.
essentally i want to save some extra attributes for the uploaded files, like original filename, email of user and also the upload date.
Now following the web2py documentation i've created this submit view. It is almost word for word copied from the documentation section here
I have a controller data.py
def submit():
import datetime
form = SQLFORM(db.uploads, fields=['up_file'], deletable=True)
form.vars.up_date = datetime.datetime.now()
form.vars.username = auth.user.email
if request.vars.up_file != None:
form.vars.filename = request.vars.up_file.filename
if form.process().accepted:
redirect(URL('data', 'index'))
elif form.errors:
response.flash = "form has errors"
and my db.py excerpt:
db.define_table('uploads',
Field('username', 'string'),
Field('filename', represent = lambda x, row: "None" if x == None else x[:45]),
Field('up_file', 'upload', uploadseparate=True, requires=[IS_NOT_EMPTY(), IS_UPLOAD_FILENAME(extension=ext_regex)]),
Field('up_date', 'datetime'),
Field('up_size', 'integer', represent= lambda x, row: quikr_utils.sizeof_fmt(x) ),
Field('notes', 'text'))
Currently the validation doesn't appear to do anything, when I submit my function, the filename isn't getting saved for some reason, and i get an error elsewhere because the value is None
You need to do something like this :
DB :
db.define_table('t_filetable',
Field('f_filename', type='string', label=T('File Name')),
Field('f_filedescription', type='text',
represent=lambda x, row: MARKMIN(x),
comment='WIKI (markmin)',
label=T('Description')),
Field('f_filebinary', type='upload', notnull=True, uploadseparate=True,
label=T('File Binary')),
auth.signature,
format='%(f_filename)s',
migrate=settings.migrate)
Controller : (default.py)
#auth.requires_login()
def addfile():
form = SQLFORM(db.t_filetable, upload=URL('download'))
if form.process(onvalidation=validate_filename).accepted:
response.flash = 'success'
elif form.errors:
response.flash = 'form has errors'
return dict(form=form)
def validate_filename(form):
if form.vars.f_filename == "":
form.vars.f_filename = request.vars.f_filebinary.filename
Function validate_filename is called AFTER the form has been validated, so form.vars should be available to use here. Function validate_filename checks if form.vars.f_filename has any value other than "" (blank) ; if not, it reads the filename from the request.vars.f_filebinary and assigns it to the form.vars.f_filename . This way you can allow users to provide an optional field for filename. If they leave it blank, and just upload the file, the f_filename in DB will be the original filename.
I tried your pasting your code into web2py to see where it goes wrong and it actually worked for me (at least the file names saved). Maybe the problem is elsewhere?