Python: 2.7.11
Django: 1.9
Pandas: 0.17.1
How should I go about creating a potentially large xlsx file download? I'm creating a xlsx file with pandas from a list of dictionaries and now need to give the user possibility to download it. The list is in a variable and is not allowed to be saved locally (on server).
Example:
df = pandas.DataFrame(self.csvdict)
writer = pandas.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
This example would just create the file and save it where the executing script is located. What I need is to create it to a http response so that the user would get a download prompt.
I have found a few posts about doing this for a xlsxwriter but non for pandas. I also think that I should be using 'StreamingHttpResponse' for this and not a 'HttpResponse'.
I will elaborate on what #jmcnamara wrote. This if for the latest versions of Excel, Pandas and Django. The import statements would be at the top of your views.py and the remaining code could be in a view:
import pandas as pd
from django.http import HttpResponse
try:
from io import BytesIO as IO # for modern python
except ImportError:
from io import StringIO as IO # for legacy python
# this is my output data a list of lists
output = some_function()
df_output = pd.DataFrame(output)
# my "Excel" file, which is an in-memory output file (buffer)
# for the new workbook
excel_file = IO()
xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
df_output.to_excel(xlwriter, 'sheetname')
xlwriter.save()
xlwriter.close()
# important step, rewind the buffer or when it is read() you'll get nothing
# but an error message when you try to open your zero length file in Excel
excel_file.seek(0)
# set the mime type so that the browser knows what to do with the file
response = HttpResponse(excel_file.read(), content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xlsx'
return response
Jmcnamara is pointing you in the rigth direction. Translated to your question you are looking for the following code:
sio = StringIO()
PandasDataFrame = pandas.DataFrame(self.csvdict)
PandasWriter = pandas.ExcelWriter(sio, engine='xlsxwriter')
PandasDataFrame.to_excel(PandasWriter, sheet_name=sheetname)
PandasWriter.save()
sio.seek(0)
workbook = sio.getvalue()
response = StreamingHttpResponse(workbook, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % filename
Notice the fact that you are saving the data to the StringIO variable and not to a file location. This way you prevent the file being saved before you generate the response.
Maybe a bit off-topic, but it's worth pointing out that the to_csv method is generally faster than to_excel, since excel contains format information of the sheets. If you only have data and not formatting information, consider to_csv. Microsoft Excel can view and edit csv files with no problem.
One gain by using to_csv is that to_csv function can take any file-like object as the first argument, not only a filename string. Since Django response object is file-like, to_csv function can directly write to it. Some codes in your view function will look like:
df = <your dataframe to be downloaded>
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=<default filename you wanted to give to the downloaded file>'
df.to_csv(response, index=False)
return response
Reference:
https://gist.github.com/jonperron/733c3ead188f72f0a8a6f39e3d89295d
https://docs.djangoproject.com/en/2.1/howto/outputting-csv/
With Pandas 0.17+ you can use a StringIO/BytesIO object as a filehandle to pd.ExcelWriter. For example:
import pandas as pd
import StringIO
output = StringIO.StringIO()
# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')
# Write the data frame to the StringIO object.
pd.DataFrame().to_excel(writer, sheet_name='Sheet1')
writer.save()
xlsx_data = output.getvalue()
print len(xlsx_data)
After that follow the XlsxWriter Python 2/3 HTTP examples.
For older versions of Pandas you can use this workaround.
Just wanted to share a class-based view approach to this, using elements from the answers above. Just override the get method of a Django View. My model has a JSON field which contains the results of dumping a dataframe to JSON with the to_json method.
Python version is 3.6 with Django 1.11.
# models.py
from django.db import models
from django.contrib.postgres.fields import JSONField
class myModel(models.Model):
json_field = JSONField(verbose_name="JSON data")
# views.py
import pandas as pd
from io import BytesIO as IO
from django.http import HttpResponse
from django.views import View
from .models import myModel
class ExcelFileDownloadView(View):
"""
Allows the user to download records in an Excel file
"""
def get(self, request, *args, **kwargs):
obj = myModel.objects.get(pk=self.kwargs['pk'])
excel_file = IO()
xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
pd.read_json(obj.json_field).to_excel(xlwriter, "Summary")
xlwriter.save()
xlwriter.close()
excel_file.seek(0)
response = HttpResponse(excel_file.read(),
content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename="excel_file.xlsx"'
return response
# urls.py
from django.conf.urls import url
from .views import ExcelFileDownloadView
urlpatterns = [
url(r'^mymodel/(?P<pk>\d+)/download/$', ExcelFileDownloadView.as_view(), name="excel-download"),]
You're mixing two requirements that should be separate:
Creating a .xlsx file using python or pandas--it looks like you're good on this part.
Serving a downloadable file (django); see this post or maybe this one
Related
Given a query set from my database, I am trying to output the results into an excel file. I have tried two versions of code... the code executes properly without outputting any error messages, but the excel file is not outputted, any suggestions on where I went wrong.
import pandas as pd
import csv, io, xlsxwriter
from django.http import HttpResponseRedirect, HttpResponse, JsonResponse, FileResponse
def print_excel(request)
products = Product.objects.all()
df = pd.DataFrame(list(products.values()))
excel_file = io.BytesIO()
xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
df.to_excel(xlwriter, 'sheetname')
xlwriter.save()
#xlwriter.close()
excel_file.seek(0)
response = HttpResponse(excel_file.read(), content_type='application/ms-excel vnd.openxmlformats-officedocument.spreadsheetml.sheet')
# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xls'
return response
I also tried the below approached where I tried to output an excel with the string rows
import pandas as pd
import csv, io, xlsxwriter
from django.http import HttpResponseRedirect, HttpResponse, JsonResponse, FileResponse
def print_excel(request)
buffer = io.BytesIO()
workbook = xlsxwriter.Workbook(buffer)
worksheet = workbook.add_worksheet()
worksheet.write('A1', "rows")
workbook.close()
buffer.seek(0)
return FileResponse(buffer, as_attachment=True, filename='myfile.xlsx')
The two codes executes properly but I cant locate the excel file created.
The code is fine, the problem might be with the get request... If you are using a fetch request via JavaScript, you will the error. For the Fileresponse and HttpResponse to work properly and generate the required files ensure the get request is generate directly from the html template i,e use Generate Excel
Right now I have a flask app in which part of the functionality allows me to select a date range and see data from a sql database from that selected date range. I then can click a button and it exports this to a csv file which is just saved in the flask project directory. I want the user to be able to download this csv file. I want to know what the best practice for a user to download a dynamic csv file. Should I send_file() and then delete the file after user has downloaded since this data shouldn't be saved and the user won't be using that file again. Should the file be saved to the database and then deleted out of the db? Or can I just keep it within the flask directory? Please provide insight if possible, thank you so much.
#brunns pointed it in very right direction.
You don't have to save the file in your database or in your file structure or anywhere. It will get created in memory on user request.
I've done this with django for pdf and for csv files it'll work in the same way with flask too. Basics are all same.
for python3 use io.StringIO, for python2 use StringIO
from io import StringIO
import csv
from flask import make_response
#app.route('/download')
def post(self):
si = StringIO.StringIO()
cw = csv.writer(si)
cw.writerows(csvList)
output = make_response(si.getvalue())
output.headers["Content-Disposition"] = "attachment; filename=export.csv"
output.headers["Content-type"] = "text/csv"
return output
Courtesy: vectorfrog
Based on #xxbinxx's answer, used with pandas
from io import StringIO
import csv
from flask import make_response
#app.route('/download')
def download_csv(self, df: pd.DataFrame):
si = StringIO()
cw = csv.writer(si)
cw.writerows(df.columns.tolist())
cw.writerows(df.values.tolist())
output = make_response(si.getvalue())
output.headers["Content-Disposition"] = "attachment; filename=export.csv"
output.headers["Content-type"] = "text/csv"
return output
Is it possible to define a route in bottle which would return a file?
I have a mongo database which is accessed by pandas.
Pandas generates a xls file based on a request parameters.
Two steps above are clear and easy to implement.
The third step is the one I have a problem with.
Define a bottle route which would return a file to download by user.
I don't want to use static previously generated files.
Thanks in advance.
I'm not familiar with Pandas but you need to get binary contents of a xls file to send to a user via a Bottle route. Modified example from here for Python 3:
from io import BytesIO
from bottle import route, response
from pandas import ExcelWriter
#route('/get-xlsx')
def get_xlsx():
output = BytesIO()
writer = ExcelWriter(output, engine='xlsxwriter')
# Do something with your Pandas data
# ...
pandas_dataframe.to_excel(writer, sheet_name='Sheet1')
writer.save()
response.contet_type = 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
response.add_header('Content-Disposition', 'attachment; filename="report.xlsx"')
return output.getvalue()
When a user click a link that corresponds to this route, a file download dialog for "report.xlxs" will open in their browser.
views.py
def export_to_excel(request):
lists = MyModel.objects.all()
# your excel html format
template_name = "sample_excel_format.html"
response = render_to_response(template_name, {'lists': lists})
# this is the output file
filename = "model.csv"
response['Content-Disposition'] = 'attachment; filename='+filename
response['Content-Type'] = 'application/vnd.ms-excel; charset=utf-16'
return response
urls.py
from django.conf.urls.defaults import *
urlpatterns = patterns('app_name.views',
url(r'^export/$', 'export_to_excel', name='export_to_excel'),
)
Last, in your page create a button or link that will point in exporting.
page.html
Export
Nothing getting file option for download and not giving any error but i can see all result in log its working fine.
I think that the solution to export excel files is :
if 'excel' in request.POST:
response = HttpResponse(content_type='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename=Report.xlsx'
xlsx_data = WriteToExcel(weather_period, town)
response.write(xlsx_data)
return response
In this example the library used for exporting is xlsxWriter.
Here is a very complete and practical solution for this, and many others: http://assist-software.net/blog/how-export-excel-files-python-django-application .
In addition to the options shown in the other answers you can also use XlsxWriter to create Excel files.
See this example.
It seems that you are trying to generate an excel workbook with HTML content. I don't know if Excel (or LibreOffice) is able to open such file but I think it is not the right approach.
You should fist generate a excel file : you can use csv, xlwt for xls and openpyxl for xlsx
The content of the file can be passed to the HttpResponse
for example, if you work with xlwt:
import xlwt
wb = xlwt.Workbook()
#use xlwt to fill the workbook
#
#ws = wb.add_sheet("sheet")
#ws.write(0, 0, "something")
response = HttpResponse(mimetype='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename=the-file.xls'
wb.save(response)
return response
You can also look at
django-excel-response which does all the work for you. (I think it doesn't support xlsx format)
django-excel-export
I hope it helps
This seems to be based on the practice of tricking excel into opening an HTML table by changing the file name and MIME type. In order to make this work, the HTML file has to assemble an HTML table, and this is likely to trigger a warning that the real content of the file is different from the declared content.
IMHO it is a crude hack and should be avoided. Instead you can create a real excel file using the xlwt module, or you can create a real CSV file using the csv module.
[update]
After looking the blog post you refered, I see it is recommending another bad practice: using anything but the csv module to produce CSV files is dangerous because if the data contains the delimiter character, quotes or line breaks, you may end up with a bad CSV.
The csv module will take care of all corner cases and produce a proper formatted output.
I've seen people use a Django template naming the file "something.xls" and using HTML tables instead of the CSV format, but this has some corner cases as well.
Export Data to XLS File
Use it if you really need to export to a .xls file. You will be able to add formating as bold font, font size, define column size, etc.
First of all, install the xlwt module. The easiest way is to use pip.
pip install xlwt
views.py
import xlwt
from django.http import HttpResponse
from django.contrib.auth.models import User
def export_users_xls(request):
response = HttpResponse(content_type='application/ms-excel')
response['Content-Disposition'] = 'attachment; filename="users.xls"'
wb = xlwt.Workbook(encoding='utf-8')
ws = wb.add_sheet('Users')
# Sheet header, first row
row_num = 0
font_style = xlwt.XFStyle()
font_style.font.bold = True
columns = ['Username', 'First name', 'Last name', 'Email address', ]
for col_num in range(len(columns)):
ws.write(row_num, col_num, columns[col_num], font_style)
# Sheet body, remaining rows
font_style = xlwt.XFStyle()
rows = User.objects.all().values_list('username', 'first_name', 'last_name', 'email')
for row in rows:
row_num += 1
for col_num in range(len(row)):
ws.write(row_num, col_num, row[col_num], font_style)
wb.save(response)
return response
urls.py
import views
urlpatterns = [
...
url(r'^export/xls/$', views.export_users_xls, name='export_users_xls'),
]
template.html
Export all users
Learn more about the xlwt module reading its official documentation. https://simpleisbetterthancomplex.com/tutorial/2016/07/29/how-to-export-to-excel.html
I'm trying to return a CSV from an action in my webapp, and give the user a prompt to download the file or open it from a spreadsheet app. I can get the CSV to spit out onto the screen, but how do I change the type of the file so that the browser recognizes that this isn't supposed to be displayed as HTML? Can I use the csv module for this?
import csv
def results_csv(self):
data = ['895', '898', '897']
return data
To tell the browser the type of content you're giving it, you need to set the Content-type header to 'text/csv'. In your Pylons function, the following should do the job:
response.headers['Content-type'] = 'text/csv'
PAG is correct, but furthermore if you want to suggest a name for the downloaded file you can also set response.headers['Content-disposition'] = 'attachment; filename=suggest.csv'
Yes, you can use the csv module for this:
import csv
from cStringIO import StringIO
...
def results_csv(self):
response.headers['Content-Type'] = 'text/csv'
s = StringIO()
writer = csv.writer(s)
writer.writerow(['header', 'header', 'header'])
writer.writerow([123, 456, 789])
return s.getvalue()