I have a calculation that creates an excel spreadsheet using xlsxwriter to show results. It would be useful to sort the table after knowing the results.
One solution would be to create a separate Data structure in python, and sort the data structure, and use xlsx later, but it is not very elegant, requires a lot of data type handling.
I cannot find a way to sort the structures in the xlsx module.
Can anybody help with the internal data structure of that module? Can that be sorted, before writing it to disk.
Another solution would be reopening the file, sort the stuff and close it again?
import xlsxwriter
workbook=xlsxwriter("Trial.xlsx")
worksheet=workbook.add_worksheet("first")
worksheet.write_number(0,1,2)
worksheet.write_number(0,2,1)
...worksheet.sort
Can anybody help with the internal data structure of that module? Can that be sorted, before writing it to disk.
I am the author of the module and the short answer is that this can't or shouldn't be done.
It is possible to sort worksheet data in Excel at runtime but that isn't part of the file specification so it can't be done with XlsxWriter.
One solution would be to create a separate Data structure in python, and sort the data structure, and use xlsx later, but it is not very elegant, requires a lot of data type handling.
That sounds like a reasonable solution to me.
You should process your data before writing it to a Workbook as it is not easily possible to manipulate the data once in the spreadsheet.
The following example would write a column of numbers unsorted:
import xlsxwriter
with xlsxwriter.Workbook("Trial.xlsx") as workbook:
worksheet = workbook.add_worksheet("first")
data = [5, 2, 7, 3, 8, 1]
for rowy, value in enumerate(data):
worksheet.write_number(rowy, 0, value) # use column 0
But if you first sort the data as follows:
import xlsxwriter
with xlsxwriter.Workbook("Trial.xlsx") as workbook:
worksheet = workbook.add_worksheet("first")
data = sorted([5, 2, 7, 3, 8, 1])
for rowy, value in enumerate(data):
worksheet.write_number(rowy, 0, value) # use column 0
You would get something like:
Related
I have a CSV file, diseases_matrix_KNN.csv which has excel table.
Now, I would like to store all the numbers from the row like:
Hypothermia = [0,-1,0,0,0,0,0,0,0,0,0,0,0,0]
For some reason, I am unable to find a solution to this. Even though I have looked. Please let me know if I can read this type of data in the chosen form, using Python please.
most common way to work with excel is use Pandas.
Here is example:
import pandas as pd
df = pd.read_excel(filename)
print (df.iloc['Hypothermia']). # gives you such result
I'm having troubles writing something that I believe should be relatively easy.
I have a template excel file, that has some visualizations on it with a few spreadsheets. I want to write a scripts that loads the template, inserts an existing dataframe rows to specific cells on each sheet, and saves the new excel file as a new file.
The template already have all the cells designed and the visualization, so i will want to insert this data only without changing the design.
I tried several packages and none of them seemed to work for me.
Thanks for your help! :-)
I have written a package for inserting Pandas DataFrames to Excel sheets (specific rows/cells/columns), it's called pyxcelframe:
https://pypi.org/project/pyxcelframe/
It has very simple and short documentation, and the method you need is insert_frame
So, let's say we have a Pandas DataFrame called df which we have to insert in the Excel file ("MyWorkbook") sheet named "MySheet" from the cell B5, we can just use insert_frame function as follows:
from pyxcelframe import insert_frame
from openpyxl import load_workbook
workbook = load_workbook("MyWorkbook.xlsx")
worksheet = workbook["MySheet"]
insert_frame(worksheet=worksheet,
dataframe=df,
row_range=(5, 0),
col_range=(2, 0))
0 as the value of the second element of row_range or col_range means that there is no ending row or column specified, if you need specific ending row/column you can replace 0 with it.
Sounds like a job for xlwings. You didn't post any test data, but modyfing below to suit your needs should be quite straight-forward.
import xlwings as xw
wb = xw.Book('your_excel_template.xlsx')
wb.sheets['Sheet1'].range('A1').value = df[your_selected_rows]
wb.save('new_file.xlsx')
wb.close()
I am creating a dataframe with a bunch of calculations and adding new columns using these formulas (calculations). Then I am saving the dataframe to an Excel file.
I lose the formula after I save the file and open the file again.
For example, I am using something like:
total = 16
for s in range(total):
df_summary['Slopes(avg)' + str(s)]= df_summary[['Slope_S' + str(s)]].mean(axis=1)*df_summary['Correction1']/df_summary['Correction2'].mean(axis=1)
How can I make sure this formula appears in my excel file I write to, similar to how we have a formula in an excel worksheet?
You can write formulas to an excel file using the XlsxWriter module. Use .write_formula() https://xlsxwriter.readthedocs.org/worksheet.html#worksheet-write-formula. If you're not attached to using an excel file to store your dataframe you might want to look into using the pickle module.
import pickle
# to save
pickle.dump(df,open('saved_df.p','wb'))
# to load
df = pickle.load(open('saved_df.p','rb'))
I think my answer here may be responsive. The short of it is you need to use openpyxl (or possibly xlrd if they've added support for it) to extract the formula, and then xlsxwriter to write the formula back in. It can definitely be done.
This assumes, of course, as #jay s pointed out, that you first write Excel formulas into the DataFrame. (This solution is an alternative to pickling.)
I'm happy to use csv.Dialect objects for reading and writing CSV files in python. My only problem with this now is the following:
it seems like I can't use them as a to_csv parameter in pandas
to_csv and Dialect (and read_csv) parameters are different (eg. to_csv have sep instead of delimiter)... so generating a key-value parameterlist doesn't seem to be a good idea
So I'm a little lost here, what to do.
What can I do if I have a dialect specified but I have a pandas.DataFrame I have to write into CSV? Should I create a parameter mapping by hand?! Should I change to something else from to_csv?
I have pandas-0.13.0.
Note: to_csv(csv.reader(..., dialect=...), ...) didn't work:
need string or buffer, _csv.writer found
If you have a CSV reader, than you don't need to also do a pandas.read_csv call. You can create a dataframe with a dictionary, so your code would look something like:
csv_dict = # Insert dialect code here to read in the CSV as a dictonary of the format {'Header_one': [1, 2, 3], 'Header_two': [4, 5, 6]}
df = pd.DataFrame(csv_dict)
i have an excel spreadsheet of about 3 million cells. i asked the following question and i liked the answer about saving the spreadsheet as CSV and then processing it with python:
solution to perform lots of calculations on 3 million data points and make charts
is there a library that i can use that will read the csv into a matrix or should i write one myself?
does python speak with VBA at all?
after i am done processing the data, is it simple to put it back in the form of a CSV so that i can open it in excel for viewing?
is there a library that i can use that will read the csv into a matrix or should i write one myself?
The csv module handles just about everything you could want.
does python speak with VBA at all?
Iron Python might.
after i am done processing the data, is it simple to put it back in the form of a CSV so that i can open it in excel for viewing?
The csv module handles just about everything you could want.
Suggestion: Read this: http://docs.python.org/library/csv.html
I like NumPy's loadtxt for this sort of thing. Very configurable for reading CSVs. And savetxt for putting it back after manipulation. Or you could check out the built in csv module if you'd rather not install anything new.
If we speak pythonish, why not to use http://www.python-excel.org/ ?
Example of read file:
import xlrd
rb = xlrd.open_workbook('file.xls',formatting_info=True)
sheet = rb.sheet_by_index(0)
for rownum in range(sheet.nrows):
row = sheet.row_values(rownum)
for c_el in row:
print c_el
Writing the new file:
import xlwt
from datetime import datetime
font0 = xlwt.Font()
font0.name = 'Times New Roman'
font0.colour_index = 2
font0.bold = True
style0 = xlwt.XFStyle()
style0.font = font0
style1 = xlwt.XFStyle()
style1.num_format_str = 'D-MMM-YY'
wb = xlwt.Workbook()
ws = wb.add_sheet('A Test Sheet')
ws.write(0, 0, 'Test', style0)
ws.write(1, 0, datetime.now(), style1)
ws.write(2, 0, 1)
ws.write(2, 1, 1)
ws.write(2, 2, xlwt.Formula("A3+B3"))
wb.save('example.xls')
There are other examples on the page.
If you don't want to deal with changing back and forth from CSV you can use win32com, which can be downloaded here. http://python.net/crew/mhammond/win32/Downloads.html