I am exporting a pandas dataframe as an excel file from a tutorial, however the resulting file does not include the highlighting and I have no idea why.
To style it:
df_styled = df.style.apply(lambda x: ['background: orange' for x in df.Margin_rate], axis=0)
and then to export it:
df_styled.to_excel('excel_python_tutorial_marked.xlsx', engine='openpyxl', index=False)
I have made sure to create a new df to export it and everything, where am I going wrong?
Because it's meant to look like this:
But instead it looks normal in excel:
Apparently you need to pass style information explicitly into the openpyxl writer. Maybe this helps.
I have had a good experience with the following, but you might need additional packages and restructure your code a little: https://xlsxwriter.readthedocs.io/example_pandas_column_formats.html
Related
I read a CSV file with pandas.read_csv("sample.csv", sep=";") and got this output:
However I want to get my dataframe output like this:
Is this possible?
You can try 'tabulate' library. Maybe not exactly the same layout you show, but might help.
I am starting to learn and understand panda module in Python. However, my issue is with the rename string. The rename works fine when i use print, this shows the column has been renamed:
print(data.rename(columns={"Rep": "Name"}))
However, when i use print(data), to show all of the data from the document, the column does not show as being renamed. This also does not show when the file has been exported using the data.to_csv("example.csv") string.
Would really appreciate if somebody could shed some light on this please.
Full Source code below:
import pandas as pd
data = pd.read_excel(r"D:\Downloads\Book1.xlsx")
del data["Region"]
del data["Item"]
print(data.rename(columns={"Rep": "Name"})
print(data)
data.to_csv("example.csv")
Use inplace argument, to make the changes reflect in the DataFrame as well, like this:
data.rename(columns={"Rep": "Name"}, inplace = True)
Try adding 'inplace=True' to data.rename
print(data.rename(columns={"Rep": "Name"}, inplace=True))
I have to get rid of duplicate rows on a *.xlsx file on a project. I have the code down here. But in the output file, date values turns into "yy-mm-dd hh:mm:ss" format after runnning my code. What would be the cause and solution to that wierd problem?
Running it on Pycharm 2019.2 Pro and Python 3.7.4
import pandas
mExcelFile = pandas.read_excel('Input/ogr.xlsx')
mExcelFile.drop_duplicates(subset=['FName', 'LName', 'Class', '_KDT'], inplace=True)
mExcelFile.to_excel('Output/NoDup.xlsx')
I'm expecting dates stay in original format which is "dd.mm.yy" but values become "yy-mm-dd hh:mm:ss"
To control date format when writing to Excel, try this:
writer = pd.ExcelWriter(fileName, engine='xlsxwriter', datetime_format='dd/mm/yy')
df.to_excel(writer)
Actually answer from the link below solved it. Since I am new to python programming I didn't realize where the problem was. It was actually pandas converting cellvalues to datetimes. Detailed answer : https://stackoverflow.com/a/49159393/11584604
I am trying to restructure the way my precipitations' data is being organized in an excel file. To do this, I've written the following code:
import pandas as pd
df = pd.read_excel('El Jem_Souassi.xlsx', sheetname=None, header=None)
data=df["El Jem"]
T=[]
for column in range(1,56):
liste=data[column].tolist()
for row in range(1,len(liste)):
liste[row]=str(liste[row])
if liste[row]!='nan':
T.append(liste[row])
result=pd.DataFrame(T)
result
This code works fine and through Jupyter I can see that the result is good
screenshot
However, I am facing a problem when attempting to save this dataframe to a csv file.
result.to_csv("output.csv")
The resulting file contains the vertical index column and it seems I am unable to call for a specific cell.
(Hopefully, someone can help me with this problem)
Many thanks !!
It's all in the docs.
You are interested in skipping the index column, so do:
result.to_csv("output.csv", index=False)
If you also want to skip the header add:
result.to_csv("output.csv", index=False, header=False)
I don't know how your input data looks like (it is a good idea to make it available in your question). But note that currently you can obtain the same results just by doing:
import pandas as pd
df = pd.DataFrame([0]*16)
df.to_csv('results.csv', index=False, header=False)
I am trying to edit several excel files (.xls) without changing the rest of the sheet. The only thing close so far that I've found is the xlrd, xlwt, and xlutils modules. The problem with these is it seems that xlrd evaluates formulae when reading, then puts the answer as the value of the cell. Does anybody know of a way to preserve the formulae so I can then use xlwt to write to the file without losing them? I have most of my experience in Python and CLISP, but could pick up another language pretty quick if they have better support. Thanks for any help you can give!
I had the same problem... And eventually found the next module:
from openpyxl import load_workbook
def Write_Workbook():
wb = load_workbook(path)
ws = wb.get_sheet_by_name("Sheet_name")
c = ws.cell(row = 2, column = 1)
c.value = Some_value
wb.save(path)
==> Doing this, my file got saved preserving all formulas inserted before.
Hope this helps!
I've used the xlwt.Formula function before to be able to get hyperlinks into a cell. I imagine it will also work with other formulas.
Update: Here's a snippet I found in a project I used it in:
link = xlwt.Formula('HYPERLINK("%s";"View Details")' % url)
sheet.write(row, col, link)
As of now, xlrd doesn't read formulas. It's not that it evaluates them, it simply doesn't read them.
For now, your best bet is to programmatically control a running instance of Excel, either via pywin32 or Visual Basic or VBScript (or some other Microsoft-friendly language which has a COM interface). If you can't run Excel, then you may be able to do something analogous with OpenOffice.org instead.
We've just had this problem and the best we can do is to manually re-write the formulas as text, then convert them to proper formulas on output.
So open Excel and replace =SUM(C5:L5) with "=SUM(C5:L5)" including the quotes. If you have a double quote in your formula, replace it with 2 double quotes, as this will escape it, so = "a" & "b" becomes "= ""a"" & ""b"" ")
Then in your Python code, loop over every cell in the source and output sheets and do:
output_sheet.write(row, col, xlwt.ExcelFormula.Formula(source_cell[1:-1]))
We use this SO answer to make a copy of the source sheet to be the output sheet, which even preserves styles, and avoids overwriting the hand written text formulas from above.