How to keep thousand separator by writing pandas df in excel

How to keep thousand separator by writing pandas df in excel - python

I have a question about writing pandas dataframe into Excel. I have numbers with thousand separator as ., after writing to Excel it changes to ,. How can I write my data without changing the separator?
This is how it looks in Jupyter notebook:
And here is how it looks in Excel:
wb = load_workbook(filename)
sheet=wb[s_name]
writer = pd.ExcelWriter(filename, engine='openpyxl')
pivot_to_excel(wb, filename, prepared_data, s_name, writer)
def pivot_to_excel(book, excelfilename, PivotTable, s_name, writer):
writer.sheets = {ws.title: ws for ws in book.worksheets}
for sheetname in writer.sheets:
if (sheetname==s_name):
PivotTable.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False
UPD:
If my data is mixed like this, it seems correct after writing in excel:

Related

Openpyxl column number formatting

Im using openpyxl to append formated dataframe rows to existing excel file/creating new with following code:
if os.path.isfile(transformed_file): #if file exists, load and append
workbook = openpyxl.load_workbook(transformed_file)
sheet = workbook['Sheet1']
for row in dataframe_to_rows(df, header=False, index=False):
sheet.append(row)
workbook.save(transformed_file)
workbook.close()
else: # create the excel file if doesn't already exist
with pd.ExcelWriter(path = transformed_file, engine = 'openpyxl') as writer:
df.to_excel(writer, index=False, sheet_name = 'Sheet1')
I need to format column 'G' as a plain number '0', at the moment when opening excel file the format is '1.23E+10'.
How could this be achieved for the sample above? Thank you!

Hello try the following code see if it works for you:
wb = Workbook()
ws = wb.active
ws['A1'] = 123455656565464563302589013
ws['B1'] = 123455656565464563302589013
ws['A1'].number_format = '0' # Number formatting
ws['B1'].number_format = '0.00E+00' # Scientific formatting
wb.save("formating_test.xlsx")

Found the solution which worked for me. Realized from documentation that one has to iterate through each cell.
for cell in sheet[('D')]:
cell.number_format ='0'

Pandas creates new excel sheet when trying to append to existing sheet

I have the code where I want to read data from the current sheet, store it in df_old, append the current data to it using df = df_old.append(df) and then replace the data in the sheet with this new dataframe. However, what it does instead is create a new sheet with the exact same name where it publishes this new dataframe. I tried adding if_sheet_exists="replace" as an argument to ExcelWriter but this did not change anything. How can I force it to overwrite the data in the sheet with the current name?
df_old = pd.read_excel(r'C:\Users\XXX\Downloads\Digitalisation\mat_flow\reblend_v2.xlsx',sheet_name = ft_tags_final[i][j])
df = df_old.append(df)
with pd.ExcelWriter(r'C:\Users\XXX\Downloads\Digitalisation\mat_flow\reblend_v2.xlsx', engine="openpyxl", mode="a", if_sheet_exists="replace") as writer:
df.to_excel(writer, index=False, sheet_name = ft_tags_final[i][j])

I had the same issue and i solved it with using write instead of append. Also i used openpyxl instead of xlsxwriter
from pandas import ExcelWriter
from pandas import ExcelFile
from openpyxl import load_workbook
book = load_workbook('Wallet.xlsx')
writer = pd.ExcelWriter('Wallet.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
#^THIS IS THE MOST IMPORTANT LINES BECAUSE IT GIVES PANDAS THE SHEET
Data.to_excel(writer, sheet_name='Main', header=None, index=False, startcol=number,startrow=counter)

append dataframe to excel with pandas

I desire to append dataframe to excel
This code works nearly as desire. Though it does not append each time. I run it and it puts data-frame in excel. But each time I run it it does not append. I also hear openpyxl is cpu intensive but not hear of many workarounds.
import pandas
from openpyxl import load_workbook
book = load_workbook('C:\\OCC.xlsx')
writer = pandas.ExcelWriter('C:\\OCC.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df1.to_excel(writer, index = False)
writer.save()
I want the data to append each time I run it, this is not happening.
Data output looks like original data:
A B C
H H H
I want after run a second time
A B C
H H H
H H H
Apologies if this is obvious I new to python and examples I practise did not work as wanted.
Question is - how can I append data each time I run. I try change to xlsxwriter but get AttributeError: 'Workbook' object has no attribute 'add_format'

first of all, this post is the first piece of the solution, where you should specify startrow=:
Append existing excel sheet with new dataframe using python pandas
you might also consider header=False.
so it should look like:
df1.to_excel(writer, startrow = 2,index = False, Header = False)
if you want it to automatically get to the end of the sheet and append your df then use:
startrow = writer.sheets['Sheet1'].max_row
and if you want it to go over all of the sheets in the workbook:
for sheetname in writer.sheets:
df1.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False,header= False)
btw: for the writer.sheets you could use dictionary comprehension (I think it's more clean, but that's up to you, it produces the same output):
writer.sheets = {ws.title: ws for ws in book.worksheets}
so full code will be:
import pandas
from openpyxl import load_workbook
book = load_workbook('test.xlsx')
writer = pandas.ExcelWriter('test.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
for sheetname in writer.sheets:
df1.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False,header= False)
writer.save()

You can use the append_df_to_excel() helper function, which is defined in this answer:
Usage examples:
filename = r'C:\OCC.xlsx'
append_df_to_excel(filename, df)
append_df_to_excel(filename, df, header=None, index=False)
append_df_to_excel(filename, df, sheet_name='Sheet2', index=False)
append_df_to_excel(filename, df, sheet_name='Sheet2', index=False, startrow=25)

All examples here are quite complicated.
In the documentation, it is much easier:
def append_to_excel(fpath, df, sheet_name):
with pd.ExcelWriter(fpath, mode="a") as f:
df.to_excel(f, sheet_name=sheet_name)
append_to_excel(<your_excel_path>, <new_df>, <new_sheet_name>)
When using this on LibreOffice/OpenOffice excel files, I get the error:
KeyError: "There is no item named 'xl/drawings/drawing1.xml' in the archive"
which is a bug in openpyxl as mentioned here.

I tried to read an excel, put it in a dataframe and then concat the dataframe from excel with the desired dataframe. It worked for me.
def append_df_to_excel(df, excel_path):
df_excel = pd.read_excel(excel_path)
result = pd.concat([df_excel, df], ignore_index=True)
result.to_excel(excel_path, index=False)
df = pd.DataFrame({"a":[11,22,33], "b":[55,66,77]})
append_df_to_excel(df, r"<path_to_dir>\<out_name>.xlsx")

If someone need it, I found an easier way:
Convert DF to rows in a list
rows = your_df.values.tolist()
load your workbook
workbook = load_workbook(filename=your_excel)
Pick your sheet
sheet = workbook[your_sheet]
Iterate over rows to append each:
for row in rows:
sheet.append(row)
Save woorkbook when done
workbook.save(filename=your_excel)
Putting it all together:
rows = your_df.values.tolist()
workbook = load_workbook(filename=your_excel)
sheet = workbook[your_sheet]
for row in rows:
sheet.append(row)
workbook.save(filename=your_excel)

def append_to_excel(fpath, df):
if (os.path.exists(fpath)):
x=pd.read_excel(fpath)
else :
x=pd.DataFrame()
dfNew=pd.concat([df,x])
dfNew.to_excel(fpath,index=False)

How to write on existing excel files without losing previous information using python?

I need to write a program to scrap daily quote from a certain web page and collect them into a single excel file. I wrote something which finds next empty row and starts writing new quotes on it but deletes previous rows too:
wb = openpyxl.load_workbook('gold_quote.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
.
.
.
z = 1
x = sheet['A{}'.format(z)].value
while x != None:
x = sheet['A{}'.format(z)].value
z += 1
writer = pd.ExcelWriter('quote.xlsx')
df.to_excel(writer, sheet_name='Sheet1',na_rep='', float_format=None,columns=['Date', 'Time', 'Price'], header=True,index=False, index_label=None, startrow=z-1, startcol=0, engine=None,merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None)
writer.save()

Question: How to write on existing excel files without losing previous information
openpyxl uses append to write after last used Row:
wb = openpyxl.load_workbook('gold_quote.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
rowData = ['2017-08-01', '16:31', 1.23]
sheet.append(rowData)
wb.save('gold_quote.xlsx')

writer.book = wb
writer.sheets = dict((ws.title, ws) for ws in wb.worksheets)

I figured it out, first we should define a reader to read existing data of excel file then concatenate recently extracted data from web with a defined writer, and we should drop duplicates otherwise any time the program is executed there will be many duplicated data. Then we can write previous and new data altogether:
excel_reader = pd.ExcelFile('gold_quote.xlsx')
to_update = {"Sheet1": df}
excel_writer = pd.ExcelWriter('gold_quote.xlsx')
for sheet in excel_reader.sheet_names:
sheet_df = excel_reader.parse(sheet)
append_df = to_update.get(sheet)
if append_df is not None:
sheet_df = pd.concat([sheet_df, df]).drop_duplicates()
sheet_df.to_excel(excel_writer, sheet, index=False)
excel_writer.save()

Write a pandas df into Excel and save it into a copy

I have a pandas dataframe and I want to open an existing excel workbook containing formulas, copying the dataframe in a specific set of columns (lets say from column A to column H) and save it as a new file with a different name.
The idea is to update an existing template, populate it with the dataframe in a specified set of column and then save a copy of the Excel file with a different name.
Any idea?
What I have is:
import pandas
from openpyxl import load_workbook
book = load_workbook('Template.xlsx')
writer = pandas.ExcelWriter('Template.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer)
writer.save()

The below should work, assuming that you are happy to copy into column A. I don't see a way to write into the sheet starting in a different column (without overwriting anything).
The below incorporates #MaxU's suggestion of copying the template sheet before writing to it (having just lost a few hours' work on my own template workbook to pd.to_excel)
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from shutil import copyfile
template_file = 'Template.xlsx' # Has a header in row 1 already
output_file = 'Result.xlsx' # What we are saving the template as
# Copy Template.xlsx as Result.xlsx
copyfile(template_file, output_file)
# Read in the data to be pasted into the termplate
df = pd.read_csv('my_data.csv')
# Load the workbook and access the sheet we'll paste into
wb = load_workbook(output_file)
ws = wb.get_sheet_by_name('Existing Result Sheet')
# Selecting a cell in the header row before writing makes append()
# start writing to the following line i.e. row 2
ws['A1']
# Write each row of the DataFrame
# In this case, I don't want to write the index (useless) or the header (already in the template)
for r in dataframe_to_rows(df, index=False, header=False):
ws.append(r)
wb.save(output_file)

try this:
df.to_excel(writer, startrow=10, startcol=1, index=False, engine='openpyxl')
Pay attention at startrow and startcol parameters

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to keep thousand separator by writing pandas df in excel - python

Related

Openpyxl column number formatting

Pandas creates new excel sheet when trying to append to existing sheet

append dataframe to excel with pandas

How to write on existing excel files without losing previous information using python?

Write a pandas df into Excel and save it into a copy

Categories

Resources