I am trying to import the data and create a new excel file first and then I want to add this excel sheet to an existing Excel file.
I did the first step as successfully as follows and I am stuck in doing the second step of adding the new sheet to another existing excel file.
I wish to keep the same existing formatting including the existing formulas in my existing excel sheet. Any help, please? I tried a lot with existing help but all refer to using Pandas for both the files. But in my case, the existing excel file has nothing to do with Pandas data structure since it has many formulas and texts formatted.
from openpyxl import load_workbook
import json
from pprint import pprint
import pandas as pd
data = json.load(open('data.txt'))
# Create a Pandas dataframe from the data.
df = pd.DataFrame({'Data': data["_owner"]["_accountId"]["averageIRR"]},index=["averageIRR"])
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('data_sheet.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
The data file is as below:
{
"_id": "58edf90",
"_owner": {
"_id": "57e4611f3a",
"_accountId": {
"_id": "57e294611f3b",
"companyName": "authguys",
"averageIRR": 0
}
}
}
Related
I wanted to automate comparing two excel spreadsheets and updating old data (call this spreadsheet Old_Data.xlsx) with new data (from a different excel document; called New_Data.xlsx) and placing the updated data into a different sheet on on Old_Data.xlsx.
I am able to successfully create the new sheet in Old_Data.xlsx and see the changes between the two data sets, however, in the new sheet an index appears labeling the rows of data from 0-n. I've tried hiding this index so the information on each sheet in Old_Data.xlsx appears the same, however, I cannot successfully seem to get rid of the addition of the index. See the code below:
from openpyxl import load_workbook
# import xlwings as xl
import pandas as pd
import jinja2
# Load the workbook that is going to updated with new information.
wb = load_workbook('OldData.xlsx')
# Define the file path for all of the old and new data.
old_path = 'OldData.xlsx'
new_path = 'NewData.xlsx'
# Load the data frames for each Spreadsheet.
df_old = pd.read_excel(old_path)
print(df_old)
df_new = pd.read_excel(new_path)
print(df_new)
# Keep all original information why showing the differences in information and write
# to a new sheet in the workbook.
difference = pd.merge(df_old, df_new, how='right')
difference = difference.style.format.hide()
print(difference)
# Append the difference to an existing Excel File
with pd.ExcelWriter('OldData.xlsx', mode='a', engine='openpyxl', if_sheet_exists='replace') as writer:
difference.to_excel(writer, sheet_name="1-25-2023")
This is an image of the table of the second sheet that I creating. (https://i.stack.imgur.com/7Amdf.jpg)
I've tried adding the code:
difference = difference.style.format.hide
To get rid of the row, but I have not succeeded.
pass index = False as an argument in last line of you code. It should be something like this :-
with pd.ExcelWriter('OldData.xlsx', mode='a', engine='openpyxl', if_sheet_exists='replace') as writer:
difference.to_excel(writer, sheet_name="1-25-2023", index = False)
I think this should solve your problem.
I' am new to Python and trying to write into a merged cell within Excel. I can see the data that is already stored within this cell/row, so I know its there. However when I try to overwrite it nothing happens.
I have tried messing with the index and header as well but nothing seems to work.
import pandas as pd
from openpyxl import load_workbook
Read the excel file into a pandas DataFrame
df = pd.read_excel(file here', sheet_name='Sheet1')
print(df.iloc[8, 2])
Make the changes to the DataFrame
df.iloc[8, 2] = "Bob Smith"
Load the workbook
book = load_workbook(file here)
writer = pd.ExcelWriter(file here, engine='openpyxl')
writer.book = book
Write the DataFrame to the first sheet
df.to_excel(writer, index=False)
Save the changes to the Excel file
writer.save()
import pandas as pd
from openpyxl import *
file="C:/Users/OneDrive/Bureau/draftExcel.xlsx"
df = pd.read_excel(file,sheet_name='sheet1')
df.iat[5,0]='cell is updated'
print(df) # to check first in the terminal if the content of the cell is updated
book=load_workbook(file)
writer=pd.ExcelWriter(file, engine='openpyxl')
df.to_excel(writer,sheet_name='sheet1',index=False)
writer.close()
I tried to make an example from what you explained because you didn't show your code, so I hope it was helpful.
Instead of using .iloc I used .iat so you can update the data in a specific cell in your DataFrame using column_index instead of column_label.
Remember that the Excel file you are working on must be closed while you are editing data with python, if it is open you will get an error.
I have a excel workbook that contains 8 sheets with different alphabetical names. i want to create csv files for each of these sheets and store it in a folder in python. Currently i am able to do this for a single sheet from the workbook but i am struggling to make a workflow on how to convert multiple sheets and store them as csv in a single folder. Here is my code for now:
import pandas as pd
my_csv=r'C:\Users\C\arcgis\New\NO.csv'
data_xls = pd.read_excel(r"C:\Users\C\Desktop\plots_data1.xlsx", "NO", index_col=0)
p=data_xls.to_csv(my_csv, encoding='utf-8')
If you want to get all of the sheets, you can pass sheet_name=None to the read_excel() call. This will then return a dictionary containing each sheet name as a key, with the value being the dataframe. With this you can iterate over each and create separate CSV files.
The following example uses a base filename with the sheetname appended, e.g. output_sheet1.csv output_sheet2.csv:
import pandas as pd
for sheet_name, df in pd.read_excel(r"input.xlsx", index_col=0, sheet_name=None).items():
df.to_csv(f'output_{sheet_name}.csv', index=False, encoding='utf-8')
It assumes that all of your sheetnames are suitable for being used as filenames.
I'm trying to use python to replace the contents of a sheet in an existing Excel workbook by importing data from a CSV. Ideally refreshing any pivot tables with the new data too.
How could I go about doing this?
excel_file = r'S:\Andy\Python\Monthly Report.xlsx'
sheet_name = r'Raw Data'
csv_path = r'S:\Andy\Python\Data Export.csv'
Im trying to create an excel file with pandas for a database I have generated.
I have tried both:
import pandas as pd
# write database to excel
df = pd.DataFrame(database)
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('fifa19.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
as well as:
import pandas as pd
df = pd.DataFrame(database).T
df.to_excel('database.xls')
However, none of the options generate an excel file. Database is a dictionary.
From the pandas document Notes itself:
If passing an existing ExcelWriter object, then the sheet will be added to the existing workbook. This can be used to save different DataFrames to one workbook:
>>> writer = pd.ExcelWriter('output.xlsx')
# writer = pd.ExcelWriter('/path_to_save/output.xlsx')
>>> df1.to_excel(writer,'Sheet1')
>>> df2.to_excel(writer,'Sheet2')
>>> writer.save()
For compatibility with to_csv, to_excel serializes lists and dicts to
strings before writing.