How could calculate the excel data by using openpyxl - python

I have an assignment to do for my boring online class and I couldn't come out with an idea to do this thing. I'm told to calculate the ratio of four columns with this formula ratio = weight/heightlengthwidth. Bu i'm bad at using microsoft excel and ironically we haven't learnt anything related to that. So I remembered that there is a python library which works with excel sheets. So how could I calculate this ratio = Weight/HeightWidthLength by using openpyxl for every single row in this excel sheet easily ?

Though I've never used openpyxl library I tried to find a solution to your problem. If the spreadsheet you're working on looks like the one below then you should be able to work with this script.
Sample spreadsheet image
from openpyxl import load_workbook
# Modify filename and sheet name where the data is
workbook_filename = 'workbook.xlsx'
sheet_name = 'Sheet1'
wb = load_workbook(workbook_filename)
ws = wb[sheet_name]
# If the data is stored differently in your file, you have to modify
# this loop to suit your needs
for row in ws.iter_rows(min_row = 2, max_row = 3, max_col = 5):
row[4].value = row[0].value / (row[1].value * row[2].value * row[3].value)
wb.save('result.xlsx')

Related

Copying/pasting a column of formulas using python

I have a very large excel file that I'm dealing with in python. I have a column where every cell is a different formula. I want to copy the formulas and paste them one column over from column GD to GE.
The issue is that I want to the formulas to update like they do in excel, its just that excel takes a very long time to copy/paste because the file I'm working with is very large.
Any ideas on possibly how to use openpyxl's translator to do this or anything else?
from openpyxl import load_workbook
import pandas as pd
#loads the excel file and is now saved under workbook#
workbook = load_workbook('file.xlsx')
#uses the individual sheets index(first sheet = 0) to work on one sheet at a time#
sheet= workbook.worksheets[8]
#inserts a column at specified index number#
sheet.insert_cols(187)
#naming the new columns#
sheet['GE2']= '20220531'
here is my updated code
from openpyxl import load_workbook
from openpyxl.formula.translate import Translator
#loads the excel file and is now saved under workbook#
workbook = load_workbook('file.xlsx')
#uses the individual sheets index(first sheet = 0) to work on one sheet at a time#
sheet= workbook.worksheets[8]
formula = sheet['GD3'].value
new_formula = Translator(formula, origin= 'GE3').translate_formula("GD3")
sheet['GD2'] = new_formula
for row in sheet.iter_rows(min_col=187, max_col=188):
old, new = row
if new.data_type != "f":
continue
new_formula = Translator(new.value, origin=old.coordinate).translate_formula(new.coordinate)
workbook.save('file.xlsx')
When you add or remove columns and rows, Openpyxl does not manage formulae for you. The reason for this is simple: where should it stop? Managing a "dependency graph" is exactly the kind of functionality that an application like MS Excel provides.
But it is quite easy to do this in your own code using the Formula Translator
# insert the column
formula = ws['GE1'].value
new_formula = Translator(formula, origin="GD1").translate_formula("GE1")
ws['GE1'] = new_formula
It should be fairly straightforward to create a loop for this (check the data type and use cell.coordinate to avoid potential typos or incorrect adjustments.
sheet.insert_cols(187)
for row in ws.iter_rows(min_col=187, max_col=188):
old, new = row
if new.data_type != "f"
continue
new_formula = Translator(new.value, origin=old.coordinate).translate_formula(new.coordinate)

Python and Excel Formula

Complete beginner here but have a specific need to try and make my life easier with automating Excel.
I have a weekly report that contains a lot of useless columns and using Python I can delete these and rename them, with the code below.
from openpyxl import Workbook, load_workbook
wb = load_workbook('TestExcel.xlsx')
ws = wb.active
ws.delete_cols(1,3)
ws.delete_cols(3,8)
ws.delete_cols(4,3)
ws.insert_cols(3,1)
ws['A1'].value = "Full Name"
ws['C1'].value = "Email Address"
ws['C2'].value = '=B2&"#testdomain.com"'
wb.save('TestExcelUpdated.xlsx')
This does the job but I would like the formula to continue from B2 downwards (since the top row are headings).
ws['C2'].value = '=B2&"#testdomain.com"'
Obviously, in Excel it is just a case of dragging the formula down to the end of the column but I'm at a loss to get this working in Python. I've seen similar questions asked but the answers are over my head.
Would really appreciate a dummies guide.
Example of Excel report after Python code
one way to do this is by iterating over the rows in your worksheet.
for row in ws.iter_rows(min_row=2): #min_row ensures you skip your header row
row[2].value = '=B' + str(row[0].row) + '&"#testdomain.com"'
row[2].value selects the third column due to zero based indexing. row[0].row gets the number corresponding to the current row

Printing Python Output to Excel Sheet(s)

For my master thesis I've created a script.
Now I want that output to be printed to an excel sheet - I read that xlwt can do that, but examples I've found only give instructions to manually print one string to the file. Now I started by adding that code:
import xlwt
new_workbook = xlwt.Workbook(encoding='utf-8')
new_sheet=new_workbook.add_sheet("1")
Now I have no clue where to go from there, can you please give me a hint? I'm guessing I need to somehow start a loop where each time it writes to a new line for each iteration it takes, but am not sure where to start. I'd really appreciate a hint, thank you!
since you are using pandas you can use to_excel to do that.
The usage is quite simple :
Just create a dataframe with the values you need into your excel sheet and save it as excel sheet :
import pandas as pd
df = pd.DataFrame(data={
'col1':["output1","output2","output3"],
'col2':["output1.1","output2.2","output3.3"]
})
df.to_excel("excel_name.xlsx",sheet_name="sheet_name",index=False)
What you need is openpyxl: https://openpyxl.readthedocs.io/en/stable/
from openpyxl import Workbook
wb = openpyxl.load_workbook('your_template.xlsx')
sheet = wb.active
sheet.cell(row=4, column=2).value = 'what you wish to write'
wb.save('save_file_name.xlsx')
wb.close()
Lets say you would save every result to a list total_distances like
total_distances = []
for c1, c2 in coords:
# here your code
total_distances.append(total_distance)
and than save it into worksheet as:
with Workbook('total_distances.xlsx') as workbook:
worksheet = workbook.add_worksheet()
data = ["Total_distance"]
row = 0
worksheet.write_row(row,0,data)
for i in total_distances:
row += 1
data = [round(i,2)]
worksheet.write_row(row,0,data)

How to read a particular cell by using "wb = load_workbook('path', True)" in openpyxl

there
I have written code for reading the large excel files
but my requirement is to read a particular cell like for e.g(cell(row,column) in a excel file when i kept True
in wb = load_workbook('Path', True)
any body please help me...
CODE:
from openpyxl import load_workbook
wb = load_workbook('Path', True)
sheet_ranges = wb.get_sheet_by_name(name = 'Global')
for row in sheet_ranges.iter_rows():
for cell in row:
print cell.internal_value
Since you are using an Optimized Reader, you cannot just access an arbitrary cell using ws.cell(row, column).value:
cell, range, rows, columns methods and properties are disabled
Optimized reader was designed and created specially for reading an umlimited amount of data from an excel file by using iterators.
Basically you should iterate over rows and cells until you get the necessary cell. Here's a simple example:
for r, row in enumerate(sheet_ranges.iter_rows()):
if r == 10:
for c, cell in enumerate(row):
if c == 5:
print cell.internal_value
You can find the answer here.
I recommend you consult the documentation first before asking a question on SO.
In particular, this is pretty much exactly what you want:
d = ws.cell(row = 4, column = 2)
where ws is a worksheet.

How to write/update data into cells of existing XLSX workbook using xlsxwriter in python

I am able to write into new xlsx workbook using
import xlsxwriter
def write_column(csvlist):
workbook = xlsxwriter.Workbook("filename.xlsx",{'strings_to_numbers': True})
worksheet = workbook.add_worksheet()
row = 0
col = 0
for i in csvlist:
worksheet.write(col,row, i)
col += 1
workbook.close()
but couldn't find the way to write in an existing workbook.
Please help me to write/update cells in existing workbook using xlswriter or any alternative.
Quote from xlsxwriter module documentation:
This module cannot be used to modify or write to an existing Excel
XLSX file.
If you want to modify existing xlsx workbook, consider using openpyxl module.
See also:
Modify an existing Excel file using Openpyxl in Python
Use openpyxl to edit a Excel2007 file (.xlsx) without changing its own styles?
you can use this code to open (test.xlsx) file and modify A1 cell and then save it with a new name
import openpyxl
xfile = openpyxl.load_workbook('test.xlsx')
sheet = xfile.get_sheet_by_name('Sheet1')
sheet['A1'] = 'hello world'
xfile.save('text2.xlsx')
Note that openpyxl does not have a large toolbox for manipulating and editing images. Xlsxwriter has methods for images, but on the other hand cannot import existing worksheets...
I have found that this works for rows...
I'm sure there's a way to do it for columns...
import openpyxl
oxl = openpyxl.load_workbook('File Loction Here')
xl = oxl.['SheetName']
x=0
col = "A"
row = x
while (row <= 100):
y = str(row)
cell = col + row
xl[cell] = x
row = row + 1
x = x + 1
You can do by xlwings as well
import xlwings as xw
for book in xlwings.books:
print(book)
If you have issue with writing into an existing xls file because it is already created you need to put checking part like below:
PATH='filename.xlsx'
if os.path.isfile(PATH):
print "File exists and will be overwrite NOW"
else:
print "The file is missing, new one is created"
...
and here part with the data you want to add

Categories