How to iterate in excel with python

How to iterate in excel with python - python

This is probably super simple, but i am new to python.
I wrote some code to insert a number into a certain row and column in excel. That gives me a value in another cell. I would like to iterate, by inserting -1000, then -950, then -900 up to +1000. And for every increment i would like to print the value.
How is this possible?
THis is my code so far
import xlwings as xw
import pandas as pd
import matplotlib.pyplot as plt
#load the excel file
wb = xw.Book("Datasets/Sektion_20111.xlsm")
#Sheet
sht = wb.sheets["Beregning"]
#dataframe
#Cell with normal force
sht.range("N25").value = (500)
#Print cell with nedre grænse, brudmoment
print(sht["AV24"].value)
This way it works by creating a new spreadsheet, where cell N25 has the value 1000, and i can read the result from that manually. i would like python to print all values and all results for me.
How can i do this?

As far as I understood you're trying to run an Excel macro several times, inserting values from -1000 to +1000 with step of 50 using a Python script, then get the result for each iteration, taken from a different cell of the sheet.
If this is your case openpyxl is not able to do that, as stated in this post:
openpyxl how to read formula result after editing input data on the sheet? data_only=True gives me a "None" result

To anyone interested, i solved it with this code
import xlwings as xw
#load the excel file
wb = xw.Book("Datasets/Sektion_20111.xlsm")
#Sheet
sht = wb.sheets["Beregning"]
#for loop
x = range(-100, 100, 50)
for i in x:
#Cell with normal force
sht.range("N25").value = i
print(sht["AV24"].value)
N25 is the cell to enter info.
AV24 is the cell to print.

Related

Copying/pasting a column of formulas using python

I have a very large excel file that I'm dealing with in python. I have a column where every cell is a different formula. I want to copy the formulas and paste them one column over from column GD to GE.
The issue is that I want to the formulas to update like they do in excel, its just that excel takes a very long time to copy/paste because the file I'm working with is very large.
Any ideas on possibly how to use openpyxl's translator to do this or anything else?
from openpyxl import load_workbook
import pandas as pd
#loads the excel file and is now saved under workbook#
workbook = load_workbook('file.xlsx')
#uses the individual sheets index(first sheet = 0) to work on one sheet at a time#
sheet= workbook.worksheets[8]
#inserts a column at specified index number#
sheet.insert_cols(187)
#naming the new columns#
sheet['GE2']= '20220531'
here is my updated code
from openpyxl import load_workbook
from openpyxl.formula.translate import Translator
#loads the excel file and is now saved under workbook#
workbook = load_workbook('file.xlsx')
#uses the individual sheets index(first sheet = 0) to work on one sheet at a time#
sheet= workbook.worksheets[8]
formula = sheet['GD3'].value
new_formula = Translator(formula, origin= 'GE3').translate_formula("GD3")
sheet['GD2'] = new_formula
for row in sheet.iter_rows(min_col=187, max_col=188):
old, new = row
if new.data_type != "f":
continue
new_formula = Translator(new.value, origin=old.coordinate).translate_formula(new.coordinate)
workbook.save('file.xlsx')

When you add or remove columns and rows, Openpyxl does not manage formulae for you. The reason for this is simple: where should it stop? Managing a "dependency graph" is exactly the kind of functionality that an application like MS Excel provides.
But it is quite easy to do this in your own code using the Formula Translator
# insert the column
formula = ws['GE1'].value
new_formula = Translator(formula, origin="GD1").translate_formula("GE1")
ws['GE1'] = new_formula
It should be fairly straightforward to create a loop for this (check the data type and use cell.coordinate to avoid potential typos or incorrect adjustments.
sheet.insert_cols(187)
for row in ws.iter_rows(min_col=187, max_col=188):
old, new = row
if new.data_type != "f"
continue
new_formula = Translator(new.value, origin=old.coordinate).translate_formula(new.coordinate)

Printing Python Output to Excel Sheet(s)

For my master thesis I've created a script.
Now I want that output to be printed to an excel sheet - I read that xlwt can do that, but examples I've found only give instructions to manually print one string to the file. Now I started by adding that code:
import xlwt
new_workbook = xlwt.Workbook(encoding='utf-8')
new_sheet=new_workbook.add_sheet("1")
Now I have no clue where to go from there, can you please give me a hint? I'm guessing I need to somehow start a loop where each time it writes to a new line for each iteration it takes, but am not sure where to start. I'd really appreciate a hint, thank you!

since you are using pandas you can use to_excel to do that.
The usage is quite simple :
Just create a dataframe with the values you need into your excel sheet and save it as excel sheet :
import pandas as pd
df = pd.DataFrame(data={
'col1':["output1","output2","output3"],
'col2':["output1.1","output2.2","output3.3"]
})
df.to_excel("excel_name.xlsx",sheet_name="sheet_name",index=False)

What you need is openpyxl: https://openpyxl.readthedocs.io/en/stable/
from openpyxl import Workbook
wb = openpyxl.load_workbook('your_template.xlsx')
sheet = wb.active
sheet.cell(row=4, column=2).value = 'what you wish to write'
wb.save('save_file_name.xlsx')
wb.close()

Lets say you would save every result to a list total_distances like
total_distances = []
for c1, c2 in coords:
# here your code
total_distances.append(total_distance)
and than save it into worksheet as:
with Workbook('total_distances.xlsx') as workbook:
worksheet = workbook.add_worksheet()
data = ["Total_distance"]
row = 0
worksheet.write_row(row,0,data)
for i in total_distances:
row += 1
data = [round(i,2)]
worksheet.write_row(row,0,data)

How could calculate the excel data by using openpyxl

I have an assignment to do for my boring online class and I couldn't come out with an idea to do this thing. I'm told to calculate the ratio of four columns with this formula ratio = weight/heightlengthwidth. Bu i'm bad at using microsoft excel and ironically we haven't learnt anything related to that. So I remembered that there is a python library which works with excel sheets. So how could I calculate this ratio = Weight/HeightWidthLength by using openpyxl for every single row in this excel sheet easily ?

Though I've never used openpyxl library I tried to find a solution to your problem. If the spreadsheet you're working on looks like the one below then you should be able to work with this script.
Sample spreadsheet image
from openpyxl import load_workbook
# Modify filename and sheet name where the data is
workbook_filename = 'workbook.xlsx'
sheet_name = 'Sheet1'
wb = load_workbook(workbook_filename)
ws = wb[sheet_name]
# If the data is stored differently in your file, you have to modify
# this loop to suit your needs
for row in ws.iter_rows(min_row = 2, max_row = 3, max_col = 5):
row[4].value = row[0].value / (row[1].value * row[2].value * row[3].value)
wb.save('result.xlsx')

xlrd & openpyxl fetch wrong cell values (Excel)

Need help, please! Seems like a simple task – I need to fetch values from certain spreadsheet cells and sum them up. But I failed even at the first step - fetching them. At first, I thought smth wrong was with the module (openpyxl is being regularly upgraded and I missed something), but the xlrd module produced the same wrong results! Here's the code:
import xlrd, xlwt
wb = xlrd.open_workbook(r"E:\Projects_working (11).xlsx")
sheet = wb.sheet_by_name('Language Process')
for i in range(1, 100):
cellVal = sheet.cell(i, 14).value #need to find "5" in column 14
if type(cellVal) == float and cellVal == 5.0: #need to read corresp.
print(sheet.cell(i, 11).value) #values в column 11
As a result, instead of an integer (say, 22), the code ends up with a float 42782.61458. (The other values are similar and wrong: 42782.66146, 42781.38542, 42781.42708, etc.)
Orignially I used the openpyxl module and added the flag data_only=True to the loadede workbook: wb = load_workbook("file.xlsx", data_only=True). That code produces the same results. Without this flag, all get is strange formulas: =B32+((M32-B32)/2), =B41+((M41-B41)/2) etc. Here's the code for these formulas (with no flag):
import openpyxl
wb = openpyxl.load_workbook(r"E:\Projects_working (11).xlsx")
sheet = wb.get_sheet_by_name('Language Process')
for i in range(1, 100):
cellVal = sheet.cell(row=i, column=14).value
if type(cellVal) == float and cellVal == 5.0:
print(sheet.cell(row=i, column=11).value)
And here's a link to the file, just in case: https://docs.google.com/spreadsheets/d/1bFhkEs8JTVWCgZoW5_9lQ1q_T0gtijBhuywr6OVpfGc/edit?usp=sharing

The data you are reading look like Excel's version of raw datetime values. You probably have miscounted the columns (that is, given the wrong column index).

xlwings function to find the last row with data

I am trying to find the last row in a column with data. to replace the vba function: LastRow = sht.Cells(sht.Rows.Count, "A").End(xlUp).Row
I am trying this, but this pulls in all rows in Excel. How can I just get the last row.
from xlwings import Workbook, Range
wb = Workbook()
print len(Range('A:A'))

Consolidating the answers above, you can do it in one line:
wb.sheet.range(column + last cell value).Get End of section going up[non blank assuming the last cell is blank].row
Example code:
import xlwings as xw
from xlwings import Range, constants
wb = xw.Book(r'path.xlsx')
wb.sheets[0].range('A' + str(wb.sheets[0].cells.last_cell.row)).end('up').row

We can use Range object to find the last row and/or the last column:
import xlwings as xw
# open raw data file
filename_read = 'data_raw.csv'
wb = xw.Book(filename_read)
sht = wb.sheets[0]
# find the numbers of columns and rows in the sheet
num_col = sht.range('A1').end('right').column
num_row = sht.range('A1').end('down').row
# collect data
content_list = sht.range((1,1),(num_row,num_col)).value
print(content_list)

This is very much the same as crazymachu's answer, just wrapped up in a function. Since version 0.9.0 of xlwings you can do this:
import xlwings as xw
def lastRow(idx, workbook, col=1):
""" Find the last row in the worksheet that contains data.
idx: Specifies the worksheet to select. Starts counting from zero.
workbook: Specifies the workbook
col: The column in which to look for the last cell containing data.
"""
ws = workbook.sheets[idx]
lwr_r_cell = ws.cells.last_cell # lower right cell
lwr_row = lwr_r_cell.row # row of the lower right cell
lwr_cell = ws.range((lwr_row, col)) # change to your specified column
if lwr_cell.value is None:
lwr_cell = lwr_cell.end('up') # go up untill you hit a non-empty cell
return lwr_cell.row
Intuitively, the function starts off by finding the most extreme lower-right cell in the workbook. It then moves across to your selected column and then up until it hits the first non-empty cell.

You could try using Direction by starting at the very bottom and then moving up:
import xlwings
from xlwings.constants import Direction
wb = xlwings.Workbook(r'data.xlsx')
print(wb.active_sheet.xl_sheet.Cells(65536, 1).End(Direction.xlUp).Row)

Try this:
import xlwings as xw
cellsDown = xw.Range('A1').vertical.value
cellsRight = xw.Range('A1').horizontal.value
print len(cellsDown)
print len(cellsRight)

One could use the VBA Find function that is exposed through api property (use it to find anything with a star, and begin your search from the first cell).
Example:
row_cell = s.api.Cells.Find(What="*",
After=s.api.Cells(1, 1),
LookAt=xlwings.constants.LookAt.xlPart,
LookIn=xlwings.constants.FindLookIn.xlFormulas,
SearchOrder=xlwings.constants.SearchOrder.xlByRows,
SearchDirection=xlwings.constants.SearchDirection.xlPrevious,
MatchCase=False)
column_cell = s.api.Cells.Find(What="*",
After=s.api.Cells(1, 1),
LookAt=xlwings.constants.LookAt.xlPart,
LookIn=xlwings.constants.FindLookIn.xlFormulas,
SearchOrder=xlwings.constants.SearchOrder.xlByColumns,
SearchDirection=xlwings.constants.SearchDirection.xlPrevious,
MatchCase=False)
print((row_cell.Row, column_cell.Column))
Other methods outlined here seems to require no empty rows/columns between data.
source: https://gist.github.com/Elijas/2430813d3ad71aebcc0c83dd1f130e33

python 3.6, xlwings 0.11
Solutoin 1
To find last row with data, you should do some work both horizontally and vertically. You have to go through every column to determine which row is the last row.
import xlwings
workbook_all = xlwings.Book(r'path.xlsx')
objectiveSheet = workbook_all .sheets['some_sheet']
# lastCellContainData(), inspired of Stefan's answer.
def lastCellContainData(objectiveSheet,lastRow=None,lastColumn=None):
lastRow = objectiveSheet.cells.last_cell.row if lastRow==None else lastRow
lastColumn = objectiveSheet.cells.last_cell.column if lastColumn==None else lastColumn
lastRows,lastColumns = [],[]
for col in range(1,lastColumn):
lastRows.append(objectiveSheet.range((lastRow, col)).end('up').row)
# extract last row of every column, then max(). Or you can compare the next
# column's last row number to the last column's last row number. Here you get
# the last row with data, you can also go further get the last column with data:
for row in range(1,lastRow):
lastColumns.append(objectiveSheet.range((row, lastColumn)).end('left').column)
return max(lastRows),max(lastColumns)
lastCellContainData(objectiveSheet,lastRow=5000,lastColumn=300)
I added lastRow and lastColumn. To make the program more effective, you can set these parameters according to the approximate shape of the data you're dealing with.
Solution 2
xlwings is honored for being wrapper of pywin32. I don't know if your situation allows for keyboard or mouse. If so, first you ctrl+tab switch to the workbook, then ctrl+a to select the region containing data, then you call workbook_all.selection.rows.count.
another way:
When you know where right bottom cell of your data locates faintly, say AAA10000, just call objectiveSheet.range('A1:'+'AAA10000').current_region.rows.count

Update:
After a while none of the solutions were really intuitive to me, so I decided to compile the following:
Code:
import xlwings as Objxlwings
import xlwings.constants
def Return_RangeLastCell(ObjWS):
return ObjWS.api.Cells.SpecialCells(xlwings.constants.CellType.xlCellTypeLastCell)
I tried to keep consistency with the way to call it from Excel to keep it simple
Then on my main code, I just call it like so:
ObjWS=Objxlwings.Book('Book1.xlsx').sheets["Sheet1"]
print(Return_RangeLastCell(ObjWS).Column)

Interesting solutions. But maybe like this:
print(sheet.used_range.last_cell.row)

#Cody's answer will help under normal circumstances, but if your sheet have hidden rows at bottom like links: example, it will give the wrong row number.
Lets say, if your row counts of data is 10, and row[5:11] are hidden, i.e. actually last_row will be 10.
[code a] below will give you answer 5, [code b] below will give you answer 10.
code a:
ws = wb.sheets[your_sheet_name]
last_row = ws.range('A' + str(ws.cells.last_cell.row)).end('up').row # return 5
code b:
ws = wb.sheets[your_sheet_name]
last_row_1 = ws.used_range.last_cell.row # return 10

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to iterate in excel with python - python

Related

Copying/pasting a column of formulas using python

Printing Python Output to Excel Sheet(s)

How could calculate the excel data by using openpyxl

xlrd & openpyxl fetch wrong cell values (Excel)

xlwings function to find the last row with data

Categories

Resources