I have a workbook in which I get data from a program and I was trying to automate some of the tasks, Basically there is data for a game every 4 rows and it goes for about 13 columns. I wanted to figure out a code to insert a row every 4th row with an "AVG" function in each cell that takes the average of the 4 cells below it for each column. I'm not sure if there is anything like this available.
Whichever you decide to use, openpyxl or xlwings or other the method is pretty much the same. Iterate through a range covering the rows with your data at every 5th row insert a new row and update each cell with the average formula.
The first row for formulas will presumably be row 2 with row 1 as header row
Note that the total number of rows will increase as you insert new rows i.e if there are 24 rows of data then ultimately 6 new rows are inserted. If the loop only covers the a range of 24 it would leave the bottom 4 rows unprocessed. Calculate the total number of rows as number of data rows + 1/4. The range should have a step of 5, [formula row + 4 data rows] to jump to the next row for insert.
For each 'row' in the loop insert a new row. You may want to 'clear' the new row also in case it inherited any formatting from the previous row. You'll need to create the range coordinates using the row variable, e.g. it will be 2, 7, 12 etc so the row number needs to be updated to the next value for each loop.
You can insert a new row across all columns in the sheet or just the columns with the data as above.
After the new row is inserted. Create a new loop to cycle through the columns on that new row and insert the formula for each cell with the correct coordinates for the row and column.
Again the row and column coordinates for both the cell.value and the AVERAGE formula need to be created from the variables of the loops.
--------Additional Information ------------
Examples on how to do this;
Openpyxl
import openpyxl as op
from openpyxl.styles import PatternFill
file = '<filename>.xlsx'
wb = op.load_workbook(file)
ws = wb['Sheet1']
cell_top_left = 'A1'
row_steps = 4
greenbg = PatternFill(start_color='FF00FF00', end_color='FF00FF00', fill_type='solid')
first_col = list(cell_top_left)[:1][0]
last_col = op.utils.cell.get_column_letter(ws.max_column)
num_col = ws.max_column
num_row = ws.max_row
num_row += int((num_row-1)/4)-3
for row in range(2, num_row, row_steps + 1):
ws.insert_rows(row)
for col in range(1, num_col + 1):
col_letter = ws.cell(row=row, column=col).column_letter
start_cell = col_letter + str(row+1)
end_cell = col_letter + str(row+4)
ws.cell(row, col).value = '=AVERAGE({0}:{1})'.format(start_cell, end_cell)
ws.cell(row, col).fill = greenbg
wb.save("modded_" + file)
xlwings
import xlwings as xw
file = '<filename>.xlsx'
wb = xw.Book(file)
ws = wb.sheets('Sheet1')
cell_top_left = 'A1'
row_steps = 4
first_col = list(cell_top_left)[:1][0]
last_col = list(ws.range(cell_top_left).end('right').address.replace('$', ''))[:1][0]
num_col = ws.range(cell_top_left).end('right').column
num_row = ws.range(cell_top_left).end('down').row
num_row += int((num_row - 1) / 4) - 3
for row in range(2, num_row, row_steps + 1):
str_coord = first_col + str(row) + ':' + last_col + str(row)
ws.range(str_coord).insert('down')
ws.range(str_coord).clear()
for col in range(1, num_col + 1):
start_cell = ws.range(row + 1, col).address.replace('$', '')
end_cell = ws.range(row + 4, col).address.replace('$', '')
ws.range((row, col)).value = '=AVERAGE({0}:{1})'.format(start_cell, end_cell)
ws.range((row, col)).color = (0, 255, 0)
wb.save("modded_" + file)
Related
i have a QTableWidget Table populated with data from xlsx. To insert row in any position i want , i must give "Kod_Towaru" index first to insert below specific amount of rows.
Code is :
columnHeaders = []
for j in range(self.ui.zestawienie_analiza_tab_2.model().columnCount()):
columnHeaders.append(self.ui.zestawienie_analiza_tab_2.horizontalHeaderItem(j).text())
df = pd.DataFrame(columns=columnHeaders)
for row in range(self.ui.zestawienie_analiza_tab_2.rowCount()):
for col in range(self.ui.zestawienie_analiza_tab_2.columnCount()):
df.at[row, columnHeaders[col]] = self.ui.zestawienie_analiza_tab_2.item(row, col).text()
from openpyxl import Workbook
flag=False
wb = openpyxl.load_workbook('Zestawienie - NOWA WERSJA.xlsx')
sheet = wb['Zestawienie']
index = df.index
number_of_rows = len(index)
# find length of index
print(number_of_rows)
kod_towaru = self.ui.dodaj_normalia.text()
if index.size != 0:
result = df.loc[df['Kod_Towaru'] == kod_towaru].index[0]
print(result)
amount = self.ui.ilosc_normalia.text()
direct_amount = int(amount)
sheet.insert_rows(idx=result+3, amount=direct_amount)
wb.save('Zestawienie - NOWA WERSJA.xlsx')
But this is as you can see from the code above very complex to use at very first the xlsx to insert blank rows and then to populate again the same table. And as i said before i must give 2 variables : Kod_Towaru to give an index position of the row and amount .
Is there a way to do it like Excel , just with Ctrl + , with right mouse click or something?
Can I add the same value in two cells at the same line of code
I want to put "Run No." in the cell ["A1"] and ["B1"].
I could do this
sheet["A1"] = "Run No."
sheet["B1"] = "Run No."
But I need to add this in many cells. I was thinking if I can create a list like this
list = ["A1","B1"]
sheet[list] = "Run No."
Just use iter_rows, you can specify the start and end rows and start and end columns then just for each cell in the range set the value.
If it is just A1 and B1 then set the values as
first_row = 1
last_row = 1
start_col = 1
end_col = 2
from openpyxl import load_workbook
filename = r"./book1.xlsx"
wb = load_workbook(filename)
ws = wb["Sheet1"]
# Example Row 1 to Row 10 and Column A to Column T will be filled
first_row = 1
last_row = 10
start_col = 1
end_col = 20
for row in ws.iter_rows(min_row=first_row, max_row=last_row, min_col=start_col, max_col=end_col):
for cell in row:
cell.value = 'Run No.'
wb.save(filename)
First row is the header
I then want cells to be printed. The code runs successfully. Creates the file and the header but does not print the values like Cars, 10 etc from row 2. What's wrong in the code ? Thanks !
from openpyxl import Workbook
wb = Workbook() #object of Workbook type
print(wb.active.title)
print(wb.sheetnames)
wb['Sheet'].title="Report_Amount"
sh1 = wb.active
sh1['A1'].value = "Item" #Writing into the cell to create header
sh1['B1'].value = "Quantity"
sh1['C1'].value = "Price($)"
sh1['D1'].value = "Amount($)"
column1 = sh1.max_column
row1 = sh1.max_row
print(f'no. of columns : {column1}')
print(f'no. of rows : {row1}')
for i in range (2, row1+1): #want to write values from row #2
for j in range (1, column1+1):
sh1.cell(row=i,column=j).value = "Cars"
sh1.cell(row=i, column = j+1).value = 5
sh1.cell(row=i, column = j+2).value = 10000
sh1.cell(row=i, column = j+3).value = 50000
print("file saved")
wb.save("C:\\Users\\Ricky\\Desktop\\FirstCreatedPythonExcel1.xlsx")
The reason that the rows are not being filled in is because of the for loop condition. According to the documentation, cells are not created in memory until they are accessed. So you only create one row meaning that the maximum will be one. By changing the value of row1 to a static number, you will see that it will put the values into the cell.
How do I iterate through all the rows in an xls sheet, and get each row data in a tuple. So at the end of the iteration, I should have a list of tuples with each element in the list, being a tuple of row data.
For instance: This is the content of my spreadsheet:
testcase_ID input_request request_change
test_1A test/request_1 YES
test_2A test/request_2 NO
test_3A test/request_3 YES
test_4A test/request_4 YES
my final list should be:
[(test_1A, test/request_1, YES),
(test_2A, test/request_2, NO),
(test_3A, test/request_3, YES),
(test_4A, test/request_4, YES)]
How can I do this in openpyxl?
I think this task would be easier with xlrd. However, if you want to use openpyxl, then assuming that testcase_ID is in column A, input_request in column B, and request_change in column C somehting like this might be what you are looking for:
import openpyxl as xl
#Opening xl file
wb = xl.load_workbook('PATH/TO/FILE.xlsx')
#Select your sheet (for this example I chose active sheet)
ws = wb.active
#Start row, where data begins
row = 2
testcase = '' #this is just so that you can enter while - loop
#Initialiazing list
final_list = []
#With each iteration we get the value of testcase, if the cell is empty
#tescase will be None, when that happens the while loop will stop
while testcase is not None:
#Getting cell value, from columns A, B and C
#Iterating through rows 2, 3, 4 ...
testcase = ws['A' + str(row)].value
in_request = ws['B' + str(row)].value
req_change = ws['C' + str(row)].value
#Making tuple
row_tuple = (testcase, in_request, req_change)
#Adding tuple to list
final_list.append(row_tuple)
#Going to next row
row += 1
#This is what you return, you don't want the last element
#because it is tuple of None's
print(final_list[:-1])
If you want to do it with xlrd this is how I would do it:
import xlrd
#Opening xl file
wb = xlrd.open_workbook('PATH/TO/FILE.xlsx')
#Select your sheet (for this example I chose first sheet)
#you can also choose by name or something else
ws = wb.sheet_by_index(0)
#Getting number of rows and columns
num_row = ws.nrows
num_col = ws.ncols
#Initializing list
final_list = []
#Iterating over number of rows
for i in range(1,num_row):
#list of row values
row_values = []
#Iterating over number of cols
for j in range(num_col):
row_values.append(ws.cell_value(i,j))
#Making tuple with row values
row_tuple = tuple(row_values)
#Adding tuple to list
final_list.append(row_tuple)
print(final_list)
Adding xlrd index specifications comments at the end for easy reading:
Deleted if statement, when num_row is 1 then for-loop never happens
xlrd indexes rows beginning at 0
for row 2 we want index 1
Columns are also zero-indexed (A=0, B=1, C=2...)
I have a worksheet that is updated every week with thousands of rows and would need to transfer rows from this worksheet after filtering. I am using the current code to find the cells which has the value I need and then transfer the entire row to another sheet but after saving the file, I get the "IndexError: list index out of range" exception.
The code I use is as follows:
import openpyxl
wb1 = openpyxl.load_workbook('file1.xlsx')
wb2 = openpyxl.load_workbook('file2.xlsx')
ws1 = wb1.active
ws2 = wb2.active
for row in ws1.iter_rows():
for cell in row:
if cell.value == 'TrueValue':
n = 'A' + str(cell.row) + ':' + ('GH' + str(cell.row))
for row2 in ws1.iter_rows(n):
ws2.append(row2)
wb2.save("file2.xlsx")
The original code I used that used to work is below and has to be modified because of the large files which causes MS Excel not to open them (over 40mb).
n = 'A3' + ':' + ('GH'+ str(ws1.max_row))
for row in ws1.iter_rows(n):
ws2.append(row)
Thanks.
I'm not entirely sure what you're trying to do but I suspect the problem is that you have nested your copy loop.
Try the following:
row_nr = 1
for row in ws1:
for cell in row:
if cell.value == "TrueValue":
row_nr = cell.row
break
if row_nr > 1:
break
for row in ws1.iter_rows(min_row=row_nr, max_col=190):
ws2.append((cell.value for cell in row))
Question: I get the "IndexError: list index out of range" exception.
I get, from ws1.iter_rows(n)
UserWarning: Using a range string is deprecated. Use ws[range_string]
and from ws2.append(row2).
ValueError: Cells cannot be copied from other worksheets
The Reason are row2 does hold a list of Cell objects instead of a list of Values
Question: ... need to transfer rows from this worksheet after filtering
The following do what you want, for instance:
# If you want to Start at Row 2 to append Row Data
# Set Private self._current_row to 1
ws2.cell(row=1, column=1).value = ws2.cell(row=1, column=1).value
# Define min/max Column Range to copy
from openpyxl.utils import range_boundaries
min_col, min_row, max_col, max_row = range_boundaries('A:GH')
# Define Cell Index (0 Based) used to Check Value
check = 0 # == A
for row in ws1.iter_rows():
if row[check].value == 'TrueValue':
# Copy Row Values
# We deal with Tuple Index 0 Based, so min_col must have to be -1
ws2.append((cell.value for cell in row[min_col-1:max_col]))
Tested with Python: 3.4.2 - openpyxl: 2.4.1 - LibreOffice: 4.3.3.2
Use a list to hold the items in each column for the particular row.
Then append the list to your ws2.
...
def iter_rows(ws,n): #produce the list of items in the particular row
for row in ws.iter_rows(n):
yield [cell.value for cell in row]
for row in ws1.iter_rows():
for cell in row:
if cell.value == 'TrueValue':
n = 'A' + str(cell.row) + ':' + ('GH' + str(cell.row))
list_to_append = list(iter_rows(ws1,n))
for items in list_to_append:
ws2.append(items)
I was able to solve this with lists for my project.
import openpyxl
#load data file
wb1 = openpyxl.load_workbook('original.xlsx')
sheet1 = wb1.active
print("loaded 1st file")
#new template file
wb2 = openpyxl.load_workbook('blank.xlsx')
sheet2 = wb2.active
print("loaded 2nd file")
header = sheet1[1:1] #grab header row
listH =[]
for h in header:
listH.append(h.value)
sheet2.append(listH)
colOfInterest= 11 # this is my col that contains the value I'm checking against
for rowNum in range(2, sheet1.max_row +1): #iterate over each row, starting with 2 to skipping header from original file
if sheet1.cell(row=rowNum, column=colOfInterest).value is not None: #interested in non blank values in column 11
listA = [] # list which will hold my data
row = sheet1[rowNum:rowNum] #creates a tuple of row's data
#print (str(rowNum)) # for debugging to show what rows are copied
for cell in row: # for each cell in the row
listA.append(cell.value) # add each cell's data as an element in the list
if listA[10] == 1: # condition1 I'm checking for by looking up the index in the list
sheet2.append(listA) # appending the sheet2's next available row
elif listA[10] > 1: # condition2 I'm checking for by looking up the index in the list
# do something else and store it in bar
sheet2.append(bar) # appending the sheet2's next available row
print("saving file...")
wb2.save('result.xlsx') # save file
print("Done!")
Tested with: Python 3.7 openpyxl 2.5.4