i have a QTableWidget Table populated with data from xlsx. To insert row in any position i want , i must give "Kod_Towaru" index first to insert below specific amount of rows.
Code is :
columnHeaders = []
for j in range(self.ui.zestawienie_analiza_tab_2.model().columnCount()):
columnHeaders.append(self.ui.zestawienie_analiza_tab_2.horizontalHeaderItem(j).text())
df = pd.DataFrame(columns=columnHeaders)
for row in range(self.ui.zestawienie_analiza_tab_2.rowCount()):
for col in range(self.ui.zestawienie_analiza_tab_2.columnCount()):
df.at[row, columnHeaders[col]] = self.ui.zestawienie_analiza_tab_2.item(row, col).text()
from openpyxl import Workbook
flag=False
wb = openpyxl.load_workbook('Zestawienie - NOWA WERSJA.xlsx')
sheet = wb['Zestawienie']
index = df.index
number_of_rows = len(index)
# find length of index
print(number_of_rows)
kod_towaru = self.ui.dodaj_normalia.text()
if index.size != 0:
result = df.loc[df['Kod_Towaru'] == kod_towaru].index[0]
print(result)
amount = self.ui.ilosc_normalia.text()
direct_amount = int(amount)
sheet.insert_rows(idx=result+3, amount=direct_amount)
wb.save('Zestawienie - NOWA WERSJA.xlsx')
But this is as you can see from the code above very complex to use at very first the xlsx to insert blank rows and then to populate again the same table. And as i said before i must give 2 variables : Kod_Towaru to give an index position of the row and amount .
Is there a way to do it like Excel , just with Ctrl + , with right mouse click or something?
Related
I have a workbook in which I get data from a program and I was trying to automate some of the tasks, Basically there is data for a game every 4 rows and it goes for about 13 columns. I wanted to figure out a code to insert a row every 4th row with an "AVG" function in each cell that takes the average of the 4 cells below it for each column. I'm not sure if there is anything like this available.
Whichever you decide to use, openpyxl or xlwings or other the method is pretty much the same. Iterate through a range covering the rows with your data at every 5th row insert a new row and update each cell with the average formula.
The first row for formulas will presumably be row 2 with row 1 as header row
Note that the total number of rows will increase as you insert new rows i.e if there are 24 rows of data then ultimately 6 new rows are inserted. If the loop only covers the a range of 24 it would leave the bottom 4 rows unprocessed. Calculate the total number of rows as number of data rows + 1/4. The range should have a step of 5, [formula row + 4 data rows] to jump to the next row for insert.
For each 'row' in the loop insert a new row. You may want to 'clear' the new row also in case it inherited any formatting from the previous row. You'll need to create the range coordinates using the row variable, e.g. it will be 2, 7, 12 etc so the row number needs to be updated to the next value for each loop.
You can insert a new row across all columns in the sheet or just the columns with the data as above.
After the new row is inserted. Create a new loop to cycle through the columns on that new row and insert the formula for each cell with the correct coordinates for the row and column.
Again the row and column coordinates for both the cell.value and the AVERAGE formula need to be created from the variables of the loops.
--------Additional Information ------------
Examples on how to do this;
Openpyxl
import openpyxl as op
from openpyxl.styles import PatternFill
file = '<filename>.xlsx'
wb = op.load_workbook(file)
ws = wb['Sheet1']
cell_top_left = 'A1'
row_steps = 4
greenbg = PatternFill(start_color='FF00FF00', end_color='FF00FF00', fill_type='solid')
first_col = list(cell_top_left)[:1][0]
last_col = op.utils.cell.get_column_letter(ws.max_column)
num_col = ws.max_column
num_row = ws.max_row
num_row += int((num_row-1)/4)-3
for row in range(2, num_row, row_steps + 1):
ws.insert_rows(row)
for col in range(1, num_col + 1):
col_letter = ws.cell(row=row, column=col).column_letter
start_cell = col_letter + str(row+1)
end_cell = col_letter + str(row+4)
ws.cell(row, col).value = '=AVERAGE({0}:{1})'.format(start_cell, end_cell)
ws.cell(row, col).fill = greenbg
wb.save("modded_" + file)
xlwings
import xlwings as xw
file = '<filename>.xlsx'
wb = xw.Book(file)
ws = wb.sheets('Sheet1')
cell_top_left = 'A1'
row_steps = 4
first_col = list(cell_top_left)[:1][0]
last_col = list(ws.range(cell_top_left).end('right').address.replace('$', ''))[:1][0]
num_col = ws.range(cell_top_left).end('right').column
num_row = ws.range(cell_top_left).end('down').row
num_row += int((num_row - 1) / 4) - 3
for row in range(2, num_row, row_steps + 1):
str_coord = first_col + str(row) + ':' + last_col + str(row)
ws.range(str_coord).insert('down')
ws.range(str_coord).clear()
for col in range(1, num_col + 1):
start_cell = ws.range(row + 1, col).address.replace('$', '')
end_cell = ws.range(row + 4, col).address.replace('$', '')
ws.range((row, col)).value = '=AVERAGE({0}:{1})'.format(start_cell, end_cell)
ws.range((row, col)).color = (0, 255, 0)
wb.save("modded_" + file)
First row is the header
I then want cells to be printed. The code runs successfully. Creates the file and the header but does not print the values like Cars, 10 etc from row 2. What's wrong in the code ? Thanks !
from openpyxl import Workbook
wb = Workbook() #object of Workbook type
print(wb.active.title)
print(wb.sheetnames)
wb['Sheet'].title="Report_Amount"
sh1 = wb.active
sh1['A1'].value = "Item" #Writing into the cell to create header
sh1['B1'].value = "Quantity"
sh1['C1'].value = "Price($)"
sh1['D1'].value = "Amount($)"
column1 = sh1.max_column
row1 = sh1.max_row
print(f'no. of columns : {column1}')
print(f'no. of rows : {row1}')
for i in range (2, row1+1): #want to write values from row #2
for j in range (1, column1+1):
sh1.cell(row=i,column=j).value = "Cars"
sh1.cell(row=i, column = j+1).value = 5
sh1.cell(row=i, column = j+2).value = 10000
sh1.cell(row=i, column = j+3).value = 50000
print("file saved")
wb.save("C:\\Users\\Ricky\\Desktop\\FirstCreatedPythonExcel1.xlsx")
The reason that the rows are not being filled in is because of the for loop condition. According to the documentation, cells are not created in memory until they are accessed. So you only create one row meaning that the maximum will be one. By changing the value of row1 to a static number, you will see that it will put the values into the cell.
I'm working with a xlsx file where it is divided by sections with empty rows and each section has an information displayed in a different manner i.e. different columns.
So i'm basically trying to find the section that i'm looking for ('Ação') and create a range from its next line, where are the headers, until the next empty row so I can create a DataFrame of this range.
when I try to print the index, it returns a tuple containing the values of the row, but I couldn't find a way to return its index (integer)
from openpyxl import load_workbook
data = '2019/02/07'
symbol = 'EQTL3'
ano = data[0:4]
mes = data[5:7]
dia = data[8:10]
file = "Fundo_{}{}{}.xlsx".format(ano, mes, dia)
wb = load_workbook(filename=file, read_only=False)
ws = wb["Fundo_{}{}{}".format(ano, mes, dia)]
for cell in ws['A']:
if (cell.value == 'Ação'):
x = int(cell.coordinate[1:]) + 1
for index in ws.iter_rows(min_row=x, max_col=ws.max_column, max_row=ws.max_row, values_only=True):
if (index[0] == None):
y = ws._current_row
break
I expect to receive an integer value with the index of the last row different than empty.
you can use enumerate for that....
something like this:
for row_idx, row_of_cells in enumerate(ws.iter_rows(min_row=x, values_only=True), start=1):
I'm relatively new to Python, and I'm attempting to count the number of empty cells in an excel sheet filled with data. To test the program, I've been deleting some values so that the cells are empty: my code is below
import xlrd
import pandas as pd
import openpyxl
df = pd.read_excel('5train.xls')
workbook = xlrd.open_workbook('5train.xls')
worksheet = workbook.sheet_by_name('5train')
#Task starts here
empty = 0
row_data = worksheet.nrows - 1
row = 0
cell = 0
while row < row_data:
if worksheet.cell(0, 0).value == xlrd.empty_cell.value:
empty += 1
cell += 1
else:
pass
row += 1
print("Number of empty cells in data sheet:", empty)
However, the code will consistently print "Number of empty cells in data sheet: 0" no matter how many cells I empty. Any pointers? Thank you!
You always check the same cell in your loop:
if worksheet.cell(0, 0).value == xlrd.empty_cell.value:
Only the cell in row 0 and columns 0 is checked if it is empty.
You can iterate over each row through the last row that contains data using .get_rows(), then count the empty cells by checking the value of each cell in each row.
workbook = xlrd.open_workbook('5train.xls')
worksheet = workbook.sheet_by_name('5train')
empty_cells = 0
for row in worksheet.get_rows():
empty_cells += sum(0 if c.value else 1 for c in row)
If you want to make it a one-liner, you can use:
empty_cells = sum(0 if c.value else 1 for row in worksheet.get_rows() for c in row)
I have a worksheet that is updated every week with thousands of rows and would need to transfer rows from this worksheet after filtering. I am using the current code to find the cells which has the value I need and then transfer the entire row to another sheet but after saving the file, I get the "IndexError: list index out of range" exception.
The code I use is as follows:
import openpyxl
wb1 = openpyxl.load_workbook('file1.xlsx')
wb2 = openpyxl.load_workbook('file2.xlsx')
ws1 = wb1.active
ws2 = wb2.active
for row in ws1.iter_rows():
for cell in row:
if cell.value == 'TrueValue':
n = 'A' + str(cell.row) + ':' + ('GH' + str(cell.row))
for row2 in ws1.iter_rows(n):
ws2.append(row2)
wb2.save("file2.xlsx")
The original code I used that used to work is below and has to be modified because of the large files which causes MS Excel not to open them (over 40mb).
n = 'A3' + ':' + ('GH'+ str(ws1.max_row))
for row in ws1.iter_rows(n):
ws2.append(row)
Thanks.
I'm not entirely sure what you're trying to do but I suspect the problem is that you have nested your copy loop.
Try the following:
row_nr = 1
for row in ws1:
for cell in row:
if cell.value == "TrueValue":
row_nr = cell.row
break
if row_nr > 1:
break
for row in ws1.iter_rows(min_row=row_nr, max_col=190):
ws2.append((cell.value for cell in row))
Question: I get the "IndexError: list index out of range" exception.
I get, from ws1.iter_rows(n)
UserWarning: Using a range string is deprecated. Use ws[range_string]
and from ws2.append(row2).
ValueError: Cells cannot be copied from other worksheets
The Reason are row2 does hold a list of Cell objects instead of a list of Values
Question: ... need to transfer rows from this worksheet after filtering
The following do what you want, for instance:
# If you want to Start at Row 2 to append Row Data
# Set Private self._current_row to 1
ws2.cell(row=1, column=1).value = ws2.cell(row=1, column=1).value
# Define min/max Column Range to copy
from openpyxl.utils import range_boundaries
min_col, min_row, max_col, max_row = range_boundaries('A:GH')
# Define Cell Index (0 Based) used to Check Value
check = 0 # == A
for row in ws1.iter_rows():
if row[check].value == 'TrueValue':
# Copy Row Values
# We deal with Tuple Index 0 Based, so min_col must have to be -1
ws2.append((cell.value for cell in row[min_col-1:max_col]))
Tested with Python: 3.4.2 - openpyxl: 2.4.1 - LibreOffice: 4.3.3.2
Use a list to hold the items in each column for the particular row.
Then append the list to your ws2.
...
def iter_rows(ws,n): #produce the list of items in the particular row
for row in ws.iter_rows(n):
yield [cell.value for cell in row]
for row in ws1.iter_rows():
for cell in row:
if cell.value == 'TrueValue':
n = 'A' + str(cell.row) + ':' + ('GH' + str(cell.row))
list_to_append = list(iter_rows(ws1,n))
for items in list_to_append:
ws2.append(items)
I was able to solve this with lists for my project.
import openpyxl
#load data file
wb1 = openpyxl.load_workbook('original.xlsx')
sheet1 = wb1.active
print("loaded 1st file")
#new template file
wb2 = openpyxl.load_workbook('blank.xlsx')
sheet2 = wb2.active
print("loaded 2nd file")
header = sheet1[1:1] #grab header row
listH =[]
for h in header:
listH.append(h.value)
sheet2.append(listH)
colOfInterest= 11 # this is my col that contains the value I'm checking against
for rowNum in range(2, sheet1.max_row +1): #iterate over each row, starting with 2 to skipping header from original file
if sheet1.cell(row=rowNum, column=colOfInterest).value is not None: #interested in non blank values in column 11
listA = [] # list which will hold my data
row = sheet1[rowNum:rowNum] #creates a tuple of row's data
#print (str(rowNum)) # for debugging to show what rows are copied
for cell in row: # for each cell in the row
listA.append(cell.value) # add each cell's data as an element in the list
if listA[10] == 1: # condition1 I'm checking for by looking up the index in the list
sheet2.append(listA) # appending the sheet2's next available row
elif listA[10] > 1: # condition2 I'm checking for by looking up the index in the list
# do something else and store it in bar
sheet2.append(bar) # appending the sheet2's next available row
print("saving file...")
wb2.save('result.xlsx') # save file
print("Done!")
Tested with: Python 3.7 openpyxl 2.5.4