Openpyxl: iterate on range of cell - python

I would write to a range of cells a sequence of values taken from a list.
VIEW
def create_excel(avg):
wb = load_workbook('output/file_base.xlsx')
ws = wb.active
ws['D20'] = avg[0]
ws['D21'] = avg[1]
ws['D22'] = avg[2]
ws['D23'] = avg[3]
ws['D24'] = avg[4]
ws['D25'] = avg[5]
wb.save('out.xlsx')
return 1
I would do this using a loop, and I have tried the following:
start, stop = 20,26
for index, row in enumerate(ws.iter_rows()):
if start < index < stop:
ws.cell[index] = avg[index]
but it returns:
list index out of range
How can I can do this? I am using openpyxl 2.3

You can specify row and columns as follows:
import openpyxl
avg = [10, 20, 25, 5, 32, 7]
wb = openpyxl.load_workbook('output/file_base.xlsx')
ws = wb.active
for row, entry in enumerate(avg, start=20):
ws.cell(row=row, column=4, value=entry)
wb.save('out.xlsx')
This iterates over your averages, and uses Python's enumerate function to count for you at the same time. By telling it to start with a value of 20, it can then be used as the row value for writing to a cell.

Related

how to write dynamic lines from pdf to excel

import pdfplumber
import openpyxl
pdf = pdfplumber.open("path")
page = pdf.pages [0]
text = page.extract_text()
lin = text.split("\n")
wb = openpyxl.load_workbook ("path")
ws= wb.active
ws.title = "S1"
u = int(0)
i = int(1)
for u in lin:
ws.cell(row = i, column =1).value = lin [u]
u = u+1
i = i+1
wb.save ("path")
pdf.close
print("Ok!")
Error:
TypeError: list indices must be integers or slices, not str
The error occurs on for.
I split each line of the pdf and now I want to write each line in excel.
Example:
line in specific pdf file:
A
B
C
D
I got every line value from the pdf. The variable "lin" for example, has the value A in line 0, the value B in line 1.
I want to take the value of lin 0 and write it in cell A1, and then in the same column take the value of lin 1 and write it in cell A2 in excel and so on.
You should use an integer variable u as the index.
u = 0
i = 1
for u in range(len(lin)):
ws.cell(row=i, column=1).value = lin[u]
u += 1
i += 1
You can also replace the for loop with a for-each loop:
for value in lin:
ws.cell(row=i, column=1).value = value
i += 1
In this case you don't need to worry about the index variable.
pdf.close()

Making my excel formatting python script more efficient

###Im trying to create a program that checks a column in excel and moves specific rows with specific values to another sheet in the same workbook, but I was wondering if there was a more efficient way to do this using pandas or something else.###
import time
start_time = time.perf_counter ()
import openpyxl
wb = openpyxl.load_workbook("Test.xlsx")
ws=wb.active
mr,mc=ws.max_row,ws.max_column
column_string=input("Enter Column Letter with Email (A or B or C or leave blank to skip editing):").upper()
if len(column_string)>0:
for cell in ws[column_string][1:]:
if cell.value is None:
ws_1=wb.create_sheet('Linkedin Only')
for i in range (1, mr +1):
for j in range (1, mc + 1):
c = ws.cell(row = i, column = j)
ws_1.cell(row = i, column = j).value = c.value
break
for cell in ws_1[column_string][1:]:
if cell.value is not None:
ws_1.delete_rows(cell.row)
for cell in ws[column_string][1:]:
if cell.value is None:
ws.delete_rows(cell.row)
wb.save("Test.xlsx")
else:
wb.save("Test.xlsx")
end_time = time.perf_counter ()
print(end_time - start_time, "seconds")

openpyxl - Find the maximum length of a cell for each column

It's my first time to use openpyxl. I want to know the size of the longest cell for each column in Excel. I tried hard to write the code, but the output is in row, and even that doesn't come out correctly. How can I fix it what I want? If you know, please reply, thank you
import openpyxl
filepath = "test.xlsx"
wb = openpyxl.load_workbook(filepath)
ws = wb.active
max_row = ws.max_row
max_column = ws.max_column
for i in range(1, max_row + 1):
max_length = 0
for j in range(1, max_column + 1):
try:
if len(str(ws.cell(row=i, column=j).value)) > max_length:
max_length = len(ws.cell(row=i, column=j).value)
except:
pass
print(max_length)
Well, you can use the ws.iter_cols(), like #CharlieClark mentioned in the comments. Here's an example:
maxLen = float('-inf')
columns = sht.iter_cols(2, 2) # The (2, 2) is the mincol to maxcol, including the max column itself, e.g. if you want to iterate through column 7, you do (7, 7)
for col in columns:
for cellRow in col: #cellRow is the specific cell, not its value
maxLen = len(cellRow.value) if len(cellRow.value) > maxLen else maxLen #Sets maxLen to the length of cellRow.value if it is larger than maxLen.

Populate different cells in a column with different values using a "for" loop

I'm trying to populate the first 9 cells in a first row with different values in an excel spreadhseet. The code as is populates the first 9 cells as expected BUT instead of populating each of the cells with "j" variable string values - "a","b","c","d","e" in each of the cells it populates all 9 cells with only last value - "e". How can I make the code to iterate through the string assigned in "j" and populate the cells in the spreadsheet with each of the string letters?
Python version 3.6,
IDE: Pycharm
Here is the code:
import xlsxwriter
workbook = xlsxwriter.Workbook("test.xlsx")
worksheet = workbook.add_worksheet()
for h in range(0, 9): #Cell position generator
u = 1
cell_position = (u + h)
g = "A"
f = str(cell_position)
iterated_cell_position = [g+f]#puts cell positions in a list
j = "abcde"
for p in iterated_cell_position:
for e in j:
worksheet.write(p, e)
workbook.close()
Please help me with this?
Thank you.
your iterated_cell_position is an array of one element, and the line
for e in j:
worksheet.write(p, e)
just writes each letter to the same cell. So you write a to the cell, then b to the cell, then c and so on. Try
import xlsxwriter
workbook = xlsxwriter.Workbook("test.xlsx")
worksheet = workbook.add_worksheet()
j = "abcde"
for h in range(0, 9): #Cell position generator
e = j[h % 5] # gets the correct letter in j (wraps around when h gets too large)
cell_position = "A{}".format(h + 1)
worksheet.write(cell_position, e)
workbook.close()

Find index of duplicate rows in Openpyxl

I want to find the index of all duplicate rows in an excel file and add them to a list which will be handled later.
unwantedRows = []
Row = []
item = ""
for index, row in enumerate(ws1.iter_rows(max_col = 50), start = 1):
for cell in row:
if cell.value:
item += cell.value
if item in Row:
unwantedRows.append(index)
else:
Row.append(item)
However this fails to work. It only indexes rows that are completely empty. How do I fix this?
unwantedRows = []
Rows = []
for index, row in enumerate(ws1.iter_rows(max_col = 50), start = 1):
sublist = []
for cell in row:
sublist.append(cell.value)
if sublist not in Rows:
Rows.append((sublist))
else:
unwantedRows.append(index)
Without a tuple:
row_numbers_to_delete = []
rows_to_keep = []
for row in ws.rows:
working_list = []
for cell in row:
working_list.append(cell.value)
if working_list not in rows_to_keep:
rows_to_keep.append(working_list)
else:
row_numbers_to_delete.append(cell.row)
for row in row_numbers_to_delete:
ws.delete_rows(
idx=row,
amount=1
)

Categories