I am trying to get the column value from worksheet1 to worksheets2(in specific column), while skipping all the nul/None value in between. My code worked when I printed out all the values in worksheet1 column, exluding all the nul values. However when I saved it to worksheet2, it only showed the last value and duplicate that to the whole column(from row 2 to 20).
Don't know why only last value was written in the new column
from openpyxl import Workbook
from openpyxl import load_workbook
source_file = (r'XXX(Source file).xlsx')
dest_file = (r'XXX(dest file).xlsx')
wb1=load_workbook(source_file, data_only=True)
wb1.active=0
ws1=wb1.active
wb2=load_workbook(dest_file)
wb2.active=0
ws2=wb2.active
for a in range(9,43):
cell2 = ws1.cell(row = a, column = 10)
if cell2.value is None or cell2.value == 0:
continue
else:
print(cell2.value)
for b in range(2,20):
ws2.cell(row = b, column=4).value = cell2.value
wb2.save(dest_file)
Your second loop is nested so that it will always overwrite all values in the column of the second sheet with the same value from the first.
I'd do something like this:
idx = 2
for row in ws1.iter_rows(min_row=9, max_row=43, min_col=10, max_col=10):
cell = row[0]
if not cell.value:
ws2.cell(row=idx, column=4, value=cell.value)
idx += 1
Related
If the cell contains "external" from the C column then copy cell "good" from the D column, into the E column, in the rows where the A column contains 003.
Below are two images (before and after) in excel.
Before:
After:
I tried to find a correct script but it did not work out. It needs to be changed to "row" and "column" where I put "???" :
import openpyxl
from openpyxl import load_workbook
wb_source = openpyxl.load_workbook("path/file.xlsx")
sheet = wb_source['Sheet1']
x=sheet.max_row
y=sheet.max_column
for r in range(1, x+1) :
for j in range(1, y+1):
copy(sheet.cell(row= ???, column=???)
if str(copy.value)=="external":
sheet.??
break
wb_source.save("path/file2.xlsx")
How should they be added (row and column)?
Read the entire sheet.
Create a dictionary for the external products
Write back to Excel.
Try:
import openpyxl
wb = openpyxl.load_workbook("file1.xlsx")
ws = wb['Sheet1']
data = list()
for r, row in enumerate(ws.iter_rows()):
data.append([cell.value for c, cell in enumerate(row)])
mapper = {l[0]: l[-1] for l in data if l[2]=="external"}
for r, row in enumerate(ws.iter_rows()):
if ws.cell(r+1, 1).value in mapper:
ws.cell(r+1, 5).value = mapper[ws.cell(r+1, 1).value]
wb.save("file2.xlsx")
I am trying to pass a new value to the B column starting at B2 and looping to Max Row.
import openpyxl
import os
# Finds current directoryhow
current_path = os.getcwd()
print(current_path)
# Changes directory
os.chdir('C:\\Users\\satwood\\Documents\\Example')
# prints new current directory
new_path = os.getcwd()
print(new_path)
# load workbooks
wb = openpyxl.load_workbook('example.xlsx')
type(wb)
# load worksheets
ws1 = wb.active
# append column B with cell_example
cell_example = ['success one']
max = ws1.max_row
for row, entry in enumerate(cell_example, start=1):
ws1.cell(row=row + max, column=1, value=entry)
wb.save('example.xlsx')
The output that comes out of this code is:
None
None
None
None
None
None
...
Your code isn't doing what you are expecting it to do. You are iterating over cell_example which is a list with only 1 item ('success one'). Your loop runs once, row is equal 1+max_row and entry is success one which is assigned to cell A{1+max_row}.
Do notice that B column is either B using ws['B{row}'] or 2 using how you're using it in your code.
If you are trying to add the same value to all cells up to max_row in column B then you just need to change your loop to:
for row in range(2, max_row):
ws1.cell(row=row, column=2, value=cell_example[0])
However, if you wish to iterate over the list in cell_example and assign those to column B then you can use:
for row, entry in enumerate(cell_example, start=2):
ws1.cell(row=row, column=2, value=entry)
This will add items from the list cell_example into column B starting at the cell B2.
Do notice what stovfl said and pointed to that a better approach to iterating over cells in a column will be to use:
for col_cells in worksheet.iter_cols(min_col=2, max_col=2):
for cell in col_cells:
cell.value = "Something"
as stated in this answer which he metioned.
How do I iterate through all the rows in an xls sheet, and get each row data in a tuple. So at the end of the iteration, I should have a list of tuples with each element in the list, being a tuple of row data.
For instance: This is the content of my spreadsheet:
testcase_ID input_request request_change
test_1A test/request_1 YES
test_2A test/request_2 NO
test_3A test/request_3 YES
test_4A test/request_4 YES
my final list should be:
[(test_1A, test/request_1, YES),
(test_2A, test/request_2, NO),
(test_3A, test/request_3, YES),
(test_4A, test/request_4, YES)]
How can I do this in openpyxl?
I think this task would be easier with xlrd. However, if you want to use openpyxl, then assuming that testcase_ID is in column A, input_request in column B, and request_change in column C somehting like this might be what you are looking for:
import openpyxl as xl
#Opening xl file
wb = xl.load_workbook('PATH/TO/FILE.xlsx')
#Select your sheet (for this example I chose active sheet)
ws = wb.active
#Start row, where data begins
row = 2
testcase = '' #this is just so that you can enter while - loop
#Initialiazing list
final_list = []
#With each iteration we get the value of testcase, if the cell is empty
#tescase will be None, when that happens the while loop will stop
while testcase is not None:
#Getting cell value, from columns A, B and C
#Iterating through rows 2, 3, 4 ...
testcase = ws['A' + str(row)].value
in_request = ws['B' + str(row)].value
req_change = ws['C' + str(row)].value
#Making tuple
row_tuple = (testcase, in_request, req_change)
#Adding tuple to list
final_list.append(row_tuple)
#Going to next row
row += 1
#This is what you return, you don't want the last element
#because it is tuple of None's
print(final_list[:-1])
If you want to do it with xlrd this is how I would do it:
import xlrd
#Opening xl file
wb = xlrd.open_workbook('PATH/TO/FILE.xlsx')
#Select your sheet (for this example I chose first sheet)
#you can also choose by name or something else
ws = wb.sheet_by_index(0)
#Getting number of rows and columns
num_row = ws.nrows
num_col = ws.ncols
#Initializing list
final_list = []
#Iterating over number of rows
for i in range(1,num_row):
#list of row values
row_values = []
#Iterating over number of cols
for j in range(num_col):
row_values.append(ws.cell_value(i,j))
#Making tuple with row values
row_tuple = tuple(row_values)
#Adding tuple to list
final_list.append(row_tuple)
print(final_list)
Adding xlrd index specifications comments at the end for easy reading:
Deleted if statement, when num_row is 1 then for-loop never happens
xlrd indexes rows beginning at 0
for row 2 we want index 1
Columns are also zero-indexed (A=0, B=1, C=2...)
I'm relatively new to Python, and I'm attempting to count the number of empty cells in an excel sheet filled with data. To test the program, I've been deleting some values so that the cells are empty: my code is below
import xlrd
import pandas as pd
import openpyxl
df = pd.read_excel('5train.xls')
workbook = xlrd.open_workbook('5train.xls')
worksheet = workbook.sheet_by_name('5train')
#Task starts here
empty = 0
row_data = worksheet.nrows - 1
row = 0
cell = 0
while row < row_data:
if worksheet.cell(0, 0).value == xlrd.empty_cell.value:
empty += 1
cell += 1
else:
pass
row += 1
print("Number of empty cells in data sheet:", empty)
However, the code will consistently print "Number of empty cells in data sheet: 0" no matter how many cells I empty. Any pointers? Thank you!
You always check the same cell in your loop:
if worksheet.cell(0, 0).value == xlrd.empty_cell.value:
Only the cell in row 0 and columns 0 is checked if it is empty.
You can iterate over each row through the last row that contains data using .get_rows(), then count the empty cells by checking the value of each cell in each row.
workbook = xlrd.open_workbook('5train.xls')
worksheet = workbook.sheet_by_name('5train')
empty_cells = 0
for row in worksheet.get_rows():
empty_cells += sum(0 if c.value else 1 for c in row)
If you want to make it a one-liner, you can use:
empty_cells = sum(0 if c.value else 1 for row in worksheet.get_rows() for c in row)
I have a worksheet that is updated every week with thousands of rows and would need to transfer rows from this worksheet after filtering. I am using the current code to find the cells which has the value I need and then transfer the entire row to another sheet but after saving the file, I get the "IndexError: list index out of range" exception.
The code I use is as follows:
import openpyxl
wb1 = openpyxl.load_workbook('file1.xlsx')
wb2 = openpyxl.load_workbook('file2.xlsx')
ws1 = wb1.active
ws2 = wb2.active
for row in ws1.iter_rows():
for cell in row:
if cell.value == 'TrueValue':
n = 'A' + str(cell.row) + ':' + ('GH' + str(cell.row))
for row2 in ws1.iter_rows(n):
ws2.append(row2)
wb2.save("file2.xlsx")
The original code I used that used to work is below and has to be modified because of the large files which causes MS Excel not to open them (over 40mb).
n = 'A3' + ':' + ('GH'+ str(ws1.max_row))
for row in ws1.iter_rows(n):
ws2.append(row)
Thanks.
I'm not entirely sure what you're trying to do but I suspect the problem is that you have nested your copy loop.
Try the following:
row_nr = 1
for row in ws1:
for cell in row:
if cell.value == "TrueValue":
row_nr = cell.row
break
if row_nr > 1:
break
for row in ws1.iter_rows(min_row=row_nr, max_col=190):
ws2.append((cell.value for cell in row))
Question: I get the "IndexError: list index out of range" exception.
I get, from ws1.iter_rows(n)
UserWarning: Using a range string is deprecated. Use ws[range_string]
and from ws2.append(row2).
ValueError: Cells cannot be copied from other worksheets
The Reason are row2 does hold a list of Cell objects instead of a list of Values
Question: ... need to transfer rows from this worksheet after filtering
The following do what you want, for instance:
# If you want to Start at Row 2 to append Row Data
# Set Private self._current_row to 1
ws2.cell(row=1, column=1).value = ws2.cell(row=1, column=1).value
# Define min/max Column Range to copy
from openpyxl.utils import range_boundaries
min_col, min_row, max_col, max_row = range_boundaries('A:GH')
# Define Cell Index (0 Based) used to Check Value
check = 0 # == A
for row in ws1.iter_rows():
if row[check].value == 'TrueValue':
# Copy Row Values
# We deal with Tuple Index 0 Based, so min_col must have to be -1
ws2.append((cell.value for cell in row[min_col-1:max_col]))
Tested with Python: 3.4.2 - openpyxl: 2.4.1 - LibreOffice: 4.3.3.2
Use a list to hold the items in each column for the particular row.
Then append the list to your ws2.
...
def iter_rows(ws,n): #produce the list of items in the particular row
for row in ws.iter_rows(n):
yield [cell.value for cell in row]
for row in ws1.iter_rows():
for cell in row:
if cell.value == 'TrueValue':
n = 'A' + str(cell.row) + ':' + ('GH' + str(cell.row))
list_to_append = list(iter_rows(ws1,n))
for items in list_to_append:
ws2.append(items)
I was able to solve this with lists for my project.
import openpyxl
#load data file
wb1 = openpyxl.load_workbook('original.xlsx')
sheet1 = wb1.active
print("loaded 1st file")
#new template file
wb2 = openpyxl.load_workbook('blank.xlsx')
sheet2 = wb2.active
print("loaded 2nd file")
header = sheet1[1:1] #grab header row
listH =[]
for h in header:
listH.append(h.value)
sheet2.append(listH)
colOfInterest= 11 # this is my col that contains the value I'm checking against
for rowNum in range(2, sheet1.max_row +1): #iterate over each row, starting with 2 to skipping header from original file
if sheet1.cell(row=rowNum, column=colOfInterest).value is not None: #interested in non blank values in column 11
listA = [] # list which will hold my data
row = sheet1[rowNum:rowNum] #creates a tuple of row's data
#print (str(rowNum)) # for debugging to show what rows are copied
for cell in row: # for each cell in the row
listA.append(cell.value) # add each cell's data as an element in the list
if listA[10] == 1: # condition1 I'm checking for by looking up the index in the list
sheet2.append(listA) # appending the sheet2's next available row
elif listA[10] > 1: # condition2 I'm checking for by looking up the index in the list
# do something else and store it in bar
sheet2.append(bar) # appending the sheet2's next available row
print("saving file...")
wb2.save('result.xlsx') # save file
print("Done!")
Tested with: Python 3.7 openpyxl 2.5.4