I need to find the first empty row in an Excel file, i am currently using Openpyxl with Python.
I couldn't find any method that does what i need so i am trying to make my own. This is my code:
book = load_workbook("myfile.xlsx")
ws = book.worksheets[0]
for row in ws['C{}:C{}'.format(ws.min_row,ws.max_row)]:
for cell in row:
if cell.value is None:
print cell.value
break
I am iterating through all cells in the "C" column and i am "breaking" if the cell is empty. The problem is that it won't break, it'll just keep print out "None" values.
Thanks
There is a built-in worksheet property "max_row" in openpyxl:
https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html#openpyxl.worksheet.worksheet.Worksheet.max_row
max_row: an integer defining the maximum row index containing data
This way your loop will stop if it encounters any empty cell in a row.
If you want the row wo be completely empty you can use all.
book = load_workbook("myfile.xlsx")
ws = book.worksheets[0]
for cell in ws["C"]:
if cell.value is None:
print cell.row
break
else:
print cell.row + 1
Update to the question in the comments:
ws["C"] will get a slice from C1:CX where X is the last filled cell in any column. So if the C column happens to be the longest column and every entry is filled you will only get cells with cell is not None so you won't break out of the loop. If you didn't break out of the loop you will enter the else block and since you looped till the last filled row, the first empty row will be the next one.
Related
Python version: 3.6
Python library: openpyxl
Excel version: 365
This will return the values from each cell in 255 columns of the top row of an excel file. I only put 255 in as a temporary place to stop:
for row in ws.iter_rows(min_row=1, max_col=255, max_row=1, values_only=True):
print(row)
I don't know how many columns with data will be in each workbook. All the top row cells that contain data will be consecutively listed starting from column 1.
When a top row cell without data is encountered, all remaining columns/rows will be empty.
I need the values of those consecutive top rows that contain values.
Thanks for the time.
#CharlieClark pointed me in the right direction. Something like this worked out for me. I still had to keep max_col=255 though or it would error out.
def column_get():
i = 1
for row in ws.iter_rows(min_row=1, max_col=255, max_row=1, values_only=True):
for x in row:
if row[i] is not None:
print(row[i])
i += 1
else:
break
If I understand you correctly you can just remove max_col value. then it prints the first row values until an empty cell.
try this:
for row in ws.iter_rows(min_row=1, max_row=1, values_only=True):
print(row)
If you still see many None values check if the sheet's first row doesn't contain any value interpeted by mistake as None. I would suggest you to debug it this way: create a new empty sheet and insert manually a test data - see if it works. if it does copy paste manually the data from the actual sheet to the test one.
I'm writing a program that searches through the first row of a sheet for a specific value ("Filenames"). Once found, it iterates through that column and returns the values underneath it (rows 2 through x).
I've figured out how to iterate through the first row in the sheet, and get the cell which contains the specific value, but now I need to iterate over that column and print out those values. How do I do so?
import os
import sys
from openpyxl import load_workbook
def main():
column_value = 'Filenames'
wb = load_workbook('test.xlsx')
script = wb["Script"]
# Find "Filenames"
for col in script.iter_rows(min_row=1, max_row=1):
for name in col:
if (name.value == column_value):
print("Found it!")
filenameColumn = name
print(filenameColumn)
# Now that we have that column, iterate over the rows in that specific column to get the filenames
for row in filenameColumn: # THIS DOES NOT WORK
print(row.value)
main()
You're actually iterating over rows and cells, not columns and names here:
for col in script.iter_rows(min_row=1, max_row=1):
for name in col:
if you rewrite it that way, you can see you get a cell, like this:
for row in script.iter_rows(min_row=1, max_row=1):
for cell in row:
if (cell.value == column_value):
print("Found it!")
filenameCell = cell
print(filenameCell)
So you have a cell. You need to get the column, which you can do with cell.column which returns a column index.
Better though, than iterating over just the first row (which iter_rows with min and max row set to 1 does) would be to just use iter_cols - built for this. So:
for col in script.iter_cols():
# see if the value of the first cell matches
if col[0].value == column_value:
# this is the column we want, this col is an iterable of cells:
for cell in col:
# do something with the cell in this column here
What I want to do is to find the first empty cells value in row 1 using openpyxl.
what i have tried is:
last_date = ws.max_column
x3 = ws.cell(row=1, column=last_date)
print(x3.value)
however this does not work because in my situation because i am looking for the first empty cell in row 1. This gets the last column that's data in it.
I recently developed code to find a keyword I input and it finds the keyword by iterating over the rows of an excel sheet, but when I find that keyword in the row how do I move horizontally and get the value from a column cell in the very row I found the keyword in?
A simple way to do this is to grab the value from a cell in a different column as you iterate over each row. Below, I'm assuming you are working from an existing workbook, which you can load by declaring the filepath variable.
import openpyxl
wb = openpyxl.load_workbook(filepath)
ws = wb.active
# Iterate each row of the spreadsheet
for row in ws.iter_rows():
# Check if the value in column A is equal to variable "target"
if row[0].value == target:
# If there is a match, output is value in same row from column B
output = row[1].value
In this example, you iterate through each row to check if the value in column A is equal to the target variable. If so, you can then retrieve any other value on that row by changing the index for the output variable.
Column index values run from 0 on, so row[0].value would be the value in the row for column A, row[1].value is the value in the row for column B, and so forth.
You have not given much information here as to what library you are using, which would be essential to give you any syntax hints. Openpyxl? Pandas?
So I can just help you with some pointers for your code:
You have a function that iterated over the rows.
You should write the function in a way that it keeps track of which row its checking, and then, when it finds the keyword, it should return the row number. Perhaps with the enumerate function. Or with a simple counter
counter = 1
for cell in column:
if keyword = cell.value:
return counter
else:
counter += 1
With the row number, all you need to do is to create a reference to the cell in which the value is, then add 1 column to the reference.
For example, if the reference for the keyword is (1, 2) (column, row) then you do a transformation like
keyword_ref = (1, 2)
value_ref = (keyword_ref[0] + 1, keyword_ref[1])
Finally you return the value in the value_ref.
I am having one excel sheet and i have to iterate the row up to certain value( like cell.value="cxc")
If cell is having that particular value it should stop and create an array with all the iterated values in it.
After reaching to that particular value it should iterate again to the maximum column and create an new array of that values
mylist = []
for row in ws.iter_rows():
for cell in row:
if cell.value == "CXC Type":
break
print(cell.coordinate ,end=" ")
else:
mylist.append(cell.value)
print(mylist)