detect EOF of excel file in Python

detect EOF of excel file in Python - python

I have written a code for detecting the EOF of an excel file using python:
row_no = 1
while True:
x = xlws.Cells(row_no,1).value
if type(x) is None:
break
else:
print(len(x))
print(x)
row_no = row_no + 1
i expect the while loop will stop then x becomes a "blank cell", which I support to be None, but it doesn't work, and it go to len(x) and prompt me an error of NoneType has no len. Why?
Thanks!

This here is your problem:
if type(x) is None:
If x is None, its type is NoneType. Therefore, this is never true, so you never see the blank cell and you end up trying to get the length of None.
Instead, write:
if x is None:

It looks like you are using pywin32com ... you don't need to loop around finding "EOF" (you mean end of Sheet, not end of File).
If xlws refers to a Worksheet object, you can use this:
used = xlws.UsedRange
nrows = used.Row + used.Rows.Count - 1
to get the effective number of rows in the worksheet. used.Row is the 1-based row number of the first used row, and the meaning of used.Rows.Count should be rather obvious.
Alternative: use xlrd ... [dis]claimer: I'm the author.

As mentioned in other comments you can use 'xlrd' as well to know the limits of the excel file as:
workbook = xlrd.open_workbook (excel_loc)
excel_sheet = workbook.sheet_by_index(0)
print("no of rows: %d" %excel_sheet.nrows)
print("no of cols: %d" %excel_sheet.ncols)

Related

Include a header from Excel in a for loop with openpyxl

I am trying to include a header when printing data in a column.
Issue
But when I try it an error comes up:
TypeError: '<' not supported between instances of 'int' and 'str'
Code
def pm1():
for cell in all_columns[1]:
power = (cell.value)
if x < power < y:
print(f"{power}")
else:
print("Not steady")
pm1()
I know you cannot compare an string with operation values.
How can I include the header while looping throughout the entire column?

Based on what I understand from your comments, this may work for you.
def pm1():
for cell in all_columns[1]:
for thing in cell:
# in openpyxl you can call on .row or .column to get the location of your cell
# you said you wanted to print the header (row 1), a sting
if thing.row == 1:
print(thing.value)
else:
# you said that the values under the header will be a digit
# so now you should be safe to set your variable and make a comparison
power = thing.value
if x < power < y:
print(f"{power}")
else:
print("Not steady")

So you are looping through all cells of a column, here given by a first column all_columns[1].
Assume the first cell of each column might contain a header which has a value is of type string (type(cell.value) == str).
Then you have to possibilities:
Given the first cell of each column (in row 1) is a header, take advantage of that position
If all other cells contain numerical values, you can handle only the str values differently as supposed headers
def power_of(value):
# either define boundaries x,y here or global
power = float(value) # defensive conversion, some values might erroneously be stored as text in Excel
if x < power < y:
return f"{power}"
return "Not steady" # default return instead else
def pm1():
for cell in all_columns[1]:
if (cell.row == 1): # assume the header is always in first row
print(cell.value) # print header
else:
print(power_of(cell.value))
pm1()

Having troubles to write list in a new row every time during while loop with openpyxl

I'm encountering problems inserting the list values into a new row every time the code loops.
In a few words, I need the code every, it loops to write the values from lst into a separate row each time.
I have done an extensive research to try to understand how this works and found plenty of examples but unfortunately, I couldn't make the code work following those examples.
This is the code:
max_value = 5
lst = []
while True:
wb = Workbook()
ws = wb.active
num = (max_value)
for n in range(num):
weight = float(input('Weight: '))
lst.append(weight)
ws.append(lst)
if n+1 == max_value:
wb.save(filename='path')
I have tried to add ws.insert_rows(idx=2, amount=1) just after the line ws.append(lst) like this:
...
ws.insert_rows(idx=2, amount=1)
ws.append(lst)
if n+1 == max_value:
wb.save(filename='path')
but it doesn't do anything because I suppose it needs something that tells the code to write the next values in that row.
I have also tried something like this:
...
next_avail_row = len(list(ws.rows))
ws.append(lst)
if n+1 == max_value
wb.save(filename='path')
But here as well I'm not sure how to tell the code after it finds next_avail_row = len(list(ws.rows)) to write in that row.
Thoughts?
EDIT:
At the moment if I enter at the prompt for instance:
1,2,3,4,5
it outputs:
if I continue inputting numbers for instance:
7,6,5,4,3
it outputs:
and so forth, what I expect to be is:
in a few words, every time the function gets called it writes in the same file but one row below. I hope it is a clear explanation.

There are a couple of issues with your code.
First, in your while loop. Every time that it loops through it is calling wb = Worbook() and eventually wb.save(filename='path'). This is creating a new excel worksheet every time. Assuming that in your wb.save() call that the filename is the same each time, every time you call save on the new workbook it will overwrite the previously made workbook with that same file name.
Next, your list that contains the weight values that you have being input. You aren't clearing the list so each time you add something to it in the loop the list will just keep expanding.
Also the line num = (max_value) doesn't really do anything, and you also need to have some kind of condition to break out of your while loop. Using while True will keep the loop going forever unless you break out of it at some point.
Here is some code that should do this the way that you want it to:
max_value = 5
line_count = 0
wb = Workbook()
ws = wb.active
white True:
lst = []
for n in range(max_value):
weight = float(input('Weight: '))
lst.append(weight)
ws.append(lst)
line_count += 1
# an example condition to break out of the loop
if line_count >= 5:
break
wb.save(filename='path')
Here, your Workbook object is only being created and saved once so it isn't opening a new one each time and overwriting the previous Workbook. The list lst is being emptied each time through the loop so that your lines will only be as long as the value in max_value. I also added in an example way of breaking out of your while True: loop. The way I set it is that once there have been 5 lines that you have added to the workbook, it will break out of the loop. You can create any condition you want to break out of the loop, this is just for example purposes.

Python pandas UIPath - List assignment index out of range

I am currently facing an error
list assignment index out of range
within the invoke Python scope. I am just trying to check if each of the variables contains any of the string mentioned in 'a'. If yes then add it as a row to the excel sheet.
import pandas as pd
import xlsxwriter
def excel_data(mz01arg,p028arg,p006arg,s007arg,mz01desc,p028desc,p006desc,s007desc):
listb=[]
a=['MZ01','P028','P006','S007']
if any (x in mz01arg for x in a) is True:
listb[0] = [mz01arg]
else:
listb[0] = []
if any (x in p028arg for x in a ) is True:
listb[1] = [p028arg]
else:
listb[1] =[]
if any (x in p006arg for x in a) is True:
listb[2]=[p006arg]
else:
listb[2] = []
if any (x in s007arg for x in a) is True:
listb[3]=[s007arg]
else:
listb[3]=[]
df1 = pd.DataFrame({'SODA COUNT': listb})
df2 = pd.DataFrame({'SODA RISK DESCRIPTION': [mz01desc,p028desc,p006desc,s007desc]})
writer = pd.ExcelWriter(r"D:\Single_process_python\try_python.xlsx", engine='xlsxwriter')
df3 = pd.concat([df1,df2],axis=1)
df3.to_excel(writer,sheet_name='Sheet1', index=False)
writer.save()

You can't write to an element that does not yet exist. listb=[] creates an empty list, so there is no element with index 0. You may append items like this: listb.append(foo).
However, since you mentioned UiPath - I would recommend checking variables and their values in the workflow instead of your Python script. This way your script does one thing, and one thing exactly - and the workflow itself makes sure that all prerequisites are met. If not, you can throw and catch error messages, for example in another workflow, and ask users for input. If that logic is part of your script, this will be much harder.
Here's a very simple example:

Openpyxl: can't seem to get the syntax right for what I'm trying to do (read next cell down)

Here's the code I have. I'm simply trying to open a spreadsheet (which currently only has information in the first column), and read the first cell, print the information, then later along the lines loop back up to the top and read the next cell down. What am I doing wrong?
import openpyxl
wb = openpyxl.load_workbook('students2.xlsx')
ws = wb.active
PrintNext = True #Starts my while statement
while True :
for i in range(1,300): #I think this is where I'm having an issue?
for j in range(1,2):
StudentID = ws.cell(row=i+1, column=j).value
print(StudentID)
PrintNext = False #This gets it to move on from my while
pass
PrintNext = True #This is to get it to go back to my while
print(StudentID) #This is to test that it has the next cell down
I found the solution with the help of the answer here, but I found a much better solution over-all.
Set these to variables:
for i in range(RowX,RowY):
for j in range(ColX,ColY):
StudentID = ws.cell(row=i+1, column=j).value
So that any changes you make ("RowX = RowX + 1", for example) are reflected the next time you update the "for" statement!

You could use the cell.offset() method.

Cannot set an array element with a sequence

I'm using the NumPy python library to run large-scale edits on a .csv file. I'm using this python code:
import numpy as np
def main():
try:
e,a,ad,c,s,z,ca,fn,ln,p,p2,g,ssn,cn,com,dob,doh,em = np.loadtxt('c:\wamp\www\_quac\carryover_data\SI\Employees.csv',delimiter=',',unpack=True,dtype='str')
x=0
dob = dob.split('/')
for digit in dob:
if len(digit) == 1:
digit = str('0'+digit)
dob = str(dob[2]+'-'+dob[0]+'-'+dob[1])
doh = doh.split('/')
for digit in doh:
if len(digit) == 1:
digit = str('0'+digit)
doh = str(doh[2]+'-'+doh[0]+'-'+doh[1])
for eID in e:
saveLine=eID+','+a[x]+','+ad[x]+','+c[x]+','+s[x]+','+z[x]+','+ca[x]+','+fn[x]+','+ln[x]+','+p[x]+','+p2[x]+','+g[x]+','+ssn[x]+','+cn[x]+','+com[x]+','+dob[x]+','+doh[x]+','+em[x]+'\n'
saveFile = open('fixedEmployees.csv','a')
saveFile.write(saveLine)
saveFile.close()
x+=1
except Exception, e:
print str(e)
main()
dob and doh contain a string, e.g. 4/26/2012 and I'm trying to convert these to mysql friendly DATE forms, e.g. 2012-04-26. The error that is printed when I run this script is
cannot set an array element with a sequence
It does not specify a line and so I don't know what this really means. I'm pretty new to python; I've checked other questions with this same error but I can't make sense of their code. Any help is very appreciated.

Try using zfill to reformat the date string so you can have a '0' before your '4'. (zfill pads a string on the left with zeros to fill the width.)
doh = '4/26/2012'
doh = doh.split('/')
for i, s in enumerate(doh):
doh[i] = s.zfill(2)
doh = doh[2]+'-'+doh[0]+'-'+doh[1]
# result: '2012-04-26'
As for the cannot set an array element with a sequence it would be helpful to know
where that is occurring. I'm guessing there is something wrong with structure of the array.

Ok, to solve it I had to do a couple things. After removing the try-except commands, I found out that the error was on line 5, the line with e,a,ad,c,s etc. I couldn't eliminate the problem until I simply copied the 2 columns I wanted to focus on only and made a new program for dealing with those.
Then I had to create a .txt instead of a .csv because Excel auto-formats the dates and literally changes the values before I can even touch them. There is no way around that, I've learned. You can't turn the date-auto-format off. A serious problem with excel. So here's my solution for this NumPy script (it changes the first column and keeps the second the same):
import numpy as np
def main():
dob,doh=np.loadtxt('temp.csv',
delimiter=',',
unpack=True,
dtype='str')
x=0
for eachDate in dob:
if any(c.isalpha() for c in eachDate):
newDate=eachDate
elif (eachDate == ''):
newDate=''
else:
sp = eachDate.split('/')
y=0
ndArray = ['','','']
for eachDig in sp:
if len(eachDig) == 1:
eachDig = str('0'+eachDig)
if y == 0:
ndArray[0] = eachDig
elif y == 1:
ndArray[1] = eachDig
elif y == 2:
ndArray[2] = eachDig
newDate=str(ndArray[2]+'-'+ndArray[0]+'-'+ndArray[1])
y=0
y+=1
print eachDate+'--->'+newDate
"""creates a .txt file with the edited dates"""
saveLine=str(newDate+','+doh[x]+'\n')
saveFile=open('__newTemp.txt','a')
saveFile.write(saveLine)
saveFile.close()
x+=1
main()
I then used Data->Import from text with "TEXT" format option in Excel to get the column into my .csv. I realize this is probably bulky and noobish but it got the job done :3

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

detect EOF of excel file in Python - python

This here is your problem: if type(x) is None: If x is None, its type is NoneType. Therefore, this is never true, so you never see the blank cell and you end up trying to get the length of None. Instead, write: if x is None:

As mentioned in other comments you can use 'xlrd' as well to know the limits of the excel file as: workbook = xlrd.open_workbook (excel_loc) excel_sheet = workbook.sheet_by_index(0) print("no of rows: %d" %excel_sheet.nrows) print("no of cols: %d" %excel_sheet.ncols)

Related

Include a header from Excel in a for loop with openpyxl

Having troubles to write list in a new row every time during while loop with openpyxl

Python pandas UIPath - List assignment index out of range

Openpyxl: can't seem to get the syntax right for what I'm trying to do (read next cell down)

Cannot set an array element with a sequence

Categories

Resources