Python nested for loop for creating excel sheets - python

I'm working with creating excel sheets using python. I'm trying to use a nested for loop to fill in some cells on a spreadsheet and it's not going well. What I want it to do is for each row in a given list of rows I want it to enter an even number into the cell. So basically it should look something like this:
2
4
6
8
etc. (One value per cell)
but instead it comes out like:
24
24
24
24
All the cells have the same value.
Aside from the obvious formatting issues (I'm not finished with the formatting part), it prints the last number in the nested loop for every single cell. From my testing it it appears to be fully executing the inner loop for every cell. I'm using XlsWriter if that helps. I'm not sure how to get it to stop.I'm guessing it's pretty obvious but I haven't done any actual "programming" in years so the solution is eluding me. Here's the loop in question:
for items in purple_rows:
riser_cable.write(items,0,'Company Name',format8)
riser_cable.write(items,1,None,format8)
riser_cable.write(items,2,None,format8)
riser_cable.write(items,3,'Riser',format8)
for i in range(2,26,2):
riser_cable.write(items,4,i,format8)
print(i)
The last 3 lines are ones causing problems.
Thanks for the help!
Edit: The sheet should look like this http://imgur.com/odSaT2D but the code presently turns the entire "Port" column to 24.

Your line:
for i in range(2,26,2):
riser_cable.write(items,4,i,format8)
Writes all the numbers into the same cell. You therefore see only the last number, 24. This is because the item variable is not increased. Try
for i,items in enumerate(purple_rows):
riser_cable.write(items,4,(i*2) % 25,format8).
This should increase item, and put a different value in each row.

the looping convention in python is exclusive so it loops from 2 to 26, not including 26
otherwise, your loop is fine
riser_cable.write(items,4,i,format8)
it seems you are just updating the 4th column with 2,4,...,24. You would have to increment the column index as well
Try this
for i in range(1,13):
riser_cable.write(items,3+i,i*2,format8)
print(i)

I figured out the issue. I ended up changing the code to include a count so it's:
count=0
for items in purple_rows:
count+=2
riser_cable.write(items,4,count,format8)
riser_cable.write(items,10,count,format8)
if count==24:
count=0
Thanks everyone!
That fixed the issue

Related

How to delete specific cells in a pandas dataframe giving some conditions?

I want to clear the contents of the first two cells in location for every first 2 duplicates in last name.
For eg: i want to clear out the 1st 2 location occurances for Balchuinas and only keep the 3rd one. Same goes for London and Fleck. I ONLY want to clear out the location cells, not complete rows.
Any help?
I tried the .drop_duplicates,keep='last' method but that removes the whole row. I only want to clear the contents of the cells (or change it to NaN if thats possible)
Ps. This is my first time asking a question so im not sure how to paste the image without a link. Please help!
Rather than removing the duplicate rows. I would suggest, find the duplicate values and replace it with NaN while keeping the last cell value
Something like this:
df[df.duplicated(keep='last')] = float('nan')

Is there a Python pandas function for retrieving a specific value of a dataframe based on its content?

I've got multiple excels and I need a specific value but in each excel, the cell with the value changes position slightly. However, this value is always preceded by a generic description of it which remains constant in all excels.
I was wondering if there was a way to ask Python to grab the value to the right of the element containing the string "xxx".
try iterating over the excel files (I guess you loaded each as a separate pandas object?)
somehting like for df in [dataframe1, dataframe2...dataframeN].
Then you could pick the column you need (if the column stays constant), e.g. - df['columnX'] and find which index it has:
df.index[df['columnX']=="xxx"]. Maybe will make sense to add .tolist() at the end, so that if "xxx" is a value that repeats more than once, you get all occurances in alist.
The last step would be too take the index+1 to get the value you want.
Hope it was helpful.
In general I would highly suggest to be more specific in your questions and provide code / examples.

Python. Deleting Excel rows while iterating. Alternative for OpenPyXl or solution for ws.max_rows wrong output

I'm working with Python on Excel files. Until now I was using OpenPyXl. I need to iterate over the rows and delete some of them if they do not meet specific criteria let's say I was using something like:
current_row = 1
while current_row <= ws.max_row
if 'something' in ws[f'L{row}'].value:
data_ws.delete_rows(current_row)
continue
current_row += 1
Everything was alright until I have encountered problem with ws.max_rows. In a new Excel file which I've received to process ws.max_rows was returning more rows than it was in the reality. After some googling I've found out why is it happening.
Here's a great explanation of the problem which I've found in the comment section on the Stack:
However, ws.max_row will not check if last rows are empty or not. If cell's content at the end of the worksheet is deleted using Del key or by removing duplicates, remaining empty rows at the end of your data will still count as a used row. If you do not want to keep these empty rows, you will have to delete those entire rows by selecting rows number on the left of your spreadsheet and deleting them (right click on selected row number(s) -> Delete) –
V. Brunelle
Thanks V. Brunelle for very good explanation of the cause of the problem.
In my case it is because some of the rows are deleted by removing duplicates. For e.g. there's 400 rows in my file listed one by one (without any gaps) but ws.max_row is returning 500
For now I'm using a quick fix:
while current_row <= len([row for row in data_ws.iter_rows(min_row=min_row) if not all([cell.value is None for cell in row])])
But I know that it is very inefficient. That's the reason why I'm asking this question. I'm looking for possible solution.
From what I've found here on the Stack I can:
Create a copy of the worksheet and iterate over that copy and ws.delete_rows in the original worksheet so I will need to my fix only once
Iterate backwards with for_loop so I won't have to deal with ws.max_rows since for_loops works fine in that case (they read proper file dimensions). This method seems promising for me, but always I've got 4 rows at the top of the workbook which I'm not touching at all and potential debugging would need to be done backwards as well, which might not be very enjoyable :D.
Use other python library to process Excel files, but I don't know which one would be better, because keeping workbook styles is very important to me (and making changes in them if needed). I've read some promising things about pywin32 library (win32com.client), but it seems lacking documentation and it might be hard to work with it and also I don't know how does it look in performance matter. I was also considering pandas, but in kind words it's messing up the styles (in reality it deletes all styles in the worksheet).
I'm stuck now, because I really don't know which route should I choose.
I would appreciate every advice/opinion in the topic and if possible I would like to make a small discussion here.
Best regards!
If max rows doesn't report what you expect you'll need to sort the issue best you can and perhaps that might be by manually deleting; "delete those entire rows by selecting rows number on the left of your spreadsheet and deleting them (right click on selected row number(s) -> Delete)" or making some other determination in your code as what the last row is, then perhaps programatically deleting all the rows from there to max_row so at least it reports correctly on the next code run.
You could also incorporate your fix code into your example code for deleting rows that meet specific criteria.
For example; a test sheet has 9 rows of data but cell B15 is an empty string so max_rows returns 15 rather than 9.
The example code checks each used cell in the row for None type in the cell value and only processes the 9 rows with data.
from openpyxl import load_workbook
filename = "foo.xlsx"
wb = load_workbook(filename)
data_ws = wb['Sheet1']
print(f"Max Rows Reports {data_ws.max_row}")
for row in data_ws:
print(f"Checking row {row[0].row}")
if all(cell.value is not None for cell in row):
if 'something' in data_ws[f'L{row[0].row}'].value:
data_ws.delete_rows(row[0].row)
else:
print(f"Actual Max Rows is {row[0].row}")
break
wb.save('out_' + filename)
Output
Max Rows Reports 15
Checking row 1
Checking row 2
Checking row 3
Checking row 4
Checking row 5
Checking row 6
Checking row 7
Checking row 8
Checking row 9
Actual Max Rows is 9
Of course this is not perfect, if any of the 9 rows with data had one cell value of None the loop would stop at that point. However if you know that's not going to be the case it may be all you need.

Compare row 1 and row 2 ( and so on) in same dataframe in python

I am working on a journal entry database I want to check if row 1 and 2, then row 2 and 3,(and so on) have the same amounts based on same account no. (to check debit credit postings, please note, i have already sorted the data based on absolute function on amounts)
Please find attached the example data that i working on
enter image description here
Can i do this without using a for loop?
If yes then how? and if not then what should be the ideal for loop for this?
included below is the for loop i tried which didn't work
for i, j in range(sales_acc):
if sales_acc["Amount LC"][i]==sales_acc["Amount LC"][j] & sales_acc["ACCOUNTNUMBER"][i]==sales_acc["ACCOUNTNUMBER"][j]:
Elimination.append(i)
Elimination.append(j)
else:
pass
You cannot do
for i, j in range(sales_acc):
First, range() returns a single integer at each iteration, so you cannot assign it to two variables.
Second, you probably wanted to write range(len(sales_acc))
Third, are you indexing correctly? Shouldn't it be sales_acc[i]["Amount LC"]
Fourth, if it worked, look at what would happen if you had 3 consecutive identical rows: you would tag the second one for elimination twice.
Anyhow you don't need two variables - just use i and i+1 to check the current entry and the next one.
Re: not using a for loop, there may be ways to not explicitly use it, but a loop will always be executed, since this is exactly what you are doing: looping on your data :)

How to run for loop repeatedly for a set number of cells in Excel using Python?

I am new to using Python to do Excel stuff, so please bear with me. I created an Excel file and I want to start at 0 in a cell and add 10 repeatedly until I reach 350. Then I want to do this again, and again, and again. I start at 0 and add 10 until I reach 350. I figured out this takes up 37 rows. I figured out how to do this once, but I was wondering if there's a way to do this for every 37 rows. I am stuck because I can't change the range after it's been set. Otherwise, I would've added a line at the end of the loop to add 37 to the range.
i=0
while i<37:
for row in range (1,37):
worksheet.write_number(row, 3, i)
i=i+10

Categories