Python Using DataFrame to save state of for loop - python

What´s the best way to save a for loop count variable state in order to run the same script in different days?
I´m trying to loop over a large dataframe in different days and I want to keep track of the row so I don´t start the loop from the beginning everytime but from the row where I´ve stopped the previous time.
I´ve seen people recommend using Pickle to save state, however I tried that and couldn´t find a way to make it work beacuse I would have to load the count variable on the first time when there would be no variable to load, so is there a way to do it that doesn´t involve Pickle?

Related

How can I make a chart that automatically adds info with openpyxl?

I'm trying to create a tracker of stuff I do every day, like the amount of water I drink, how much I exercise, read, whatever, and store that info in two sheets, one with 365-day info and one with all the days
I also want to have an excel graph that I can quickly take a look at (meaning I don't want to have to set it up every time I want to see it). Doing that for the 365 days thing should be easy by just setting a locked range of 365 cells, but in any case it would only look pretty once I've been using it for some months.
This webpage shows exactly how to do it when you input the info manually, but I'm using openpyxl to input the info for me via a telegram bot. I tried doing that then adding a value right below the table with code and it doesn't work.
My theory is that once you call that cell it gets emptied of any kind of attributes (format) it might have before setting a value.
Example: If you make the cells (B2:D4) have all borders and then execute ws['C3'] = 5, it will remove the borders of that cell because you didn't explicitly told Excel to "keep them" (or, more accurate, set them again)
The example can be fixed using openpyxl.styles, but how do I make excel "check", or add new values to the existing table? I assume that if I get them to join the table, they will be added to the graph too.
This is a personal project, not an assignment or anything, so if there's another way to do the same, I'm okay with that.
Edit: I've been playing with dir(ws.tables['Table1']) and found an (attribute?) called insertRow, but I can't find anything at all that explains what it does, only how it works (Boolean that allows None). I tried manually changing it from None to True and then trying again, but when I opened the file it said it needed to be repaired, and when repaired it added the value but didn't make it join the table...

Why is google colab not allowing me to add data to a csv file?

I am trying to load a file in a pandas dataframe in google colab and then in a for loop, I'm adding data in the csv file using
for i in range(100):
res=some_processing()
df.loc[len(df)] = [i,res]
df.to_csv(df)
It was working fine yesterday and I was able to store all the values yesterday in the csv file. But today, it's not moving beyond index 75. some_processing() takes around 10 mins every time. So, I let it run for a long time. But I noticed that it didn't store value for the last 25 or something. It just stored upto 75 values and no, it's not because google colab shut down. I check for the values that are already done so i just skip them if they are there. So, now when I start this program, it runs from 75 then after 10 mins, I checked the csv in my drive file and it had the 76th entry written. So, I stopped the program and let it run again. This time it skipped the 76th entry but after another 10 mins, when I checked the csv file, the 77th value had overwritten the 76th value. Now, when I ran it again, the 76th value overwrote 77th value. So, the cycle continues. I don't understand why this is happening. It was working fine yesterday and I was able to store upto 130 values when I wanted. It's like len(df) got stuck or something. and also 1 more thing, when I tried to manually add values to the dataframe and then store it, it was working fine too. I don't know what's happening.

Wondering how I can store an address into a temporary block that will be wiped on a loop iteration

I am trying to come up with a way to pull 4 files from a directory and to store the address into a temporary file. I am using the random module in Python with the random.choice(directory) to pull the pictures. I need to assign these chosen files to temp variables, and on the next iteration wipe the variables so that 4 new files can be picked. I tried using lambda to assign it but it seems a bit too complex. Is there any way of doing this?
Thanks!
You could try to define the variables before the loop that executes 4 times.
In the beginning of the loop before reading from the files you set the variables equal to NULL

Will interrupting the script delete the progress in Jupyter Notebook?

I'm currently running a script in Jupyter Notebook which loops over a Dataframe and manipulates the data of the current row. As my Dataframe has thousands of rows and each loop takes a while to run, I am wondering whether it's safe to interrupt the script without losing all of my progress?
I am keeping track of rows that have already been processed so I could just start where I left off in the case that the manipulations on the Dataframe don't get lost. I don't want to take the risk of trying it out right now so advice would be appreciated.
Unless you are storing progress in external files, interrupting Jupyter will lose you data. I highly do not recommend on counting on the variables inside of Jupyter on being in some state if you are mid-way through a calculation, just save intermediate steps in files to track progress, chunking as you go.

How to set (global) variables under scheduled repeating task?

I am testing a simple python script to collect images. Set a fixed timing to run the script every day, but I want to keep a count continuously increasing.
schedule.every().day.at(time1).do(job)
I realized that if I do not do that, new images will overwrite the old images. I want to find a way to properly count/name the newly downloaded images in the next day. Can anyone help?

Categories