I am testing a simple python script to collect images. Set a fixed timing to run the script every day, but I want to keep a count continuously increasing.
schedule.every().day.at(time1).do(job)
I realized that if I do not do that, new images will overwrite the old images. I want to find a way to properly count/name the newly downloaded images in the next day. Can anyone help?
Related
What´s the best way to save a for loop count variable state in order to run the same script in different days?
I´m trying to loop over a large dataframe in different days and I want to keep track of the row so I don´t start the loop from the beginning everytime but from the row where I´ve stopped the previous time.
I´ve seen people recommend using Pickle to save state, however I tried that and couldn´t find a way to make it work beacuse I would have to load the count variable on the first time when there would be no variable to load, so is there a way to do it that doesn´t involve Pickle?
I'm trying to use python to compute and experiment on some datas from some files.
The parsing and computation of those files can take up to 20 minutes and always lead to the exact same result. I want to experiment on that result.
Is there a way (programatically or with spyder) to only compute these datas once per python console and to keep them in memory so the script dont have to compute them again each time I run my code ?
Am I clear ? ^^'
you can use pickles. by using them you can store the modified and parsed data on disk and then load it when you need
I'm currently running a script in Jupyter Notebook which loops over a Dataframe and manipulates the data of the current row. As my Dataframe has thousands of rows and each loop takes a while to run, I am wondering whether it's safe to interrupt the script without losing all of my progress?
I am keeping track of rows that have already been processed so I could just start where I left off in the case that the manipulations on the Dataframe don't get lost. I don't want to take the risk of trying it out right now so advice would be appreciated.
Unless you are storing progress in external files, interrupting Jupyter will lose you data. I highly do not recommend on counting on the variables inside of Jupyter on being in some state if you are mid-way through a calculation, just save intermediate steps in files to track progress, chunking as you go.
As I have written a few times before here, I am writing a programme to archive user-specified files in a certain time interval. The user also specifies when these files shall be deleted. Hence each file has a different archive and delete time interval associated with it.
I have written pretty much everything, including extracting the timings for each file in the list and working out when the next archive/delete time would be (relevant to the current time).
I am struggling with putting it all together, i.e. with actually scheduling these two processes (archive and delete archive) for each file with its individual time intervals. I guess these two functions have to be running in the background, but only execute when the clock strikes the required time.
I have looked into scheduler, timeloop, threading.Timer, but I don't see how I can set a different time interval for each file in the list, and make it run for both archive and delete processes without interfering. I came across the concept of 'cron jobs' - can anyone let me know if this might be on the right track? I'm just looking for some ideas from more experienced programmers for what I might be missing/what I should look into to get me on the right track.
I have a list of websites I need to extract certain values from in order to keep a local .txt file up-to-date. Since the websites need to be checked at different time intervals I would prefer not to use Windows Task Manager but instead have a single script running continuously in the background, extracting the information from each website at each specified frequency (so the frequency for each website would be an input parameter) and keep the file updated.
I know how to extract the information from the websites but I don't know how to schedule the checks on the websites in an automated fashion and have the script run continuously in the background. Knowing how to stop it would be useful too. (I have Anaconda Python installed on Windows 7)
What is an efficient way of coding that?
Thanks.
PS clarification: The script just needs to run as a background job once started and harvest some text from a number of predefined urls. So my questions are: a) How do I set it to run as a background job? A while loop? Something else? b) How do I make it return to a url to harvest the text at pre-specified intervals?
Given that it doesn't need to be a hidden process and that the Windows Task scheduler is unsuitable (as you need to pick different recurrences), it sounds like you just want a simple Python process that will call your function to extract the data on an irregular but predetermined basis.
This sounds a lot like apscheduler (https://pypi.python.org/pypi/APScheduler/) to me. I've used it a lot in Linux and it's worked like a charm for cron-like features. The package docs say it is Cross platform and so might fit the bill.