Loop method call is not executing fully? - python

I have a loop with a delay which calls a method inside my class, however it only runs fully once, then only the first line gets called, I can see this through my logs in the terminal.
class Processor:
def __init__(self) -> None:
self.suppliers = ["BullionByPostUK", "Chards"]
self.scraper_delay = 60 # in seconds (30mins)
self.oc_delay = 30 # in seconds (5mins)
self.scraper = Scraper()
def run_one(self):
self.scraper.scrape()
self.data = self.scraper.get_data()
t_s = time.perf_counter()
for supplier in self.suppliers:
eval(supplier)(self.data).run() # danger? eval()
t_e = time.perf_counter()
dur = f"{t_e - t_s:0.4f}"
dur_ps = float(dur) / len(self.suppliers)
print(f"Processed/Updated {len(self.suppliers)} suppliers in {dur} seconds \n Average {dur_ps} seconds per supplier\n")
logging.info(f"Processed/Updated {len(self.suppliers)} suppliers in {dur} seconds \n Average {dur_ps} seconds per supplier")
def run_queue(self): # IN PROGRESS
print(color.BOLD + color.YELLOW + f"[{datetime.datetime.utcnow()}] " + color.END + f"Queue Started")
while True:
recorded_silver = self.scraper.silver()
try:
wait(lambda: self.bullion_diff(recorded_silver), timeout_seconds=self.scraper_delay, sleep_seconds=self.oc_delay)
self.run_one()
except exceptions.TimeoutExpired: # delay passed
self.run_one() # update algorithm (x1)
finally:
continue
...
As seen from image provided, each run through after does not run all the way up to process
Why is this that the code up to before the eval() only runs once then stops out? Im basically trying to loop the run_one() method so it runs on a cycle.

Related

Add delay between workers in concurrent.futures and shutdown

Looking to add delay between Threadpoolexecutor workers
checked = 0
session_number = 0.00
def parseCombo(file_name, config_info, threads2):
try:
global checked
global session_number
### some non related code here
if config_info["Parsing"] == "First":
system("title "+ config_info["name"] + " [+] Start Parsing...")
with concurrent.futures.ThreadPoolExecutor(max_workers=threads2) as executor:
for number in numbers:
number = number.strip()
executor.submit(call_master, number, config_info)
time.sleep(20)
So basically, with those threads I'm making some requests to API that limit 2 requests per second.
The max workers i can set is 2, if I raise the API block all requests.
Each task take 60 seconds to complete, what I want to do, is set max_workers to 10, then the executor start with 2 tasks "workers" and wait 3 seconds to start the next 2 and so on, without waiting for the first 2 to finish.
Next thing is to shutdown the executor when one the task return value.
def beta(url, number, config_info):
try:
## some non related code here
global checked
global session_number
session_number = session_number
checked +=1
if '999' in text:
track = re.findall('\999[0-9]+\.?[0-9]*', text)
num = float(track[0].replace("999", ""))
session_number += num
log(number, track[0], config_info, url, text)
Is there a way to shutdown executor from def beta?
If possible, need it shutdown when this line is triggered
log(number, track[0], config_info, url, text)

is there an easy way to add a progress bar/counter where I can add a line to increment it every so often - Not Timed

I have a script that is basically complete. I'd like to add some sort of a progress bar to it, instead of printing out each step as it passes by
is there anything that will let me do this.
setup a progress widget/counter/loop
give it a command function to increment
do some script
add the code to advance/increment the progress bar
do some more script
add the code to advance/increment the progress bar
do some more script
add the code to advance/increment the progress bar
do some more script
add the code to advance/increment the progress bar
also, can you please give me an example of some sort
I've looked at 3 or 4 different "progress bar" type libraries, and none give an example of doing it this way
all of the examples I seem to find want to do it by time or by byte size for downloading files
There is a number of progress bars in PIP, I recommend ezprogress if you run python3.
from ezprogress.progressbar import ProgressBar
import time
# Number of steps in your total script
steps_needed = 100
current_step = 0
# setup progress bar
pb = ProgressBar(steps_needed, bar_length=100)
pb.start()
# Do what your script wants
...
# Increment counter
current_step += 1
pb.update(current_step)
# Do what your script wants
...
# When you are done you can force the progress bar to finish
PB.finished()
The progress bar did not support turning off time estimation, however it is now possible in the newest version, just upgrade from PIP.
To turn off time estimation the progress bar just needs to be started with the parameter no_time=True like in the code below:
pb = ProgressBar(steps_needed, bar_length=100, no_time=True)
create your progressbar.py module
import sys
import copy
currentProgressCnt = 0
progressCntMax = 0 #
progressBarWidth = 50 # in chars
scaleFctr = 0.0
tasksToDo = []
class ProgressIndicator:
def showProgress(self):
global progressCntMax
global currentProgressCnt
cr = "\r"
progressChar = '#'
fillChar = '.'
progressBarDone = currentProgressCnt*progressChar*scaleFctr
progressBarRemain = fillChar*(progressCntMax - currentProgressCnt)*scaleFctr
percent = str(int((float(currentProgressCnt)/float(progressCntMax))*100)) + " % completed "
taskId = '(' + tasksToDo[currentProgressCnt - 1] + ') '
quote = str(currentProgressCnt) + '/' + str(progressCntMax) + ' '
sys.stdout.write(cr + progressBarDone + progressBarRemain + ' ' + percent + taskId + quote)
sys.stdout.flush()
if currentProgressCnt == progressCntMax:
print
def incProgress(self):
global currentProgressCnt
currentProgressCnt += 1
def setLastStep(self, size):
global progressCntMax, scaleFctr
progressCntMax = size
scaleFctr = progressBarWidth / progressCntMax
def setTaskList(self, taskList):
global tasksToDo
tasksToDo = copy.copy(taskList)
self.setLastStep(len(tasksToDo))
in main, use the ProgressIndicator class like this:
from progressbar import ProgressIndicator
import time
import datetime
#########################################
### MAIN ###
### SIMULATION ###
#########################################
# your procedure list you have to run
toDoList = ['proc1', 'proc2', 'proc3', 'proc1', 'proc4', 'proc5',
'proc6', 'proc7', 'proc21', 'proc32', 'proc43', 'proc51',
'proc4', 'proc65', 'proc76', 'proc87']
progressLine = ProgressIndicator() # create your indicator
progressLine.setTaskList(toDoList) # set params
# your main work
i = 0; lastTask = len(toDoList)
# log the start
startTime = str(datetime.datetime.now())
print ( startTime + " main started")
while i < lastTask:
# run your task list here
time.sleep(1) # simulating your toDoList[i]() run
i += 1
progressLine.incProgress() # use when task done, incrase progress
progressLine.showProgress() # use for update display
# work is done, log the end
endTime = str(datetime.datetime.now())
print ( endTime + " main finished")

Multiprocessing function not writing to file or printing

I'm working on a Raspberry Pi (3 B+) making a data collection device and I'm
trying to spawn a process to record the data coming in and write it to a file. I have a function for the writing that works fine when I call it directly.
When I call it using the multiprocess approach however, nothing seems to happen. I can see in task monitors in Linux that the process does in fact get spawned but no file gets written, and when I try to pass a flag to it to shut down it doesn't work, meaning I end up terminating the process and nothing seems to have happened.
I've been over this every which way and can't see what I'm doing wrong; does anyone else? In case it's relevant, these are functions inside a parent class, and one of the functions is meant to spawn another as a thread.
Code I'm using:
from datetime import datetime, timedelta
import csv
from drivers.IMU_SEN0 import IMU_SEN0
import multiprocessing, os
class IMU_data_logger:
_output_filename = ''
_csv_headers = []
_accelerometer_headers = ['Accelerometer X','Accelerometer Y','Accelerometer Z']
_gyroscope_headers = ['Gyroscope X','Gyroscope Y','Gyroscope Z']
_magnetometer_headers = ['Bearing']
_log_accelerometer = False
_log_gyroscope= False
_log_magnetometer = False
IMU = None
_writer=[]
_run_underway = False
_process=[]
_stop_value = 0
def __init__(self,output_filename='/home/pi/blah.csv',log_accelerometer = True,log_gyroscope= True,log_magnetometer = True):
"""data logging device
NOTE! Multiple instances of this class should not use the same IMU devices simultaneously!"""
self._output_filename = output_filename
self._log_accelerometer = log_accelerometer
self._log_gyroscope = log_gyroscope
self._log_magnetometer = log_magnetometer
def __del__(self):
# TODO Update this
if self._run_underway: # If there's still a run underway, end it first
self.end_recording()
def _set_up(self):
self.IMU = IMU_SEN0(self._log_accelerometer,self._log_gyroscope,self._log_magnetometer)
self._set_up_headers()
def _set_up_headers(self):
"""Set up the headers of the CSV file based on the header substrings at top and the input flags on what will be measured"""
self._csv_headers = []
if self._log_accelerometer is not None:
self._csv_headers+= self._accelerometer_headers
if self._log_gyroscope is not None:
self._csv_headers+= self._gyroscope_headers
if self._log_magnetometer is not None:
self._csv_headers+= self._magnetometer_headers
def _record_data(self,frequency,stop_value):
self._set_up() #Run setup in thread
"""Record data function, which takes a recording frequency, in herz, as an input"""
previous_read_time=datetime.now()-timedelta(1,0,0)
self._run_underway = True # Note that a run is now going
Period = 1/frequency # Period, in seconds, of a recording based on the input frequency
print("Writing output data to",self._output_filename)
with open(self._output_filename,'w',newline='') as outcsv:
self._writer = csv.writer(outcsv)
self._writer.writerow(self._csv_headers) # Write headers to file
while stop_value.value==0: # While a run continues
if datetime.now()-previous_read_time>=timedelta(0,1,0): # If we've waited a period, collect the data; otherwise keep looping
print("run underway value",self._run_underway)
if datetime.now()-previous_read_time>=timedelta(0,Period,0): # If we've waited a period, collect the data; otherwise keep looping
previous_read_time = datetime.now() # Update previous readtime
next_row = []
if self._log_accelerometer:
# Get values in m/s^2
axes = self.IMU.read_accelerometer_values()
next_row += [axes['x'],axes['y'],axes['z']]
if self._log_gyroscope:
# Read gyro values
gyro = self.IMU.read_gyroscope_values()
next_row += [gyro['x'],gyro['y'],gyro['z']]
if self._log_magnetometer:
# Read magnetometer value
b= self.IMU.read_magnetometer_bearing()
next_row += b
self._writer.writerow(next_row)
# Close the csv when done
outcsv.close()
def start_recording(self,frequency_in_hz):
# Create recording process
self._stop_value = multiprocessing.Value('i',0)
self._process = multiprocessing.Process(target=self._record_data,args=(frequency_in_hz,self._stop_value))
# Start recording process
self._process.start()
print(datetime.now().strftime("%H:%M:%S.%f"),"Data logging process spawned")
print("Logging Accelerometer:",self._log_accelerometer)
print("Logging Gyroscope:",self._log_gyroscope)
print("Logging Magnetometer:",self._log_magnetometer)
print("ID of data logging process: {}".format(self._process.pid))
def end_recording(self,terminate_wait = 2):
"""Function to end the recording multithread that's been spawned.
Args: terminate_wait: This is the time, in seconds, to wait after attempting to shut down the process before terminating it."""
# Get process id
id = self._process.pid
# Set stop event for process
self._stop_value.value = 1
self._process.join(terminate_wait) # Wait two seconds for the process to terminate
if self._process.is_alive(): # If it's still alive after waiting
self._process.terminate()
print(datetime.now().strftime("%H:%M:%S.%f"),"Process",id,"needed to be terminated.")
else:
print(datetime.now().strftime("%H:%M:%S.%f"),"Process",id,"successfully ended itself.")
====================================================================
ANSWER: For anyone following up here, it turns out the problem was my use of the VS Code debugger which apparently doesn't work with multiprocessing and was somehow preventing the success of the spawned process. Many thanks to Tomasz Swider below for helping me work through issues and, eventually, find my idiocy. The help was very deeply appreciated!!
I can see few thing wrong in your code:
First thing
stop_value == 0 will not work as the multiprocess.Value('i', 0) != 0, change that line to
while stop_value.value == 0
Second, you never update previous_read_time so it will write the readings as fast as it can, you will run out of disk quick
Third, try use time.sleep() the thing you are doing is called busy looping and it is bad, it is wasting CPU cycles needlessly.
Four, terminating with self._stop_value = 1 probably will not work there must be other way to set that value maybe self._stop_value.value = 1.
Well here is a pice of example code based on the code that you have provided that is working just fine:
import csv
import multiprocessing
import time
from datetime import datetime, timedelta
from random import randint
class IMU(object):
#staticmethod
def read_accelerometer_values():
return dict(x=randint(0, 100), y=randint(0, 100), z=randint(0, 10))
class Foo(object):
def __init__(self, output_filename):
self._output_filename = output_filename
self._csv_headers = ['xxxx','y','z']
self._log_accelerometer = True
self.IMU = IMU()
def _record_data(self, frequency, stop_value):
#self._set_up() # Run setup functions for the data collection device and store it in the self.IMU variable
"""Record data function, which takes a recording frequency, in herz, as an input"""
previous_read_time = datetime.now() - timedelta(1, 0, 0)
self._run_underway = True # Note that a run is now going
Period = 1 / frequency # Period, in seconds, of a recording based on the input frequency
print("Writing output data to", self._output_filename)
with open(self._output_filename, 'w', newline='') as outcsv:
self._writer = csv.writer(outcsv)
self._writer.writerow(self._csv_headers) # Write headers to file
while stop_value.value == 0: # While a run continues
if datetime.now() - previous_read_time >= timedelta(0, 1,
0): # If we've waited a period, collect the data; otherwise keep looping
print("run underway value", self._run_underway)
if datetime.now() - previous_read_time >= timedelta(0, Period,
0): # If we've waited a period, collect the data; otherwise keep looping
next_row = []
if self._log_accelerometer:
# Get values in m/s^2
axes = self.IMU.read_accelerometer_values()
next_row += [axes['x'], axes['y'], axes['z']]
previous_read_time = datetime.now()
self._writer.writerow(next_row)
# Close the csv when done
outcsv.close()
def start_recording(self, frequency_in_hz):
# Create recording process
self._stop_value = multiprocessing.Value('i', 0)
self._process = multiprocessing.Process(target=self._record_data, args=(frequency_in_hz, self._stop_value))
# Start recording process
self._process.start()
print(datetime.now().strftime("%H:%M:%S.%f"), "Data logging process spawned")
print("ID of data logging process: {}".format(self._process.pid))
def end_recording(self, terminate_wait=2):
"""Function to end the recording multithread that's been spawned.
Args: terminate_wait: This is the time, in seconds, to wait after attempting to shut down the process before terminating it."""
# Get process id
id = self._process.pid
# Set stop event for process
self._stop_value.value = 1
self._process.join(terminate_wait) # Wait two seconds for the process to terminate
if self._process.is_alive(): # If it's still alive after waiting
self._process.terminate()
print(datetime.now().strftime("%H:%M:%S.%f"), "Process", id, "needed to be terminated.")
else:
print(datetime.now().strftime("%H:%M:%S.%f"), "Process", id, "successfully ended itself.")
if __name__ == '__main__':
foo = Foo('/tmp/foometer.csv')
foo.start_recording(20)
time.sleep(5)
print('Ending recording')
foo.end_recording()

twython search api rate limit: Header information will not be updated

I want to handle the Search-API rate limit of 180 requests / 15 minutes. The first solution I came up with was to check the remaining requests in the header and wait 900 seconds. See the following snippet:
results = search_interface.cursor(search_interface.search, q=k, lang=lang, result_type=result_mode)
while True:
try:
tweet = next(results)
if limit_reached(search_interface):
sleep(900)
self.writer(tweet)
def limit_reached(search_interface):
remaining_rate = int(search_interface.get_lastfunction_header('X-Rate-Limit-Remaining'))
return remaining_rate <= 2
But it seems, that the header information are not reseted to 180 after it reached the two remaining requests.
The second solution I came up with was to handle the twython exception for rate limitation and wait the remaining amount of time:
results = search_interface.cursor(search_interface.search, q=k, lang=lang, result_type=result_mode)
while True:
try:
tweet = next(results)
self.writer(tweet)
except TwythonError as inst:
logger.error(inst.msg)
wait_for_reset(search_interface)
continue
except StopIteration:
break
def wait_for_reset(search_interface):
reset_timestamp = int(search_interface.get_lastfunction_header('X-Rate-Limit-Reset'))
now_timestamp = datetime.now().timestamp()
seconds_offset = 10
t = reset_timestamp - now_timestamp + seconds_offset
logger.info('Waiting {0} seconds for Twitter rate limit reset.'.format(t))
sleep(t)
But with this solution I receive this message INFO: Resetting dropped connection: api.twitter.com" and the loop will not continue with the last element of the generator. Have somebody faced the same problems?
Regards.
just rate limit yourself is my suggestion (assuming you are constantly hitting the limit ...)
QUERY_PER_SEC = 15*60/180.0 #180 per 15 minutes
#~5 seconds per query
class TwitterBot:
last_update=0
def doQuery(self,*args,**kwargs):
tdiff = time.time()-self.last_update
if tdiff < QUERY_PER_SEC:
time.sleep(QUERY_PER_SEC-tdiff)
self.last_update = time.time()
return search_interface.cursor(*args,**kwargs)

benchmarking python script reveals mysterious time delays

i have two modules: moduleParent and moduleChild
i'm doing something like this in moduleParent:
import moduleChild
#a bunch of code
start = time.time()
moduleChild.childFunction()
finish = time.time()
print "calling child function takes:", finish-start, "total seconds"
#a bunch of code
i'm doing something like this in moduleChild:
def childFunction():
start = time.time()
#a bunch of code
finish = time.time()
print "child function says it takes:", finish-start, "total seconds"
the output looks like this:
calling child function takes: .24 total seconds
child function says it takes: 0.0 total seconds
so my question is, where are these .24 extra seconds coming from?
thank you for your expertise.
#
here is the actual code for "childFuntion". it really shouldn't take .24 seconds.
def getResources(show, resourceName='', resourceType=''):
'''
get a list of resources with the given name
#show: show name
#resourceName: name of resource
#resourceType: type of resource
#return: list of resource dictionaries
'''
t1 = time.time()
cmd = r'C:\tester.exe -cmdFile "C:\%s\info.txt" -user root -pwd root'%show
cmd += " -cmd findResources -machineFormatted "
if resourceName:
cmd += '-name %s'%resourceName
if resourceType:
cmd += '_' + resourceType.replace(".", "_") + "_"
proc=subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output = proc.stdout.read()
output = output.strip()
resourceData = output.split("\r\n")
resourceData = resourceData[1:]
resourceList = []
for data in resourceData:
resourceId, resourceName, resourceType = data.split("|")
rTyp = "_" + resourceType.replace(".", "_") + "_"
shot, assetName = resourceName.split(rTyp)
resourceName = assetName
path = '//projects/%s/scenes/%s/%s/%s'%(show, shot, resourceType.replace(".", "/"), assetName)
resourceDict = {'id':resourceId, 'name':resourceName, 'type':resourceType, 'path':path }
resourceList.append(resourceDict)
t2 = time.time()
print (" ", t2 - t2, "seconds")
return resourceList
Edit 2: I just noticed a typo in child function, you have t2 - t2 in the print statement
ignore below:
Calling the function itself has overhead (setting up stack space, saving local variables, returning, etc). The result suggests that your function is so trivial that setting up for a function call took longer than running the code itself.
Edit: also, calling the timers as well as print ads overhead. Now that I think about it, calling print could account for a lot of that .24 seconds. IO is slow.
You can't measure the time of a function by running it once, especially one which runs so short. There are a myriad of factors which could affect the timing, not the least of which is what other processes you have running on the system.
Something like this I would probably run at least a few hundred times. Check out the timeit module.

Categories