Getting "maximum recursion depth exceeded while calling a Python object" with Celery - python

I am beginner.I'm running a task in Celery and getting this strange error
maximum recursion depth exceeded while calling a Python object
You can check the full error in this pastebin
I don't quite understand because I haven't change anything and yesterday it was working fine. I ran the task without celery in the python interpreter and it runs fine. You can check the function here. Finally, for what it is worth, this task is getting created 12 times by an other task.
Do you see anything that could create such an error?
EDIT:
This is the task I call this function / task
#celery.task(ignore_result=True)
def get_classicdata(leagueid):
print "getting team data for %s"%leagueid
returned_data = {}
for team in r.smembers('league:%s'%leagueid):
data = scrapteam.delay(team,r.get('currentgw'))
returned_data[team] = data.get()

Everything looks fine. The traceback implies that the returned object somewhere cannot be pickled, but your returned 'team' data structure is a dictionary containing a non-recursive data structure of basic types, so that can't cause a problem. For better remote debugging, please put a print statement before the "return team", so that it shows the content of the team. You might also try just having it return a {} and see if that changes thing.
Then also add a debugging print statement in get_classicdata showing the content of data.get(), as well as something just before the return there, in order to verify if that function reaches completion.

Related

multiprocessing.pool.MaybeEncodingError: Error sending result occurs at last object

I keep having an issue when executing a function multiple times at once using the multiprocessing.Pool class.
I am using Python 3.8.3 on Windows 10 with PyCharm 2017.3.
The function I am executing is opening and serialising excel files from my harddisk to custom objects which I want to iterate through later on.
The error always occurs after the last execution of the function.
Here is what it says:
multiprocessing.pool.MaybeEncodingError: Error sending result: '[<IntegListe.IntegrityList object at 0x037481F0>, <IntegListe.IntegrityList object at 0x03D86CE8>, <IntegListe.IntegrityList object at 0x03F50F88>]'. Reason: 'TypeError("cannot pickle '_thread.RLock' object")'
Here is what my code looks like:
from multiprocessing import Pool
p = Pool()
ilList = p.starmap(extract_excel, [(f, spaltennamen) for f in files])
p.join()
p.close()
And this is the function I am trying to execute parallely:
def extract_excel(t: tuple) -> IntegrityList:
file_path = t[0]
spaltennamen = t[1]
il = IntegrityList(file_path)
print(il)
spaltennamen = list(map(lambda x: Excel.HereOrFind(il.ws, x, x.value), spaltennamen)) # Update position of column headers
il.parse_columns(spaltennamen, il.ws)
il.close()
return il
Since I am quite new to python, I am having troubles figuring out the magic behind this multiprocessing error. Executing the function serially is working perfectly fine and I get the desired output. This proofs, that the function and all the sub functions work as expected. I would be glad for any information that could help solve this problem. Thanks!
Okay so for future issue viewers, I solved the error with the help of this website: https://www.synopsys.com/blogs/software-security/python-pickling/#:~:text=Whenever%20an%20object%20is%20pickled,reconstruct%20this%20object%20when%20unpickling..
It states that every customized object that goes through a parallel process needs to have the __reduce__ method implemented in order to be reconstructed.
I simply added this code to my custom object:
def __reduce__(self):
return IntegrityList, (self.file_path,)
After that the execution of the parallel processing works great.

Segmentation fault when initializing array

I am getting a segmentation fault when initializing an array.
I have a callback function from when an RFID tag gets read
IDS = []
def readTag(e):
epc = str(e.epc, 'utf-8')
if not epc in IDS:
now = datetime.datetime.now().strftime('%m/%d/%Y %H:%M:%S')
IDS.append([epc, now, "name.instrument"])
and a main function from which it's called
def main():
for x in vals:
IDS.append([vals[0], vals[1], vals[2]])
for x in IDS:
print(x[0])
r = mercury.Reader("tmr:///dev/ttyUSB0", baudrate=9600)
r.set_region("NA")
r.start_reading(readTag, on_time=1500)
input("press any key to stop reading: ")
r.stop_reading()
The error occurs because of the line IDS.append([epc, now, "name.instrument"]). I know because when I replace it with a print call instead the program will run just fine. I've tried using different types for the array objects (integers), creating an array of the same objects outside of the append function, etc. For some reason just creating an array inside the "readTag" function causes the segmentation fault like row = [1,2,3]
Does anyone know what causes this error and how I can fix it? Also just to be a little more specific, the readTag function will work fine for the first two (only ever two) calls, but then it crashes and the Reader object that has the start_reading() function is from the mercury-api
This looks like a scoping issue to me; the mercury library doesn't have permission to access your list's memory address, so when it invokes your callback function readTag(e) a segfault occurs. I don't think that the behavior that you want is supported by that library
To extend Michael's answer, this appears to be an issue with scoping and the API you're using. In general pure-Python doesn't seg-fault. Or at least, it shouldn't seg-fault unless there's a bug in the interpreter, or some extension that you're using. That's not to say pure-Python won't break, it's just that a genuine seg-fault indicates the problem is probably the result of something messy outside of your code.
I'm assuming you're using this Python API.
In that case, the README.md mentions that the Reader.start_reader() method you're using is "asynchronous". Meaning it invokes a new thread or process and returns immediately and then the background thread continues to call your callback each time something is scanned.
I don't really know enough about the nitty gritty of CPython to say exactly what going on, but you've declared IDS = [] as a global variable and it seems like the background thread is running the callback with a different context to the main program. So when it attempts to access IDS it's reading memory it doesn't own, hence the seg-fault.
Because of how restrictive the callback is and the apparent lack of a buffer, this might be an oversight on the behalf of the developer. If you really need asynchronous reads it's worth sending them an issue report.
Otherwise, considering you're just waiting for input you probably don't need the asynchronous reads, and you could use the synchronous Reader.read() method inside your own busy loop instead with something like:
try:
while True:
readTags(r.read(timeout=10))
except KeyboardInterrupt: ## break loop on SIGINT (Ctrl-C)
pass
Note that r.read() returns a list of tags rather than just one, so you'd need to modify your callback slightly, and if you're writing more than just a quick script you probably want to use threads to interrupt the loop properly as SIGINT is pretty hacky.

SET working every other time Python MySQL

Im trying to set a value in my table with a given ID. When I run this code I am getting a return code of 1 on 'cursor.execute(updateStr)' the first time I run it, but it executes without issue and returns 0 when I run it a second time. There is no exception raised and I am not sure how to retrieve the actual error message. What could cause this problem and how do I dig deeper into the actual error? Thanks for looking.
updateStr = "UPDATE db.table SET OverrideVal = '{0}' WHERE table.OverrideID ={1};".format(overrideVal, overrideID)
returnCode = cursor.execute(updateStr)
if returnCode == 0:
cursor.execute("COMMIT")
else:
cursor.execute("ROLLBACK")
I don't think the execute command returns anything meaningful. Python's DB-API 2.0 states this about the execute method:
Return values are not defined.
Did you check to see if your updates are actually working? (Be sure to commit them).

How to transfer a value from a function in one script to another script without re-running the function (python)?

I'm really new to programming in general and very inexperienced, and I'm learning python as I think it's more simple than other languages. Anyway, I'm trying to use Flask-Ask with ngrok to program an Alexa skill to check data online (which changes a couple of times per hour). The script takes four different numbers (from a different URL) and organizes it into a dictionary, and uses Selenium and phantomjs to access the data.
Obviously, this exceeds the 8-10 second maximum runtime for an intent before Alexa decides that it's taken too long and returns an error message (I know its timing out as ngrok and the python log would show if an actual error occurred, and it invariably occurs after 8-10 seconds even though after 8-10 seconds it should be in the middle of the script). I've read that I could just reprompt it, but I don't know how and that would only give me 8-10 more seconds, and the script usually takes about 25 seconds just to get the data from the internet (and then maybe a second to turn it into a dictionary).
I tried putting the getData function right after the intent that runs when the Alexa skill is first invoked, but it only runs when I initialize my local server and just holds the data for every new Alexa session. Because the data changes frequently, I want it to perform the function every time I start a new session for the skill with Alexa.
So, I decided just to outsource the function that actually gets the data to another script, and make that other script run constantly in a loop. Here's the code I used.
import time
def getData():
username = '' #username hidden for anonymity
password = '' #password hidden for anonymity
browser = webdriver.PhantomJS(executable_path='/usr/local/bin/phantomjs')
browser.get("https://gradebook.com") #actual website name changed
browser.find_element_by_name("username").clear()
browser.find_element_by_name("username").send_keys(username)
browser.find_element_by_name("password").clear()
browser.find_element_by_name("password").send_keys(password)
browser.find_element_by_name("password").send_keys(Keys.RETURN)
global currentgrades
currentgrades = []
gradeids = ['2018202', '2018185', '2018223', '2018626', '2018473', '2018871', '2018886']
for x in range(0, len(gradeids)):
try:
gradeurl = "https://www.gradebook.com/grades/"
browser.get(gradeurl)
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:3]
if grade[2] != "%":
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:4]
if grade[1] == "%":
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:1]
currentgrades.append(grade)
except Exception:
currentgrades.append('No assignments found')
continue
dictionary = {"class1": currentgrades[0], "class2": currentgrades[1], "class3": currentgrades[2], "class4": currentgrades[3], "class5": currentgrades[4], "class6": currentgrades[5], "class7": currentgrades[6]}
return dictionary
def run():
dictionary = getData()
time.sleep(60)
That script runs constantly and does what I want, but then in my other script, I don't know how to just call the dictionary variable. When I use
from getdata.py import dictionary
in the Flask-ask script it just runs the loop and constantly gets the data. I just want the Flask-ask script to take the variable defined in the "run" function and then use it without running any of the actual scripts defined in the getdata script, which have already run and gotten the correct data. If it matters, both scripts are running in Terminal on a MacBook.
Is there any way to do what I'm asking about, or are there any easier workarounds? Any and all help is appreciated!
It sounds like you want to import the function, so you can run it; rather than importing the dictionary.
try deleting the run function and then in your other script
from getdata import getData
Then each time you write getData() it will run your code and get a new up-to-date dictionary.
Is this what you were asking about?
This issue has been resolved.
As for the original question, I didn't figure out how to make it just import the dictionary instead of first running the function to generate the dictionary. Furthermore, I realized there had to be a more practical solution than constantly running a script like that, and even then not getting brand new data.
My solution was to make the script that gets the data start running at the same time as the launch function. Here was the final script for the first intent (the rest of it remained the same):
#ask.intent("start_skill")
def start_skill():
welcome_message = 'What is the password?'
thread = threading.Thread(target=getData, args=())
thread.daemon = True
thread.start()
return question(welcome_message)
def getData():
#script to get data here
#other intents and rest of script here
By design, the skill requested a numeric passcode to make sure I was the one using it before it was willing to read the data (which was probably pointless, but this skill is at least as much for my own educational reasons as for practical reasons, so, for the extra practice, I wanted this to have as many features as I could possibly justify). So, by the time you would actually be able to ask for the data, the script to get the data will have finished running (I have tested this and it seems to work without fail).

Python - Looping through functions is throwing errors in bg, until maximum recursion depth exceeded

So I think I am just fundamentally doing something wrong, but here is a basic example of what I am doing
some variables here
some code here to run once
def runfirst():
do some stuff
runsecond()
def runsecond():
do some different stuff
runthird():
def runthird():
do some more stuff
runfirst():
runfirst()
So it basically pulls some info I need at beginning, and then runs through 3 different variables. What I am doing is pulling info from db, then watching some counts on the db, and if any of those counts goes over a certain number over a time period, it sends email. This is for monitoring purposes, and I need it to run all the time. The problem I get is, all the time it is running, in the background it is throwing errors like "File "asdf.py", line blah, in firstrun"
I think it is complaining because it sees that I am looping through functions, but for what I need this to do, it works perfectly, except for the errors, and eventually killing my script due to maximum recursion depth exceeded. Any help?
You have infinite recursion here. Because you call runfirst from runthird, it keeps going deeper and deeper and none of the functions ever return. You might want to consider putting the functions in a while True loop instead of calling them from each other.
def runfirst():
do some stuff
def runsecond():
do some different stuff
def runthird():
do some more stuff
while True:
runfirst()
runsecond()
runthird()
You're not looping.
You're calling a function that calls another function that calls a third function that calls the first function which calls the second function which calls the third function which again calls the first function...and so on until your stack overflows.

Categories