I am encountering a strange error that occurs once every few days. I have several virtual machines running on Google Cloud running a Python script. The Python file is very large, but the part that gets stuck is the following:
try:
f = urlopen('https://resources.lendingclub.com/SecondaryMarketAllNotes.csv')
df = pd.read_csv(f)
except:
print('error')
The first line of code always works, but the second line will occasionally stop the program. What I mean by that is that the program will not continue execution, but it does not throw any kind of error. I have logger running in my code in debug mode and it does not record anything.
Again, this happens very rarely, but when it does happen my virtual machines will stop. When I look at the processes in top, I see Python running with 0% CPU, and there is still plenty of system memory available. It will continue to sit there for hours without moving on to the next line of code or returning an error.
My application is very time sensitive and using urlopen is faster than using pd.read_csv to directly open the file.
I notice that when this rare error occurs, it happens at the same time in all of my virtual machines which means that there is likely something about the file being downloaded that is triggering this issue. Why it doesn't cause an error is beyond me.
I would greatly appreciate any ideas on what might be causing this and what workarounds might be available.
I am using Python 3.5.3 and pandas 0.19.2
Related
I am new to Python programming and having a problem with a multithreaded program (using the "threading" module) that runs fine at the beginning, but after a while starts repeatedly printing "IOStream.flush timed out" errors.
I am not even sure how to debug such an error, because I don't know what line is causing it. I read a bit about this error and saw that it might be related to memory consumption, so I tried profiling my program using a memory profiler on the Spyder IDE. Nothing jumped out at me, however (although I admit that I am not sure what to look for when it comes to Python memory leaks).
A few more observations:
I have an outer loop that runs my function over a large number of files. The files are just number data with the same formatting (they are quite large though and there is download latency, which is why I have made my application multithreaded so that each thread works on different files). If I have a long list of files, the problem occurs. If I shorten the list, the program concludes without problem. I am not sure why that is, although if it is some kind of memory leak then I would assume when I run the program longer, the problem grows until it reaches some kind of memory limit.
Normally, I use 128 threads in the program. If I reduce the number of threads to 48 or less, the program works fine and completes correctly. So clearly the problem is caused by multithreading (I'm using the "threading" module). This makes it a bit trickier to debug and figure out what is causing the problem. It seems something around 64 threads starts causing problems.
The program never explicitly crashes out. Once it gets to the point where it has this error, it just keeps repeatedly printing "IOStream.flush timed out". I have to close the Spyder IDE to stop it (Restart kernel doesn't work).
Right before this error happens, the program appears to stall. At least no more "prints" happen to the console (the various threads are all printing debug information to the screen). The last lines printed are standard debugging/status print statements that usually work when the number of threads is reduced or the number of files to process is decreased.
I have no idea how to debug this and get to the bottom of the problem. Any suggestions on how to get to the bottom of this would be much appreciated. Thanks in advance!
Specs:
Python 3.8.8
Spyder 4.2.5
Windows 10
I have a created a python to get input files from windows folder and updated the excel sheet every 15 minutes. Program is always open - running in background.
Program was running properly for 2 weeks and suddenly the program closed with error message 'A problem caused the program stop working correctly and was closed". I have checked the log files and didn't see any error message.
I checked the Windows log viewer and error was present with below text, which i could not interpret properly. Can anyone please let me the possible causes for the error.
Program.exe
0.0.0.0
5a2e9e81
python36.dll
3.6.5150.1013
5abd3161
c00000fd
0000000000041476
1ba8
01d45e9fe43cba57
C:\Python code\program.exe
C:\Users\aisteam\AppData\Local\Temp\2_MEI51602\python36.dll
a9da018c-e2e3-4821-9387-cce82ff29186
Make sure that your python code robustly handles errors like when the file it wants to update is locked, which is what Excel does while the file is open in Excel. by design, you could easily make your code create a new excel file each time, or wait until the file isn’t locked then update it. Either way, you need to make your code better at telling you what it is doing, e.g. by logging what it is doing (which is important to implement now because the logging needs to be in place before your code stops unexpectedly for, err, an unexpected reason), e.g. by carefully managing exceptions (i.e. don’t simply code as try/except:pass!)
BUT don’t do this sort of code with an unconditional except and nothing but a pass in the except: statement) because it will make errors HARDER to figure out:
try:
something
except:
pass
Always be specific about the exception you expect, and even if you are going to not raise, always always always log the exception.
EDIT: I rolled back PyCharm versions and it's working again. CLEARLY an IDE issue, not a script issue now. PyCharm version 2017.2.4 is what I rolled back to.
So I have a script that's been working great for me, until today. For some reason, the script will run fine with no errors at all, as long as I don't use PyCharm (Community Edition 2017.3.3) in debugging mode. I need to use debugger, so when it throws errors for no reason and stops the script, it makes it a pointless IDE.
The reason I know this is a PyCharm problem is because I copied the entire script into a different IDE (Wing), set to the same python interpreter, and went through it in debug mode there and it worked fine, no errors.
I have done extensive error testing to make sure the errors aren't actually there in my script; they're not. The script should work as written. It keeps saying datasets don't exist or input features for arcpy tools (a spatial program that hooks into python via a library called "arcpy") don't have values when they do. It's not a script problem, it's an IDE problem.
Has anybody encountered this and know how to fix it?
I do not have any specific environment settings, I just popped an ArcGIS python interpreter in there for the project so I could have access to the arcpy library and that's it. It should be noted that this interpreter is python 2.7 because ArcGIS is not yet compatible with python 3+. I doubt that has anything to do with it, but you never know...
This is a chunk of script causing the issues (if you don't have/know how to use ArcGIS don't bother trying to run it, it won't work for you). What I want to point out is that if I put a breakpoint at the qh_buffer line, it will break after trying to run that line with an arcpy error message that states invalid input/parameters (they are not invalid, it's written exactly how it should be and I have checked that qhPolys is being created and exists). THEN, if I move the breakpoint to the crop_intersect line and run it in debug, it runs through the entire code, INCLUDING the buffer statement, but then errors out with error 000732 "Input Features: Dataset #1; #2 does not exist or is not supported" (they both do exist, because I have hardcoded them to an output directory before and they are created just fine).
import arcpy
arcpy.env.overwriteOutput = True
svyPtFC = r"C:\Users\xxx\GIS_Testing\Crop_Test.shp"
select_query = '"FID" = 9'
qhPolys = arcpy.Select_analysis(svyPtFC, 'in_memory/qhPolys', select_query)
qh_buffer = arcpy.Buffer_analysis(qhPolys, 'in_memory/qh_buffer', '50
Meters')
cropFID = '"FID" = 1'
cropPoly = arcpy.Select_analysis(svyPtFC, 'in_memory/cropPoly', cropFID)
crop_intersect = arcpy.Intersect_analysis([[cropPoly, 1], [qh_buffer, 2]],
r'C:\Users\xxx\GIS_Testing\crp_int.shp')
feature_count = arcpy.GetCount_management(crop_intersect)
print feature_count
It does not make sense that it can cause an error at the buffer line if I put a breakpoint near there, but then if I move the breakpoint further down, that line will run fine and it'll just break at the next breakpoint... does explain why it works when you just hit "Run" instead of doing debug mode though. No breakpoints!
In PyCharm (JetBrains), I have been having trouble with typing full statements without getting an interuption. I first thought it was due to me not having updated the software, so I updated it, but the problem remains.
So if I type any statement or word, PyCharm seems to delay before I can proceed. An example:
import csv
Even before I finish typing "import" - if I delay a keystroke - PyCharm begins to "think" and the window is not accessible for about one to two seconds (quite literally). I assume it is going to give me suggestions or show a tip/error about the code.
Any thoughts to prevent this from happening?
Edit:
Windows 8.1; PyCharm 2016.2
Code Complete turned off via Settings->Editor->General->Code Completion, but did not solve problem.
Key PC Spec:
Intel Core i5-337U
4GB Ram
64-bit
Edit2:
I receive this error when I run anything now, including simply print("test"):
Process finished with exit code -1073741511 (0xC0000139)
Will separate the question somewhere else, since this may be a separate problem altogether.
Try disabling code completion. I believe that your computer can't search through all of Python's librarys fast enough so it freezes for a bit.
This is similar to a few questions on the internet, but this code seems to be working for awhile instead of returning an error instantly, which suggests to me it is maybe not just a host-file error?
I am running a code that spawns multiple MPI processes which then each create a loop, within which they send some data with bcast and scatter, then gathers data from those processes. This runs the algorithm and saves data. It then disconnects from the spawned comm, and creates another set of spawns on the next loop. This works for a few minutes, then after around 300 files, it will spit this out:
[T7810:10898] [[50329,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../orte/mca/plm/base/plm_base_launch_support.c at line 758
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered an error.
More information may be available above.
I am testing this on a local machine (single node), so the end deployment will have multiple nodes that each spawn their own mpi processes within that node. I am trying to figure out if this is an issue with testing the multiple nodes on my local machine and will work fine on the HPC or is a more serious error.
How can I debug this? Is there a way to be printing out what MPI is trying to do during, or monitor MPI, such as a verbose mode?
Since MPI4PY is so close to MPI (logically, if not in terms of lines-of-code), one way to debug this is to write the C version of your program and see if the problem persists. When you report this bug to OpenMPI, they are going to want a small c test case anyway.