Is it possible to see what a Python process is doing? - python

I created a python script that grabs some info from various websites, is it possible to analyze how long does it take to download the data and how long does it take to write it on a file?
I am interested in knowing how much it could improve running it on a better PC (it is currently running on a crappy old laptop.

If you just want to know how long a process takes to run the time command is pretty handy. Just run time <command> and it will report how much time it took to run with it counted in a few categories, like wall clock time, system/kernel time and user space time. This won't tell you anything about which parts of the system are taking up the amount of time. You can always look at a profiler if you want/need that type of information.
That said, as Barmar said, if you aren't doing much processing of the sites you are grabbing, the laptop is probably not going to be a limiting factor.

You can always store the system time in a variable before a block of code that you want to test, do it again after then compare them.

Related

Is it possible to force a 2 second looping callback in Python?

I'm trying to get a looping call to run every 2 seconds. Sometimes, I get the desired functionality, but othertimes I have to wait up to ~30 seconds which is unacceptable for my applications purposes.
I reviewed this SO post and found that looping call might not be reliable for this by default. Is there a way to fix this?
My usage/reason for needing a consistent ~2 seconds:
The function I am calling scans an image (using CV2) for a dollar value and if it finds that amount it sends a websocket message to my point of sale client. I can't have customers waiting 30 seconds for the POS terminal to ask them to pay.
My source code is very long and not well commented as of yet, so here is a short example of what I'm doing:
#scan the image for sales every 2 seconds
def scanForSale():
print ("Now Scanning for sale requests")
#retrieve a new image every 2 seconds
def getImagePreview():
print ("Loading Image From Capture Card")
lc = LoopingCall(scanForSale)
lc.start(2)
lc2 = LoopingCall(getImagePreview)
lc2.start(2)
reactor.run()
I'm using a Raspberry Pi 3 for this application, which is why I suspect it hangs for so long. Can I utilize multithreading to fix this issue?
Raspberry Pi is not a real time computing platform. Python is not a real time computing language. Twisted is not a real time computing library.
Any one of these by itself is enough to eliminate the possibility of a guarantee that you can run anything once every two seconds. You can probably get close but just how close depends on many things.
The program you included in your question doesn't actually do much. If this program can't reliably print each of the two messages once every two seconds then presumably you've overloaded your Raspberry Pi - a Linux-based system with multitasking capabilities. You need to scale back your usage of its resources until there are enough available to satisfy the needs of this (or whatever) program.
It's not clear whether multithreading will help - however, I doubt it. It's not clear because you've only included an over-simplified version of your program. I would have to make a lot of wild guesses about what your real program does in order to think about making any suggestions of how to improve it.

Different time taken by python script every time it is runned?

I am working on a Opencv based Python project. I am working on program development which takes less time to execute. For that i have tested my small program print hello world on python to test the time taken to run the program. I had run many time and every time it run it gives me a different run time.
Can you explain me why a simple program is taking different time to execute?
I need my program to be independent of system processes ?
Python gets different amounts of system resources depending upon what else the CPU is doing at the time. If you're playing Skyrim with the highest graphics levels at the time, then your script will run slower than if no other programs were open. But even if your task bar is empty, there may be invisible background processes confounding things.
If you're not already using it, consider using timeit. It performs multiple runs of your program in order to smooth out bad runs caused by a busy OS.
If you absolutely insist on requiring your program to run in the same amount of time every time, you'll need to use an OS that doesn't support multitasking. For example, DOS.

How do I load up python in the shell from Vim, but not use it right away?

I'm doing TDD, but the system I'm working with takes 6 seconds to get through boilerplate code. This code is not part of my work, nor of my tests (it's Autodesk Maya's headless/batch/CLI Python mode). I've talked to support, and there's no way around the load time, so I though maybe I could load and initialize Python first in the background, as I code, and then my mapping would simply run the nosetests inside of that when I'm ready. My tests take something like 0.01 seconds, so this should feel instantaneous, which would really help the red/green/refactor cycle.
In short, instead of firing off /path/to/mayapy /path/to/runtests.py /current/buffer/path, Vim would just fire up /path/to/mayapy with the boilerplate stuff from runtests.py, then somehow hold onto that running instance. When I hit my mapping, it would send into that running instance the call to nosetest with the current buffer's path (and then fire up another instance to hold onto while waiting for the next run). How do I hold onto the running instance and call into it later? I'm even considering having a chain of 2 or 3, for the times when I make minor mistakes and rerun 2 seconds later.
Vim-ipython, the excellent work of Paul Ivanov, is an interface between vim and ipython sessions (demo video). This may relieve you of some of the boilerplate of sending buffers to python and waiting on results.
I'm not entirely sure this is exactly what you want, but with a bit of python and vim glue code it may be a good step in the right direction, but I'm guessing you'd need to do a bit of experimentation to get a workflow you're happy with.

Python watch-dog script : load url asynchronously

I have simple Python script which do check few urls :
f = urllib2.urlopen(urllib2.Request(url))
as i have socket timeout setted on 5 seconds sometimes is bothering to wait 5sec * number of urls on results.
Is there any easy standartized way how to run those url checks asynchronously without big overhead. Script must use standart python components on vanilla ubuntu distribution (no additional installations).
Any ideas ?
I wrote something called multibench a long time ago. I used it for almost the same thing you want to do here, which was to call multiple concurrent instances of wget and see how long it takes to complete. It is a crude load testing and performance monitoring tool. You will need to adapt this somewhat, because this runs the same command n times.
Install additional software. It's a waste of time you re-invent something just because of some packaging decisions made by someone else.

RW-locking a Windows file in Python, so that at most one test instance runs per night

I have written a custom test harness in Python (existing stuff was not a good fit due to lots of custom logic). Windows task scheduler kicks it off once per hour every day. As my tests now take more than 2 hours to run and are growing, I am running into problems. Right now I just check the system time and do nothing unless hour % 3 == 0, but I do not like that. I have a text file that contains:
# This is a comment
LatestTestedBuild = 25100
# Blank lines are skipped too
LatestTestRunStartedDate = 2011_03_26_00:01:21
# This indicates that it has not finished yet.
LatestTestRunFinishDate =
Sometimes, when I kick off a test manually, it can happen at any time, including 12:59:59.99
I want to remove race conditions as much as possible. I would rather put some extra effort once and not worry about practical probability of something happening. So, I think locking a this text file atomically is the best approach.
I am using Python 2.7, Windows Server 2008R2 Pro and Windows 7 Pro. I prefer not to install extra libraries (Python has not been "sold" to my co-workers yet, but I could copy over a file locally that implements it all, granted that the license permits it).
So, please suggest a good, bullet-proof way to solve this.
When you start running a test make a file called __LOCK__ or something. Delete it when you finish, using a try...finally block to ensure that it always gets cleared up. Don't run the test if the file exists. If the computer crashes or similar, delete the file by hand. I doubt you need more cleverness than that.
Are you sure you need 2 hours of tests?! I think 2 minutes is a more reasonable amount of time to spend, though I guess if you are running some complicated numerics you might need more.
example code:
import os
if os.path.exists("__LOCK__"):
raise RuntimeError("Already running.") # or whatever
try:
open("__LOCK__", "w").write("Put some info here if you want.")
finally:
if os.path.exists("__LOCK__"):
os.unlink("__LOCK__")

Categories