I recently downloaded fogleman's excellent "Minecraft in 500 lines" demo from https://github.com/fogleman/Craft. I used the 2to3 tool and corrected some details by hand to make it runnable under python3. I am now wondering about a thing with the call of self.clear() in the render method. This is my modified rendering method that is called every frame by pyglet:
def on_draw(self):
""" Called by pyglet to draw the canvas.
"""
frameStart = time.time()
self.clear()
clearTime = time.time()
self.set_3d()
glColor3d(1, 1, 1)
self.model.batch.draw()
self.draw_focused_block()
self.set_2d()
self.draw_label()
self.draw_reticle()
renderTime = time.time()
self.clearBuffer.append(str(clearTime - frameStart))
self.renderBuffer.append(str(renderTime - clearTime))
As you can see, I took the execution times of self.clear() and the rest of the rendering method. The call of self.clear() calls this method of pyglet, that can be found at .../pyglet/window/__init__.py:
def clear(self):
'''Clear the window.
This is a convenience method for clearing the color and depth
buffer. The window must be the active context (see `switch_to`).
'''
gl.glClear(gl.GL_COLOR_BUFFER_BIT | gl.GL_DEPTH_BUFFER_BIT)
So I basically do make a call to glClear().
I noticed some frame drops while testing the game (at 60 FPS), so I added the above code to measure the execution time of the commands, and especially that one of glClear(). I found out that the rendering itself never takes longer than 10 ms. But the duration of glClear() is a bit of a different story, here is the distribution for 3 measurements under different conditions:
Duration of glClear() under different conditions.
The magenta lines show the time limit of a frame. So everything behind the first line means there was a frame drop.
The execution time of glClear() seems to have some kind of "echo" after the first frame expires. Can you explain me why? And how can I make the call faster?
Unfortunately I am not an OpenGL expert, so I am thankful for every advice guys. ;)
Your graph is wrong. Well, at least it's not a suitable graph for the purpose of measuring performance. Don't ever trust a gl* function to execute when you tell it to, and don't ever trust it to execute as fast as you'd expect it to.
Most gl* functions aren't executed right away, they couldn't be. Remember, we're dealing with the GPU, telling it to do stuff directly is slow. So, instead, we write a to-do list (a command queue) for the GPU, and dump it into VRAM when we really need the GPU to output something. This "dump" is part of a process called synchronisation, and we can trigger one with glFlush. Though, OpenGL is user friendly (compared to, say, Vulkan, at least), and as such it doesn't rely on us to explicitly flush the command queue. Many gl* functions, which exactly depending on your graphics driver, will implicitly synchronise the CPU and GPU state, which includes a flush.
Since glClear usually initiates a frame, it is possible that your driver thinks it'd be good to perform such an implicit synchronisation. As you might imagine, synchronisation is a very slow process and blocks CPU execution until it's finished.
And this is probably what's going on here. Part of the synchronisation process is to perform memory transactions (like glBufferData, glTexImage*), which are probably queued up until they're flushed with your glClear call. Which makes sense in your example; the spikes that we can observe are probably the frames after you've uploaded a lot of block data.
But bear in mind, this is just pure speculation, and I'm not a total expert on this sort of stuff either, so don't trust me on the exact details. This page on the OpenGL wiki is a good resource on this sort of thing.
Though one thing is certain, your glClear call does not take as long as your profiler says it does. You should be using a profiler dedicated to profiling graphics.
Related
I'm trying to put in scheduled refreshes for a pong game, so the screen updates after a time interval so the paddles (which I have segmented) move in sync on the screen. What is the functional difference between these two blocks of code? I was trying to get the first one to work but wasn't able to get it to do what I wanted but after some experimenting I landed at the second one which does:
pongscreen.tracer(0)
def balltiming():
pongscreen.update()
time.sleep(0.5)
balltiming()
balltiming()
pongscreen.listen()
pongscreen.mainloop()
This second one works, but functionally they seem pretty much the same in what they're doing. Why doesn't the first one work but this does?
pongscreen.tracer(0)
def balltiming():
pongscreen.update()
pongscreen.ontimer(balltiming, 300)
balltiming()
pongscreen.listen()
pongscreen.mainloop()
The first block is genuinely recursive, so it'll blow the call stack after ~1000 frames on a typical CPython implementation.
The second block isn't recursive; the initial balltiming function is completely cleared from the call stack after it sets an ontimer event. Turtle internally calls the next iteration of balltiming that has been registered as a callback to ontimer without unbounded stack growth.
If you want to use sleep manually, use a while loop rather than recursion. However, sleep blocks the thread and is less amenable to precise framerate control, so generally ontimer is used.
A consequence of the blocking recursive calls and sleep is that in the first code block, pongscreen.listen() and pongscreen.mainloop() won't be reached, while in the second version, both will be reached and the main thread will block at pongscreen.mainloop(); the typical flow for a turtle program.
I would like to measure the coverage of my Python code which gets executed in the production system.
I want an answer to this question:
Which lines get executed often (hot spots) and which lines are never used (dead code)?
Of course this must not slow down my production site.
I am not talking about measuring the coverage of tests.
I assume you are not talking about test suite code coverage which the other answer is referring to. That is a job for CI indeed.
If you want to know which code paths are hit often in your production system, then you're going to have to do some instrumentation / profiling. This will have a cost. You cannot add measurements for free. You can do it cheaply though and typically you would only run it for short amounts of time, long enough until you have your data.
Python has cProfile to do full profiling, measuring call counts per function etc. This will give you the most accurate data but will likely have relatively high impact on performance.
Alternatively, you can do statistical profiling which basically means you sample the stack on a timer instead of instrumenting everything. This can be much cheaper, even with high sampling rate! The downside of course is a loss of precision.
Even though it is surprisingly easy to do in Python, this stuff is still a bit much to put into an answer here. There is an excellent blog post by the Nylas team on this exact topic though.
The sampler below was lifted from the Nylas blog with some tweaks. After you start it, it fires an interrupt every millisecond and records the current call stack:
import collections
import signal
class Sampler(object):
def __init__(self, interval=0.001):
self.stack_counts = collections.defaultdict(int)
self.interval = interval
def start(self):
signal.signal(signal.VTALRM, self._sample)
signal.setitimer(signal.ITIMER_VIRTUAL, self.interval, 0)
def _sample(self, signum, frame):
stack = []
while frame is not None:
formatted_frame = '{}({})'.format(
frame.f_code.co_name,
frame.f_globals.get('__name__'))
stack.append(formatted_frame)
frame = frame.f_back
formatted_stack = ';'.join(reversed(stack))
self.stack_counts[formatted_stack] += 1
signal.setitimer(signal.ITIMER_VIRTUAL, self.interval, 0)
You inspect stack_counts to see what your program has been up to. This data can be plotted in a flame-graph which makes it really obvious to see in which code paths your program is spending the most time.
If i understand it right you want to learn which parts of your application is used most often by users.
TL;DR;
Use one of the metrics frameworks for python if you do not want to do it by hand. Some of them are above:
DataDog
Prometheus
Prometheus Python Client
Splunk
It is usually done by function level and it actually depends on application;
If it is a desktop app with internet access:
You can create a simple db and collect how many times your functions are called. For accomplish it you can write a simple function and call it inside every function that you want to track. After that you can define an asynchronous task to upload your data to internet.
If it is a web application:
You can track which functions are called from js (mostly preferred for user behaviour tracking) or from web api. It is a good practice to start from outer to go inner. First detect which end points are frequently called (If you are using a proxy like nginx you can analyze server logs to gather information. It is the easiest and cleanest way). After that insert a logger to every other function that you want to track and simply analyze your logs for every week or month.
But you want to analyze your production code line by line (it is a very bad idea) you can start your application with python profilers. Python has one already: cProfile.
Maybe make a text file and through your every program method just append some text referenced to it like "Method one executed". Run the web application like 10 times thoroughly as a viewer would and after this make a python program that reads the file and counts a specific parts of it or maybe even a pattern and adds it to a variable and outputs the variables.
After I hit n to evaluate a line, I want to go back and then hit s to step into that function if it failed. Is this possible?
The docs say:
j(ump) lineno
Set the next line that will be executed. Only available in the bottom-most frame. This lets you jump back and execute code again, or jump forward to skip code that you don’t want to run.
The GNU debugger, gdb: It is extremely slow, as it undoes single machine instruction at a time.
The Python debugger, pdb: The jump command takes you backwards in the code, but does not reverse the state of the program.
For Python, the extended python debugger prototype, epdb, was created for this reason. Here is the thesis and here is the program and the code.
I used epdb as a starting point to create a live reverse debugger as part of my MSc degree. The thesis is available online: Combining reverse debugging and live programming towards visual thinking in computer programming. In chapter 1 and 2 I also cover most of the historical approaches to reverse debugging.
PyPy has started to implement RevDB, which supports reverse debugging.
It is (as of Feb 2017) still at an alpha stage, only supports Python 2.7, only works on Linux or OS X, and requires you to build Python yourself from a special revision. It's also very slow and uses a lot of RAM. To quote the Bitbucket page:
Note that the log file typically grows at a rate of 1-2 MB per second. Assuming size is not a problem, the limiting factor are:
Replaying time. If your recorded execution took more than a few minutes, replaying will be painfully slow. It sometimes needs to go over the whole log several times in a single session. If the bug occurs randomly but rarely, you should run recording for a few minutes, then kill the process and try again, repeatedly until you get the crash.
RAM usage for replaying. The RAM requirements are 10 or 15 times larger for replaying than for recording. If that is too much, you can try with a lower value for MAX_SUBPROCESSES in _revdb/process.py, but it will always be several times larger.
Details are on the PyPy blog and installation and usage instructions are on the RevDB bitbucket page.
Reverse debugging (returning to previously recorded application state or backwards single-stepping debugging) is generally an assembly or C level debugger feature. E.g. gdb can do it:
https://sourceware.org/gdb/wiki/ReverseDebug
Bidirectional (or reverse) debugging
Reverse debugging is utterly complex, and may have performance penalty of 50.000x. It also requires extensive support from the debugging tools. Python virtual machine does not provide the reverse debugging support.
If you are interactively evaluation Python code I suggest trying IPython Notebook which provide HTML-based interactive Python shells. You can easily write your code and mix and match the order. There is no pdb debugging support, though. There is ipdb which provides better history and search facilities for entered debugging commands, but it doesn't do direct backwards jumps either as far as I know.
Though you may not be able to reverse the code execution in time, the next best thing pdb has are the stack frame jumps.
Use w to see where you're in the stack frame (bottom is the newest), and u(p) or d(own) to traverse up the stackframe to access the frame where the function call stepped you into the current frame.
I do not want to lose my sets if windows is about to shutdown/restart/log off/sleep, Is it possible to save it before shutdown? Or is there an alternative to save information without worring it will get lost on windows shutdown? JSON, CSV, DB? Anything?
s = {1,2,3,4}
with open("s.pick","wb") as f: # pickle it to file when PC about to shutdown to save information
pickle.dump(s,f)
I do not want to lose my sets if windows is about to shutdown/restart/log off/sleep, Is it possible to save it before shutdown?
Yes, if you've built an app with a message loop, you can receive the WM_QUERYENDSESSION message. If you want to have a GUI, most GUI libraries will probably wrap this up in their own way. If you don't need a GUI, your simplest solution is probably to use PyWin32. Somewhere in the docs there's a tutorial on creating a hidden window and writing a simple message loop. Just do that on the main thread, and do your real work on a background thread, and signal your background thread when a WM_QUERYENDSESSION message comes in.
Or, much more simply, as Evgeny Prokurat suggests, just use SetConsoleCtrlHandler (again through PyWin32). This can also catch ^C, ^BREAK, and the user closing your console, as well as the logoff and shutdown messages that WM_QUERYENDSESSION catches. More importantly, it doesn't require a message loop, so if you don't have any other need for one, it's a lot simpler.
Or is there an alternative to save information without worring it will get lost on windows shutdown? JSON, CSV, DB? Anything?
The file format isn't going to magically solve anything. However, a database could have two advantages.
First, you can reduce the problem by writing as often as possible. But with most file formats, that means rewriting the whole file as often as possible, which will be very slow. The solution is to streaming to a simpler "journal" file, packing that into the real file less often, and looking for a leftover journal at every launch. You can do that manually, but a database will usually do that for you automatically.
Second, if you get killed in the middle of a write, you end up with half a file. You can solve that by the atomic writing trick—write a temporary file, then replace the old file with the temporary—but this is hard to get right on Windows (especially with Python 2.x) (see Getting atomic writes right), and again, a database will usually do it for you.
The "right" way to do this is to create a new window class with a msgproc that dispatches to your handler on WM_QUERYENDSESSION. Just as MFC makes this easier than raw Win32 API code, win32ui (which wraps MFC) makes this easier than win32api/win32gui (which wraps raw Win32 API). And you can find lots of samples for that (e.g., a quick search for "pywin32 msgproc example" turned up examples like this, and searches for "python win32ui" and similar terms worked just as well).
However, in this case, you don't have a window that you want to act like a normal window, so it may be easier to go right to the low level and write a quick&dirty message loop. Unfortunately, that's a lot harder to find sample code for—you basically have to search the native APIs for C sample code (like Creating a Message Loop at MSDN), then figure out how to translate that to Python with the pywin32 documentation. Less than ideal, especially if you don't know C, but not that hard. Here's an example to get you started:
def msgloop():
while True:
msg = win32gui.GetMessage(None, 0, 0)
if msg and msg.message == win32con.WM_QUERYENDSESSION:
handle_shutdown()
win32api.TranslateMessage(msg)
win32api.DispatchMessage(msg)
if msg and msg.message == win32con.WM_QUIT:
return msg.wparam
worker = threading.Thread(real_program)
worker.start()
exitcode = msgloop()
worker.join()
sys.exit(exitcode)
I haven't shown the "how to create a minimal hidden window" part, or how to signal the worker to stop with, e.g., a threading.Condition, because there are a lot more (and easier-to-find) good samples for those parts; this is the tricky part to find.
you can detect windows shutdown/log off with win32api.setConsoleCtrlHandler
there is a good example How To Catch “Kill” Events with Python
I would like to perform a measurement and plot a graph while the measurement is
running. This measurements takes quite some time in python (it has to retrieve data over a slow connection). The problem is that the graph freezes when measuring. The measurement
consists of setting a center wavelength, and then measuring some signal.
My program looks something like this:
# this is just some arbitrary library that has the functions set_wavelength and
# perform_measurement
from measurement_module import set_wavelength, perform_measurement
from pylab import *
xdata = np.linspace(600,1000,30) # this will be the x axis
ydata = np.zeros(len(xdata)) # this will be the y data. It will
for i in range(len(xdata)):
# this call takes approx 1 s
set_wavelength(xdata[i])
# this takes approx 10 s
ydata[i] = perform_measurement(xdata)
# now I would like to plot the measured data
plot(xdata,ydata)
draw()
This will work when it is run in IPython with the -pylab module switched on,
but while the measurement is running the figure will freeze. How can modify
the behaviour to have an interactive plot while measuring?
You cannot simply use pylab.ion(), because python is busy while performing the measurements.
regards,
Dirk
You can, though maybe a bit awkward, run the data-gathering as a serparate process. I find Popen in the subprocess module quite handy. Then let that data-gathering script save what it does to disk somewhere and you use
Popen.poll()
To check if it has completed.
It ought to work.
I recommend buffering the data in large chunks and render/re-render when the buffer fills up. If you want it to be nonblocking look at greenlets.
from gevent.greenlet import Greenlet
import copy
def render(buffer):
'''
do rendering stuff
'''
pass
buff = ''
while not_finished:
buff = connection.read()
g = Greenlet(render, copy.deepcopy(buff))
g.start()
Slow input and output is the perfect time to use threads and queues in Python. Threads have there limitations, but this is the case where they work easily and effectively.
Outline of how to do this:
Generally the GUI (e.g., the matplotlib window) needs to be in the main thread, so do the data collection in a second thread. In the data thread, check for new data coming in (and if you do this in some type of infinite polling loop, put in a short time.sleep to release the thread occasionally). Then, whenever needed, let the main thread know that there's some new data to be processed/displayed. Exactly how to do this depends on details of your program and your GUI, etc. You could just use a flag in the data thread that you check for from the main thread, or a theading.Event, or, e.g., if you have a wx backend for matplotlib wx.CallAfter is easy. I recommend looking through one of the many Python threading tutorials to get a sense of it, and also threading with a GUI usually has a few issues too so just do a quick google on threading with your particular backend. This sounds cumbersome as I explain it so briefly, but it's really pretty easy and powerful, and will be smoother than, e.g., reading and writing to the same file from different processes.
Take a look at Traits and Chaco, Enthought's type system and plotting library. They provide a nice abstraction to solve the problem you're running into. A Chaco plot will update itself whenever any of its dependencies change.