With OpenCV I've written a script that;
captures a running application in real-time and displays it in a window
searches for a specific image within the window
if it finds the image, it draws a box around it otherwise it does nothing
I want to extend this functionality to be able to identify multiple images on the same window at once.
As I understand it I can either:
store the images as a list and iterate over the list doing something at each iteration
use the same functionality noted above, in a different thread.
What do I need to take into consideration when deciding on the approach? Is there a best practice to follow or things I need to be aware of before deciding on a course?
I've found a number of topics on the subject but they're all very old and mostly deal with sensors or performance or are other languages. It's a 2D image I'm detecting, I don't think performance is going to be an issue.
When is it appropriate to multi-thread?
When to thread and when to WAR?
Python: Multi-threading or Loops?
When multi-threading is a bad idea?
To multi-thread or not to multi-thread!
Related
Goal: I have one large, interactive image, depicting several different buttons/symbols. Depending on which symbol it is, I want my program to click on the right one.
My attempt so far: Using cv2.matchTemplate() in Python, I could successfully identify everything that I needed to, as well as create copies that framed the right template with rectangles. No problems with the detection itself, the symbols are quite simple.
So far so good, but how do I proceed from there to interact with the original image on which I want to perform the clicks? I know that there are several modules to control mouse events, but the problem I have is that all the detection is based on the copy, and not the interactive original.
I'm using Python 2.7, PyGTK 2.24, and PyGST (Gstreamer).
To ensure smooth playback from one clip to another (without a blink), I combined all the clips I needed into one larger video. This lets me seek to the exact place I need in code. One of the clips is like a "fill-in", which should loop whenever one of the other clips is not playing.
However, to make my code easier and more streamlined, I want to use segments to define the various clips within the larger video. Then, at the end of each segment (I know there is a segment end event), I seek to the fill-in clip. When I need another clip, I just seek to that segment.
My question is, how exactly do I create these segments? I'm guessing that would be the event_new_new_segment(), but I am not sure. Can I create multiple clips to seek with using this function? Is there another I should use. Are there any gotchas to this method of seeking in my video that I should be aware of?
Second, how do I seek to that segement?
Thank you!
Looks like only GstElement's can generate NEWSEGMENT events, you can't simply attach it to an existing element. The closest thing you could do if not using Python, would be creating a single shot or periodic GstClockID or and use gst_clock_id_wait_async until the clock time hit. But the problem is, GstClockID is not wrapped in PyGst.
I think I'm actually working on some similar problem. Some kind of solution I'm using now, is gluing video streams in real time with gnonlin. The good side: seems to work, didn't have time to thoroughly test it yet. Bad side: poorly documented and buggy. These sources from the flumotion project (and the comments inside!) were very, very helpful to me for understanding how to make the whole thing work.
I'm working on a Minecraft-style game, and I need a way to reduce the amount of the world rendered. Currently, I'm using a naive, render-everything approach, which is having obvious scaling problems. I need a way to take an array of blocks, and in some way find out which ones are touching air, water, or any other translucent block.
I'm open to using external modules like NumPy or SciPy, though some of their documentation is a bit over my head. Alternatively, it would also be acceptable to iterate through each block and get a list of neighbors, though the performance cost of doing these calculations in Python instead of C would be pretty hefty.
For the record, I've already tried looking at NetworkX, but it seems to be more for scientific analysis or pathfinding than visibility checking.
If you only need to do this once, performance should not be an issue. If you also incrementally update the .isBoundary property of blocks whenever the world is changed, you will never have to do it again.
However you still run into issues if your world is too large or full of holes and caverns and transparent-interleaved-with-nontransparent. If you need to dynamically determine what is visible, you can keep an octree ( http://en.wikipedia.org/wiki/Octree ) where you can have giant expanses of air/water/etc. as a single node (giant block), labeled as "transparent". Then use the "paintbucket" algorithm (modified to perform Dijkstra's algorithm, so it is easy to detect when you've "gone around a corner" by checking to see if blocks exist between the current block and the origin) to quickly figure out which blocks are in sight. Updates for things far in the distance can be delayed significantly if the player is moving slowly.
You could use the Z-Buffer solution. Concerning speed, I'd write as much as possible in python and use pypy. EVE Online (a successful 3D MMO) was written in stackless python I believe.
Background
I have been asked by a client to create a picture of the world which has animated arrows/rays that come from one part of the world to another.
The rays will be randomized, will represent a transaction, will fade out after they happen and will increase in frequency as time goes on. The rays will start in one country's boundary and end in another's. As each animated transaction happens a continuously updating sum of the amounts of all the transactions will be shown at the bottom of the image. The amounts of the individual transactions will be randomized. There will also be a year showing on the image that will increment every n seconds.
The randomization, summation and incrementing are not a problem for me, but I am at a loss as to how to approach the animation of the arrows/rays.
My question is what is the best way to do this? What frameworks/libraries are best suited for this job?
I am most fluent in python so python suggestions are most easy for me, but I am open to any elegant way to do this.
The client will present this as a slide in a presentation in a windows machine.
The client will present this as a slide in a presentation in a windows machine
I think this is the key to your answer. Before going to a 3d implementation and writing all the code in the world to create this feature, you need to look at the presentation software. Chances are, your options will boil down to two things:
Animated Gif
Custom Presentation Scripts
Obviously, an animated gif is not ideal due to the fact that it repeats when it is done rendering, and to make it last a long time would make a large gif.
Custom Presentation Scripts would probably be the other way to allow him to bring it up in a presentation without running any side-programs, or doing anything strange. I'm not sure which presentation application is the target, but this could be valuable information.
He sounds like he's more non-technical and requesting something he doesn't realize will be difficult. I think you should come up with some options, explain the difficulty in implementing them, and suggest another solution that falls into the 'bang for your buck' range.
If you are adventurous use OpenGL :)
You can draw bezier curves in 3d space on top of a textured plane (earth map), you can specify a thickness for them and you can draw a point (small cone) at the end. It's easy and it looks nice, problem is learning the basics of OpenGL if you haven't used it before but that would be fun and probably useful if your in to programing graphics.
You can use OpenGL from python either with pyopengl or pyglet.
If you make the animation this way you can capture it to an avi file (using camtasia or something similar) that can be put onto a presentation slide.
It depends largely on the effort you want to expend on this, but the basic outline of an easy way. Would be to load an image of an arrow, and use a drawing library to color and rotate it in the direction you want to point(or draw it using shapes/curves).
Finally to actually animate it interpolate between the coordinates based on time.
If its just for a presentation though, I would use Macromedia Flash, or a similar animation program.(would do the same as above but you don't need to program anything)
I want to automate playing a video game with Python. I want to write a script that can grab the screen image, diff it with the next frame and track an object to click on. What libraries would be useful for this other than PIL?
There are a few options here. The brute force diff'ing approach will lead to a lot of frustration unless what you're tracking is very consistent. For this you could use any number of genetic approaches to train your program what to follow. After enough generations it would do the right thing reliably. If the thing you want to track is visually obvious (like a red ball on a white screen) then you could detect it yourself through simple brute force scanning of the bitmap.
Another approach would be just looking at the memory of the running app, and figuring out what area is controlling the position of your object. For some more info and ideas on this, see how mumble got 3D positional audio working in various games.
http://mumble.sourceforge.net/HackPositionalAudio
Answer would depend on the platform and game too.
e.g. I did once similar things for helicopter flash game, as it was very simple 2d game with well defined colored maze
It was on widows with copy to clipboard and win32 key events using win32api bindings for python.