So I have basically build a program that detects and maps the positions of NBA players on the court.
Here is an example of it working. My algorithm works good when the camera movement is not to fast. But since the player movement is fast in transition and the camera follows it. The picture quality is bad/blurry. SIFT has problems detecting and matching keypoints. I am looking for an algorithm or approach that would try to fix the outlier points that happen with those sudden transitions during the game but not change the ones recorded when the players are on the either side of the flor since they are correct. This can be done either during or in post processing. But I am struggling to find a good solution that would match the above mentioned criteria.
The points can be fixed in any way may it be fixing the recorded coordinates. Or by making the detection better.
I don't have enough reputation to comment, so I'll have to answer.
If I understand well, you're using SIFT tp detect players in images, and also to match the players in two consecutive images to track the players. Is that correct?
And due to the bluriness (linked to movement), sometimes SIFT will fail, and you "lose" some players for a few frames, is that correct?
For example, if you have a player's coordinates at frames 53 and 56, but fail to track him for frames 54 and 55, you want an algorithm to find the positions for these associated frames?
If that is so, why can't you use a simple linear interpolation on the 2d position in the court space? Your camera basically works at 20-30 FPS, so your player won't move very far in between 2 consecutive frames.
Considering positions in the court space, if you're looking for position at timestamp i, and have position for i-1 and i+1, then:
x(i) = (x(i-1) + x(i+1))/2
y(i) = (y(i-1) + y(i+1))/2
I have a image that I've logically separated into a 100 by 100 grid. In each cell of the grid, I want to find the most commonly occurring color. My current algorithm uses a hash map to calculate the mode in linear time, but processing the whole image still takes a ridiculous amount of time!
I'm working on a little game that is grid based, and I wanted to allow the players to import an image to color the grid after.
Is there a better algorithm out there that I should know about?
I have a device which I am reading from. Currently it's just test device to implement a GUI (PyQT/PySide2). I am using PyQtGraph to display plots.
This is the update function (simplified for better readability):
def update(self, line):
self.data_segment[self.ptr] = line[1] # gets new line from a Plot-Manager which updates all plots
self.ptr += 1 # counts the amount of samples
self.line_plot.setData(self.data_segment[:self.ptr]) # displays all read samples
self.line_plot.setPos(-self.ptr, 0) # shifts the plot to the left so it scrolls
I have an algorithm that deletes the first x values of the array and saves them into a temp file. Currently the maximum of available data is 100 k. If the user is zoomed in and only sees a part of the plot, there is no problem, no lagging plot
But the more points are displayed (bigger x-range) the more it laggs, lagging plot
Especially when I set the width of the scrolling plot < 1 it starts lagging way faster. Note that this is just a test plot, the actual plot will be more complex, but the peaks will be important as well, so losing data is crucial.
I need an algorithm that resamples the data without losing information or almost no information and displays only visible points, rather then calculating 100k points, which aren't visible anyway and wasting performance with no gain.
This seems like a basic problem to me, but I can't seem to find a solution for this somehow... My knowledge on signal processing is very limited, which is why I might not be able find anything on the web. I might even took the false approach to solve this problem.
EDIT
This is what I mean by "invisible points"
invisible points
As a simple modification of what you are doing, you could try something like this:
def update(self, line):
# Get new data and update the counter
self.data_segment[self.ptr] = line[1]
self.ptr += 1
# Update the graph to show the last 256 samples
n = min( 256, len(self.data_segment) )
self.line_plot.setData(self.data_segment[-n:])
For an explicit downsampling of the data, you can try this
resampled_data = scipy.signal.resample( data, NumberOfPixels )
or to downsample the most recent set of N points,
n = min( N, len(self.data_segment) )
newdata = scipy.signal.resample( self.data_segment[-n:], NumberOfPixels )
self.line_plot.setData(newdata)
However, a good graphics engine should do this for your automatically.
A caveat in resampling or downsampling, is that the original data does not contain information or features on a scale that is too fast for the new scale after you resample or downsample. If it does, then the features will run together and you will get something that looks like your second graph.
Some general comments on coding signal acquisition, processing and display
It seems perhaps useful at this point to offer some general comments on working with and displaying signals.
In any signal acquisition, processing and display coding task, the architect or coder (sometimes by default), should understand (a) something of the physical phenomenon represented by the data, (b) how the information will be used, and (c) the physical characteristics of the measurement, signal processing, and display systems (c.f., bandwidths, sampling rates, dynamic range, noise characteristics, aliasing, effects of pixelation, and so forth).
This is a large subject, and not often completely described in any one text book. It seems to take some experience to pull it all together. Moreover, it seems to me that if you don't understand a measurement well enough to code it yourself, then you also don't know enough to use or rely on a canned routine. In other words, there is no substitute for understanding and the canned routine should be only a convenience and not a crutch. Even for the resampling algorithm suggested above, I would encourage its user to understand how it works and how it effects their signal.
In this particular example, we learn that the application is cardiography, type unspecified and that a great deal of latitude is left to the coder. As the coder then, we should try to learn about these kinds of measurements (c.f. heart in general and electro-,acoustic-, and echo- cardiography) and how they are performed and used, and try to find some examples.
P/S For anyone working with digital filters, if you have not formally studied the subject, it might useful to read the book "Digital Filters" by Hamming. Its available as a Dover book and affordable.
Pyqtgraph has downsampling implemented:
self.line_plot.setDownsampling(auto=True, method='peak')
Depending on how you created the line, you might instead have to use
self.line_plot.setDownsampling(auto=True, mode='peak')
There are other methods/modes available.
What can also slow down the drawing (and reactiveness of the UI) is continuously moving the shown XRange. Simply updating the position only every x ms or samples can help in that case. That also counts for the updating of the plots.
I use pyqtgraph to plot the live data coming in from three vibration sensors with a sampling rate of 12800 kSamples/second. For the plot I viewed a time window of 10 seconds per sensor (so a total of 384000 samples). The time shown includes reading the data, plotting it and regularly calculating and plotting FFTs, writing to a database, etc. For the "no downsampling" part, I turned off the downsampling for one of the three plots.
It is more than fast enough that I haven't bothered with multithreading or anything like that.
I recently worked on a code that allowed to display a simulation of particles' motions in a periodical space. In concrete terms, it resulted in a 2D plot provided with N points (N ~ 10^4) initially gathered at the center, then spread out according to a matching velocity. As it is a periodical space, any points that would go beyond the upper limit is actually brought back to the lower limit, and vice versa. To illustrate, here are two images :
Initial positions
After a certain time
Each points are supposed to travel horizontally, either to the right or to the left (respectively positive or negative velocity).
I programmed it using Python, but now, in the scope of my project, I'd like to simulate the same thing but on a torus. To give you a good glimpse of how it looked like, please take a look at the following pic :
Transformation from a rectangle to a torus
(Imagine my initial 2D plan is the initial rectangle, which I'd like to transform into the final torus).
Therefore, in that case we would see every particle moving on the surface of the torus. The previous 1st picture would correspond to particles gathered on a "single" circus of the torus, and the previous 2nd picture would correspond to the "filling up" the entire surface of the torus.
Since my code for previous simulations was written in Python, I am wondering if I can still use it for this task. If so, I'd like to have some clues about how to do it, and otherwise, what would be the best language to use for this ?
I hope I have been clear. I apologize in advance for some mistakes I could have done with English.
How can I find the actual real world velocity of an object using the optical flow information obtained from two images? Can anyone help me out?
as the commentators have already said we need some more information on your problem.
Basically: Yes, it is possible to calculate real world velocity from an image
But all of this depends on the following things:
Is your camera fixed or is it maybe even moving
Do you try to calculate velocity of any object moving anywhere on the scene or do you have a fixed lane, like a street filmed with a mounted camera and objects (cars) will always move along one lane?
If the latter, can you do measurements on the street in real world? Like marking points on the boardwalk (permanently or simply to find out to how long a distance of x meters in real world will appear on your camera image in pixels)
if you cannot do those measurements in the real world scene you will need to provide information on angle of the camera to the scene/ground level, distance of the camera to the scene, and parameters of your camera.
For calculating the velocity of any tracked object on the scene you'd probably need all the latter stuff to really calculate the distances in the scene. But this is much more difficult.
If you have the case of a fixed lane where you i.e. want to measure a car's velocity I would prefer the method with measuring or marking points in real world.
Because if have that information:
x m = y px
and an object has moved y px in t time (you get that time by the refreshment rate of your calculation) you can calculate how many pixels it will have moved in 1 second and since you know how many pixels are one meter you'd know its speed in meters per second (or any other unit you prefer.
You could also just set your two marks in the scene and simply measure, how many frames (and therefore how much time) the object needed to move from one marking to the other. This would give you a more averaged velocity since if you do calculations in small time steps you might get a noisy result due to segmentation problems or simply because changes are fairly small between the shorter the measured timespan is.
Well and for segmentation you could simply try a substraction method. Substract two or three following frames from each other. Moving objects (and therefore image parts that have changed) will result in non-zero values whereas color values of a steady image part should substract to something about 0.
Maybe that helps you with your problem... but of couse this depends on your setting and your desired goal... You'll need to provide more information then...
This method is quite long but in short:
What you can do is set a value that specifies the distance of object from camera.
Then capture first frame and save it somewhere.
Capture last frame and save it somewhere.
Apply threshold on both the frames.
Trim all the pixels from left of first frame and then do the same for second frame.
For detail tutorial I think this article may help you a bit.
http://morefunscience.blogspot.in/2012/05/calculating-speed-using-webcam.html