Easy monocular camera self-calibration algorithm

Easy monocular camera self-calibration algorithm - python

I have a video of a road/building and I want to create a 3D model out of it. The scene I am looking at is rigid and the drone is moving. I assume not having any extra info like camera pose, accelerations or GPS position. I would love to find a python implementation that I can adapt to my liking.
So far, I have decided to use the OpenCV calcOpticalFlowFarneback() for optical flow, which seems reasonably fast and accurate. With it, I can get the Fundamental Matrix F with findFundamentalMat(). So far so good.
Now, according to the tutorial I am following here, I am supposed to magically have the Calibration Matrix of the camera, which I obviously don't have nor plan to have available in the future app I am developing.
After some long research, I have found a paper (Self-calibration of a moving camera from point correspondences and
fundamental matrices) from 1997 that defines what I am looking for (with a nice summary here). I am looking for the simplest/easiest implementation possible, and I am stuck with these problems:
If the camera I am going to use changes exposure and focus automatically (no zoom), are the intrinsic parameters of the camera going to change?
I am not familiar with the Homotopy Continuation Method for solving equations numerically, plus they seem to be slow.
I intend to use the Extended Kalman Filter, but do not know where to start, knowing that a bad initialization leads to non-convergence.
Digging some more I found a Multi Camera Self Calibration toolbox open-source written for Octave with a Python wrapper. My last resort will be to break down the code and write it in Python directly. Any other options?
Note: I do not want to use the a chess board nor the planarity constraint.
Is there any other way to very accurately self-calibrate my camera? After 20 years of research since 1997, has anyone come up with a more straightforward method??

Is this a one-shot thing, or are you developing an app to process lots videos like these automatically?
If the former, I'd rather use an integrated tool like Blender. Look up one of the motion tracking (or "matchmoving") tutorials on youtube to get an idea of it, for example this one.

Related

Deep learning person detection with opencv

so I'm really new here. Currently working on a public art project where I need a little help with the programming because I'm kind off lost between codes.
First I'll give you a short description of the goal of the work and then state my problem.
I'm putting a webcam in the shopwindow of a gallery that is facing out on a public street. This webcam is connected to a tv screen that is facing outwards on the street so people see themselves being filmed (like cctv). Then if people stand still long enough for the camera the webcam makes an automatic screenshot what will be emailed to a site which hold a script for automatic attachment printing and the people from the street instantly come in to my gallery, on paper.
(and yes I have permission from the gallery to do this since it is slightly in the grey area of legality)
I come from a art background with interest in programming so this was all very very new for me and made it already quite far I think. I have a raspberry pi running with open cv and put a script on it for deep learning object detection (https://www.pyimagesearch.com/2017/09/18/real-time-object-detection-with-deep-learning-and-opencv/) < the link I used for that.
I also come across loads of pedestrian tracking but did not find a suitable code yet for a real time video stream.
So what I need from you guys, is a little help with how to make a timer in the script so that when people stand still long enough for the camera, it wil make the screenshot. It is a bit like reversed security cams script because they react on movement and I want it to react to no movement at all.
The automatic attachment printing part I got covered I think because there are a lot of scripts already on the internet.
If you have any tips or tricks.. please let me know.
Help a girl out!
Marije

There are a number of things you can try.
Is the camera faced towards a shopping street? In that case you could go for simple background subtraction. For each frame, apply some preprocessing (e.g. blurring, morpholoy operations), call findContours and compute the center of minEnclosingRect for each of these.
Another option is to use the inbuilt (and pretrained) HOG PeopleDetector. This is based on SVM (Support Vector Machines), which is another machine learning technique. For this to work efficiently you'd have to tune the parameters adequately. Since you're using a Pi you'd also need to consider the tradeoff between speed and accuracy. Using this technique, we'd be left with rectangles as well, so we can again compute the center.
For both techniques, you'd want to make sure that the center point doesn't fluctuate too much from frame to frame (that would mean the person is moving). For this you'd also want to take into account the framerate and understand that you can't guarantee person detection for every frame.
The caveat of the first technique, whilst having more explanatory power, would be that it'd detect ANYTHING that changes from frame to frame, that includes pets, bikes, cars (if on a public street) and so on. You could then consider filtering (e.g. by area, color).

Can Python + Qt combination produce a real time spectral analysis tool?

I want to develop a tool that does the following things.
take in a live voice recording
produce a real time spectrogram
show the time-domain signal
output few values extracted from the spectral analysis
All of these have to be kept updated in a window as the voice is recorded.
I have worked with numpy. But I'm completely new to Qt and other GUI builder tools. What's the best way to proceed given this situation? My peers recommended Qt after I explained them the task. If someone knew of a better tool to be used with python for this task, please let me know. Also, please help me with technical details as to how to capture the live stream and process it in python which is to be shown in a GUI window. One link that gave me some hope is http://www.swharden.com/blog/2010-03-05-realtime-fft-graph-of-audio-wav-file-or-microphone-input-with-python-scipy-and-wckgraph/ . But it is a bit difficult to comprehend it. May be a less intensive solution will help me in getting started.

In Qt 4.6, the QAudioInput API was added. This provides a cross-platform abstraction for getting an audio input signal, and therefore would be of use in achieving point (1).
As for (2) and (3), the Spectrum Analyzer demo which ships with Qt may be of interest.
Screenshot of Spectrum Analyzer demo running on Symbian http://labs.trolltech.com/blogs/wp-content/uploads/2010/05/spectrum.png
The implementation is in C++ rather than in Python, but it may be of use as a reference. Basically what you need for (2) is to calculate the Fast Fourier Transform of the input signal. You'll probably want to use a library which provides an FFT implementation rather than writing your own - that's the approach I took when writing the demo :)
As for (3), this is conceptually pretty simple, but requires a bit of thought in order to get a smoothly scrolling waveform. Take a look at the tiling approach used in the Waveform class in the demo for some tips.
I think by (4) you mean: reduce the large number of points in the FFT output to a small number of values. This is what the demo does in order to plot a bar chart for the spectrum. Again, refer to the demo code to see how the binning of frequency amplitudes is implemented.

Another example of a real-time audio spectrum analyzer using PyAudio, scipy, Chaco in one script can be found in the list of examples for Chaco. (Worked out-of-the-box on my Precise).

On Linux, this is definitely feasible. Other platforms too, but I can really only answer for Linux. Python isn't necessarily your sharpest tool for real-time DSP, but on a suitably modern machine and suitably modest goals you will be fine.
First, you need an interface to the Linux audio drivers. ALSA is pretty universal. There are several different Python wrappers for the ALSA libraries, see Python In Music for a list of libs and applications using them.
Then you do your spectral analysis. SciPy and NumPy have all that.
Then you draw into your Qt window. My expertise is in GTK but you probably want to create a QtCanvas (tutorial), which is an object-oriented drawing area that's designed for this kind of use.
Or you could just use SciPy, which can probably be convinced to do all of this! AudioLab in particular looks like it might be a big help.

How can I detect and track people using OpenCV?

I have a camera that will be stationary, pointed at an indoors area. People will walk past the camera, within about 5 meters of it. Using OpenCV, I want to detect individuals walking past - my ideal return is an array of detected individuals, with bounding rectangles.
I've looked at several of the built-in samples:
None of the Python samples really apply
The C blob tracking sample looks promising, but doesn't accept live video, which makes testing difficult. It's also the most complicated of the samples, making extracting the relevant knowledge and converting it to the Python API problematic.
The C 'motempl' sample also looks promising, in that it calculates a silhouette from subsequent video frames. Presumably I could then use that to find strongly connected components and extract individual blobs and their bounding boxes - but I'm still left trying to figure out a way to identify blobs found in subsequent frames as the same blob.
Is anyone able to provide guidance or samples for doing this - preferably in Python?

The latest SVN version of OpenCV contains an (undocumented) implementation of HOG-based pedestrian detection. It even comes with a pre-trained detector and a python wrapper. The basic usage is as follows:
from cv import *
storage = CreateMemStorage(0)
img = LoadImage(file) # or read from camera
found = list(HOGDetectMultiScale(img, storage, win_stride=(8,8),
padding=(32,32), scale=1.05, group_threshold=2))
So instead of tracking, you might just run the detector in each frame and use its output directly.
See src/cvaux/cvhog.cpp for the implementation and samples/python/peopledetect.py for a more complete python example (both in the OpenCV sources).

Nick,
What you are looking for is not people detection, but motion detection. If you tell us a lot more about what you are trying to solve/do, we can answer better.
Anyway, there are many ways to do motion detection depending on what you are going to do with the results. Simplest one would be differencing followed by thresholding while a complex one could be proper background modeling -> foreground subtraction -> morphological ops -> connected component analysis, followed by blob analysis if required. Download the opencv code and look in samples directory. You might see what you are looking for. Also, there is an Oreilly book on OCV.
Hope this helps,
Nand

This is clearly a non-trivial task. You'll have to look into scientific publications for inspiration (Google Scholar is your friend here). Here's a paper about human detection and tracking: Human tracking by fast mean shift mode seeking

This is similar to a project we did as part of a Computer Vision course, and I can tell you right now that it is a hard problem to get right.
You could use foreground/background segmentation, find all blobs and then decide that they are a person. The problem is that it will not work very well since people tend to go together, go past each other and so on, so a blob might very well consist of two persons and then you will see that blob splitting and merging as they walk along.
You will need some method of discriminating between multiple persons in one blob. This is not a problem I expect anyone being able to answer in a single SO-post.
My advice is to dive into the available research and see if you can find anything there. The problem is not unsolvavble considering that there exists products which do this: Autoliv has a product to detect pedestrians using an IR-camera on a car, and I have seen other products which deal with counting customers entering and exiting stores.

Can 3D OpenGL game written in Python look good and run fast?

I am planning to write an simple 3d(isometric view) game in Java using jMonkeyEngine - nothing to fancy, I just want to learn something about OpenGL and writing efficient algorithms (random map generating ones).
When I was planning what to do, I started wondering about switching to Python. I know that Python didn't come into existence to be a tool to write 3d games, but is it possible to write good looking games with this language?
I have in mind 3d graphics, nice effects and free CPU time to power to rest of game engine? I had seen good looking java games - and too be honest, I was rather shocked when I saw level of detail achieved in Runescape HD.
On the other hand, pygame.org has only 2d games, with some starting 3d projects. Are there any efficient 3d game engines for python? Is pyopengl the only alternative? Good looking games in python aren't popular or possible to achieve?
I would be grateful for any information / feedback.

If you are worried about 3D performance: Most of the performance-critical parts will be handled by OpenGL (in a C library or even in hardware), so the language you use to drive it should not matter too much.
To really find out if performance is a problem, you'd have to try it. But there is no reason why it cannot work in principle.
At any rate, you could still optimize the critical parts, either in Python or by dropping to C. You still gain Python's benefit for most of the game engine which is less performance-critical.

Yes. Eve Online does it.
http://support.eve-online.com/Pages/KB/Article.aspx?id=128

I did a EuroPython talk about my amateur attempts to drive OpenGL from Python:
http://pyvideo.org/video/381/pycon-2011--algorithmic-generation-of-opengl-geom
The latest version of the code I'm talking about is here:
https://github.com/tartley/gloopy
It's billed as a 'library', but that was naive of me: It's a bunch of personal experimental code.
Nevertheless, it demonstrates that you can move around hundreds of bits of geometry at 60fps from Python.
Although the demo above is fairly bare-bones in that it uses simply geometry and untextured faces, one thing I found is that more detailed geometry, texture mapping or other more modern graphics effects don't substantially affect the framerate. Or at least they don't affect it any worse than using the same effects in a C program. These are run on the GPU, so it doesn't make any difference at all if your program is written in Python.
One thing that is performance-sensitive from Python is if you are creating dynamic geometry on the CPU side, e.g. moving individual vertices within a shape, by bending or melting the shape. Doing this sort of per-vertex calculation in Python, then constructing a new ctypes array from the result, then shunting this geometry to the GPU, every frame, will be slow. Instead you should probably be doing this in a vertex shader.
On the other hand, if you just want affine transformations (moving objects around, rotating them, opening chests of drawers, rotating car wheels, bending a jointed robot arm) then all of this can be done by the GPU and the fact your program is written in Python makes little difference to the performance.

You might want to check out Python-Ogre. I just messed with it myself, nothing serious, but seems pretty good.

I would recommend pyglet which is a similar system to pygame, but with full bindings to OpenGL. You can start with simple 2D games to get the hang of the system and work up to 3D later. It is a more modern system than PyGame which is built around SDL which itself is a bit long in the tooth these days.

Perhaps a wee bit off topic but, if your goal is to learn Python, how about creating a game using IronPython and XNA? XNA is not OpenGL though, yet I find it an extremely simple 2D/3D engine which is fast and supports Shader Model 3.0.

Check out the Frets on Fire project -- an open source Guitar Hero alternative. It's written in Python and has decent 3D graphics in OpenGL. I would suggest checking out its sources for hints on libraries etc.

There was a Vampires game out a few years ago where most if not all of the code was in Python. Not sure if the 3D routines were in them, but it worked fine.

How to create a picture with animated aspects programmatically

Background
I have been asked by a client to create a picture of the world which has animated arrows/rays that come from one part of the world to another.
The rays will be randomized, will represent a transaction, will fade out after they happen and will increase in frequency as time goes on. The rays will start in one country's boundary and end in another's. As each animated transaction happens a continuously updating sum of the amounts of all the transactions will be shown at the bottom of the image. The amounts of the individual transactions will be randomized. There will also be a year showing on the image that will increment every n seconds.
The randomization, summation and incrementing are not a problem for me, but I am at a loss as to how to approach the animation of the arrows/rays.
My question is what is the best way to do this? What frameworks/libraries are best suited for this job?
I am most fluent in python so python suggestions are most easy for me, but I am open to any elegant way to do this.
The client will present this as a slide in a presentation in a windows machine.

The client will present this as a slide in a presentation in a windows machine
I think this is the key to your answer. Before going to a 3d implementation and writing all the code in the world to create this feature, you need to look at the presentation software. Chances are, your options will boil down to two things:
Animated Gif
Custom Presentation Scripts
Obviously, an animated gif is not ideal due to the fact that it repeats when it is done rendering, and to make it last a long time would make a large gif.
Custom Presentation Scripts would probably be the other way to allow him to bring it up in a presentation without running any side-programs, or doing anything strange. I'm not sure which presentation application is the target, but this could be valuable information.
He sounds like he's more non-technical and requesting something he doesn't realize will be difficult. I think you should come up with some options, explain the difficulty in implementing them, and suggest another solution that falls into the 'bang for your buck' range.

If you are adventurous use OpenGL :)
You can draw bezier curves in 3d space on top of a textured plane (earth map), you can specify a thickness for them and you can draw a point (small cone) at the end. It's easy and it looks nice, problem is learning the basics of OpenGL if you haven't used it before but that would be fun and probably useful if your in to programing graphics.
You can use OpenGL from python either with pyopengl or pyglet.
If you make the animation this way you can capture it to an avi file (using camtasia or something similar) that can be put onto a presentation slide.

It depends largely on the effort you want to expend on this, but the basic outline of an easy way. Would be to load an image of an arrow, and use a drawing library to color and rotate it in the direction you want to point(or draw it using shapes/curves).
Finally to actually animate it interpolate between the coordinates based on time.
If its just for a presentation though, I would use Macromedia Flash, or a similar animation program.(would do the same as above but you don't need to program anything)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.