I have an air drone with four motors and wanted to make it fly between two straight lines.
The first problem:
its initial position will be in the middle at certain height but because of the air factors it may deviate (up or down) or (left or right). I have calculated the error when it deviates left or right using the camera, but still don't know how to calculate the error of the height (using the camera too without pressure sensor).
The second problem:
after calculating these errors how to convert them from an integer to a real move.
Sorry, I couldn't provide my code. it is too large and complicated.
1) Using a single camera to calculate distance is not enough.
However, if you're using a stereo camera, you can get a distance data pretty easily. If you want to avoid using a pressure sensor, you may want to consider using a distance sensor(LIDAR or ultrasonic: check the maximum range on these) to measure the height at which your drone will fly. In addition to this, you'll require a error control algorithm eg. PID algorithm to make your drone fly at a constant height.
This is a fantastic source for understanding the fundamentals of PID.
2)For implementation:
In my opinion, this video is awesome for understanding how your sensor data will get converted to an actual movement and will help you can create an analogy. You'll also get a headstart on the code provided.
Related
I have a picture of human eye taken roughly 10cm away using a mobile phone(no specifications regarding the camera). After some detection and contouring, I got 113px as the Euclidean distance between the center of the detected iris and the outermost edge of iris on the taken image. Dimensions of the image: 483x578px.
I tried converting the pixels into mm by simply multiplying the number of pixels with the size of a pixel in mm since 1px is roughly equal to 0.264mm which gives the proper length only if the image is in 1:1 ratio wrt to the real-time eye which is not the case here.
Edit:
Device used: One Plus 7T
View of range = 117 degrees
Aperture = f/2.2
Distance photo was taken = 10 cm (approx)
Question:
Is there an optimal way to find the real time radius of this particular eye with the amount of information I have gathered through processing thus far and by not including a reference object within the image?
P.S. The actual HVID of the volunteer's iris is 12.40mm taken using Sirus(A hi-end device to calculate iris radius and I'm trying to simulate the same actions using Python and OpenCV)
After months I was able to come up with the result after ton of research and lots of trials and errors. This is not the most ideal answer but it gave me expected results with decent precision.
Simply, In order to measure object size/distance from the image we need multiple parameters. In my case, I was trying to measure the diameter of iris from a smart phone camera.
To make that possible we need to know the following details prior to the calculation
1. The Size of the physical sensor (height and width) (usually in mm)
(camera inside the smart phone whose details can be obtained from websites on the internet but you need to know the exact brand and version of the smart phone used)
Note: You cannot use random values for these, otherwise you will get inaccurate results. Every step/constraint must be considered carefully.
2. The Size of the image taken (pixels).
Note: Size of the image can be easily obtained used img.shape but make sure the image is not cropped. This method relies on the total width/height of the original smartphone image so any modifications/inconsistencies would result in inaccurate results.
3. Focal Length of the Physical Sensor (mm)
Note: Info regarding focal length of the sensor used can be acquired from the internet and random values should not be given. Make sure you are taking images with auto focus feature disabled so the focal length is preserved. Incase if you have auto focus on then the focal length will be constantly changing and the results will be all over the place.
4. Distance at which the image is taken (Very Important)
Note: As "Christoph Rackwitz" told in the comment section. The distance from which the image is taken must be known and should not be arbitrary. Head cannoning a number as input will always result in inaccuracy for sure. Make sure you properly measure the distance from sensor to the object using some sort of measuring tool. There are some depth detection algorithms out there in the internet but they are not accurate in most cases and need to calibrated after every single try. That is indeed an option if you dont have any setup to take consistent photos but inaccuracies are inevitable especially in objects like iris which requires medical precision.
Once you have gathered all these "proper" information the rest is to dump these into a very simple equation which is a derivative of the "Similar Traingles".
Object height/width on sensor (mm) = Sensor height/width (mm) × Object height/width (pixels) / Sensor height/width (pixels)
Real Object height (in units) = Distance to Object (in units) × Object height on sensor (mm) / Focal Length (mm)
In the first equation, you must decide from which axis you want to measure. For instance, if the image is taken in portrait and you are measuring the width of the object on the image, then input the width of the image in pixels and width of the sensor in mm
Sensor height/width in pixels is nothing but the size of the "image"
Also you must acquire the object size in pixels by any means.
If you are taking image in landscape, make sure you are passing the correct width and height.
Equation 2 is pretty simple as well.
Things to consider:
No magnification (Digital magnification can destroy any depth info)
No Autofocus (Already Explained)
No cropping/editing image size/resizing (Already Explained)
No image skewing.(Rotating the image can make the image unfit)
Do not substitute random values for any of these inputs (Golden Advice)
Do not tilt the camera while taking images (Tilting the camera can distort the image so the object height/width will be altered)
Make sure the object and the camera is exactly in the same line
Don't use EXIF data of the image (EXIF data contains depth information which is absolute garbage since they are not accurate at all. DO NOT CONSIDER THEM)
Things I'm unsure of till now:
Lens distortion / Manufacturing defects
Effects of field of view
Perspective Foreshortening due to camera tilt
Depth field cameras
DISCLAIMER: There are multiple ways to solve this issue but I chose to use this method and I highly recommend you guys to explore more and see what you can come up with. You can basically extend this idea to measure pretty much any object using a smartphone (given the images that a normal smart phone can take)
(Please don't try to measure the size of an amoeba with this. Simply won't work but you can indeed take some of the advice I have gave for your advantage)
If you have cool ideas and issues with my answers. Please feel free to let me know I would love to have discussions. Feel free to correct me if I have made any mistakes and misunderstood any of these concepts.
Final Note:
No matter how hard you try, you cannot make something like a smartphone to work and behave like a camera sensor which is specifically designed to take images for measuring purposes. Smart phone can never beat those but sure we can manipulate the smart phone camera to achieve similar results upto a certain degree. So you guys must keep this in mind and I learnt it the hard way
I'm training a neural network on stimuli which are being developed to mimic a sensory neuroscience task to compare performance to human results.
The task is based on spatial localization of audio. I need to generate white noise audio in python to present to the neural network, but also need to alter the audio as if it were presented at different locations. I understand how I'd generate the audio, but I'm not sure on how to generate the white noise from different theoretical locations.
You can add a delay to the right or left track, to account for the arrival time at the two ears. If I recall correctly, it amounts to up to about 25 or 30 milliseconds, depending on the angle. The travel distance disparity from source to the two ears can be calculated with basic trigonometry, and then multiplied by speed of sound in air to get the delay length. (IDK what python has for controlling delays or to what granularity delay lengths can be specified.)
Most of the other cues we have for spacial location are a lot harder to quantify. Most commonly we use volume, of course. Especially for higher-pitched content (wavelengths smaller than the width of the head) the head itself can block and cause some volume differences, based on the angle.
But a lot comes from reverberation for environmental cues, from timbrel roll-off as a function of distance (a quiet sound with lots of highs in the mix can really sound like they are right next to your ear), from moving the head to capture the sound from different angles, and from the filtering effects of the pinna of the ear. Because everyone's ear shape is different, I don't know that there is a universal thumbnail algorithm for what causes a sound to be sensed as originating from a particular altitude for a given angle. I think to some extent we just all learn by experiencing the sounds with our own particular ears while observing the sound source visually.
I am planning to acquire position in 3D cartesian coordinates from an IMU (Inertial Sensor) containing Accelerometer and Gyroscope. I'm using this to track the objects position and trajectory in 3D.
1- From my limited knowledge I was under the assumption that Accelerometer alone would be enough, resulting in acceleration in xyz axis A(Ax,Ay,Az) and would need to be integrated twice to get velocity and then position, but integrating would add an unknown constant value, this error called drift increases with time. How to remove this error?
2- Furthermore, why is there a need for gyroscope in the first place, cant we just translate the x-y-z axis acceleration to displacement, if accelerometer tells the axis of motion then why check orientation from Gyroscopes. Sorry this is a very basic question, everywhere I checked both Gyro+Accel were used but don't know why.
3- Even when stationary and not in any motion there is earth's gravitation force acting on the sensor which will always give values more than that attributed by the motion of sensor. How do you remove the gravity?
Once this has been done ill apply Kalman Filters to them to fuse them and to smooth the values. How accurate is this method for trajectory estimation of an object for environments where GPS is not an option. I'm getting the Accelerometer and Gyroscope values from arduino and then importing to Python where it will be plotted on a 3D graph updating in real time. Any help would be highly appreciated, especially links to similar codes.
1 - An accelerometer can be calibrated to account for some of this drift but in the end no sensor is perfect and inaccuracy will inevitably cause drift. To fix this you would need some filter such as the Kalman filter to use the accelerometer for short high frequency data, and a secondary sensor such as a camera to periodically get the absolute position and update the internal position. This is the fundamental idea behind the Kalman filter.
2 - Accelerometers aren't very good for high frequency rotational data. Just using the accelerometers data would mean the system could not differentiate between a horizontal linear acceleration and rotational position. The gyroscope is used for the high frequency data while the accelerometer is used for low frequency data to adjust and counteract the rotational drift. A Kalman filter is one possible solution to this problem and there are many great online resources explaining this.
3 - You would have to use the methods including gyro / accel sensor fusion to get the 3d orientation of the sensor and then use vector math to subtract 1g from that orientation.
You would most likely be better off looking at some online resources to get the gist of it and then using a pre-built sensor fusion system whether it be a library or an fusion system on the accelerometer (on most accelerometers today including the mpu6050). These onboard systems typically do a better job then a simple Kalman filter and can combine other sensors such as magnetometers to gain even more accuracy.
I recently worked on a code that allowed to display a simulation of particles' motions in a periodical space. In concrete terms, it resulted in a 2D plot provided with N points (N ~ 10^4) initially gathered at the center, then spread out according to a matching velocity. As it is a periodical space, any points that would go beyond the upper limit is actually brought back to the lower limit, and vice versa. To illustrate, here are two images :
Initial positions
After a certain time
Each points are supposed to travel horizontally, either to the right or to the left (respectively positive or negative velocity).
I programmed it using Python, but now, in the scope of my project, I'd like to simulate the same thing but on a torus. To give you a good glimpse of how it looked like, please take a look at the following pic :
Transformation from a rectangle to a torus
(Imagine my initial 2D plan is the initial rectangle, which I'd like to transform into the final torus).
Therefore, in that case we would see every particle moving on the surface of the torus. The previous 1st picture would correspond to particles gathered on a "single" circus of the torus, and the previous 2nd picture would correspond to the "filling up" the entire surface of the torus.
Since my code for previous simulations was written in Python, I am wondering if I can still use it for this task. If so, I'd like to have some clues about how to do it, and otherwise, what would be the best language to use for this ?
I hope I have been clear. I apologize in advance for some mistakes I could have done with English.
How can I find the actual real world velocity of an object using the optical flow information obtained from two images? Can anyone help me out?
as the commentators have already said we need some more information on your problem.
Basically: Yes, it is possible to calculate real world velocity from an image
But all of this depends on the following things:
Is your camera fixed or is it maybe even moving
Do you try to calculate velocity of any object moving anywhere on the scene or do you have a fixed lane, like a street filmed with a mounted camera and objects (cars) will always move along one lane?
If the latter, can you do measurements on the street in real world? Like marking points on the boardwalk (permanently or simply to find out to how long a distance of x meters in real world will appear on your camera image in pixels)
if you cannot do those measurements in the real world scene you will need to provide information on angle of the camera to the scene/ground level, distance of the camera to the scene, and parameters of your camera.
For calculating the velocity of any tracked object on the scene you'd probably need all the latter stuff to really calculate the distances in the scene. But this is much more difficult.
If you have the case of a fixed lane where you i.e. want to measure a car's velocity I would prefer the method with measuring or marking points in real world.
Because if have that information:
x m = y px
and an object has moved y px in t time (you get that time by the refreshment rate of your calculation) you can calculate how many pixels it will have moved in 1 second and since you know how many pixels are one meter you'd know its speed in meters per second (or any other unit you prefer.
You could also just set your two marks in the scene and simply measure, how many frames (and therefore how much time) the object needed to move from one marking to the other. This would give you a more averaged velocity since if you do calculations in small time steps you might get a noisy result due to segmentation problems or simply because changes are fairly small between the shorter the measured timespan is.
Well and for segmentation you could simply try a substraction method. Substract two or three following frames from each other. Moving objects (and therefore image parts that have changed) will result in non-zero values whereas color values of a steady image part should substract to something about 0.
Maybe that helps you with your problem... but of couse this depends on your setting and your desired goal... You'll need to provide more information then...
This method is quite long but in short:
What you can do is set a value that specifies the distance of object from camera.
Then capture first frame and save it somewhere.
Capture last frame and save it somewhere.
Apply threshold on both the frames.
Trim all the pixels from left of first frame and then do the same for second frame.
For detail tutorial I think this article may help you a bit.
http://morefunscience.blogspot.in/2012/05/calculating-speed-using-webcam.html