Okay I know what might come to people's mind is initially that I use OpenCV to solve my problem - except that unfortunately I can't implement it in this case (I'm using a Raspberry PI + the camera module). I have a motion detection script, but the problem is throughout the day shadows are cast or sometimes the camera adjust it's sensitivity - which to the algorithm it detects as a movement. I'm trying to figure out what is a way I can get around this problem. Currently I take an image A and calculate the different between that and image B, and compute the entropy of those images, I do this for ten images average them to create a moving average so that the current image I can compare it to this average entropy if it's outside of 1 standard deviation I take this to be indicative of motion.
I should add since I don't really understand OpenCV - is there potentially a way to have it compare two images and detect motion from that? In which case I'm already generating images as the camera is taking photos I could find a way to feed them to OpenCV using python and maybe get around my sunlight exposure issue?
My current plan of attack is to try and average say the last ten frames into one and then compute the difference, is there a good way to do this in Python? Also say a dog is found in image 10, does that mean that for 10 images following even if it's gone in 11 it will still detect there is motion because the average of those ten frames till it is removed from the queue will have the dog present?
Related
I'm training a neural network on stimuli which are being developed to mimic a sensory neuroscience task to compare performance to human results.
The task is based on spatial localization of audio. I need to generate white noise audio in python to present to the neural network, but also need to alter the audio as if it were presented at different locations. I understand how I'd generate the audio, but I'm not sure on how to generate the white noise from different theoretical locations.
You can add a delay to the right or left track, to account for the arrival time at the two ears. If I recall correctly, it amounts to up to about 25 or 30 milliseconds, depending on the angle. The travel distance disparity from source to the two ears can be calculated with basic trigonometry, and then multiplied by speed of sound in air to get the delay length. (IDK what python has for controlling delays or to what granularity delay lengths can be specified.)
Most of the other cues we have for spacial location are a lot harder to quantify. Most commonly we use volume, of course. Especially for higher-pitched content (wavelengths smaller than the width of the head) the head itself can block and cause some volume differences, based on the angle.
But a lot comes from reverberation for environmental cues, from timbrel roll-off as a function of distance (a quiet sound with lots of highs in the mix can really sound like they are right next to your ear), from moving the head to capture the sound from different angles, and from the filtering effects of the pinna of the ear. Because everyone's ear shape is different, I don't know that there is a universal thumbnail algorithm for what causes a sound to be sensed as originating from a particular altitude for a given angle. I think to some extent we just all learn by experiencing the sounds with our own particular ears while observing the sound source visually.
I am just starting with computer vision and I do not have much experience with this area. Therefore sorry for little bit generic question but I am not sure how to start and go in the correct direction.
Like in the title. I am building a system which is able to capture the image from the camera and I would like to detect if the 2 lines of stitches / seams are parallel to each other and if the gap between the lines is in specified limits / threshold. See below sample picture:
Can those lines be detected by some functions in open cv or do I need to use machine learning approach and built a model which will recognize the single stitch on the picture and then based on the detection draw new lines and perform calculations?
I have an air drone with four motors and wanted to make it fly between two straight lines.
The first problem:
its initial position will be in the middle at certain height but because of the air factors it may deviate (up or down) or (left or right). I have calculated the error when it deviates left or right using the camera, but still don't know how to calculate the error of the height (using the camera too without pressure sensor).
The second problem:
after calculating these errors how to convert them from an integer to a real move.
Sorry, I couldn't provide my code. it is too large and complicated.
1) Using a single camera to calculate distance is not enough.
However, if you're using a stereo camera, you can get a distance data pretty easily. If you want to avoid using a pressure sensor, you may want to consider using a distance sensor(LIDAR or ultrasonic: check the maximum range on these) to measure the height at which your drone will fly. In addition to this, you'll require a error control algorithm eg. PID algorithm to make your drone fly at a constant height.
This is a fantastic source for understanding the fundamentals of PID.
2)For implementation:
In my opinion, this video is awesome for understanding how your sensor data will get converted to an actual movement and will help you can create an analogy. You'll also get a headstart on the code provided.
I have a video file of evening time ( 6pm-9pm). And I want to detect movement of people on the road.
While trying to find the difference between a handful of images from "10 minute" time frame videos (10 equally time spaced images within any 10 minutes video frame clip) I'm facing these challenges:
All the images are coming as different (coming as Alert) because there is some plant moving due to wind all the time.
All the 10 images are coming different also because the sun is setting down and hence due to "natural light variation" the 10
images from 10 minute frames after coming different even though
there is no public/human movement.
How do I restrict my algorithm to focus only on movements ion certain area of the video rather than all of it ? (Couldn't find
anything on google or dont know if there's any algo in opencv for this)
This one is rather difficult to deal with. I recommend you try to blur the frames a little bit to reduce the noises from moving plants. Also, if the range of the movement is not so large, try changing the difference threshold and area threshold (if your algorithm contains contour detection as the following step). Hope this can help a little bit.
For detecting "movement" of people, a (10 frame/10 min) fps is a little too low. People in the frames can be totally different. This means you cannot detect the movement of a single person, but to find the differences between two frames. In the case where you are using low fps videos, I recommend you try Background Subtraction, to find people in the frames instead of people movements between the frames. For Background Subtraction, to solve
All the 10 images are coming different also because the sun is setting down and hence due to "natural light variation" the 10 images from 10 minute frames after coming different even though there is no public/human movement.
you can try using the average image of all frames as the background_img in
difference = current_img - background_img
If the time span is longer, you can use the average of images more recent to current_img as background_img. And keep updating background_img when running the video.
If your ROI is a rectangle in the frame, use
my_ROI = cv::Rect(x, y, width, height)
cv::Mat ROI_img= frame(my_ROI)
If not, try using a mask.
I think what you are looking for is a Pedestrian Detection. You can do this easily in Python with OpenCV package.
# Initialize a HOG descriptor
hog = cv2.HOGDescriptor()
# Set it for Pedestrian Detection
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
# Then use the detector
hog.detectMultiScale()
Exemple : Pedestrian Detection OpenCV
I am building a system which detects coins that are picked up from a tray. This tray will be kept in a public place. People will pick up one or more coins, but would be expected to keep them back after some time.
I would have a live stream through a webcam placed at the top. I will have a calibration step, say at the beginning of the day, that captures the initial state of the tray to be used for comparing with the live feed. A few slots might be empty to begin with, as you can see in the sample image.
I need to detect slots that had a coin initially, but are missing the same at any given point of time during the day.
I am trying out a few approaches using OpenCV:
SSIM difference: I can use SSIM to find diff between my live image frame and initial state. However, a number of slots are larger than the corresponding coin sizes (e.g. top two rows). This could mean that if the coin was originally placed at the center, but was later put back to touch one of the edges, we may get a false positive.
Blob detection: Alternatively, I can pre-feed (or detect) slot co-ordinates. Then do a blob detection within every slot. If a blob was present in the original state, but is missing in a camera frame, this would mean a coin has been picked up. However, accurate blob detection could be a challenge if the contrast between the coin and the tray is low.
I might also need to watch out for slight variations in lighting due to shadows of people moving around.
Any thoughts on these or any pointers on alternate approaches that can be tried out? Is there any analogous implementation that I can learn from?
Many thanks in advance.
Edit: Thanks to #I.Newton's suggestion. For those who stumble upon this question and would benefit from a sample implementation, look here: https://github.com/kewats/computer-vision-samples/tree/master/image-processing/missing-coins-detection
If you complete control over the lighting conditions, you can use simple color thresholding to solve the problem.
First make a mask for the boxes. You can do it in multiple ways by color threshold or by using adaptive threshold or canny edge etc. I did by color threshold
Then make a mask for the coins by the same method.
Now flood fill your box mask from from the center of each of this coins. It'll retain only those which do not have the coins.
Now you can compare this with your initial mask to figure out if all the coins are present
This does not include frame subtraction. So you need not worry about different position of coin in the box. Only thing you need to make sure is the lighting conditions for making the masks. If you want to make sure the coins are returned to the same box, you should go for template matching etc which again needs effort.