Generating Spatial White Noise audio in Python

Generating Spatial White Noise audio in Python - python

I'm training a neural network on stimuli which are being developed to mimic a sensory neuroscience task to compare performance to human results.
The task is based on spatial localization of audio. I need to generate white noise audio in python to present to the neural network, but also need to alter the audio as if it were presented at different locations. I understand how I'd generate the audio, but I'm not sure on how to generate the white noise from different theoretical locations.

You can add a delay to the right or left track, to account for the arrival time at the two ears. If I recall correctly, it amounts to up to about 25 or 30 milliseconds, depending on the angle. The travel distance disparity from source to the two ears can be calculated with basic trigonometry, and then multiplied by speed of sound in air to get the delay length. (IDK what python has for controlling delays or to what granularity delay lengths can be specified.)
Most of the other cues we have for spacial location are a lot harder to quantify. Most commonly we use volume, of course. Especially for higher-pitched content (wavelengths smaller than the width of the head) the head itself can block and cause some volume differences, based on the angle.
But a lot comes from reverberation for environmental cues, from timbrel roll-off as a function of distance (a quiet sound with lots of highs in the mix can really sound like they are right next to your ear), from moving the head to capture the sound from different angles, and from the filtering effects of the pinna of the ear. Because everyone's ear shape is different, I don't know that there is a universal thumbnail algorithm for what causes a sound to be sensed as originating from a particular altitude for a given angle. I think to some extent we just all learn by experiencing the sounds with our own particular ears while observing the sound source visually.

Related

Is it possible to calculate real-time distance of an object in a image w/o reference objects?

I have a picture of human eye taken roughly 10cm away using a mobile phone(no specifications regarding the camera). After some detection and contouring, I got 113px as the Euclidean distance between the center of the detected iris and the outermost edge of iris on the taken image. Dimensions of the image: 483x578px.
I tried converting the pixels into mm by simply multiplying the number of pixels with the size of a pixel in mm since 1px is roughly equal to 0.264mm which gives the proper length only if the image is in 1:1 ratio wrt to the real-time eye which is not the case here.
Edit:
Device used: One Plus 7T
View of range = 117 degrees
Aperture = f/2.2
Distance photo was taken = 10 cm (approx)
Question:
Is there an optimal way to find the real time radius of this particular eye with the amount of information I have gathered through processing thus far and by not including a reference object within the image?
P.S. The actual HVID of the volunteer's iris is 12.40mm taken using Sirus(A hi-end device to calculate iris radius and I'm trying to simulate the same actions using Python and OpenCV)

After months I was able to come up with the result after ton of research and lots of trials and errors. This is not the most ideal answer but it gave me expected results with decent precision.
Simply, In order to measure object size/distance from the image we need multiple parameters. In my case, I was trying to measure the diameter of iris from a smart phone camera.
To make that possible we need to know the following details prior to the calculation
1. The Size of the physical sensor (height and width) (usually in mm)
(camera inside the smart phone whose details can be obtained from websites on the internet but you need to know the exact brand and version of the smart phone used)
Note: You cannot use random values for these, otherwise you will get inaccurate results. Every step/constraint must be considered carefully.
2. The Size of the image taken (pixels).
Note: Size of the image can be easily obtained used img.shape but make sure the image is not cropped. This method relies on the total width/height of the original smartphone image so any modifications/inconsistencies would result in inaccurate results.
3. Focal Length of the Physical Sensor (mm)
Note: Info regarding focal length of the sensor used can be acquired from the internet and random values should not be given. Make sure you are taking images with auto focus feature disabled so the focal length is preserved. Incase if you have auto focus on then the focal length will be constantly changing and the results will be all over the place.
4. Distance at which the image is taken (Very Important)
Note: As "Christoph Rackwitz" told in the comment section. The distance from which the image is taken must be known and should not be arbitrary. Head cannoning a number as input will always result in inaccuracy for sure. Make sure you properly measure the distance from sensor to the object using some sort of measuring tool. There are some depth detection algorithms out there in the internet but they are not accurate in most cases and need to calibrated after every single try. That is indeed an option if you dont have any setup to take consistent photos but inaccuracies are inevitable especially in objects like iris which requires medical precision.
Once you have gathered all these "proper" information the rest is to dump these into a very simple equation which is a derivative of the "Similar Traingles".
Object height/width on sensor (mm) = Sensor height/width (mm) × Object height/width (pixels) / Sensor height/width (pixels)
Real Object height (in units) = Distance to Object (in units) × Object height on sensor (mm) / Focal Length (mm)
In the first equation, you must decide from which axis you want to measure. For instance, if the image is taken in portrait and you are measuring the width of the object on the image, then input the width of the image in pixels and width of the sensor in mm
Sensor height/width in pixels is nothing but the size of the "image"
Also you must acquire the object size in pixels by any means.
If you are taking image in landscape, make sure you are passing the correct width and height.
Equation 2 is pretty simple as well.
Things to consider:
No magnification (Digital magnification can destroy any depth info)
No Autofocus (Already Explained)
No cropping/editing image size/resizing (Already Explained)
No image skewing.(Rotating the image can make the image unfit)
Do not substitute random values for any of these inputs (Golden Advice)
Do not tilt the camera while taking images (Tilting the camera can distort the image so the object height/width will be altered)
Make sure the object and the camera is exactly in the same line
Don't use EXIF data of the image (EXIF data contains depth information which is absolute garbage since they are not accurate at all. DO NOT CONSIDER THEM)
Things I'm unsure of till now:
Lens distortion / Manufacturing defects
Effects of field of view
Perspective Foreshortening due to camera tilt
Depth field cameras
DISCLAIMER: There are multiple ways to solve this issue but I chose to use this method and I highly recommend you guys to explore more and see what you can come up with. You can basically extend this idea to measure pretty much any object using a smartphone (given the images that a normal smart phone can take)
(Please don't try to measure the size of an amoeba with this. Simply won't work but you can indeed take some of the advice I have gave for your advantage)
If you have cool ideas and issues with my answers. Please feel free to let me know I would love to have discussions. Feel free to correct me if I have made any mistakes and misunderstood any of these concepts.
Final Note:
No matter how hard you try, you cannot make something like a smartphone to work and behave like a camera sensor which is specifically designed to take images for measuring purposes. Smart phone can never beat those but sure we can manipulate the smart phone camera to achieve similar results upto a certain degree. So you guys must keep this in mind and I learnt it the hard way

lane lines keeping project using opencv-python(Raspberrypi) and Arduino

I have an air drone with four motors and wanted to make it fly between two straight lines.
The first problem:
its initial position will be in the middle at certain height but because of the air factors it may deviate (up or down) or (left or right). I have calculated the error when it deviates left or right using the camera, but still don't know how to calculate the error of the height (using the camera too without pressure sensor).
The second problem:
after calculating these errors how to convert them from an integer to a real move.
Sorry, I couldn't provide my code. it is too large and complicated.

1) Using a single camera to calculate distance is not enough.
However, if you're using a stereo camera, you can get a distance data pretty easily. If you want to avoid using a pressure sensor, you may want to consider using a distance sensor(LIDAR or ultrasonic: check the maximum range on these) to measure the height at which your drone will fly. In addition to this, you'll require a error control algorithm eg. PID algorithm to make your drone fly at a constant height.
This is a fantastic source for understanding the fundamentals of PID.
2)For implementation:
In my opinion, this video is awesome for understanding how your sensor data will get converted to an actual movement and will help you can create an analogy. You'll also get a headstart on the code provided.

Measure the rate of growth of a crack from Video

My experiment involves subjecting a substance to pressure that makes the substance eventually crack. The crack grows with time and pressure applied. I have a set-up to take a picture of the substance at fixed intervals of time.
I need to measure how fast crack grows.How do I go about this? (I can code in Python).
Is there a way to measure live speed or speed of growth of crack from one frame to another?
Google drive link to series of pictures taken - https://drive.google.com/open?id=189cv8B4rm3lhSgT6OYfI_aN0Xmqi-tYi
Kindly advise.
I Tried floodFill from OpenCV as per suggestions to this question. But the returned mask is as shown:
h, w = resized.shape[:2]
mask = np.zeros((h+2, w+2), np.uint8)
seed = (int(w/2),int(h/2))
# Floodfill from point (0, 0)
num,im,mask,rect = cv2.floodFill(resized, mask, (0,0), (255,0,0), (10,)*3, (10,)*3, floodflags)
I thought if I can get the co-ordinates of the rectangle bounding box that encloses the crack, I can track its co-ordinates across frames and measure the size of the crack and eventually the speed.
I tried thresholding as below:
th, im_th = cv2.threshold(im, 100, 255, cv2.THRESH_BINARY);
This gives:
I'm unsure if this will let me filter out the background and draw a bounding box over the crack alone. Please advise.
Thanks in advance.

Depending on how slowly the crack forms, you probably don't need a video; you'll likely wind up sampling every X frames anyway, and throwing all of the extra frames away. What you want is enough frames to get "incremental" changes in the crack without getting too many frames that it becomes too computationally expensive.
If you can carefully control the lighting conditions in your setup, then you're in luck! This becomes a very simple problem. You can take a histogram of the pixels (openCV has handles for this, but so does PIL and numpy); you should get two families of color; one that is the color of the outside of the substance, and another that is biased by the shadow in the crack.
You can also try dramatically increasing the contrast in each image/frame in order to get a binary mask of the crack, or running an edge detector over the image. These techniques will lead to frames that are substantially easier to process than the raw footage. You can even feed these into a skeletonization process in order to generate a vector-based representation of the line, in XY image coordinates.
If you can't control the lighting, or the sample is a similar color to the crack, you'll probably need to use object detection techniques, but it's unlikely there's an existing "crack detector," so you may either need to build your own, or look for what other detectors serve as a good proxy for the color and shape of the forming crack.
I'd highly recommend trying the first option if at all possible; pixel and histogram math is far easier than other techniques.

I appreciate you are only just getting started but you have some issues with your video. Firstly the lighting it is not best and it is not consistent because people are moving around in front of it and casting shadows - it also doesn't illuminate the the background behind the crack best - it would be better if it was at the height of the crack and shining more into it so that it better illuminates the background behind the crack. Secondly, you could do without the camera moving part way through the experiment!
Finally, if you want to measure things you need to calibrate, which at the very least means putting a ruler in the image - or scale lines on your background at fixed intervals. If you are doing all that you may as well make life easy for yourself and put markers of a specific colour/pattern, both different, on the top and bottom of the frame plates that are applying the load.
Finally then, you want to do something like a floodfill, or a fill just within the confines of your material (probably by masking) to fill the crack with a different colour. It is then pretty simple to measure the length of the crack and the left-most extent of the crack.

With a proper segmentation approach you are going to have a detailed geometry of the object extracted from a single frame. For example:
If you process multiple frames you will be able to see geometry evolution in time. Having that it should be easy to compare polygons to find form changes, cracks, etc:
I used to work with 4K video to get all required details and good accuracy. You might not need all that data, but video is still way more flexible.
Here is a complete example: https://youtu.be/g2KyfrBtTA4
Provide some examples if you want to get more detailed recommendations.
Update
Real examples are always helpful. So you can segment a crack:
or a substance:
or both:
Basically, you need to enhance overall quality of the input (focus, background under the substance, etc).
As Mark Setchell showed, you might get unwanted background as part of the result shape (the right side of the crack), so it is better to make sure that will not happen or just try to analyze only the substance.
Anyway, your task doesn't seem to be complex. It might be trivial if you can improve image quality and do some simplifications to the environment (some specific background, etc).

Python audio analysis: find real time values of the strongest beat in each meter

I have a song and I'd like to use Python to analyze it.
I need to find the "major sounds" in the song.
I use this term because I don't know the technical term for it, but here is what I mean:
https://www.youtube.com/watch?v=TYYyMu3pzL4
If you play the only first second of the song, I count about 4 major sounds.
In general, these are the same sounds that a person would hum if they were humming the song.
What are these called? And is there a function in librosa (or any other library/programming language) that can help me pinpoint their occurrence in a song?
I can provide more info/examples as needed.
UPDATE: After doing more research, I believe I am looking for what is called the "strongest beats". Librosa already has a beat_track function, but I think this gives you every single thing that can be called a beat in the song. I don't really want every beat, just the ones that stand out the most. The over-arching goal here is to create a music video where the major action happening on the screen lines up perfectly with the strongest beats. This creates a synergistic effect within the video - everything feels connected.

You would do well to call the process of parsing audio to identify its sonic archetypes acoustic fingerprinting
Audio has a time dimension so to witness your "major sounds" requires listening to the audio for a period of time ... across a succession of instantaneous audio samples. Audio can be thought of as a time series curve where for each instant in time you record the height of the audio curve digitized into PCM format. It takes wall clock time to hear a given "major sound". Here your audio is in its natural state in the time domain. However the information load of a stretch of audio can be transformed into its frequency domain counterpart by feeding a window of audio samples into a fft api call ( to take its Fourier Transform ).
A powerfully subtle aspect of taking the FFT is it removes the dimension of time from the input data and replaces it with a distillation while retaining the input information load. As an aside, if the audio is periodic once transformed from the time domain into its frequency domain representation by applying a Fourier Transform, it can be reconstituted back into the same identical time domain audio curve by applying an inverse Fourier Transform. The data which began life as a curve which wobbles up and down over time is now cast as a spread of frequencies each with an intensity and phase offset yet critically without any notion of time. Now you have the luxury to pluck from this static array of frequencies a set of attributes which can be represented by a mundane struct data structure and yet imbued by its underlying temporal origins.
Here is where you can find your "major sounds". To a first approximation you simply stow the top X frequencies along with their intensity values and this is a measure of a given stretch of time of your input audio captured as its "major sound". Once you have a collection of "major sounds" you can use this to identify when any subsequent audio contains an occurrence of a "major sound" by performing a difference match test between your pre stored set of "major sounds" and the FFT of the current window of audio samples. You have found a match when there is little or no difference between the frequency intensity values of each of those top X frequencies of the current FFT result compared against each pre stored "major sound"
I could digress by explaining how by sitting down and playing the piano you are performing the inverse Fourier Transform of those little white and black frequency keys, or by saying the muddied wagon tracks across a spring rain swollen pasture is the Fourier Transform of all those untold numbers of heavily laden market wagons as they trundle forward leaving behind an ever deepening track imprinted with each wagon's axle width, but I won't.
Here are some links to audio fingerprinting
Audio fingerprinting and recognition in Python
https://github.com/worldveil/dejavu
Audio Fingerprinting with Python and Numpy http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/
Shazam-like acoustic fingerprinting of continuous audio streams (github.com) https://news.ycombinator.com/item?id=15809291
https://github.com/dest4/stream-audio-fingerprint
Audio landmark fingerprinting as a Node Stream module - nodejs converts a PCM audio signal into a series of audio fingerprints. https://github.com/adblockradio/stream-audio-fingerprint
https://stackoverflow.com/questions/26357841/audio-matching-audio-fingerprinting

Requesting Guidance: What approach to follow to do Quality Control on small and thin metal ring shafts using Computer Vision?

I am new to the Computer Vision field and looking for your guidance to identify approach to tackle the following scenario:
What approach to follow to do Quality Control on small and thin metal rings using Computer Vision
Putting below the detailed requirement(this is the best I can share):
To begin with, I have attached a picture of the ring we need to do QC of.
Ring_for_QC
Ring diameter = 3 inch
Following checks we need to do:
1.Surface coating of the ring peeled off
2.Portion of ring chipped off
3.Scratch on the ring's Surface
4.Width of the ring is uneven
5.Dent on the ring
6.Entire surface of the ring is not completely horizontal to the plane;
may be due to some dent a part of the ring is resting on the plane surface creating some 1 or 2 degree angle
(I have marked no.6 as 'uneven surface' in the attached picture)
I have also attached another picture marking the quality issues found on a random ring.elevated view with marked QC issues
Scenario:
One single ring can have one or more than one of the above mentioned 6 defects
Issue 1 & 3 can occur at either surface of the ring and we need to check both the surfaces
We need to QC on one single ring at a time
Challenge:
- Need to set up a work station to capture image or video of each ring under check
How many cameras will be there in that work station and what would be the angle for the camera
As we need to check both the sides of the ring we need to decide whether:
we will place the ring on a trasperent surface and take image
or
we need to flip the ring after image is taken on one side
Next challenge is what computer vision technique we should employ to identify all these issues
For the time being we are doing some research around opencv's background substraction methods
It will be helpful to get some insight from you on
what should be a better/feasible approach

Since this is for a student project I'll emphasize image processing more than other aspects of an application. See the bottom section for considerations for real-world applications.
That aside, a general comment: implementing vision for quality control (QC) is hard to get right. If the product to be inspected is cheap (e.g. a ring, a small plastic thing), and if the result of the vision inspection is a borderline pass/fail, or uncertain, you can reject the part. If the part to be inspected is expensive (e.g. a large assembly for a tractor, individual CPUs, medical devices near the end of the production line), then you have to have very well defined specifications, and the system needs to be made as robust as possible.
In general, you want to optimize imaging for each type of defect. For example, the camera location, lens, and lighting to detect scratches may be quite different than what is needed for dimensional gauging (a.k.a. dimensional measurement).
Machine Vision vs. Computer Vision
When you search online for algorithms, equipment, and techniques specific to vision for industrial automation, including the quality control of parts on production lines, then for English-language websites favor the term "machine vision" instead of "computer vision."
https://en.wikipedia.org/wiki/Machine_vision
Machine vision is the common industry term for image processing (+ cameras + lighting + ...) for industrial use. Although different people may use different terminology, and the terminology isn't as important as learning techniques, you'll find a lot of material by searching for "machine vision." The term "computer vision" tends to be used for non-industrial applications, and for academic research, though in languages other than English the terms "machine vision" and "computer vision" may be the same. By comparison, "medical imaging" is similar to machine vision, but involves application of image processing to medical applications.
Lighting
Most importantly, you must control the lighting. Ambient lighting, such as desk lamps, overhead lights, etc., are not only useless for a vision system inspecting parts in production, but will typically interfere with image processing. You might find some defects sometimes with poorly controlled light, but to generate the most consistent results, you'll need to set up lights in specific locations, run the lights at specific, verifiable intensities, and have your vision system detect when something has gone wrong with the lighting.
There are "machine vision lights" designed especially for specific applications such as finding scratches in shiny surfaces, making shiny surfaces look less shiny, to backlight parts (which is useful for dimensional gauging), to illuminate parts from low angles, and so on. Read about different types of lighting.
https://smartvisionlights.com/
https://www.vision-systems.com/content/dam/VSD/solutionsinvision/Resources/lighting_tips_white_paper.pdf
Rather than spend a lot of money on special lights, you can mock them up:
LED flashlight or single LED (as a "point" light source)
Bright light + translucent sheet of plastic (for backlighting)
White tissue paper or some other diffusing material in front of a bright light
...
The importance of lighting can not be underestimated. Controlling lighting conditions improves the chance of success, and is typically necessary to achieve the accuracy of measurement or pass/fail assessment required in real-world environments.
Accuracy, Correctness, Usefulness
At some point you'll probably wonder whether machine learning is useful or necessary for the application. The question to ask yourself (or the customer) is this: what percentage of defects would need to be detected?
For example, if a chip is missing from the ring that could be a fatal defect. Is the ring used in some safety-critical application? If so, vision inspection for QC would have to be extraordinarily robust.
Even if you're familiar with the terms "accuracy" and "precision," make sure they have very clear meanings as you consider image processing problems:
https://en.wikipedia.org/wiki/Accuracy_and_precision
So, what percentage of chip defects needs to be found? 90%? 95%? 98%?
Using the term "accurate" more loosely to mean "the vision system gets the measurement correct and/or finds the defects we know are there," what is the accuracy of the most accurate machine learning algorithm you've read about? Or at least, what would qualify as reasonably impressive accuracy for machine learning? 95%? 98%?
If you're making measurements of machine parts on a production line, then you would typically want the accuracy of dimensional measurements and defect detection to be 99% or better. For high-value products, and products such as electronic components that are highly sensitive to defects, accuracy may need to be 99.999% or better. Think of it this way: if a manufacturer is making thousands or tens of thousands of parts, they don't want garbage parts to make it past your vision system several times a day.
Machine learning for image processing has been around a long time. Processing speeds, memory, and training set sizes have improved, and there have been improvements in algorithms as well, but it's important to note that machine learning is suitable only for some applications, and will fail miserably at other applications.
Techniques
To begin with, I have attached a picture of the ring we need to do QC
of.
Ring_for_QC
Ring diameter = 3 inch
Get the exact diameter, including tolerances. If the nominal diameter is 3.000 inches, then then tolerance might be expressed in terms of thousands of an inch. You may not need to know that for a student project, but if you were proposing a solution for a factory owner you wouldn't want to even suggest a price or timeline for delivery without having complete specs for the part, and numerous samples of the part.
From the one image it's not possible to be too specific about what a defect might look like--the same part can have different defects in different factories, or even on different production lines of the same factory--but we can make some guesses.
1.Surface coating of the ring peeled off
From the one image it's not clear what the surface coating is supposed to look like, or what's underneath. You must provide at least one image of a good part, and at least one image for each type of defect.
What is the surface coating? Anodization? Paint? Enamel? Plastic? Cheese? Whatever the case, knowing what material it is, and how that material degrades, will give some clues about what sort of vision setup may help detect problems with the coating. Changes in coating quality can affect apparent texture (e.g. edge content), brightness/darkness (intensity), color, shininess, and so on.
For the moment, let's assume the coating peeling off changes the brightness or texture of the uncoated surface vs. the remaining coated surface. Then your image processing might look something like the following:
Determine whether a ring is in the image
Segment the ring from the background. That is, use an algorithm such as connected components (OpenCV's findContours()), SIFT, or some other technique to identify the presence and location of a rigid object of known size and shape from the background.
Isolate further processing to just those pixels corresponding to the surface of the part.
Use some technique to find clusters of different texture differences, brightness differences, etc. This is where a better description of the coating is required. If lighting and lens parameters are "fixed," you can consider generating a histogram of brightness values in the image (0 = black, 255 = white) and then comparing the histogram of good parts and bad parts--is there some statistical difference? Or you might use connected components (findContours() again) to cluster pixels of different colors, assuming the lack of coating changes the apparent color of the part: maybe the coating is brown and the part is silvery.
It's hard to guess what technique would be relevant here without photos and/or a much more specific description of the coating. Hopefully this makes it clear why specs are important.
Coatings can be absent in different ways: peeling, small absences (voids), partially scraped away, etc. It can be difficult to predict in advance what the shape and size of missing coating may be.
When the size and shape of a defect is hard to predict, but when the defect is associated with a difference in image intensity (pixel brightness) or color, then explore these ideas:
Generate an "edge image" in which you find brightness/color transitions. You start with the grayscale or color image, then use Sobel or Canny or some other algorithm to generate an image of edge intensities.
Apply statistical methods to determine how "edgy" an image is. Are there more than N pixels (or more than 5% of all pixels) with an edge strength greater than S?
Once you have some basic algorithm that identifies the difference between good parts and parts with some missing coating, then you could consider using machine learning to review lots (lots!) of samples to help determine the best parameterization. For example, how do you know what number of edge pixels or edge pixel strength should be considered "bad"?
2.Portion of ring chipped off
It depends on whether the chip is visible just from the part's outline. For example, if you placed the part on a light table (a.k.a. "backlight"), would you always see a defect considered to be a "chip"? Or could the chip just be on the top surface facing the camera?
To find chips on edges, having the part on a backlight simplifies matters greatly.
Identify the location and orientation of the part (e.g. using connect components, normalized correlation, SIFT, or whatever algorithm is suitable for the part and accuracy of location required).
Find edges corresponding to the outer and inner rings of the part.
Fit a circle or nearly circle ellipse to the edge points using Hough circle fit, RANSAC circle fit, or (meh) least square circle fit parameterized to the known dimensions (in pixels) of the outer ring and inner rind diameters.
For the points used for the circle fits, find the point-to-circle (or point-to-ellipse) shortest distance. The larger this distance, the more likely you have a chip or missing chunk.
To ensure you're finding identations, chips, or whatever, and not just individual "noise" edge points, examine points in order going clockwise or anticlockwise, and only consider a series of perimeter points as defects if N successive points have a median or possibly mean point-to-edge distance greater than N.
A simpler approach could be to fit a black-and-white mask--a template--representing a good part to the current location and rotation of the part to be inspected. If the template and sample part are aligned very precisely, and if you perform image subtraction, then you may be fortunate enough to get clusters or pixels where there are defects. But this method is fairly crude, and harder to make robust.
There are machine learning techniques to identify chips on edges, but you'd need lots of part samples to train the techniques. Optionally, if you don't have enough samples, you can use the sample samples with slightly modified lighting, at different locations in the image, with manually added defects, etc., to help train the algorithm. But that's another discussion altogether.
3.Scratch on the ring's Surface
See the link above about different types of lighting. You'll need to experiment with a few different lighting configurations to figure out what works for your part.
Generally, though, scratches are likely to have difference in brightness and "edginess" (image edge content) relative to the rest of the part. If you're lucky, a scratch can reveal a different color.
Scratches can vary so much in appearance, area, and shape that it would be hard to parameterize an algorithm to catch them all. Once again, statistical analysis of edge content, brightness, and color tends to be useful.
In general: to achieve the best results for a particular QC inspection, you'll need to engineer a system specifically for the part. Your vision system may be configurable, and there can be different combinations of lights and cameras for different types of QC inspection, but for any particular defect detection you want to control the appearance of the part as much as possible. Relying on software to do all the work yields a less robust system that customers will typically yank out and throw away.
4.Width of the ring is uneven
This is almost an example of dimensional gauging or optical gauging. If you're just looking for unevenness, you don't necessarily need to measurement diameter in engineering units such as millimeters: you can just measure pixels. BUT the effort required to ensure your measurement in pixels is accurate will typically lead you to measuring in millimeters anyway.
Assuming the optical setup is correct and (more or less) calibrated, which I'll describe below, here's a basic process:
Identify the position and location of the part
From the algorithm that find the part, or from a follow-on algorithm that identifies edge pixels (e.g. Sobel, Canny, ...), find the edge pixels just for the outer diameter of the ring.
Perform a circle/ellipse fit to the edge pixels, and eliminate outlier pixels that don't actually belong to the circle/ellipse.
Have your algorithm start with the 1st pixel in the list of edge pixels corresponding to the outer diameter.
From that 1st pixel, find the edge pixel farthest away. Ideally, this would be the point diametrically opposite.
Cycle through all pixels, finding the distance to the farthest pixel. (This is not optimal in terms of speed, but simpler to code.)
Generate a histogram of all distances.
Make a determination of good/bad based on the histogram of point-to-point distances.
You might call a part "bad" for one or more of the following conditions:
At least N point-to-point distances exceed a distance of P pixels
The standard deviation of point-to-point distances exceeds some threshold T
...
Measurement of distance depends on the consistency of point-to-point distances at different locations within the image. If you perform accurate, precise measurements of distance, you'll notice that an object of fixed length appears to vary in length depending on its location in the image: if the object is located in the center of the image it may appear to be 57.5 pixels long, but in one corner of the image it may appear to be 56.2 pixels long.
To correct for these irregularities, you can...
Perform a nonlinear flatness correction. This will also correct for non-normal alignment of the camera to part, though you want to start with the optical axis of the camera as normal (perpendicular) to the surface of the part as possib.e.
Make a few quick measurements to estimate how much measurements vary.
5.Dent on the ring
6.Entire surface of the ring is not completely horizontal to the plane; may be due to some dent a part of the ring is resting on the
plane surface creating some 1 or 2 degree angle (I have marked no.6 as
'uneven surface' in the attached picture)
Use cameras imaging from the sides. Make sure the background is simple.
A 1- to 2-degree difference could be hard to detect using a camera placed directed overhead. If you're lucky you could detect that the outer edge of the part is more elliptical than circular, but the ability to detect this would depend on the color and thickness of the part. Also, you wouldn't necessarily be able to distinguish between a misshapen part and one resting at an angle--but for some inspections that's okay since both are defects.
HOWEVER, in a real-world application the customer might not be happy if you reject parts that are otherwise good, but happen to be sitting at a slight angle. A mechanical fixture might fix the problem by ensure parts are lying flat.
I have also attached another picture marking the quality issues found
on a random ring.elevated view with marked QC issues
The image isn't clear enough. Put the part on a simpler background and tinker with lighting to make it more obvious what the differences are between good and bad.
One single ring can have one or more than one of the above mentioned 6 defects
Run one algorithm after the other. You may also have to turn different lights on and off before running each algorithm (or rather, each chain of algorithms).
Issue 1 & 3 can occur at either surface of the ring and we need to check both the surfaces
We need to QC on one single ring at a time
You may have to write an algorithm to detect whether multiple rings happen to be present. Even if you weren't asked to do this specifically, this happens in production, and your professor may surprise you with it. At least have an idea how you would detect the presence of multiple rings.
That's another aspect of vision: you may start thinking of what algorithms and lighting are necessary to solve "the problem," but you'll also spend a lot of time figuring out everything that could go wrong, and writing software to detect those conditions to ensure you don't yield a false result. For example, what happens if the lights turn off? What if two rings are present? What if the ring isn't fully within the field of view? What if dirt gets on the surface the part is resting on? What if the lens gets dirty (which it will)?
A few principles:
Provide the best image for image processing before you consider what algorithm would work best.
Understand what accuracy/success rate is necessary, and measure it.
Get as many samples as you possibly can: hundreds, thousands if possible. Having a chance to measure "online" (in real production) is helpful.
Real-world applications
If it were a real-world application--that is, if you went into the field of vision professionally--there are many more steps that may seem less difficult, but that turn out to be critical:
How rings come into view (or into "station"): on a moving conveyor? placed by a robot? in some container?
What triggers vision inspection of the ring -- a programmable logic controller, a "light curtain" the ring passes through, or whether the vision system itself has to determine when a ring is ready for inspection.
How results are communicated to other equipment. (This can be a huge hassle, and an otherwise good vision system can be rejected by a customer if communications aren't designed and implemented properly.)
Whether you are guaranteed to see only one ring at a time
This isn't to say university isn't the real world: just that you probably won't lose tens or hundreds of thousands of Euros/pounds/dollars if you happen to overlook something.

You can see how to makes face recognition.
Face detection.
Face alignment and normalization.
Features extraction.
Comparing features with pattern.
But in your case, you can skip paragraph 3 and compare 2 with the reference image. Depending on the conditions, additional filtering may be necessary.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.