I have pictures of networks, and my goal is to process the images in order to extract the skeleton of the network.
My approach lies in two steps :
1) Going from grayscale image to binary image (Using local thresholding or Otsu method, and then a medianfilter (python function medfilt)
2) Using thinning algorithms in order to extract the skeleton of the network.
Considering the quality of the first image, I'm pretty sure I can do a lot better than that. I thus have two question :
1) Taking the last image, how would you remove all the small lines perpendendicular to the ridges, and the small gaps, that are artefacts ?
2) What algorithm would you actually advise me to use for these two steps ?
Related
I need to delete the noise from this image. My problem is that I need a neat contour without all the lines like in this image.
Do you have any suggestions how to do that using python?
Looking at your example images, I suppose you are looking for an image processing algorithm that finds the edges of your image (in your case, the border lines of the ground plan).
Have a look the Canny edge detection algorithm which might be a well-suited for this task. A tutorial with an example implementation in python can be found here.
I am working on a project where I have to find the background of a given gray-scale image.
I did several kinds of research on the internet and I've found some algorithms using OpenCV library (like the following: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_video/py_bg_subtraction/py_bg_subtraction.html#py-background-subtraction).
This kind of approach doesn't work for me.
The image I want to elaborate is:
As you can see it is in gray-scale and we see the "gray static" background. I would love to see only the nucleus of the cell (the image will improve resolution and quality in the time, this is a pretty raw one)
I tried to subtract the 2D magnitude FFT of the background from the main image but the results is not good:
What I am asking is: What kind of process do you suggest to use to eliminate background?
Did you already try watershed algorithm ? I saw on a paper it's already used and improved for cell image segmentation.
Background subtraction won't work for your images because your background is not consistent. image's SNR is too low!
So you have 2 options:
1) Using deep learning method (like UNET) if you have enough data
2) Using bilateral filter then, some methods like active contour or GLCM Texture Feature or k-means clustering.
Goal:
For the past two weeks I've been trying to figure out how to convert the following image:
To one that looks like this (may not match exactly, as this image was taken at a different time):
Lens Correction (necessary?):
The first thing I noticed is that simply slicing the image and overlaying the four parts wouldn't work perfectly, as the curvature of certain lines does not match. For instance, the mid-court line bends left in the second slice and bends right in the third slice. This bending looks like a barrel distortion so I tried using both a parameterized lens correction function (passing k1, k2, and k3 to OpenCV) and using lensfun. Since the lensfun database does not include my camera make or model (it's an AXIS camera) and I do not know the make or model of the lens (it's manufactured as part of the camera), I wrote a small script to dump test images using various lenses with various parameters, then skimmed through the thousands of output images until I found one that looked like it had relatively straight lines:
This correction was done using the "Samyang 12mm f/2.8 Fish-Eye ED AS NCS" lens with a "Canon EOS 10D" camera in lensfun. It's probably not perfect, but I figured it was close enough to move on to step two.
Once the lens distortion was corrected, the second issue is that the same line in two slices was pointing in different directions, which should be corrected with a simple perspective transform. So I began a long quest to figure out the proper parameters for this perspective transform.
Failed Attempts:
1. Using SciPy
I started by writing a cost function to judge the "quality" of a given set of parameters (overlapped pixels should match) and applying SciPy's solver to figure it out. I made several tweaks to my cost function (applying a Gaussian blur, scaling down the image, gray scaling the image, using the Sobel operator to get a gradient, looking only at the pixels on either side of a "seam" after overlapping instead of the whole overlap region, etc) but it always failed to find a good solution. The results looked worse than the original camera image most of the time:
2. Using math
When that failed I tried applying math to compute the proper perspective transform. I know the FOV of the camera (from the spec sheet), I know the image width and height, I know the sensor size (from the spec sheet), and using a protractor I measured the angles between the lenses. Using the pinhole model I then calculated the expected (x,y) values of points on the image plane and what transform would be necessary to correct them. The results looked better than SciPy, but were still dismal.
3. Using OpenCV's Stitcher
After this I tried using OpenCV's built-in Stitcher class. However it failed to stitch together slices 2 and 3 due to insufficient overlap between the images (and about 10% of the time it even failed to stitch together slices 1 and 2, presumably because of the non-deterministic nature of RANSAC). Even when it did succeed, the stitch wasn't that great:
4. Using ORB and OpenCV's findHomography
Most recently I tried using ORB with a mask (only looking for features in the overlap region) and OpenCV's findHomography function to create a custom version of the Stitcher. While the matches seemed promising, the resulting stitch was still sub-optimal:
I'm beginning to suspect that my methodology (slice -> lens correct -> perspective transform -> overlay) is flawed and there's a better way to do this.
5. Updated ORB / findHomography
I updated my feature detection to eliminate any matches where the Y coordinates differed drastically (e.g. matching the white of the table to the white of the lights). After doing this my number of matched features fell from ~110 to ~55, but the homography was improved significantly. Here's the stitch that results for slices 1/2 and 2/3 with the update:
Until someone can tell me that I'm going about this all wrong, I'm going to keep pursuing this strategy with the following added step:
Slice image
Lens correct each slice
Perspective transform slice 2 or 3 so that the side line is horizontal and the mid-court line is vertical
Use ORB + match filtering + findHomography to iteratively align and then stitch adjacent slices
Ultimately when it's all said and done I want to try and compute a mapping from input pixels to output pixels so that we're not doing all of this complex work (lens correction, ORB, findHomography, etc) per-frame. We'll do it once per camera, save the mapping to a file somewhere, then we can in real-time map the input video to an output video frame-by-frame using cv2.remap
Note:
The second image I posted showing the "expected output" comes directly from the camera in question. It can be configured to return the first image at 30 fps, or the second image at 10 fps. We wish to perform the stitching off-camera on a more powerful computer so we can get 30 fps but still have the single image.
AXIS provides an SDK for doing the stitching off-camera, but this SDK is Windows-only and most of our tech stack is Linux and most of our development machines are Mac OS. I have used a Windows computer to try and look into the stitching SDK they provide, however I had no luck getting it to compile and run. Their sample code kept throwing errors and I've never had any luck getting Visual Studio or C++ to play nicely for me.
My suggestion is to train an autoencoder. Use the first image as input and the second one as an output, as in a denoising autoencoder:
Note that you may lose resolution if you create a botteleneck too small in the middle layer.
Also, Variational autoencoders present a latent vector but work following the same principle.
You can adapt this code:
denoise = Sequential()
denoise.add(Convolution2D(20, 3,3,
border_mode='valid',
input_shape=input_shape))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(UpSampling2D(size=(2, 2)))
denoise.add(Convolution2D(20, 3, 3,
init='glorot_uniform'))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(Convolution2D(20, 3, 3,init='glorot_uniform'))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(MaxPooling2D(pool_size=(3,3)))
denoise.add(Convolution2D(4, 3, 3,init='glorot_uniform'))
denoise.add(BatchNormalization(mode=2))
denoise.add(Activation('relu'))
denoise.add(Reshape((28,28,1)))
sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=False)
denoise.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['accuracy'])
denoise.summary()
denoise.fit(x_train_noisy, x_train,
nb_epoch=50,
batch_size=30,verbose=1)
When humans see markers suggesting the form of a shape, they immediately perceive the shape itself, as in https://en.wikipedia.org/wiki/Illusory_contours. I'm trying to accomplish something similar in OpenCV in order to detect the shape of a hand in a depth image with very heavy noise. In this question, assume that skin color based detection is not working (actually it is the best I've achieved so far but it is not robust under changing light conditions, shadows or skin colors. Also various paper shapes (flat and colorful) are on the table, confusing color-based approaches. This is why I'm attempting to use the depth cam instead).
Here's a sample image of the live footage that is already pre-processed for better contrast and with background gradient removed:
I want to isolate the exact shape of the hand from the rest of the picture. For a human eye this is a trivial thing to do. So here are a few attempts I did:
Here's the result with canny edge detection applied. The problem here is that the black shape inside the hand is larger than the actual hand, causing the detected hand to overshoot in size. Also, the lines are not connected and I fail at detecting contours.
Update: Combining Canny and a morphological closing (4x4 px ellipse) makes contour detection possible with the following result. It is still waaay too noisy.
Update 2: The result can be slightly enhanced by drawing that contour to an empty mask, save that in a buffer and re-detect yet another contour on a merge of three buffered images. The line that combines the buffered images is is hand_img = np.array(np.minimum(255, np.multiply.reduce(self.buf)), np.uint8) which is then morphed once again (closing) and finally contour detected. The results are slightly less horrible than in the picture above but laggy instead.
Alternatively I tried to use an existing CNN (https://github.com/victordibia/handtracking) for detecting the approximate position of the hand's center (this step works) and then flood from there. In order to detect contours the result is put into an OTSU filter and then the largest contour is taken, resulting in the following picture (ignore black rectangles in the left). The problem is that some of the noise is flooded as well and the results are mediocre:
Finally, I tried background removers such as MOG2 or GMG. They are confused by the enormous amount of fast-moving noise. Also they cut off the fingertips (which are crucial for this project). Finally, they don't see enough details in the hand (8 bit plus further color reduction via equalizeHist yield a very poor grayscale resolution) to reliably detect small movements.
It's ridiculous how simple it is for a human to see the exact precise shape of the hand in the first picture and how incredibly hard it is for the computer to draw a shape.
What would be your recommended method to achieve an exact hand segmentation?
After two days of desperate testing, the solution was to VERY carefully apply thresholding to an well-preprocessed image.
Here are the steps:
Remove as much noise as you possibly can. In my case, denoising was done using Intel's pyrealsense2 (I'm using an Intel RealSense depth camera and the algorithms were written for that camera family, thus they work very well). I used rs.temporal_filter() and directly after rs.hole_filling_filter() on every frame.
Capture the very first frame. Besides capturing the exact distance to the table (for later thresholding), this step also saves a still picture that is blurred by a 100x100 px kernel. Since the camera is never mounted perfectly but slightly tilted, there's an ugly grayscale gradient going over the picture and making operations impossible. This still picture is then subtracted from every single later frame, eliminating the gradient. BTW: this gradient removal step is already incorporated in the screenshots shown in the question above
Now the picture is almost noise-free. Do not use equalizeHist. This does not simply increase the general contrast regularly but instead empathizes the remaining noise way too much. This was my main error I did in almost all experiments. Instead, apply a threshold (binary with fixed border) directly. The border is extremely thin, setting it at 104 instead of 205 makes a huge difference.
Invert colors (unless you have taken BINARY_INV in the previous step), apply contours, take the largest one and write it to a mask
VoilĂ !
I have software that generates several images like the following four images:
Does an algorithm exist that detects the (horizontal & vertical) edges and creates a binary output like this?
If possible I'd like to implement this with numpy and scipy. I already tried to implement an algorithm, but I failed because I didn't find a place to start. I also tried to use a neural network to do this, but this seems to be overpowered and does not work perfectly.
The simplest thing to try is to:
Convert your images to binary images (by a simple threshold)
Apply the Hough transform (OpenCV, Matlab have it already implemented)
In the Hough transform results, detect the peaks for angles 0 degree, + and - 90 degrees. (Vertical and horizontal lines)
In OpenCV and Matlab, you have extra options for the Hough transform which allow you to fill the gaps between two disconnected segments belonging to a same straight line. You may need a few extra operations for post-processing your results but the main steps should be these ones.