I have an 32x32 image. I resize it for 512x512 in python, quality and view of image is not same when I resize it in paint.
original image resized-with-paint resized-with-python
What is needed to add to have same result as Paint?
from PIL import Image
im=Image.open('1.png')
im=im.resize(512,512)
im.save('resized.png')
Use:
im = im.resize((521,512), resample=Image.NEAREST)
for that effect.
It is not really a "loss of quality" that you are seeing - it is actually a difference in the interpolation method. When upsizing an image, the algorithm effectively has to "invent" new pixels to fill the bitmap raster. Some algorithms interpolate between known surrounding values, other algorithms just take the closest value - also known as "Nearest Neighbour". There are both advantages and disadvantages - "Nearest Neighbour" will be faster and will not introduce new "in-between" colours into your resized image. On the downside, it will be slower, look more "blocky" and less smooth.
It requires some thought and experience to choose the appropriate method.
Related
I'm using OpenCV with Python to process images for AI training. I need to scale the images down to 32×32 pixels, but with cv2.resize() the images come out too noisy. It looks like this function takes the value of a single pixel from each region of the image, but I need an average value of each region so that the images are less noisy. Is there an alternative to cv2.resize()? I could just write my own function but I don't think it would be very fast.
As you can see in the cv2.resize documentation, the last parameter interpolation determines the way the image is resampled.
See also the possible values for it.
The default is cv2.INTER_LINEAR meaning a linear interpolation. It can create a blurred/noisy effect when the image is downsampled.
You can try to use other interpolation methods to see if the result is better suited for your needs.
Sepcifically I recommend you to try the cv2.INTER_NEAREST option. It will determine the destination pixel value based on the color of the nearest pixel in the source. The downsampled image should be pixellated, but not blurred.
Another option is cv2.INTER_AREA as mentioned in #fmw42's comment.
In this project, you will implement the image super-resolution problem. Specifically,
you will start from a digital image of size M*N pixels, and then you will enlarge the
image to (3M) * (3N) pixels. While the pixels in the original image should keep their
original intensities, the intensities of new pixels are interpolated by using a local radial
basis function in a user-chosen neighborhood of each new pixel.
This is the image I want to enlarge.
The image is 256 x 256. I want to use Colab and I found a function pysteps.utils.interpolate.rbfinterp2d and here is the documentation for this function:
https://pysteps.readthedocs.io/en/latest/generated/pysteps.utils.interpolate.rbfinterp2d.html.
I am very new to computer programming and I am wondering how do I actually do this. I can do individual steps, so I am more or less looking for a (detailed, if possible) outline of steps to accomplish the task. At the end of the project I want to display the original image and then the resulting image after up-scaling it.
Any help would be much appreciated. Thanks in advance!
I know the basic flow or process of the Image Registration/Alignment but what happens at the pixel level when 2 images are registered/aligned i.e. similar pixels of moving image which is transformed to the fixed image are kept intact but what happens to the pixels which are not matched, are they averaged or something else?
And how the correct transformation technique is estimated i.e. how will I know that whether to apply translation, scaling, rotation, etc and how much(i.e. what value of degrees for rotation, values for translation, etc.) to apply?
Also, in the initial step how the similar pixel values are identified and matched?
I've implemented the python code given in https://simpleitk.readthedocs.io/en/master/Examples/ImageRegistrationMethod1/Documentation.html
Input images are of prostate MRI scans:
Fixed Image Moving Image Output Image Console output
The difference can be seen in the output image on the top right and top left. But I can't interpret the console output and how the things actually work internally.
It'll be very helpful if I get a deep explanation of this thing. Thank you.
A transformation is applied to all pixels. You might be confusing rigid transformations, which will only translate, rotate and scale your moving image to match the fixed image, with elastic transformations, which will also allow some morphing of the moving image.
Any pixel that a transformation cannot place in the fixed image is interpolated from the pixels that it is able to place, though a registration is not really intelligent.
What it attempts to do is simply reduce a cost function, where a high cost is associated with a large difference and a low cost is associated with a small difference. Cost functions can be intensity based (pixel values) or feature based (shapes). It will (semi-)randomly shift the image around untill a preset criteria is met, generally a maximum amount of iterations.
What that might look like can be seen in the following gif:
http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/registration_visualization.gif
When humans see markers suggesting the form of a shape, they immediately perceive the shape itself, as in https://en.wikipedia.org/wiki/Illusory_contours. I'm trying to accomplish something similar in OpenCV in order to detect the shape of a hand in a depth image with very heavy noise. In this question, assume that skin color based detection is not working (actually it is the best I've achieved so far but it is not robust under changing light conditions, shadows or skin colors. Also various paper shapes (flat and colorful) are on the table, confusing color-based approaches. This is why I'm attempting to use the depth cam instead).
Here's a sample image of the live footage that is already pre-processed for better contrast and with background gradient removed:
I want to isolate the exact shape of the hand from the rest of the picture. For a human eye this is a trivial thing to do. So here are a few attempts I did:
Here's the result with canny edge detection applied. The problem here is that the black shape inside the hand is larger than the actual hand, causing the detected hand to overshoot in size. Also, the lines are not connected and I fail at detecting contours.
Update: Combining Canny and a morphological closing (4x4 px ellipse) makes contour detection possible with the following result. It is still waaay too noisy.
Update 2: The result can be slightly enhanced by drawing that contour to an empty mask, save that in a buffer and re-detect yet another contour on a merge of three buffered images. The line that combines the buffered images is is hand_img = np.array(np.minimum(255, np.multiply.reduce(self.buf)), np.uint8) which is then morphed once again (closing) and finally contour detected. The results are slightly less horrible than in the picture above but laggy instead.
Alternatively I tried to use an existing CNN (https://github.com/victordibia/handtracking) for detecting the approximate position of the hand's center (this step works) and then flood from there. In order to detect contours the result is put into an OTSU filter and then the largest contour is taken, resulting in the following picture (ignore black rectangles in the left). The problem is that some of the noise is flooded as well and the results are mediocre:
Finally, I tried background removers such as MOG2 or GMG. They are confused by the enormous amount of fast-moving noise. Also they cut off the fingertips (which are crucial for this project). Finally, they don't see enough details in the hand (8 bit plus further color reduction via equalizeHist yield a very poor grayscale resolution) to reliably detect small movements.
It's ridiculous how simple it is for a human to see the exact precise shape of the hand in the first picture and how incredibly hard it is for the computer to draw a shape.
What would be your recommended method to achieve an exact hand segmentation?
After two days of desperate testing, the solution was to VERY carefully apply thresholding to an well-preprocessed image.
Here are the steps:
Remove as much noise as you possibly can. In my case, denoising was done using Intel's pyrealsense2 (I'm using an Intel RealSense depth camera and the algorithms were written for that camera family, thus they work very well). I used rs.temporal_filter() and directly after rs.hole_filling_filter() on every frame.
Capture the very first frame. Besides capturing the exact distance to the table (for later thresholding), this step also saves a still picture that is blurred by a 100x100 px kernel. Since the camera is never mounted perfectly but slightly tilted, there's an ugly grayscale gradient going over the picture and making operations impossible. This still picture is then subtracted from every single later frame, eliminating the gradient. BTW: this gradient removal step is already incorporated in the screenshots shown in the question above
Now the picture is almost noise-free. Do not use equalizeHist. This does not simply increase the general contrast regularly but instead empathizes the remaining noise way too much. This was my main error I did in almost all experiments. Instead, apply a threshold (binary with fixed border) directly. The border is extremely thin, setting it at 104 instead of 205 makes a huge difference.
Invert colors (unless you have taken BINARY_INV in the previous step), apply contours, take the largest one and write it to a mask
Voilà!
I am trying to paste an object with a completely tight known mask onto an image so it should be easy, but without some post treatments I get artefacts at the border. I want to use the blending technique Poisson Blending to reduce the artefacts. It is implemented in opencv seamlessClone.
import cv2
import matplotlib.pyplot as plt
#user provided tight mask array tight_mask of dtype uint8 with only white pixel the ones on the object the others are black (50x50x3)
tight_mask
#object obj to paste a 50x50x3 uint8 in color
obj
#User provided image im which is large 512x512 of a mostly uniform background in colors
im
#two different modes of poisson blending, which give approximately the same result
normal_clone=cv2.seamlessClone(obj, im, mask, center, cv2.NORMAL_CLONE)
mixed_clone=cv2.seamlessClone(obj, im, mask, center, cv2.MIXED_CLONE)
plt.imshow(normal_clone,interpolation="none")
plt.imshow(mixed_clone, interpolation="none")
However, with the code above, I only get images where the pasted objects are very very very transparent. So they are obviously well blended but they are so blended that they fade away like ghosts of objects.
I was wondering if I was the only one to have such issues and if not what were the alternatives in term of poisson blending ?
Do I have to reimplement it from scratch to modify the blending factor (is that even possible ?), is there another way ? Do I have to use dilatation on the mask to lessen the blending ? Can I enhance the contrast somehow afterwards ?
In fact the poisson blending is using the gradient information in the image to paste to blend it into the target image.
It turns out that if the mask is completely tight the border gradient is then artificially interpreted as null.
That is why it ignores it completely and produces ghosts.
Using a larger mask by dilating the original mask using morphological operations and therefore including some background is thus the solution.
Care must be taken when choosing the color of the background included if the contrast is too big the gradient will be too strong and the image would not be well blended.
Using a color like gray is a good starting point.