Keras: How does reprocessing scale images?

Keras: How does reprocessing scale images? - python

For loading image data from disk, Keras is providing the tf.keras.preprocessing.image_dataset_from_directory() method, which is documented on https://keras.io/api/preprocessing/image/.
I would like to use this method to load data for generating a image classifier.
https://machinelearningmastery.com/how-to-load-large-datasets-from-directories-for-deep-learning-with-keras/ contains some information on how to best organize the data.
The image_dataset_from_directory() takes a mandatory argument image_size=(..., ...) to give all images to the same size (which is required for further steps).
Where can I find details about how the pictures are scaled?
Would pictures with extreme ratios become distorted and negatively impact the classfier?

It uses any of the tf.image.ResizeMethod. You can find the source code in the Tensorflow documentation.

I was not able to find the specifics on how resize is done. However it probably uses one of the methods available in cv2 which are listed below but I do not know which one.
[optional] flag that takes one of the following methods. INTER_NEAREST – a nearest-neighbor interpolation INTER_LINEAR – a bilinear interpolation (used by default) INTER_AREA – resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method. INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
Of course when resized your images will become distorted. I do a lot of image classification and have not found this to be a problem. Beside that there appears to be no other choice but to have all the images be the same size.

Related

Is there an alternative to cv2.resize() in OpenCV for downscaling images?

I'm using OpenCV with Python to process images for AI training. I need to scale the images down to 32×32 pixels, but with cv2.resize() the images come out too noisy. It looks like this function takes the value of a single pixel from each region of the image, but I need an average value of each region so that the images are less noisy. Is there an alternative to cv2.resize()? I could just write my own function but I don't think it would be very fast.

As you can see in the cv2.resize documentation, the last parameter interpolation determines the way the image is resampled.
See also the possible values for it.
The default is cv2.INTER_LINEAR meaning a linear interpolation. It can create a blurred/noisy effect when the image is downsampled.
You can try to use other interpolation methods to see if the result is better suited for your needs.
Sepcifically I recommend you to try the cv2.INTER_NEAREST option. It will determine the destination pixel value based on the color of the nearest pixel in the source. The downsampled image should be pixellated, but not blurred.
Another option is cv2.INTER_AREA as mentioned in #fmw42's comment.

Is it possible to turn a low quality image into a high quality one with Python?

I made a tif image based on a 3d model of a woodsheet. (x, y, z) represents a point in a 3d space. I simply map (x, y) to a pixel position in the image and (z) to the greyscale value of that pixel. It worked as I have imagined. Then I ran into a low-resolution problem when I tried to print it. The tif image would get pixilated badly as soon as it zooms out. My research suggests that I need to increase the resolution of the image. So I tried a few super-resolution algos found from online sources, including this one https://learnopencv.com/super-resolution-in-opencv/
The final image did get a lot bigger in resolution (10+ times larger in either dimension) but the same problem persists - it gets pixilated as soon as it zooms out, just about the same as the original image.
Looks like quality of an image has something to do not only with resolution of it but also something else. When I say quality of image, I mean how clear the wood texture is in the image. And when I enlarge it, how sharp/clear the texture remains in the image. Can anyone shed some light on this? Thank you.
original tif
The algo generated tif is too large to be included here (32M)
Gigapixel enhanced tif
Update - Here is a recently achieved result: with a GAN-based solution
It has restored/invented some of the wood grain details. But the models need to be retrained.

In short, it is possible to do this via deep learning reconstruction like the Super Resolution package you referred to, but you should understand what something like this is trying to do and whether it is fit for purpose.
Generic algorithms like the Super Resolution is trained on variety of images to "guess" at details that is not present in the original image, typically using generative training methods like using the low vs high resolution version of the same image as training data.
Using a contrived example, let's say you are trying to up-res a picture of someone's face (CSI Zoom-and-Enhance style!). From the algorithm's perspective, if a black circle is always present inside a white blob of a certain shape (i.e. a pupil in an eye), then next time it the algorithm sees the same shape it will guess that there should be a black circle and fill in a black pupil. However, this does not mean that there is details in the original photo that suggests a black pupil.
In your case, you are trying to do a very specific type of up-resing, and algorithms trained on generic data will probably not be good for this type of work. It will be trying to "guess" what detail should be entered, but based on a very generic and diverse set of source data.
If this is a long-term project, you should look to train your algorithm on your specific use-case, which will definitely yield much better results. Otherwise, simple algorithms like smoothing will help make your image less "blocky", but it will not be able to "guess" details that aren't present.

Getting started with denoising elements of a 200x200 numpy array

I have a 200x200 numpy array that has a shape in it which I can see when I graph it using matplotlib's imshow() function. However, there is also a lot of noise added in that picture. I am trying to use openCV to emphasize the shape and denoise the image. But it keeps throwing error messages that I don't understand. What should I do to get started on the denoising problem. The shape is visible to me as I see it but extra noise was added using the np.random.randint() function on top of the image. I want to reduce that noise

Here are some tutorials about image denoising techniques available in opencv.
Blurring out the noise
The most basic is applying a blur to average out the random noise. This will have the negative effect that the edges in the image will not be as sharp as originally. Depending on your application, this might be fine. Depending on the amount of noise, you can chance the size of the filter k. A larger value will produce a blurrier image with less noise.
k = 5
filtered_image = cv.blur(img,(k,k))
Advanced denoising
Alternatively, you can use more advanced techniques such as Non-local Means Denoising. This applies averaging across similar patches in the image. This technique has a few more parameters to tune to your specific application which you can read about here. (There are different versions of this function for greyscale and colour images, as well as for image sequences).
luminosity_filter_strength = 10
colour_filter_strength = 10
template_window_size = 7
search_window_size = 21
filtered_image = cv.fastNlMeansDenoisingColored(img,
luminosity_filter_strength,
colour_filter_strength,
template_window_size,
search_window_size)

I solved the problem using Scikit Image. They have very accessible documentation page for new comers and the error messages are a lot easier to understand. As for my problem I had to use Scikit Image's restoration library which has a lot of denoising functions much like openCV however the examples and the easy to understand error messages really helped. Playing around with Bilateral filters and Non-local Means Denoising solved the problem for me.

How can i properly resize my images for further processing?

I want to resize my images( to a smaller size) :
How can I resize my images properly without the bad pixels effect for further cnn processing afterward.

Your problems are due to interpolation artifacts. As you can check in the documentation for cv2.resize, by default BILINEAR is used. You should probably go with their suggestion and try using the INTER_AREA version. You may also want to check other options and see which one suits you best.

You need to look at vector images here at Wikipedia if you really want a clear picture in a small size. OpenCV library doesn't provide a function for converting bitmap images to vector images.

How to align multiple camera images using opencv

Imagine someone taking a burst shot from camera, he will be having multiple images, but since no tripod or stand was used, images taken will be slightly different.
How can I align them such that they overlay neatly and crop out the edges
I have searched a lot, but most of the solutions were either making a 3D reconstruction or using matlab.
e.g. https://github.com/royshil/SfM-Toy-Library
Since I'm very new to openCV, I will prefer a easy to implement solution
I have generated many datasets by manually rotating and cropping images in MSPaint but any link containing corresponding datasets(slightly rotated and translated images) will also be helpful.
EDIT:I found a solution here
http://www.codeproject.com/Articles/24809/Image-Alignment-Algorithms
which gives close approximations to rotation and translation vectors.
How can I do better than this?

It depends on what you mean by "better" (accuracy, speed, low memory requirements, etc). One classic approach is to align each frame #i (with i>2) with the first frame, as follows:
Local feature detection, for instance via SIFT or SURF (link)
Descriptor extraction (link)
Descriptor matching (link)
Alignment estimation via perspective transformation (link)
Transform image #i to match image 1 using the estimated transformation (link)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.