I have a list of numpy arrays which are actually input images to my CNN. However size of each of my image is not cosistent, and my CNN takes only images which are of dimension 224X224. How do I reshape each of my image into the given dimension?
print(train_images[key].reshape(224, 224,3))
gives me an output
ValueError: total size of new array must be unchanged
I would be very grateful if anybody could help me with this.
New array should have the same amount of values when you are reshaping. What you need is cropping the picture (if it is bigger than 224x224) and padding (if it is smaller than 224x224) or resizing on both occasions.
Cropping is simply slicing with correct indexes:
def crop(np_img, size):
v_start = round((np_img.shape[0] - size[0]) / 2)
h_start = round((np_img.shape[1] - size[1]) / 2)
return np_img[v_start:v_start+size[1], h_start:h_start+size[0],:]
Padding is slightly more complex, this will create a zeros array in desired shape and plug in the values of image inside:
def pad_image(np_img, size):
v_start = round((size[0] - np_img.shape[0]) / 2)
h_start = round((size[1] - np_img.shape[1]) / 2)
result = np.zeros(size)
result[v_start:v_start+np_img.shape[1], h_start:h_start+np_img.shape[0], :] = np_img
return result
You can also use np.pad function for it:
def pad_image(np_img, size):
v_dif = size[0] - np_img.shape[0]
h_dif = size[1] - np_img.shape[1]
return np.lib.pad(np_img, ((v_dif, 0), (h_dif, 0), (0, 0)), 'constant', constant_values=(0))
You may realize padding is a bit different in two functions, I didn't want to over complicate the problem and just padded top and left on the second function. Did the both sides in first one since it was easier to calculate.
And finally for resizing, you better use another library. You can use scipy.misc.imresize, its pretty straightforward. This should do it:
imresize(np_img, size)
Here are a few ways I know to achieve this:
Since you're using python, you can use cv2.resize(), to resize the image to 224x224. The problem here is going to be distortions.
Scale the image to adjust to one of the required sizes (W=224 or H=224) and trim off whatever is extra. There is a loss of information here.
If you have the larger image, and a bounding box, use some delta to bounding box to maintain the aspect ratio and then resize down to the required size.
When you reshape a numpy array, the produce of the dimensions must match. If not, it'll throw a ValueError as you've got. There's no solution using reshape to solve your problem, AFAIK.
The standard way is to resize the image such that the smaller side is equal to 224 and then crop the image to 224x224. Resizing the image to 224x224 may distort the image and can lead to erroneous training. For example, a circle might become an ellipse if the image is not a square. It is important to maintain the original aspect ratio.
Related
I was doing some face verification stuff(I'm a newbie) first I vectorize the 2 pics I want to compare
filename1 = '/1_52381.jpg'
filename2 = '/1_443339.jpg'
img1 = plt.imread(filename1)
rows,cols,colors = img1.shape
img1_size = rows*cols*colors
img1_1D_vector = img1.reshape(img1_size).reshape(-1, 1)
img2 = plt.imread(filename2)
rows,cols,colors = img2.shape
img2_size = rows*cols*colors
img2_1D_vector = img2.reshape(img2_size).reshape(-1, 1)
(img2_1D_vector.shape,img1_1D_vector.shape)
and here I get the dim of the both vector which is: ((30960, 1), (55932, 1)).
My question is how to make them both of the same length do I need to reshape the picture to have the same size first? or I can do it after vectorize it? thanks for reading
Yes, to compute a cosine similarity you need your vectors to have the same dimension, and resizing one of the pictures before reshaping it into a vector is a good solution.
To resize, you can use one of image processing framework available in python.
Note: they are different algorithm/parameters that can be use for resizing.
# with skimage (you can also use PIL or cv2)
from skimage.transform import resize
shape = img1.shape
img2_resized = resize(img1, shape)
img1_vector = img1.ravel()
img2_vector = img2_resized.ravel()
# now you can perform your cosine similarity between img1_vector and img2_vector
# which are of the same dimension
you may wan't to downscale the bigger picture instead of upscaling the smaller one as upscaling may introduce more artefacts.
You may also want to work with a fixed size accross a whole dataset.
I have an image, using steganography I want to save the data in border pixels only.
In other words, I want to save data only in the least significant bits(LSB) of border pixels of an image.
Is there any way to get border pixels to store data( max 15 characters text) in the border pixels?
Plz, help me out...
OBTAINING BORDER PIXELS:
Masking operations are one of many ways to obtain the border pixels of an image. The code would be as follows:
a= cv2.imread('cal1.jpg')
bw = 20 //width of border required
mask = np.ones(a.shape[:2], dtype = "uint8")
cv2.rectangle(mask, (bw,bw),(a.shape[1]-bw,a.shape[0]-bw), 0, -1)
output = cv2.bitwise_and(a, a, mask = mask)
cv2.imshow('out', output)
cv2.waitKey(5000)
After I get an array of ones with the same dimension as the input image, I use cv2.rectangle function to draw a rectangle of zeros. The first argument is the image you want to draw on, second argument is start (x,y) point and the third argument is the end (x,y) point. Fourth argument is the color and '-1' represents the thickness of rectangle drawn (-1 fills the rectangle). You can find the documentation for the function here.
Now that we have our mask, you can use 'cv2.bitwise_and' (documentation) function to perform AND operation on the pixels. Basically what happens is, the pixels that are AND with '1' pixels in the mask, retain their pixel values. Pixels that are AND with '0' pixels in the mask are made 0. This way you will have the output as follows:
.
The input image was :
You have the border pixels now!
Using LSB planes to store your info is not a good idea. It makes sense when you think about it. A simple lossy compression would affect most of your hidden data. Saving your image as JPEG would result in loss of info or severe affected info. If you want to still try LSB, look into bit-plane slicing. Through bit-plane slicing, you basically obtain bit planes (from MSB to LSB) of the image. (image from researchgate.net)
I have done it in Matlab and not quite sure about doing it in python. In Matlab,
the function, 'bitget(image, 1)', returns the LSB of the image. I found a question on bit-plane slicing using python here. Though unanswered, you might want to look into the posted code.
To access border pixel and enter data into it.
A shape of an image is accessed by t= img.shape. It returns a tuple of the number of rows, columns, and channels.A component is RGB which 1,2,3 respectively.int(r[0]) is variable in which a value is stored.
import cv2
img = cv2.imread('xyz.png')
t = img.shape
print(t)
component = 2
img.itemset((0,0,component),int(r[0]))
img.itemset((0,t[1]-1,component),int(r[1]))
img.itemset((t[0]-1,0,component),int(r[2]))
img.itemset((t[0]-1,t[1]-1,component),int(r[3]))
print(img.item(0,0,component))
print(img.item(0,t[1]-1,component))
print(img.item(t[0]-1,0,component))
print(img.item(t[0]-1,t[1]-1,component))
cv2.imwrite('output.png',img)
I have an image i.e an array of pixel values, lets say 5000x5000 (this is the typical size). Now I want to expand it by 2 times to 10kx10k. The value of (0,0) pixel value goes to (0,0), (0,1), (1,0), (1,1) in the expanded image.
After that I am rotating the expanded image using scipy.interpolate.rotate (I believe there is no faster way than this given the size of my array)
Next I have to again resize this 10kx10k array to original size i.e. 5kx5k. To do this I have to take the average pixel values of (0,0), (0,1), (1,0), (1,1) in the expanded image and put them in (0,0) of the new image.
However it turns out that this whole thing is an expensive procedure an takes a lot of time given the size of my array. Is there a faster way to do it?
I am using the following code to expand the original image
#Assume the original image is already given
largeImg=np.zeros((10000,10000), dtype=np.float32)
for j in range(5000):
for k in range(5000):
pixel_value=original_img[j][k]
for x in range((2*k), (2*(k+1))):
for y in range((2*j), (2*(j+1))):
largeImg[y][x] = pixel_value
A similar method is used to reduce the image to original size after rotation.
In numpy you can use repeat:
large_img = original_img.repeat(2, axis=1).repeat(2, axis=0)
and
final_img = 0.25 * rotated_img.reshape(5000,2,5000,2).sum(axis=(3,1))
or use scipy.ndimage.zoom. this can give you smoother results than the numpy methods.
there is a nice library that probably has all the functions you need for handling images, including rotate:
http://scikit-image.org/docs/dev/api/skimage.transform.html#skimage.transform.rotate
I need to search outliers in more or less homogeneous images representing some physical array. The images have a resolution which is much higher than the screen resolution. Thus every pixel on screen originates from a block of image pixels. Is there the possibility to customize the algorithm which calculates the displayed value for such a block? Especially the possibility to either use the lowest or the highest value would be helpful.
Thanks in advance
Scipy provides several such filters. To get a new image (new) whose pixels are the maximum/minimum over a w*w block of an original image (img), you can use:
new = scipy.ndimage.filters.maximum_filter(img, w)
new = scipy.ndimage.filters.minimum_filter(img, w)
scipy.ndimage.filters has several other filters available.
If the standard filters don't fit your requirements, you can roll your own. To get you started here is an example that shows how to get the minimum in each block in the image. This function reduces the size of the full image (img) by a factor of w in each direction. It returns a smaller image (new) in which each pixel is the minimum pixel in a w*w block of pixels from the original image. The function assumes the image is in a numpy array:
import numpy as np
def condense(img, w):
new = np.zeros((img.shape[0]/w, img.shape[1]/w))
for i in range(0, img.shape[1]//w):
col1 = i * w
new[:, i] = img[:, col1:col1+w].reshape(-1, w*w).min(1)
return new
If you wanted the maximum, replace min with max.
For the condense function to work well, the size of the full image must be a multiple of w in each direction. The handling of non-square blocks or images that don't divide exactly is left as an exercise for the reader.
I want to create a thumbnail, and the width has to be either fixed or no bigger than 200 pixels wide (the length can be anything).
The images are either .jpg or .png or .gif
I am using python.
The reason it has to be fixed is so that it fits inside a html table cell.
To keep proportions the same, you need to multiply both the width and the height by the same scaling factor. Calculate each independently to fit inside your space, then choose the smallest of the two. You say you don't care about the height, but you might want to set a bound on it anyway in case someone feeds you a really skinny image.
In the code below, I've added two additional constraints: the resulting thumbnail width or height will always be >= 1, and the scaling factor will will always be <= 1 (so that the thumbnail isn't larger than the original).
scale_x = max_width / image_width
scale_y = max_height / image_height
scale = min(scale_x, scale_y, 1)
thumb_width = max(round(image_width * scale), 1)
thumb_height = max(round(image_height * scale), 1)
Look at PyMagick, the python interface for the ImageMagick libraries. It's fairly simple to resize an image, retaining proportion, while limiting the longest side.
edit: when I say fairly simple, I mean you can describe your resize in terms of the longest acceptable values for each side, and ImageMagick will preserve the proportions automatically.
Support suggestion of using PIL. However, the calculation is actually much simpler:
from PIL import Image as PILImage
imageObj = PILImage.open(image_filename,'r')
iwidth, iheight = imageObj.size # pixels
size_proportion = iheight / iwidth # make sure your "limiter" is the denominator
newheight = size_proportion * 200
# resize the image to (newheight, 200) and save it
Alternatively, just call out to subprocess and use ImageMagic or GraphicsMagic (i use latter) These libs give you very good scaling algorithms, are written in lower level language and are very much optimized. One extra nice thing IM and GM do is mass processing of images. Another nice thing is that in some modes you don't need to give GraphicsMagic the needed size, just give maximums, and it will scale the picture down based on whichever constraint exceeds your given maximums. Check it out.