I have a numpy array batch initialized as follows:
batch = np.zeros((50, 60, 1920, 1080, 3))
Its supposed to be an array of 50 different, 60FPS videos of dimension 1920x1080, and the 3 represents three channels - red, green, blue. Each video is of exactly 1 second.
I iterate through all videos in my video folder and perform image processing on each frame of every video. Then, I write the transformed video in to the batch array. How do I properly index the batch array to save each video conforming with the dimension of batch array?
So far I have tried the following:
batch[:batches_produced, :idx, :] = frame[:]
where batches_produced is the current batch item index, idx is the current frame's index and frame is the actual frame of dimension (1920x1080x3).
When I
print(batch_data[1,2,:,:,:].shape), it throws
IndexError: index 1 is out of bounds for axis 0 with size 1.
Needless to say, this isn't working at all. I have spent most of my day trying to figure this out.
Any help would be greatly appreciated!
Related
I have made myself a numpy array from a picture using
from PIL import Image
import numpy as np
image = Image.open(file)
np.array(image)
its shape is (6000, 6000, 4) and in that array I would like to replace pixel values by one number lets say this green pixel [99,214,104,255] will be 1.
I have only 4 such pixels I want to replace with a number and all other pixels will be 0. Is there a fast and efficient way to do so and what is the best way to minimize the size of the data. Is it better to save it as dict(), where keys will be x,y and values, will be integers? Or is it better to save the whole array as it is with the shape it has? I only need the color values the rest is not important for me.
I need to process such a picture as fast as possible because there is one picture every 5 minutes and lets say i would like to store 1 year of data. That is why I'd like to make it as efficient as possible time and space-wise.
If I understand the question correctly, you can use np.where for this:
>>> arr = np.array(image)
>>> COLOR = [99,214,104,255]
>>> np.where(np.all(arr == COLOR, axis=-1), 1, 0)
This will produce a 6000*6000 array with 1 if the pixel is the selected colour, or 0 if not.
How about just storing in a database: the position and value of the pixels you want to modify, the shape of the image, the dtype of the array and the extension (jpg, etc...). You can use that information to build a new image from an array filled with 0.
I am making an input dataset which will have couple of thousands of images which all don't have same sizes but have same number of channels. I need to make these different images into one stack.
orders = (channels, size, size)
Image sizes = (3,240,270), (3,100,170), etc
I have tried appending it to axis of 0 and one and inserting too.
Images = append(Images, image, axis = 0)
File "d:/Python/advanced3DFacePointDetection/train.py", line 25, in <module>
Images = np.append(Images, item, axis=0)
File "C:\Users\NIK\AppData\Roaming\Python\Python37\site-packages\numpy\lib\function_base.py", line 4694, in append
return concatenate((arr, values), axis=axis)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Ideal output shape is like (number of images, 3) 3 for number of channels and it contains different shapes of images after that.
If you don't want to resize the image, choose the biggest one and padding all picture become same shape with it, i used to answer how to pad in this question: Can we resize an image from 64x64 to 256x256 without increasing the size
.
When run that script in loop for all your image, create a list to save all their shape. When you want to take the original image, just take image at index x in your array and shape x in your list then crop padding image with original size.
I have some problems in numpy array code.
I am coding a program to take a frame while photographing with a camera.
I want to save the image values that are fetched for each frame in a list or numpy array, and then I want to take all the pictures and save them all at once.
However, if I try to save it in an array, it will still fail.
Ah, Image's size is 640*480 and fps is 30.
This is Windows 10, pyrealsense, numpy, python 3.6 , cv2 and realsense camera D415.
Example code
color = []<br>
color_image = frame.get_color_frame()<br>
color_array = np.asanyarray(color_image.get_data())<br>
color.append(color_array)<br>
If I put this code in it, I will get a few frames, and after 15 frames, the camera frame keeps getting the same frame.
Example code
test = np.zeros((480,640,3))<br>
color_image = frame.get_color_frame()<br>
color_array = np.asanyarray(color_image.get_data())<br>
np.insert(test, frame_count, color_array)<br>
ValueError: could not broadcast input array from shape (480,640,3) into shape (480)... why..? :(
In case of 1, different frames must be stored after 15th frame.
In the case of 2, insert the value into the array well. There is no example that should be like this.
I am working with face crops of different sizes, which are generated at inference time from a video. The number of face crops per frame can vary. Lets say I have n = 3 face crops of sizes (127,321,3), (119,258,3), (135,127,3). I need to resize them in a single go to a new size (new_w,new_h,3). The resizing will generally be down-sampling. What if it some of the crops require up-sampling?
I am currently using opencv resize function in a loop, but I need to do this in realtime (< 10 ms). I tried to convert images to numpy arrays and resize them as numpy arrays but it distorts the images. I have also tried creating process pool, but it takes too many resources which I can't afford since I also have a deep learning network loaded in my memory.
new_w, new_h = 112, 112 # new shape we need to resize images to
crops_batch = [[...],[...],[...]] # contains 3 images
resized_crops = [len(crops_batch),3,new_w,new_h]
for i in range(0, len(crops_batch)):
temp_img = crops_batch[i]
resized_img = cv2.resize(temp_img, (new_h, new_w))
resized_img = np.transpose(resized_img , (2,0,1)) # channels, width, height
resized_crops[i,:,:,:] = resized_img
My target is to get resized_crops array of shape (n,3,new_w,new_h) using batching, since I will have to process hundreds of crops per second.
**Edit: ** What if all the face_crops are of same size already? How can we do the resizing in batch then?
**Edit 2: ** I am open to a solution in C++ which can help batch resize.
Recently I followed a few tutorials on machine learning, and now I want to test if I can make some image recognition program by myself. For this I want to use the CIFAR 10 dataset, but I think I have a small problem in the conversion of the dataset.
For who is not familiar with this set: the dataset comes as lists of n rows and 3072 columns, in which the first 1024 columns represent the red values, the second 1024 the green values and the last are the blue values. Each row is a single image (size 32x32) and the pixel rows are stacked after each other (first 32 values are the red values for the top-most row of pixels, etc.)
What I wanted to do with this dataset is to transform it to a 4D tensor (with numpy), so I can view the images with matplotlibs .imshow(). the tensor I made has this shape: (n, 32, 32, 3), so the first 'dimension' stores all images, the second stores rows of pixels, the third stores individual pixels and the last represents the rgb values of those pixels. Here is the function I made that should do this:
def rawToRgb(data):
length = data.shape[0]
# convert to flat img array with rgb pixels
newAr = np.zeros([length, 1024, 3])
for img in range(length):
for pixel in range(1024):
newAr[img, pixel, 0] = data[img, pixel]
newAr[img, pixel, 1] = data[img, pixel+1024]
newAr[img, pixel, 2] = data[img, pixel+2048]
# convert to 2D img array
newAr2D = newAr.reshape([length, 32, 32, 3])
# plt.imshow(newAr2D[5998])
# plt.show()
return newAr2D
Which takes a single parameter (a tensor of shape (n, 3072)). I have commented out the pyplot code, as this is only for testing, but when testing, I noticed that everything seems to be ok (I can recognise the shapes of the objects in the images, but I am not sure if the colours are good or not, as I get some oddly-coloured images as well as some pretty normal images... Here are a few examples: purple plane, blue cat, normal horse, blue frog.
Can anyone tell me wether I am making a mistake or not?
The images that appear oddly-coloured are the negative of the actual image, so you need to subtract each pixel value from 255 to get the true value. If you simply want to see what the original images look like, use:
from scipy.misc import imread
import matplotlib.pyplot as plt
img = imread(file_path)
plt.imshow(255 - img)
plt.show()
The original cause of the problem is that the CIFAR-10 data stores the pixel values on a scale of 0-255, but matplotlib's imshow() method (which I assume you are using) expects inputs between 0 and 1. Given an input that is not scaled between 0 and 1, imshow() does some normalization internally, which causes some images to become negatives.