I am currently working on a project where i am trying to create a machine learning model that is able to classify actions in a video. I already created a script that is able to detect a person in a video and generate data based on the movements of the body parts. This generates a 4D array with the following input shape:
(nframes, nperson, nbodyparts, 3 coördinates per body part)
The input shape of just 1 video (2 persons) with a duration of 3 second and filmed in 60fps/s will look like this:
(180, 2, 25, 3)
The 4D array for every video is saved as a numpy file, so if i process 400 video's. I will get 400 numpy files.
The next step is to create a keras or tensorflow RNN-LSTM model that is able to train on the 400 numpy files and able to work with the 4D array of every video, but i really don't know how to get this to work. I already searched for some solutions but the only thing i could fine is that Keras is only able to work with 3D array.
I really would appreciate your help and view on how i could solve this, with hopefully an example code.
King regards,
I assume you are using numpy.array. Reshaping the 4d array into 3d could be done by np.resahpe(). Documentation can be found here.
Example:
import numpy as np
# create a sample 4d array data of shape (10, 2, 25, 3)
data = np.arange(10*2*25*3).reshape((10, 2, 25, 3))
# condense the 4d array to 3d array by explicitly stating the shape.
data_reshaped = data.reshape((10, 2, 75))
# or you can use -1 to ask numpy infer the dimension
data_reshaped2 = data.reshape((10, 2, -1))
# you can also reshape your data into 2d of shape (10, 150)
data_reshaped3 = data.reshape((10, -1))
Then, you can follow tutorials online to build your model. An example tutorial could be this.
Note: You mentioned that "Keras is only able to work with the 3D array." I think one of the dimension is reserved for batch_size. So, I suggest you should convert your 4d array into 2d.
Related
I am in the process of converting some matlab code to python. I working with a 3d volume h x w x d represented as an numpy array, I am extracting smaller 3d patches from this volume using the function from SO here. So if I have 32x32x32 array and extract 16x16x16 patches I end up with a shape (2, 2, 2, 16, 16, 16) After processing each patch I would like to put it back into shape h x w x d basically reverse window_nd What would be the idiomatic numpy way without looping each dimension? Since I also need to work with 2d and 4d data I would like to avoid creating a function for each dimension.
Normally, writing back to as_strided views is not advised because it can cause race conditions, but since you only made blocks, this should work:
original_shaped_array = windowed_array.transpose(0,3,1,4,2,5).reshape(32,32,32)
Additionally, if you never copied the windowed array, and do calculations in-place, the data should be changed in the original array - a windowed view is simply a new view into the same data. Don't do this if there is any overlap
Im in python, I have a list with a length 784 (which i extracted from a 28x28 image ) and now i want to arrange it in a way i can use it in tensor-flow to be used with a trained model but the problem is that the NN needs to have a 28,28 shaped array but my current array is of shape (784,)
I tried to use some for loops for this but i have no luck in successfully creating a system to carry this out , please help
i figured out that i need to use this structure
for i in range(res):
for a in range(0,res):
mnistFormat.append(grascalePix[a]) #mnistFormat is an innitially empty list
#and grascale has my 784 grayscale pixles
but i cant figure out what should go in the range function of the for loop to make this possible
For example lets say i have a sample 4x4 image's pixel list >
grayscalePix = [255,255,255,255,255,100,83,200,255,50,60,255,30,1,46,255]
this is a Row by Row representation , which means the first 4 elements are the first ---- row
i want to arrange them into a list of shape (4,4)
mnistFormat = [255,255,255,255],[255,100,83,200],[255,50,60,255],[30,1,46,255]
Just keep in mind this is sample , the real data is 784 elements long and i dont have much experince in numpy
numpy might help you there very easily:
mnistFormat = np.array(grascalePix).reshape((np.sqrt(len()),-1))
I have an numpy array X which contains 2d images. numpy array dimensions are (1000,60,40) (1000=no.of img).
I want to feed this array to my model but requires dimensions to be
(1000,60,40,1) (appended 1 is for no. of channels).
so i reshape the array by
Y=X.reshape(1000,60,40,1)
as I was having wrong predictions I checked by re-reshaping the reshaped array to check if it was same as my orig img,
I did that by doing
Z=Y.reshape(1000,60,40)
And I saved them as PNG by doing
for i in range(1000):
misc.imsave('img_rereshaped'+str(i)+'.png',Z[i])
It gives some png files as output but they are not same as the respective original ones from the X numpy array
Am I reshaping in the wrong way or reshaping changes the input data and again reshaping the reshaped data would give different result than the original data?
To test whether the reshaping is causing a problem, it's better to test it without involving other potential errors coming from, say, misc.imsave() etc.
Running something like:
import numpy as np
a = np.random.rand(10,3)
b = np.reshape(a, [10, 3, 1])
c = np.reshape(b, [10, 3])
print(np.sum(c - a))
you'll see that going back and forth using reshape doesn't cause a problem.
Could be you're not using the PNG save correctly. Perhaps the function expects 3 channels for example. Try plotting it locally using matplotlib.
I have a numpy array Data of 1D of size [36*64]. Basically, I have 36, 8*8 images stored in a 1D array. Each image is stored in Height(8)*Width(8) format.
For e.g.: ith image is stored from Data[i*8*8 : (i*8*8 + 8*8)].
Now I want to make a tile of images from the given 36 images, i.e. 6 images stacked on top of each other. Example.
Basically, I want to transform my 1D Numpy array into a 2D array of images in the above mentioned format.
I would prefer answers with just using Numpy methods.
To convert your 1D array to 2D use reshape as shown with an example:
# Creating 36 images each of shape 8x8
initial_1D = np.random.randn(2304).reshape(36, 8, 8)
Collage can be formed using PIL. For clear understanding refer here Making a collage in PIL
If I understand you correctly, you can do it like this
# make example data
a = np.linspace(0, 36*64-1, 36*64)
print(a[:64])
print(a.shape)
# reshape 1D to 3D array
b = a.reshape(-1, 8, 8)
# look at first "image"
print(b[0])
If I did not understand you correctly, you need to put the -1, 8, 8 in a different order.
I'm new to python and I apologize in advance if my question seems trivial.
I have a .h5 file containing pairs of images in greyscale organized in match and non-match groups. For my final purpose I need to consider each pair of images as a single image of 2 channels (where each channel is in fact an image).
To use my data I proceed like this:
I read the .h5 file putting my data in numpy arrays (I read both groups match and non-match, both with shape (50000,4096)):
with h5py.File('supervised_64x64.h5','r') as hf:
match = hf.get('/match')
non_match = hf.get('/non-match')
np_m = np.asarray(match)
np_nm = np.asarray(non_match)
hf.close()
Then I try to reshape the arrays:
reshaped_data_m = np_m.reshape(250000,2,4096)
Now, if I reshape the arrays as (250000,2,4096) and then I try to show the corresponding images what I get is actually right. But I need to reshape the arrays as (25000,64,64,2) and when I try to do this I get all black images.
Can you help me?
Thank you in advance!
I bet you need to first transpose your input matrix from 250000x2x4096 to 250000x4096x2, after which you can do the reshape.
Luckly, numpy offers the transpose function which should do the trick. See this question for a bigger discussion around transposing.
In your particular case, the invocation would be:
transposed_data_m = numpy.transpose(np_m, [1, 3, 2])
reshaped_data_m = tranposed_data_m.reshape(250000, 64, 64, 2)