Different decimal formats within same numpy array - python

I reshaped a 3D NumPy array to 2D using the reshape method by X1 = np.reshape(input,(500, 3*40)). Now the new 2D array has different formats such as,
few rows have the following format -
X1[8,:] has -
array([ 5557., 2001., 1434., 1348., 991., 1240., 1668., 1093.,
1680., 1476., 2521., 1841., 2443., 2295., 1911., 2491., and so on .... ])
whereas few other rows have the following format -
X1[9,:] has -
array([3.69900e+04, 1.19090e+04, 1.12300e+04, 1.25170e+04, 6.91000e+03,
7.24700e+03, 8.31800e+03, 6.31000e+03, 8.96700e+03, 7.18100e+03,
1.03010e+04, 9.69800e+03, 1.29270e+04, 1.33140e+04, 1.00420e+04, and so on ... ])
Since they don't have the same format throughout, I am not sure if it will cause a problem during neural network model training. I am not sure how to maintain the same decimal format throughout the same NumPy array.

That isn't problem for You, because 5557. and 1.03010e+04 are float both. The second number format ( scientific notation is only for show (print) the numbers ).
Remeber that numpy array has just one data tipe for all items in an array, you could get it with array.dtype attribute

Related

Python numpy comparing two 3D Arrays for similarity

I am trying to compare two 3D numpy arrays to calculate similarity. I have found these two posts, which I am trying to stich together to something useful.
Comparing NumPy Arrays for Similarity
Subtracting numpy arrays of different shape efficiently
To make a long story short, I have two arrays created from 3D point clouds so they are filled with 3D coordinates, but because the 3D objects are different, the arrays have different lengths.
If requested, I can post some sample arrays, but they are +1000 points, so that would be a lot of text to post.
Here is what I am trying to do now. You can get array1 and array2 data here: https://pastebin.com/WbNvRUwG (array2 starts at line 1858).
array1 = [long np array with 3D coordinates]
array2 = [long np array with 3D coordinates]
array1_original = array1.copy()
if len(array1) < len(array2):
array1, array2 = array2, array1
array_difference = np.subtract(array1, array2[:,None]) # The [:,None] is from the second link to make the arrays have same length to enable subtractraction
array_abs_difference = np.absolute(array_difference)
array_total_difference = np.sum(array_abs_difference)
similarity = 1 - (array_total_difference /
np.sum(array1_original))
My array differences are fine and represent what I want, so the most similar arrays have small differences, but when I do the sum of array1_original it comes out way smaller than my differences and therefore my similarity score becomes negative.
I also tried to calculate the difference from an array filled with zeros to array1_original, but it comes out about the same.
Can anyone tell me why np.sum(array1_original) would not be bigger than np.sum(array_abs_difference)?
The numpy comparison ended up being to slow, so I just used open3D instead. It works for me

I have a numpy 2 dimensional complex number data, how to use this array to do 2D-FFT and plot the result

I have a 48x64 size numpy array of complex number.
How can I do 2D-FFT with this array? After I get the result, it returns another same size array of complex number, how can I plot it?
Here is what I have done.
a = np.fft.fft2(data_total_dataframe)
data_total_dataframe is my original 48x64 complex number array.
So after this code, I will have the numpy array a.
Is there any method to plot this array so that I can see any type of graph?
I'm new to python and FFT processing.
Thank you in advance for the help

Transpose a 1-dimensional array in Numpy without casting to matrix

My goal is to to turn a row vector into a column vector and vice versa. The documentation for numpy.ndarray.transpose says:
For a 1-D array, this has no effect. (To change between column and row vectors, first cast the 1-D array into a matrix object.)
However, when I try this:
my_array = np.array([1,2,3])
my_array_T = np.transpose(np.matrix(myArray))
I do get the wanted result, albeit in matrix form (matrix([[66],[640],[44]])), but I also get this warning:
PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
my_array_T = np.transpose(np.matrix(my_array))
How can I properly transpose an ndarray then?
A 1D array is itself once transposed, contrary to Matlab where a 1D array doesn't exist and is at least 2D.
What you want is to reshape it:
my_array.reshape(-1, 1)
Or:
my_array.reshape(1, -1)
Depending on what kind of vector you want (column or row vector).
The -1 is a broadcast-like, using all possible elements, and the 1 creates the second required dimension.
If your array is my_array and you want to convert it to a column vector you can do:
my_array.reshape(-1, 1)
For a row vector you can use
my_array.reshape(1, -1)
Both of these can also be transposed and that would work as expected.
IIUC, use reshape
my_array.reshape(my_array.size, -1)

Pandas Series.as_matrix() doesn't properly convert a series of nd arrays into a single nd array

I have a pandas dataframe where one column is labeled "feature_vector" and contains in it a 1d numpy array with a bunch of numbers. Now, I am needing to use this data in an scikit learn model, so I need it as a single numpy array. So naturally I call DataFrame["feature_vector"].as_matrix() to get the numpy array from the correct series. The only problem is, the as_matrix() function will return an 1d numpy array where each element is an 1d numpy array containing each vector. When this is passed to an sklearn model's .fit() function, it throws an error. What I instead need is a 2d numpy array rather than the 1d array of 1d arrays. I wrote this work around, which uses presumably unnecessary memory and computation time:
x = dataframe["feature_vector"].as_matrix()
#x is a 1d array of 1d arrays.
l = []
for e in x:
l.append(e)
x = np.array(l)
#x is now a single 2d array.
Is this a bug in pandas .as_matrix()? Is there a better work around that doesn't require me to change the structure of the original dataframe?

Python: how to keep a big numpy array of arrays of floats relatively small?

I have a numpy array created from reading many images with the cv2 package. I read the image in grayscale so the pixel values are from 0 to 255 in which case the type of the data is uint8. This means each data element is of size 1 byte. I create a list using each image and then want to transform the list of arrays into an array of arrays. Afterwards, I need to feed this data into a model but the model needs the image pixel values to be floats between 1 and 0. Now each float in Python is 8 bytes. So I tried to transform each array using this cv2 function
unlabeled_img_array = cv2.normalize(unlabeled_img_array.astype('float'), None, 0.0, 1.0, cv2.NORM_MINMAX)
It works and I can create the list of arrays. The problem arises when I try to turn the list of arrays into an array of arrays with this:
unlabeled_img_array_arrays = np.array(unlabeled_img_list_arrays)
I then get a Memory Error, evidently because the matrix is too big. If I do it with datatype uint8 there's no error.
My question is. Is there a way to get over this problem or would I have to stick with using uint8 instead of float values?
Edit:
I also tried using this
cv2.normalize(unlabeled_img_array.astype(np.float16), None, 0.0, 1.0, cv2.NORM_MINMAX)
but it gives me this error
TypeError: src data type = 23 is not supported
Is there a way to make the array to float16? Maybe that will cut the size enough. Although I'm not sure if the model will accept it.
You could try 'float16' instead of 'float', this would nominally save 3/4 of the memory.

Categories