I have list of arrays similar to lstB and want to pick random collection of 2D arrays. The problem is that numpy somehow does not treat objects in lists equally:
lstA = [numpy.array(0), numpy.array(1)]
lstB = [numpy.array([0,1]), numpy.array([1,0])]
print(numpy.random.choice(lstA)) # returns 0 or 1
print(numpy.random.choice(lstB)) # returns ValueError: must be 1-dimensional
Is there an ellegant fix to this?
Let's call it semi-elegant...
# force 1d object array
swap = lstB[0]
lstB[0] = None
arrB = np.array(lstB)
# reinsert value
arrB[0] = swap
# and clean up
lstB[0] = swap
# draw
numpy.random.choice(arrB)
# array([1, 0])
Explanation: The problem you encountered appears to be that numpy when converting the input list to an array will make as deep an array as it can. Since all your list elements are sequences of the same length this will be 2d. The hack shown here forces it to make a 1d array of object dtype instead by temporarily inserting an incompatible element.
However, I personally would not use this. Because if you draw multiple subarrays with this method you'll get a 1d array of arrays which is probably not what you want and tedious to convert.
So I'd actually second what one of the comments recommends, i.e. draw ints and then use advanced indexing into np.array(lstB).
Related
I have high dimension numpy array, the dimension of the array is not fixed. I need to retrieve the value with a index list, the length of the index list is the same as the dimension of the numpy array.
In other words, I need a function:
def get_value_by_list_index(target_array, index_list):
# len(index_list) = target_array.ndim
# target array can be in any dimension
# return the element at index specified on index list
For example, for a 3-dimension array, data and a list [i1, i2, i3], I the function should return data[i1][i2][i3].
Is there a good way to achieve this task?
If you know the ndarray is actually containing a type that is just a number well-representable by python types:
source_array.item(*index_iterable)
will do the same.
If you need to work with ndarrays of more complex types that might not have a python built-in type representation, things are harder.
You could implement exactly what you sketch in your comment:
data[i1][i2][i3]
# note that I didn't like the name of your function
def get_value_by_index_iterable(source_array, index_iterable):
subarray = source_array
for index in index_iterable:
subarray = subarray[index]
return subarray
last week, my teacher asks us: when storing integers from one to one hundred, what the differences between using list and using ndarray. I never use numpy before, so I search this question on the website.
But all my search result told me, they just have dimension difference. Ndarray can store N dimension data, while list storge one. That doesn't satisfy me. Is it really simple, just my overthinking, Or I didn't find the right keyword to search?
I need help.
There are several differences:
-You can append elements to a list, but you can't change the size of a ´numpy.ndarray´ without making a full copy.
-Lists can containt about everything, in numpy arrays all the elements must have the same type.
-In practice, numpy arrays are faster for vectorial functions than mapping functions to lists.
-I think than modification times is not an issue, but iteration over the elements is.
Numpy arrays have many array related methods (´argmin´, ´min´, ´sort´, etc).
I prefer to use numpy arrays when I need to do some mathematical operations (sum, average, array multiplication, etc) and list when I need to iterate in 'items' (strings, files, etc).
A one-dimensional array is like one row graph paper .##
You can store one thing inside of each box
The following picture is an example of a 2-dimensional array
Two-dimensional arrays have rows and columns
I should have changed the numbers.
When I was drawing the picture I just copied the first row many times.
The numbers can be completely different on each row.
import numpy as np
lol = [[1, 2, 3], [4, 5, 6]]
# `lol` is a list of lists
arr_har = np.array(lol, np.int32)
print(type(arr_har)) # <class 'numpy.ndarray'>
print("BEFORE:")
print(arr_har)
# change the value in row 0 and column 2.
arr_har[0][2] = 999
print("\n\nAFTER arr_har[0][2] = 999:")
print(arr_har)
The following picture is an example of a 3-dimensional array
Summary/Conclusion:
A list in Python acts like a one-dimensional array.
ndarray is an abbreviation of "n-dimensional array" or "multi-dimensional array"
The difference between a Python list and an ndarray, is that an ndarray has 2 or more dimensions
I want to create 2D array by multiple 1D array (1,7680) to have multiple number of arrays under each other creating 2D array (n,7680)
Any help would be appreciated
code
y=[]
t=0
movement=int(S*256)
if(S==0):
movement=_SIZE_WINDOW
while data.shape[1]-(t*movement+_SIZE_WINDOW) > 0:
for i in range(0, 22):
start = t*movement
stop = start+_SIZE_WINDOW
signals[i,:]=data[i,start:stop]
y=np.append(signals[i,:],y)
t=t+1
If the shape of the arrays you want to create is well defined the easiest and optimal way to do so is to create an empty array like this:
array_NxM = np.empty((N,M))
This will create an empty array with the desired shape, then you can fill the array by iterating through its elements.
Creating an array by appending 1d arrays is definitely not optimal but an acceptable way to do so would be to create a list, appending 1d arrays to it and then cast the list to a numpy array like this:
array_NxM = []
for i in range(M):
array_NxM.append(array_1xM)
array_NxM = np.array(array_NxM)
The worst way to do this is definitely to use np.append. If possible always avoid appending to a numpy array as this operations leads to a full copy in memory of the array and a read/write of it.
I wanna print the index of the row containing the minimum element of the matrix
my matrix is matrix = [[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]]
and the code
matrix = [[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]]
a = np.array(matrix)
buff_min = matrix.argmin(axis = 0)
print(buff_min) #index of the row containing the minimum element
min = np.array(matrix[buff_min])
print(str(min.min(axis=0))) #print the minium of that row
print(min.argmin(axis = 0)) #index of the minimum
print(matrix[buff_min]) # print all row containing the minimum
after running, my result is
1
3
1
[22, 3, 4, 12]
the first number should be 2, because the minimum is 2 in the third list ([34,6,4,5,8,2]), but it returns 1. It returns 3 as minimum of the matrix.
What's the error?
I am not sure which version of Python you are using, i tested it for Python 2.7 and 3.2 as mentioned your syntax for argmin is not correct, its should be in the format
import numpy as np
np.argmin(array_name,axis)
Next, Numpy knows about arrays of arbitrary objects, it's optimized for homogeneous arrays of numbers with fixed dimensions. If you really need arrays of arrays, better use a nested list. But depending on the intended use of your data, different data structures might be even better, e.g. a masked array if you have some invalid data points.
If you really want flexible Numpy arrays, use something like this:
np.array([[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]], dtype=object)
However this will create a one-dimensional array that stores references to lists, which means that you will lose most of the benefits of Numpy (vector processing, locality, slicing, etc.).
Also, to mention if you can resize your numpy array thing might work, i haven't tested it, but by the concept that should be an easy solution. But i will prefer use a nested list in this case of input matrix
Does this work?
np.where(a == a.min())[0][0]
Note that all rows of the matrix need to contain the same number of elements.
This is basically what I am trying to do:
array = np.array() #initialize the array. This is where the error code described below is thrown
for i in xrange(?): #in the full version of this code, this loop goes through the length of a file. I won't know the length until I go through it. The point of the question is to see if you can build the array without knowing its exact size beforehand
A = random.randint(0,10)
B = random.randint(0,10)
C = random.randint(0,10)
D = random.randint(0,10)
row = [A,B,C,D]
array[i:]= row # this is supposed to add a row to the array with A,C,B,D as column values
This code doesn't work. First of all it complains: TypeError: Required argument 'object' (pos 1) not found. But I don't know the final size of the array.
Second, I know that last line is incorrect but I am not sure how to call this in python/numpy. So how can I do this?
A numpy array must be created with a fixed size. You can create a small one (e.g., one row) and then append rows one at a time, but that will be inefficient. There is no way to efficiently grow a numpy array gradually to an undetermined size. You need to decide ahead of time what size you want it to be, or accept that your code will be inefficient. Depending on the format of your data, you can possibly use something like numpy.loadtxt or various functions in pandas to read it in.
Use a list of 1D numpy arrays, or a list of lists, and then convert it to a numpy 2D array (or use more nesting and get more dimensions if you need to).
import numpy as np
a = []
for i in range(5):
a.append(np.array([1,2,3])) # or a.append([1,2,3])
a = np.asarray(a) # a list of 1D arrays (or lists) becomes a 2D array
print(a.shape)
print(a)