In c/c++ we use to declare three-dimensional using the following syntax.
`long long dp[20][180][2]; `
memset(dp, -1, sizeof(dp));
My code:
import numpy as np
x = np.zeros((20,180,2))
How can we declare and initialize a three-dimensional array in python?
If you want all the values initialized to -1 like in your memset example, then you'd want np.full instead of np.zeros
import numpy as np
x = np.full((20,180,2), -1)
Related
How would I efficiently initialise a fixed-size pyarray.ListArray
from a suitably prepared numpy array?
The documentation of pyarray.array indicates that a nested iterable input structure works, but in practice that does not work if the outer iterable is a numpy array:
import numpy as np
import pyarrow as pa
n = 1000
w = 3
data = np.arange(n*w,dtype="i2").reshape(-1,w)
# this works:
pa.array(list(data),pa.list_(pa.int16(),w))
# this fails:
pa.array(data,pa.list_(pa.int16(),w))
# -> ArrowInvalid: only handle 1-dimensional arrays
It seems ridiculus to split an input array directly matching the Arrow specification into n separate arrays and then re-assemble from there.
pyarray.ListArray.from_arrays seems to require an offsets argument, which only has a meaning for variable-size lists.
I believe you are looking for pyarrow.FixedSizeListArray.from_arrays which, regrettably, appears undocumented (I went ahead and filed a JIRA ticket)
You'll want to reshape your numpy array as a contiguous array first.
import numpy as np
import pyarrow as pa
len = 10
width = 3
# Or just skip the initial reshape but keeping it in to simulate real data
arr = np.arange(len*width,dtype="i2").reshape(-1,width)
arr.shape = -1
pa.FixedSizeListArray.from_arrays(arr, width)
Let's say I have an existing array that we don't want to make any changes to, but like to be converted to a ctype array and be shared among all the multiprocessing later on.
The actual array I want to be shared is of shape 120,000 x 4, which is too large to type all out here, so let's pretend such an array is way smaller and looks like this:
import numpy as np
import multiprocessing as mp
import ctypes
array_from_data = np.array([[275,174,190],
[494, 2292, 9103],
[10389,284,28],
[193,746,293]])
I have read other posts that discuss the ctype array and multiprocessing, like this one. However, the answers are not quite the same as what I am looking for, because so far they are not exactly about converting an existing NumPy array.
My questions are the following:
1) How to do a simple conversion from an existing Numpy array to a ctype array?
2) How to make the array to be shared among all the multiprocessing in a simple fashion?
Thank you in advance.
EDIT: spellings and some clarifications on the actual array
EDIT2: Apparently the os itself affects how the multiprocessing will behave and I need to specify it: My os is Windows 10 64-bit.
The workaround I found months ago requires flattening the array into a 1-dimensional array first, even though I only understand half of what is under the hood.
The gist of the solution is to:
1) make a RawArray of the same size and same dtypes as the array we are trying to share
2) create a numpy array that uses the same memory location as the RawArray
3) fill in the elements to the newly created numpy array
Workaround:
import ctypes
import multiprocessing as mp
import numpy as np
array_from_data = np.array([[275,174,190],
[494, 2292, 9103],
[10389,284,28],
[193,746,293]])
flattened_array1 = array_from_data.flatten(order='C')
flattened_array2 = np.array([1,0,1,0,1]).astype(bool)
flattened_array3 = np.array([1,0,1,0,-10]).astype(np.float32)
array_shared_in_multiprocessing1 = mp.RawArray(ctypes.c_int32,len(flattened_array1))
temp1 = np.frombuffer(array_shared_in_multiprocessing1, dtype=np.int32)
temp1[:] = flattened_array1
array_shared_in_multiprocessing2 = mp.RawArray(ctypes.c_bool,len(flattened_array2))
temp2 = np.frombuffer(array_shared_in_multiprocessing2, dtype=bool)
temp2[:] = flattened_array2
array_shared_in_multiprocessing3 = mp.RawArray(ctypes.c_float,len(flattened_array3))
temp2 = np.frombuffer(array_shared_in_multiprocessing3, dtype=np.float32)
temp2[:] = flattened_array3
Matlab Code:
AP(queryIdx) = diff([0;recall]')*prec
My python code:
AP[queryIdx] = np.dot(np.diff(np.concatenate(([[0]], recall), axis=0).transpose()),prec)
Variables:(Checked and am quite sure they are equivalent in python and in Matlab)
Recall: 1000x1 np array*
prec: 1000x1 np array
* prints out as [[.],.....,[.]]
Results:
Matlab: .1011
Python: 0.05263158
Only cause I can think of outside of the code is that python uses more
precision, but I doubt that would make such a large difference)
*Edit There was a problem with my prec variable. The above code worked
That code looks a bit messy. Try replacing it with this:
AP[queryIdx] = np.dot(np.diff(np.hstack([0, recall.ravel()])), prec.ravel())
In your post, you mentioned that you have a 1000 x 1 array for both recall and prec. This to me is interpreted as a 2D array with a singleton dimension: the second dimension. As such, you'd need to convert this back to a 1D array using ravel.
Now, np.hstack horizontally stacks 1D arrays together and so this will append a 0 at the front, then apply the diff operator, and the perform the dot product with prec.
One common gotcha that MATLAB coders have with numpy is the representation of 1D arrays in numpy. There is no such thing as the transpose of a 1D array. All numpy 1D arrays are row vectors. If you explicitly want to make the 1D array a column vector, you need to include an additional dimension and make the second dimension 1, then transpose it. Something like this:
r = v[:][None].T
In any case, let's verify the results:
MATLAB
>> recall = (1:1000).';
>> prec = (1000:-1:1).';
>> diff([0; recall].')*prec
ans =
500500
Python (IPython)
In [1]: import numpy as np
In [2]: recall = np.arange(1,1001)
In [3]: prec = np.arange(1000,0,-1)
In [4]: np.dot(np.diff(np.hstack([0, recall.ravel()])), prec.ravel())
Out[4]: 500500
In my program I current create a numpy array full of zeros and then for loop through each element replacing it with the desired value. Is there a more efficient way of doing this?
Below is an example of what I am doing however, instead of a int I have a list of each row which needs put into the numpy array. Is there a way to put replace whole rows and is that more efficient.
import numpy as np
from tifffile import imsave
image = np.zeros((5, 2160, 2560), 'uint16')
num =0
for pixel in np.nditer(image, op_flags=['readwrite']):
pixel = num
num += 1
imsave('multipage.tif', image)
Just assign to the whole row using slicing
import numpy as np
from tifffile import imsave
list_of_rows = ... # all items in list should have same length
image = np.zeros((len(list_of_rows),'uint16')
for row_idx, row in enumerate(list_of_rows):
image[row_idx, :] = row
imsave('multipage.tif', image)
Numpy slicing is extremely powerful and nice. I recommend reading through this documentation to get a feeling of what is possible.
You could simply generate a vector of length 5*2160*2560 and apply this to image.
image=np.arange(5*2160*2560)
image.shape=5,2160,-1
For example, I have a variable which point to a vector contains many elements in memory, I want to copy element in vector to a numpy array, what should I do except one by one copy? Thx
I am assuming that your vector can be represented like that:-
import array
x = array('l', [1, 3, 10, 5, 6]) # an array using python's built-in array module
Casting it as a numpy array will then be:-
import numpy as np
y = np.array(x)
If the data is packed in a buffer in native float format:
a = numpy.fromstring(buf, dtype=float, count=N)