Say I have an array
np.zeros((4,2))
I have a list of values [4,3,2,1], which I want to assign to the following positions:
[(0,0),(1,1),(2,1),(3,0)]
How can I do that without using the for loop or flattening the array?
I can use fancy index to retrieve the value, but not to assign them.
======Update=========
Thanks to #hpaulj, I realize the bug in my original code is.
When I use zeros_like to initiate the array, it defaults to int and truncates values. Therefore, it looks like I did not assign anything!
You can use tuple indexing:
>>> import numpy as np
>>> a = np.zeros((4,2))
>>> vals = [4,3,2,1]
>>> pos = [(0,0),(1,1),(2,0),(3,1)]
>>> rows, cols = zip(*pos)
>>> a[rows, cols] = vals
>>> a
array([[ 4., 0.],
[ 0., 3.],
[ 2., 0.],
[ 0., 1.]])
Here is a streamlined version of #wim's answer based on #hpaulj's comment. np.transpose automatically converts the Python list of tuples into a NumPy array and transposes it. tuple casts the index coordinates to tuples which works because a[rows, cols] is equivalent to a[(rows, cols)] in NumPy.
import numpy as np
a = np.zeros((4, 2))
vals = range(4)
indices = [(0, 0), (1, 1), (2, 0), (3, 1)]
a[tuple(np.transpose(indices))] = vals
print(a)
Related
I have the following numpy array arr_split:
import numpy as np
arr1 = np.array([[1.,2,3], [4,5,6], [7,8,9]])
arr_split = np.array_split(arr1,
indices_or_sections = 4,
axis = 0)
arr_split
Output:
[array([[1., 2., 3.]]),
array([[4., 5., 6.]]),
array([[7., 8., 9.]]),
array([], shape=(0, 3), dtype=float64)]
How do I remove rows which are "empty" (ie. in the above eg., it's the last row). The array arr_split can have any number of "empty" rows. The above eg. just so happens to have only one row which is "empty".
I have tried using list comprehension, as per below:
arr_split[[(arr_split[i].shape[0] != 0) for i in range(len(arr_split))]]
but this doesn't work because the list comprehension [(arr_split[i].shape[0] != 0) for i in range(len(arr_split))] part returns a list, when I actually just need the elements in the list to feed into arr_split[] as indices.
Anyone know how I could fix this or is there another way of doing this? If possible, looking for the easiest way of doing this without too many loops or if statements.
you can change the indices_or_sections value to length of the first axis, this will prevent any empty arrays from being produced
import numpy as np
arr1 = np.array([[1.,2,3], [4,5,6], [7,8,9]])
arr_split = np.array_split(arr1,
indices_or_sections = arr1.shape[0],
axis = 0)
arr_split
>>> [
array([[1., 2., 3.]]),
array([[4., 5., 6.]]),
array([[7., 8., 9.]])
]
Just loop through and check the size. Only add them to the new list if they have a size greater than 0.
arr_split_new = [arr for arr in arr_split if arr.size > 0]
You can use enumerate to get the indexes and size to check if empty
indexes = [idx for idx, v in enumerate(arr_split) if v.size != 0]
[0, 1, 2]
Here's the problem. Let's say I have a matrix A =
array([[ 1., 0., 2.],
[ 0., 0., 2.],
[ 0., -1., 3.]])
and a vector of indices p = array([0, 2, 1]). I want to turn a 3x3 matrix A to an array of length 3 (call it v) where v[j] = A[j, p[j]] for j = 0, 1, 2. I can do it the following way:
v = map(lambda (row, idx): row[idx], zip(A, p))
So for the above matrix A and a vector of indices p I expect to get array([1, 2, -1]) (ie 0th element of row 0, 2nd element of row 1, 1st element of row 2).
But can I achieve the same result by using native numpy (ie without explicitly zipping and then mapping)? Thanks.
I don't think that such a functionality exists. To achieve what you want, I can think of two easy ways. You could do:
np.diag(A[:, p])
Here the array p is applied as a column index for every row such that on the diagonal you will have the elements that you are looking for.
As an alternative you can avoid to produce a lot of unnecessary entries by using:
A[np.arange(A.shape[0]), p]
I have a list which consists of several numpy arrays with different shapes.
I want to reshape this list of arrays into a numpy vector and then change each element in the vector and then reshape it back to the original list of arrays.
For example:
input
[numpy.zeros((2,2)), numpy.ones((3,3))]
First
To vector
[0,0,0,0,1,1,1,1,1,1,1,1,1]
Second
every time change only one element. for example change the 1st element 0 to 2
[0,2,0,0,1,1,1,1,1,1,1,1,1]
Last
convert it back to
[array([[0,2],[0,0]]),array([[1,1,1],[1,1,1],[1,1,1]])]
Is there any fast implementation? Thanks very much.
It seems like converting to a list and back will be inefficient. Instead, why not figure out which array to index (and where) and then just update that index? e.g.
def change_element(arr1, arr2, ix, value):
which = ix >= arr1.size
arr = [arr1, arr2][which]
ix = ix - arr1.size if which else ix
arr.ravel()[ix] = value
And here's some example usage:
>>> arr1 = np.zeros((2, 2))
>>> arr2 = np.ones((3, 3))
>>> change_element(arr1, arr2, 1, 2)
>>> change_element(arr1, arr2, 6, 3.14)
>>> arr1
array([[ 0., 2.],
[ 0., 0.]])
>>> arr2
array([[ 1. , 1. , 3.14],
[ 1. , 1. , 1. ],
[ 1. , 1. , 1. ]])
>>> change_element(arr1, arr2, 7, 3.14)
>>> arr1
array([[ 0., 2.],
[ 0., 0.]])
>>> arr2
array([[ 1. , 1. , 3.14],
[ 3.14, 1. , 1. ],
[ 1. , 1. , 1. ]])
A few notes -- This updates the arrays in place. It doesn't create new arrays. If you really need to create new arrays, I suppose you could np.copy them and return. Also, this relies on the arrays sharing memory before and after the ravel. I don't remember the exact circumstances where ravel would return a new array rather than a view into the original array . . .
Generalizing to more arrays is actually quite easy. We just need to walk down the list of arrays and see if ix is less than the array size. If it is, we've found our array. If it isn't, we need to subtract the array's size from ix to represent the number of elements we've traversed thus far:
def change_element(arrays, ix, value):
for arr in arrays:
if ix < arr.size:
arr.ravel()[ix] = value
return
ix -= arr.size
And you can call this similar to before:
change_element([arr1, arr2], 6, 3.14159)
#mgilson probably has the best answer for you, but if you absolutely have to convert to a flat list first and then go back again (perhaps because you need to do something else with the flat list as well), then you can do this with list comprehensions:
lst = [numpy.zeros((2,4)), numpy.ones((3,3))]
tlist = [e for a in lst for e in a.ravel()]
tlist[1] = 2
i = 0
lst2 = []
dims = [a.shape for a in lst]
for n, m in dims:
lst2.append(np.array(tlist[i:i+n*m]).reshape(n,m))
i += n*m
lst2
[array([[ 0., 2.],
[ 0., 0.]]), array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])]
Of course, you lose the information about your array sizes when you flatten, so you need to store them somewhere (here, in dims).
Is there a filter similar to ndimage's generic_filter that supports vector output? I did not manage to make scipy.ndimage.filters.generic_filter return more than a scalar. Uncomment the line in the code below to get the error: TypeError: only length-1 arrays can be converted to Python scalars.
I'm looking for a generic filter that process 2D or 3D arrays and returns a vector at each point. Thus the output would have one added dimension. For the example below I'd expect something like this:
m.shape # (10,10)
res.shape # (10,10,2)
Example Code
import numpy as np
from scipy import ndimage
a = np.ones((10, 10)) * np.arange(10)
footprint = np.array([[1,1,1],
[1,0,1],
[1,1,1]])
def myfunc(x):
r = sum(x)
#r = np.array([1,1]) # uncomment this
return r
res = ndimage.generic_filter(a, myfunc, footprint=footprint)
The generic_filter expects myfunc to return a scalar, never a vector.
However, there is nothing that precludes myfunc from also adding information
to, say, a list which is passed to myfunc as an extra argument.
Instead of using the array returned by generic_filter, we can generate our vector-valued array by reshaping this list.
For example,
import numpy as np
from scipy import ndimage
a = np.ones((10, 10)) * np.arange(10)
footprint = np.array([[1,1,1],
[1,0,1],
[1,1,1]])
ndim = 2
def myfunc(x, out):
r = np.arange(ndim, dtype='float64')
out.extend(r)
return 0
result = []
ndimage.generic_filter(
a, myfunc, footprint=footprint, extra_arguments=(result,))
result = np.array(result).reshape(a.shape+(ndim,))
I think I get what you're asking, but I'm not completely sure how does the ndimage.generic_filter work (how abstruse is the source!).
Here's just a simple wrapper function. This function will take in an array, all the parameters ndimage.generic_filter needs. Function returns an array where each element of the former array is now represented by an array with shape (2,), result of the function is stored as the second element of that array.
def generic_expand_filter(inarr, func, **kwargs):
shape = inarr.shape
res = np.empty(( shape+(2,) ))
temp = ndimage.generic_filter(inarr, func, **kwargs)
for row in range(shape[0]):
for val in range(shape[1]):
res[row][val][0] = inarr[row][val]
res[row][val][1] = temp[row][val]
return res
Output, where res denotes just the generic_filter and res2 denotes generic_expand_filter, of this function is:
>>> a.shape #same as res.shape
(10, 10)
>>> res2.shape
(10, 10, 2)
>>> a[0]
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> res[0]
array([ 3., 8., 16., 24., 32., 40., 48., 56., 64., 69.])
>>> print(*res2[0], sep=", ") #this is just to avoid the vertical default output
[ 0. 3.], [ 1. 8.], [ 2. 16.], [ 3. 24.], [ 4. 32.], [ 5. 40.], [ 6. 48.], [ 7. 56.], [ 8. 64.], [ 9. 69.]
>>> a[0][0]
0.0
>>> res[0][0]
3.0
>>> res2[0][0]
array([ 0., 3.])
Of course you probably don't want to save the old array, but instead have both fields as new results. Except I don't know what exactly you had in mind, if the two values you want stored are unrelated, just add a temp2 and func2 and call another generic_filter with the same **kwargs and store that as the first value.
However if you want an actual vector quantity that is calculated using multiple inarr elements, meaning that the two new created fields aren't independent, you are just going to have to write that kind of a function, one that takes in an array, idx, idy indices and returns a tuple\list\array value which you can then unpack and assign to the result.
I'm trying to move a few Matlab libraries that I've built to the python environment. So far, the biggest issue I faced is the dynamic allocation of arrays based on index specification. For example, using Matlab, typing the following:
x = [1 2];
x(5) = 3;
would result in:
x = [ 1 2 0 0 3]
In other words, I didn't know before hand the size of (x), nor its content. The array must be defined on the fly, based on the indices that I'm providing.
In python, trying the following:
from numpy import *
x = array([1,2])
x[4] = 3
Would result in the following error: IndexError: index out of bounds. On workaround is incrementing the array in a loop and then assigned the desired value as :
from numpy import *
x = array([1,2])
idx = 4
for i in range(size(x),idx+1):
x = append(x,0)
x[idx] = 3
print x
It works, but it's not very convenient and it might become very cumbersome for n-dimensional arrays.I though about subclassing ndarray to achieve my goal, but I'm not sure if it would work. Does anybody knows of a better approach?
Thanks for the quick reply. I didn't know about the setitem method (I'm fairly new to Python). I simply overwritten the ndarray class as follows:
import numpy as np
class marray(np.ndarray):
def __setitem__(self, key, value):
# Array properties
nDim = np.ndim(self)
dims = list(np.shape(self))
# Requested Index
if type(key)==int: key=key,
nDim_rq = len(key)
dims_rq = list(key)
for i in range(nDim_rq): dims_rq[i]+=1
# Provided indices match current array number of dimensions
if nDim_rq==nDim:
# Define new dimensions
newdims = []
for iDim in range(nDim):
v = max([dims[iDim],dims_rq[iDim]])
newdims.append(v)
# Resize if necessary
if newdims != dims:
self.resize(newdims,refcheck=False)
return super(marray, self).__setitem__(key, value)
And it works like a charm! However, I need to modify the above code such that the setitem allow changing the number of dimensions following this request:
a = marray([0,0])
a[3,1,0] = 0
Unfortunately, when I try to use numpy functions such as
self = np.expand_dims(self,2)
the returned type is numpy.ndarray instead of main.marray. Any idea on how I could enforce that numpy functions output marray if a marray is provided as an input? I think it should be doable using array_wrap, but I could never find exactly how. Any help would be appreciated.
Took the liberty of updating my old answer from Dynamic list that automatically expands. Think this should do most of what you need/want
class matlab_list(list):
def __init__(self):
def zero():
while 1:
yield 0
self._num_gen = zero()
def __setitem__(self,index,value):
if isinstance(index, int):
self.expandfor(index)
return super(dynamic_list,self).__setitem__(index,value)
elif isinstance(index, slice):
if index.stop<index.start:
return super(dynamic_list,self).__setitem__(index,value)
else:
self.expandfor(index.stop if abs(index.stop)>abs(index.start) else index.start)
return super(dynamic_list,self).__setitem__(index,value)
def expandfor(self,index):
rng = []
if abs(index)>len(self)-1:
if index<0:
rng = xrange(abs(index)-len(self))
for i in rng:
self.insert(0,self_num_gen.next())
else:
rng = xrange(abs(index)-len(self)+1)
for i in rng:
self.append(self._num_gen.next())
# Usage
spec_list = matlab_list()
spec_list[5] = 14
This isn't quite what you want, but...
x = np.array([1, 2])
try:
x[index] = value
except IndexError:
oldsize = len(x) # will be trickier for multidimensional arrays; you'll need to use x.shape or something and take advantage of numpy's advanced slicing ability
x = np.resize(x, index+1) # Python uses C-style 0-based indices
x[oldsize:index] = 0 # You could also do x[oldsize:] = 0, but that would mean you'd be assigning to the final position twice.
x[index] = value
>>> x = np.array([1, 2])
>>> x = np.resize(x, 5)
>>> x[2:5] = 0
>>> x[4] = 3
>>> x
array([1, 2, 0, 0, 3])
Due to how numpy stores the data linearly under the hood (though whether it stores as row-major or column-major can be specified when creating arrays), multidimensional arrays are pretty tricky here.
>>> x = np.array([[1, 2, 3], [4, 5, 6]])
>>> np.resize(x, (6, 4))
array([[1, 2, 3, 4],
[5, 6, 1, 2],
[3, 4, 5, 6],
[1, 2, 3, 4],
[5, 6, 1, 2],
[3, 4, 5, 6]])
You'd need to do this or something similar:
>>> y = np.zeros((6, 4))
>>> y[:x.shape[0], :x.shape[1]] = x
>>> y
array([[ 1., 2., 3., 0.],
[ 4., 5., 6., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
A python dict will work well as a sparse array. The main issue is the syntax for initializing the sparse array will not be as pretty:
listarray = [100,200,300]
dictarray = {0:100, 1:200, 2:300}
but after that the syntax for inserting or retrieving elements is the same
dictarray[5] = 2345