I would like to generate a 2 dimensional array with NumPy , iterate some variable few times, and fill the 2 dimensional array with 2 pieces of float data that gets calculated inside the for iteration. Then export it out to .csv
Technically I would like to do it like this:
max_array=8000
ARRAY=numpy.zeros( [max_array*2] , dtype=float)
ARRAY=numpy.arange(max_array*2).reshape((max_array,2))
for i in range(1,max_array):
######calculations here#######
array[i,i]=[data1,data2]
numpy.savetxt("output.csv", numpy.asarray(ARRAY), delimiter=",")
Unfortunately it doesn't work, I am very clumsy with syntax, and probably the [,] brackets are the problem. I would be very thankful if somebody would fix my snippet.
import numpy as np
max_array = 4
ARRAY = np.arange(max_array*2).reshape((max_array,2))
You have created a 2-d array
>>> ARRAY
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
>>>
ARRAY[i,i] indexes a single element in the array
>>> i = 0
>>> ARRAY[i,i]
0
>>> ARRAY[i,i] = 222
>>> ARRAY
array([[222, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7]])
If you want to assign values to a row:
>>> ARRAY[0] = 99, 99
>>> ARRAY
array([[99, 99],
[ 2, 3],
[ 4, 5],
[ 6, 7]])
Or
>>> ARRAY[2,:] = 66, 66
>>> ARRAY
array([[99, 99],
[ 2, 3],
[66, 66],
[ 6, 7]])
The second value in the subscript indexes a column
>>> ARRAY[:, 1]
array([99, 3, 66, 7])
>>> ARRAY[:, 1] = 0
>>> ARRAY
array([[99, 0],
[ 2, 0],
[66, 0],
[ 6, 0]])
>>>
ARRAY[i,i] = data1, data2 : tries to assign two things to a single element in the array - that is why you get an error.
Numpy docs
Indexing, Slicing, Iterating
Indexing
Related
I have an array of arrays:
x = array([array([[1, 2],
[3, 4]]),
array([[22, 4],
[ 9, 10],
[ 3, 2]])], dtype=object)
And i have a list of arrays with same length like:
xa = [array([11, 22]), array([33, 44])]
I would like to add, in pure numpy, each element of xa to the end or beginning of x, as follows:
In the end:
result = array([array([[ 1, 2],
[ 3, 4],
[11, 22]]),
array([[22, 4],
[ 9, 10],
[ 3, 2],
[33, 44]])], dtype=object)
In the beginning:
result = array([array([[11, 22],
[ 1, 2],
[ 3, 4]]),
array([[33, 44],
[22, 4],
[ 9, 10],
[ 3, 2]])], dtype=object)
*Numpy version = 1.9.3
assuming you imported numpy like that:
import numpy as np
from numpy import array
(1) not biutifully but this would work:
result = array([np.vstack((x[0], xa[0])), np.vstack((x[1], xa[1]))])
or respectively:
result = array([np.vstack((xa[0], x[0])), np.vstack((xa[1], x[1]))])
(2) better with flexible lengths of the two arrays:
result = array([np.vstack((x[i], xa[i])) for i in range(len(x))])
result = array([np.vstack((xa[i], x[i])) for i in range(len(x))])
(3) a bit more pythonic way of handling this:
result = array([np.vstack((x_i, xa_i)) for (x_i, xa_i) in zip(x, xa)])
result = array([np.vstack((xa_i, x_i)) for (x_i, xa_i) in zip(x, xa)])
I have a multi-dimensional array in Python where there may be a repeated integer within a vector in the array. For example.
array = [[1,2,3,4],
[2,9,12,4],
[5,6,7,8],
[6,8,12,13]]
I would like to completely remove the vectors that contain any element that has appeared previously. In this case, vector [2,9,12,4] and vector [6,11,12,13] should be removed because they have an element (2 and 6 respectively) that has appeared in a previous vector within that array. Note that [6,8,12,13] contains two elements that have appeared previously, so the code should be able to work with these scenarios as well.
The resulting array should end up being:
array = [[1,2,3,4],
[5,6,7,8]]
I thought I could achieve this with np.unique(array, axis=0), but I couldnt find another function that would take care of this particular uniqueness.
Any thoughts are appreaciated.
You can work with array of sorted numbers and corresponding indices of rows that looks like so:
number_info = array([[ 0, 1],
[ 0, 2],
[ 1, 2],
[ 0, 3],
[ 0, 4],
[ 1, 4],
[ 2, 5],
[ 2, 6],
[ 3, 6],
[ 2, 7],
[ 2, 8],
[ 3, 8],
[ 1, 9],
[ 1, 12],
[ 3, 12],
[ 3, 13]])
It indicates that rows remove_idx = [2, 5, 8, 11, 14] of this array needs to be removed and it points to rows rows_idx = [1, 1, 3, 3, 3] of the original array. Now, the code:
flat_idx = np.repeat(np.arange(array.shape[0]), array.shape[1])
number_info = np.transpose([flat_idx, array.ravel()])
number_info = number_info[np.argsort(number_info[:,1])]
remove_idx = np.where((np.diff(number_info[:,1])==0) &
(np.diff(number_info[:,0])>0))[0] + 1
remove_rows = number_info[remove_idx, 0]
output = np.delete(array, remove_rows, axis=0)
Output:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
Here's a quick way to do it with a list comprehension and set intersections:
>>> array = [[1,2,3,4],
... [2,9,12,4],
... [5,6,7,8],
... [6,8,12,13]]
>>> [v for i, v in enumerate(array) if not any(set(a) & set(v) for a in array[:i])]
[[1, 2, 3, 4], [5, 6, 7, 8]]
Let's say I have a matrix like this:
import numpy as np
a = np.array([[1, 2, 3], [89, 43, 2], [12, -3, 4], [-2, 4, 7]])
array([[ 1, 2, 3],
[89, 43, 2],
[12, -3, 4],
[-2, 4, 7]])
and a vector that looks like this:
b = np.array([1, 2, 3])
If I now want to do a elementwise multiplication I can simply do
c = a * b
and obtain
array([[ 1, 4, 9],
[89, 86, 6],
[12, -6, 12],
[-2, 8, 21]])
My question is: How can I do this kind of multiplication only for certain rows in my matrix? I currently do it like this:
E = a.copy()
# ignore these rows
ignInd = [1, 3]
for ind in xrange(a.shape[0]):
if ind not in ignInd:
E[ind, :] = a[ind, :] * b
The matrix E looks as desired (the rows 1 and 3 are the same as in a):
array([[ 1, 4, 9],
[89, 43, 2],
[12, -6, 12],
[-2, 4, 7]])
Can someone come up with a smarter solution than this?
It seems like you could just do the multiplication and then put back the original data where you want to ignore ...
>>> import numpy as np
>>> a = np.array([[1,2,3],[89,43,2],[12, -3, 4], [-2, 4, 7]])
>>> b = np.array([1,2,3])
>>> c = a * b
>>> ignInd = [1,3]
>>> c[ignInd, :]
array([[89, 86, 6],
[-2, 8, 21]])
>>> c[ignInd, :] = a[ignInd, :]
>>> c
array([[ 1, 4, 9],
[89, 43, 2],
[12, -6, 12],
[-2, 4, 7]])
You can index an NumPy array directly with another NumPy array. In your case, you have the indexes of rows you want to ignore, so you can build an array of indexes to include from this:
In [21]: ignInd = [1,3] #ignore these rows
In [22]: ind = np.array([i for i in range(a.shape[0]) if i not in ignInd])
In [23]: E2 = a.copy()
In [24]: E2[ind,:] = a[ind,:]*b
In [25]: E2
Out[25]:
array([[ 1, 4, 9],
[89, 43, 2],
[12, -6, 12],
[-2, 4, 7]])
EDIT: as #DSM comments, for large arrays it would be more efficient to build the index array using NumPy's vectorized methods, viz. ind = np.setdiff1d(np.arange(len(a)), ignInd) instead of the list comprehension used above.
You can use boolean indexing with np.in1d to select the rows excluded in the give indices list. The implementation would look something like this -
E = a.copy()
mask = ~np.in1d(np.arange(a.shape[0]),ignInd,)
E[mask] = a[mask]*b
For the 1st row in a
a[0] = a[0] * b
..and so on for the other rows.
This is an indirect indexing problem.
It can be solved with a list comprehension.
The question is whether, or, how to solve it within numpy,
When
data.shape is (T,N)
and
c.shape is (T,K)
and each element of c is an int between 0 and N-1 inclusive, that is,
each element of c is intended to refer to a column number from data.
The goal is to obtain out where
out.shape = (T,K)
And for each i in 0..(T-1)
the row out[i] = [ data[i, c[i,0]] , ... , data[i, c[i,K-1]] ]
Concrete example:
data = np.array([\
[ 0, 1, 2],\
[ 3, 4, 5],\
[ 6, 7, 8],\
[ 9, 10, 11],\
[12, 13, 14]])
c = np.array([
[0, 2],\
[1, 2],\
[0, 0],\
[1, 1],\
[2, 2]])
out should be out = [[0, 2], [4, 5], [6, 6], [10, 10], [14, 14]]
The first row of out is [0,2] because the columns chosen are given by c's row 0, they are 0 and 2, and data[0] at columns 0 and 2 are 0 and 2.
The second row of out is [4,5] because the columns chosen are given by c's row 1, they are 1 and 2, and data[1] at columns 1 and 2 is 4 and 5.
Numpy fancy indexing doesn't seem to solve this in an obvious way because indexing data with c (e.g. data[c], np.take(data,c,axis=1) ) always produces a 3 dimensional array.
A list comprehension can solve it:
out = [ [data[rowidx,i1],data[rowidx,i2]] for (rowidx, (i1,i2)) in enumerate(c) ]
if K is 2 I suppose this is marginally OK. If K is variable, this is not so good.
The list comprehension has to be rewritten for each value K, because it unrolls the columns picked out of data by each row of c. It also violates DRY.
Is there a solution based entirely in numpy?
You can avoid loops with np.choose:
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
data = np.array([\
[ 0, 1, 2],\
[ 3, 4, 5],\
[ 6, 7, 8],\
[ 9, 10, 11],\
[12, 13, 14]])
c = np.array([
[0, 2],\
[1, 2],\
[0, 0],\
[1, 1],\
[2, 2]])
--
In [2]: np.choose(c, data.T[:,:,np.newaxis])
Out[2]:
array([[ 0, 2],
[ 4, 5],
[ 6, 6],
[10, 10],
[14, 14]])
Here's one possible route to a general solution...
Create masks for data to select the values for each column of out. For example, the first mask could be achieved by writing:
>>> np.arange(3) == np.vstack(c[:,0])
array([[ True, False, False],
[False, True, False],
[ True, False, False],
[False, True, False],
[False, False, True]], dtype=bool)
>>> data[_]
array([ 2, 5, 6, 10, 14])
The mask to get the values for the second column of out: np.arange(3) == np.vstack(c[:,1]).
So, to get the out array...
>>> mask0 = np.arange(3) == np.vstack(c[:,0])
>>> mask1 = np.arange(3) == np.vstack(c[:,1])
>>> np.vstack((data[mask0], data[mask1])).T
array([[ 0, 2],
[ 4, 5],
[ 6, 6],
[10, 10],
[14, 14]])
Edit: Given arbitrary array widths K and N you could use a loop to create the masks, so the general construction of the out array might simply look like this:
np.vstack([data[np.arange(N) == np.vstack(c[:,i])] for i in range(K)]).T
Edit 2: A slightly neater solution (though still relying on a loop) is:
np.vstack([data[i][c[i]] for i in range(T)])
I'm trying to transform each element of a numpy array into an array itself (say, to interpret a greyscale image as a color image). In other words:
>>> my_ar = numpy.array((0,5,10))
[0, 5, 10]
>>> transformed = my_fun(my_ar) # In reality, my_fun() would do something more useful
array([
[ 0, 0, 0],
[ 5, 10, 15],
[10, 20, 30]])
>>> transformed.shape
(3, 3)
I've tried:
def my_fun_e(val):
return numpy.array((val, val*2, val*3))
my_fun = numpy.frompyfunc(my_fun_e, 1, 3)
but get:
my_fun(my_ar)
(array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object), array([None, None, None], dtype=object), array([None, None, None], dtype=object))
and I've tried:
my_fun = numpy.frompyfunc(my_fun_e, 1, 1)
but get:
>>> my_fun(my_ar)
array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object)
This is close, but not quite right -- I get an array of objects, not an array of ints.
Update 3! OK. I've realized that my example was too simple beforehand -- I don't just want to replicate my data in a third dimension, I'd like to transform it at the same time. Maybe this is clearer?
Does numpy.dstack do what you want? The first two indexes are the same as the original array, and the new third index is "depth".
>>> import numpy as N
>>> a = N.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b = N.dstack((a,a,a))
>>> b
array([[[1, 1, 1],
[2, 2, 2],
[3, 3, 3]],
[[4, 4, 4],
[5, 5, 5],
[6, 6, 6]],
[[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
>>> b[1,1]
array([5, 5, 5])
Use map to apply your transformation function to each element in my_ar:
import numpy
my_ar = numpy.array((0,5,10))
print my_ar
transformed = numpy.array(map(lambda x:numpy.array((x,x*2,x*3)), my_ar))
print transformed
print transformed.shape
I propose:
numpy.resize(my_ar, (3,3)).transpose()
You can of course adapt the shape (my_ar.shape[0],)*2 or whatever
Does this do what you want:
tile(my_ar, (1,1,3))