I am trying to attach or concatenate two numpy arrays with different dimensions. It does not look good so far.
So, as an example,
a = np.arange(0,4).reshape(1,4)
b = np.arange(0,3).reshape(1,3)
And I am trying
G = np.concatenate(a,b,axis=0)
I get an error as a and b are not the same dimension. The reason I need to concatenate a and b is that I am trying to solve a model recursively and the state space is changing over time. So I need to call the last value function as an input to get a value function for the next time period, etc.:
for t in range(T-1,0,-1):
VG,CG = findv(VT[-1])
VT = np.append(VT,VG,axis=0)
CT = np.append(CT,CG,axis=0)
But, VT has a different dimension from the time period to the next.
Does anyone know how to deal with VT and CT numpy arrays that keep changing dimension?
OK - thanks for the input ... I need the output to be of the following form:
G = [[0, 1, 2, 3],
[0, 1, 2]]
So, if I write G[-1] I will get the last element,
[0,1,2].
I do not know if that is a numpy array?
Thanks, Jesper.
In [71]: a,b,c = np.arange(0,4), np.arange(0,3), np.arange(0,7)
It's easy to put those arrays in a list, either all at once, or incrementally:
In [72]: [a,b,c]
Out[72]: [array([0, 1, 2, 3]), array([0, 1, 2]), array([0, 1, 2, 3, 4, 5, 6])]
In [73]: G =[a,b]
In [74]: G.append(c)
In [75]: G
Out[75]: [array([0, 1, 2, 3]), array([0, 1, 2]), array([0, 1, 2, 3, 4, 5, 6])]
We can make an object dtype array from that list.
In [76]: np.array(G)
Out[76]:
array([array([0, 1, 2, 3]), array([0, 1, 2]),
array([0, 1, 2, 3, 4, 5, 6])], dtype=object)
Be aware that sometimes this could produce a 2d array (if all subarrays were the same size), or an error. Usually it's better to stick with the list.
Repeated append or concatenate to an array is usually not recommended. It's trickier to do right, and slower when it does work.
But let's demonstrate:
In [80]: G = np.array([a,b])
In [81]: G
Out[81]: array([array([0, 1, 2, 3]), array([0, 1, 2])], dtype=object)
c gets 'expanded' with a simple concatenate:
In [82]: np.concatenate((G,c))
Out[82]:
array([array([0, 1, 2, 3]), array([0, 1, 2]), 0, 1, 2, 3, 4, 5, 6],
dtype=object)
Instead we need to wrap c in an object dtype array of its own:
In [83]: cc = np.array([None])
In [84]: cc[0]= c
In [85]: cc
Out[85]: array([array([0, 1, 2, 3, 4, 5, 6])], dtype=object)
In [86]: np.concatenate((G,cc))
Out[86]:
array([array([0, 1, 2, 3]), array([0, 1, 2]),
array([0, 1, 2, 3, 4, 5, 6])], dtype=object)
In general when we concatenate, the dtypes have to match, or at least be compatible. Here, all inputs need to be object dtype. The same would apply when joining compound dtypes (structured arrays). It's only when joining simple numeric dtypes (and strings) that we can ignore dtypes (provided we don't care about integers becoming floats, etc).
You cant really stack arrays with different dimensions or size of dimensions.
This is list (kind of your desired ouput if I understand correctly):
G = [[0, 1, 2, 3],
[0, 1, 2]]
Transformed to numpy array:
G_np = np.array(G)
>>> G_np.shape
(2,)
>>> G_np
array([list([0, 1, 2, 3]), list([0, 1, 2])], dtype=object)
>>>
Solution in your case (based on your requirements):
a = np.arange(0,4)
b = np.arange(0,3)
G_npy = np.array([a,b])
>>> G_np.shape
(2,)
>>> G_np
array([array([0, 1, 2, 3]), array([0, 1, 2])], dtype=object)
>>> G_npy[-1]
array([0, 1, 2])
Edit: In relation to your Question in comment
I must admit I have no Idea how to do it in correct way.
But if a hacky way is ok(Maybe its the correct way), then:
G_npy = np.array([a,b])
G_npy = np.append(G_npy,None) # Allocate space for your new array
G_npy[-1] = np.arange(5) # populate the new space with new array
>>> G_npy
array([array([0, 1, 2, 3]), array([0, 1, 2]), array([0, 1, 2, 3, 4])],
dtype=object)
>>>
Or this way - but then, there is no point in using numpy
temp = [i for i in G_npy]
temp.append(np.arange(5))
G_npy = np.array(temp)
NOTE:
To be honest, i dont think numpy is good for collecting objects(list like this).
If I were you, I would just keep appending a real list. At the end, I would transform it to numpy. But after all, I dont know your application, so I dont know what is best attitude
Try this way:
import numpy as np
a = np.arange(4).reshape(2,2)
b = np.arange(6).reshape(2,3)
c = np.arange(8).reshape(2,4)
a
# array([[0, 1],
# [2, 3]])
b
# array([[0, 1, 2],
# [3, 4, 5]])
c
# array([[0, 1, 2, 3],
# [4, 5, 6, 7]])
np.hstack((a,b,c))
#array([[0, 1, 0, 1, 2, 0, 1, 2, 3],
# [2, 3, 3, 4, 5, 4, 5, 6, 7]])
Hope it helps.
Thanks
You are missing a parentheses there.
Please refer to the concatenate documentation below.
https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.concatenate.html
import numpy as np
a = np.arange(0,4).reshape(1,4)
b = np.arange(0,3).reshape(1,3)
c = np.concatenate((a,b), axis=1) #axis 1 as you have reshaped the numpy array
The above will give you the concatenated output c as:
array([[0, 1, 2, 3, 0, 1, 2]])
Related
I am quite familiar with python programming but I found some strange cases where the following two lines of code provided different results (assuming that the two arrays are 2-dimensional):
A[:][:] = B[:][:]
and
A[:,:] = B[:,:]
I am wondering if there is any case, explication.
Any hint?
Example :
>>> x = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> x
array([[1, 2],
[3, 4],
[5, 6]])
>>> x[1][1]
4 # expected behavior
>>> x[1,1]
4 # expected behavior
>>> x[:][1]
array([3, 4]) # huh?
>>> x[:,1]
array([2, 4, 6]) # expected behavior
Let's take a step back. Try this:
>>> x = np.arange(6)
>>> x
array([0, 1, 2, 3, 4, 5])
>>> x[:]
array([0, 1, 2, 3, 4, 5])
>>> x[:][:]
array([0, 1, 2, 3, 4, 5])
>>> x[:][:][:][:][:][:][:][:][:][:]
array([0, 1, 2, 3, 4, 5])
It looks like x[:] is equal to x. (Indeed, x[:] creates a copy of x.)
Therefore, x[:][1] == x[1].
Is this consistent with what we should expect? Why should x[:] be a copy of x? If you're familiar with slicing, these examples should clarify:
>>> x[0:4]
array([0, 1, 2, 3])
>>> x[0:6]
array([0, 1, 2, 3, 4, 5])
>>> x[0:]
array([0, 1, 2, 3, 4, 5])
>>> x[:]
array([0, 1, 2, 3, 4, 5])
We can omit the 0 and 6 and numpy will figure out what the maximum dimensions are for us.
Regarding the first part of your question, to create a copy of B, you can do any of the following:
A = B[:, :]
A = B[...]
A = np.copy(B)
I'm trying to extract columns of a scipy sparse column matrix, but the result is not stored as I'd expect. Here's what I mean:
In [77]: a = scipy.sparse.csc_matrix(np.ones([4, 5]))
In [78]: ind = np.array([True, True, False, False, False])
In [79]: b = a[:, ind]
In [80]: b.indices
Out[80]: array([3, 2, 1, 0, 3, 2, 1, 0], dtype=int32)
In [81]: a.indices
Out[81]: array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3], dtype=int32)
How come b.indices is not [0, 1, 2, 3, 0, 1, 2, 3] ?
And since this behaviour is not the one I expect, is a[:, ind] not the correct way to extract columns from a csc matrix?
The indices are not sorted. You can either force the looping by reversing in a's rows, which is not that intuitive, or enforce sorted indices (you can also do it in-place, but I prefer casting). What I find funny is that the has_sorted_indices attribute does not always return a boolean, but mixes it with integer representation.
a = scipy.sparse.csc_matrix(np.ones([4, 5]))
ind = np.array([True, True, False, False, False])
b = a[::-1, ind]
b2 = a[:, ind]
b3 = b2.sorted_indices()
b.indices
>>array([0, 1, 2, 3, 0, 1, 2, 3], dtype=int32)
b.has_sorted_indices
>>1
b2.indices
>>array([3, 2, 1, 0, 3, 2, 1, 0], dtype=int32)
b2.has_sorted_indices
>>0
b3.indices
array([0, 1, 2, 3, 0, 1, 2, 3], dtype=int32)
b3.has_sorted_indices
>>True
csc and csr indices are not guaranteed to be sorted. I can't off hand find documentation to the effect, but the has_sort_indices and the sort methods suggest that.
In your case the order is the result of how the indexing is done. I found in previous SO questions, that multicolumn indexing is performed with a matrix multiplication:
In [165]: a = sparse.csc_matrix(np.ones([4,5]))
In [166]: b = a[:,[0,1]]
In [167]: b.indices
Out[167]: array([3, 2, 1, 0, 3, 2, 1, 0], dtype=int32)
This indexing is the equivalent to constructing a 'selection' matrix:
In [169]: I = sparse.csr_matrix(np.array([[1,0,0,0,0],[0,1,0,0,0]]).T)
In [171]: I.A
Out[171]:
array([[1, 0],
[0, 1],
[0, 0],
[0, 0],
[0, 0]], dtype=int32)
and doing this matrix multiplication:
In [172]: b1 = a * I
In [173]: b1.indices
Out[173]: array([3, 2, 1, 0, 3, 2, 1, 0], dtype=int32)
The order is the result of how the matrix multiplication was done. In fact a * a.T does the same reversal. We'd have to examine the multiplication code to know exactly why. Evidently the csc and csr calculation code doesn't require sorted indices, and doesn't bother to ensure the results are sorted.
https://docs.scipy.org/doc/scipy-0.19.1/reference/sparse.html#further-details
Further DetailsĀ¶
CSR column indices are not necessarily sorted. Likewise for CSC row indices. Use the .sorted_indices() and .sort_indices() methods when sorted indices are required (e.g. when passing data to other libraries).
I have a huge training dataset with 4 classes. These classes are labeled non-consecutively. To be able to apply a sequential neural network the classes have to be relabeled so that the unique values in the classes are consecutive. In addition, at the end of the script I have to relabel them back to their old values.
I know how to relabel them with loops:
def relabel(old_classes, new_classes):
indexes=[np.where(old_classes ==np.unique(old_classes)[i]) for i in range(len(new_classes))]
for i in range(len(new_classes )):
old_classes [indexes[i]]=new_classes[i]
return old_classes
>>> old_classes = np.array([0,1,2,6,6,2,6,1,1,0])
>>> new_classes = np.arange(len(np.unique(old_classes)))
>>> relabel(old_classes,new_classes)
array([0, 1, 2, 3, 3, 2, 3, 1, 1, 0])
But this isn't nice coding and it takes quite a lot of time.
Any idea how to vectorize this relabeling?
To be clear, I also want to be able to relabel them back to their old values:
>>> relabeled_classes=np.array([0, 1, 2, 3, 3, 2, 3, 1, 1, 0])
>>> old_classes = np.array([0,1,2,6])
>>> relabel(relabeled_classes,old_classes )
array([0,1,2,6,6,2,6,1,1,0])
We can use the optional argument return_inverse with np.unique to get those unique sequential IDs/tags, like so -
unq_arr, unq_tags = np.unique(old_classes,return_inverse=1)
Index into unq_arr with unq_tags to retrieve back -
old_classes_retrieved = unq_arr[unq_tags]
Sample run -
In [69]: old_classes = np.array([0,1,2,6,6,2,6,1,1,0])
In [70]: unq_arr, unq_tags = np.unique(old_classes,return_inverse=1)
In [71]: unq_arr
Out[71]: array([0, 1, 2, 6])
In [72]: unq_tags
Out[72]: array([0, 1, 2, 3, 3, 2, 3, 1, 1, 0])
In [73]: old_classes_retrieved = unq_arr[unq_tags]
In [74]: old_classes_retrieved
Out[74]: array([0, 1, 2, 6, 6, 2, 6, 1, 1, 0])
How can I append the last row of an array to itself ?
something like:
x= np.array([(1,2,3,4,5)])
x= np.append(x, x[0], 1)
Also, Could you explain why this way of working with vectors yields an error?
for i in range(3):
x.append(0)
x
[0, 0, 0]
x= np.append(x, x[0],0)
Which way of working with vectors would be best ? I am trying to work with 2D vectors as being a matrix, keeping in mind i would like to do some future matrix calculations like multiplication etc.
In [3]: x=np.array([(1,2,3,4,5)])
In [4]: x
Out[4]: array([[1, 2, 3, 4, 5]])
In [5]: x=np.append(x,x[0],1)
...
ValueError: all the input arrays must have same number of dimensions
x is (1,5), x[0] is (5,) - one is 2d, the other 1d.
In [11]: x=np.vstack([x,x[0]])
In [12]: x
Out[12]:
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
this works because vstack changes the x[0] to 2d, e.g. (1,5), so it can concatenate it with x.
In [16]: x=np.concatenate([x, np.atleast_2d(x[-1,:])])
In [17]: x.shape
Out[17]: (3, 5)
We can use concatenate (or append) by first expanding x[-1,:] to 2d.
But in general repeated concatenation is a slow way of building an array.
For a list, repeated append like this works. But it does not work for arrays. For one thing, an array does not have an append method. And np.append function returns a new array. It does not change x in place.
In [19]: z=[]
In [20]: for i in range(3):
...: z.append(0)
...:
In [21]: z
Out[21]: [0, 0, 0]
Repeated append to a list is fine. Repeated append to an array is slow.
In [25]: z=[]
In [26]: for i in range(3):
...: z.append(list(range(i,i+4)))
In [27]: z
Out[27]: [[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
In [28]: np.array(z)
Out[28]:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
>>> np.append(x,x[-1:],0)
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
How about this:
np.append(arr=x, values=x[-1,None], axis=0)
#array([[1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5]])
I want to find the differences between all values in a numpy array and append it to a new list.
Example: a = [1,4,2,6]
result : newlist= [3,1,5,3,2,2,1,2,4,5,2,4]
i.e for each value i of a, determine difference between values of the rest of the list.
At this point I have been unable to find a solution
You can do this:
a = [1,4,2,6]
newlist = [abs(i-j) for i in a for j in a if i != j]
Output:
print newlist
[3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4]
I believe what you are trying to do is to calculate absolute differences between elements of the input list, but excluding the self-differences. So, with that idea, this could be one vectorized approach also known as array programming -
# Input list
a = [1,4,2,6]
# Convert input list to a numpy array
arr = np.array(a)
# Calculate absolute differences between each element
# against all elements to give us a 2D array
sub_arr = np.abs(arr[:,None] - arr)
# Get diagonal indices for the 2D array
N = arr.size
rem_idx = np.arange(N)*(N+1)
# Remove the diagonal elements for the final output
out = np.delete(sub_arr,rem_idx)
Sample run to show the outputs at each step -
In [60]: a
Out[60]: [1, 4, 2, 6]
In [61]: arr
Out[61]: array([1, 4, 2, 6])
In [62]: sub_arr
Out[62]:
array([[0, 3, 1, 5],
[3, 0, 2, 2],
[1, 2, 0, 4],
[5, 2, 4, 0]])
In [63]: rem_idx
Out[63]: array([ 0, 5, 10, 15])
In [64]: out
Out[64]: array([3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4])