What is the difference in NumPy between [:][:] and [:,:]? - python

I am quite familiar with python programming but I found some strange cases where the following two lines of code provided different results (assuming that the two arrays are 2-dimensional):
A[:][:] = B[:][:]
and
A[:,:] = B[:,:]
I am wondering if there is any case, explication.
Any hint?
Example :
>>> x = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> x
array([[1, 2],
[3, 4],
[5, 6]])
>>> x[1][1]
4 # expected behavior
>>> x[1,1]
4 # expected behavior
>>> x[:][1]
array([3, 4]) # huh?
>>> x[:,1]
array([2, 4, 6]) # expected behavior

Let's take a step back. Try this:
>>> x = np.arange(6)
>>> x
array([0, 1, 2, 3, 4, 5])
>>> x[:]
array([0, 1, 2, 3, 4, 5])
>>> x[:][:]
array([0, 1, 2, 3, 4, 5])
>>> x[:][:][:][:][:][:][:][:][:][:]
array([0, 1, 2, 3, 4, 5])
It looks like x[:] is equal to x. (Indeed, x[:] creates a copy of x.)
Therefore, x[:][1] == x[1].
Is this consistent with what we should expect? Why should x[:] be a copy of x? If you're familiar with slicing, these examples should clarify:
>>> x[0:4]
array([0, 1, 2, 3])
>>> x[0:6]
array([0, 1, 2, 3, 4, 5])
>>> x[0:]
array([0, 1, 2, 3, 4, 5])
>>> x[:]
array([0, 1, 2, 3, 4, 5])
We can omit the 0 and 6 and numpy will figure out what the maximum dimensions are for us.
Regarding the first part of your question, to create a copy of B, you can do any of the following:
A = B[:, :]
A = B[...]
A = np.copy(B)

Related

remove array() from return

I am creating a function that takes in two lists and a tuple as data, and returns the data sorted in increasing order with respect to the first lists indexes (this isn't very important to my question but context.) Here is what I have:
def sort_data(data):
""" (tuple) -> tuple
data is a tuple of two lists.
Returns a copy of the input tuple sorted in
non-decreasing order with respect to the
data[0]
>>> sort_data(([5, 1, 7], [1, 2, 3]))
([1, 5, 7], [2, 1, 3])
>>> sort_data(([2, 4, 8], [1, 2, 3]))
([2, 4, 8], [1, 2, 3])
>>> sort_data( ([11, 4, -5], [1, 2, 3]))
([-5, 4, 11], [3, 2, 1])
"""
([a,b,c],[d,e,f]) = data
x = [a,b,c]
y = [d,e,f]
xarray = np.array(x)
yarray = np.array(y)
x1 = np.argsort(xarray)
xsort = (xarray[x1])
ysort = (yarray[x1])
#remove array()
return ([xsort],[ysort])
This is working great, but returns very slightly wrong. For example, I would want this as seen in my docstring:
>>> sort_data(([5, 1, 7], [1, 2, 3]))
([1, 5, 7], [2, 1, 3])
but instead I got this:
([array([1, 5, 7])], [array([2, 1, 3])])
How could I remove the array() so that I just have the two lists in a tuple as my return value? I tried to convert it to a tuple, but then it is two tuples, when I only want one.
In [78]: data = ([5, 1, 7], [1, 2, 3])
Since you are using argsort, you can sort both rows together:
Make an array from the list:
In [79]: arr = np.array(data)
In [80]: arr
Out[80]:
array([[5, 1, 7],
[1, 2, 3]])
sorting index:
In [81]: idx = np.argsort(arr[0])
In [82]: idx
Out[82]: array([1, 0, 2])
apply it to the columns:
In [83]: arr[:,idx]
Out[83]:
array([[1, 5, 7],
[2, 1, 3]])
make that array a list:
In [84]: arr[:,idx].tolist()
Out[84]: [[1, 5, 7], [2, 1, 3]]
Since you are given a tuple of lists, there should be a way of doing this sorting using Python sorted and its key. But I haven't used that nearly as much as the numpy.
I don't know if this is best or not:
In [11]: data = ([5, 1, 7], [1, 2, 3])
sort first list, recording the index as well:
In [12]: x=sorted([(v,i) for i,v in enumerate(data[0])], key=lambda x:x[0])
In [13]: x
Out[13]: [(1, 1), (5, 0), (7, 2)]
extract that index:
In [14]: idx = [i[1] for i in x]
In [15]: idx
Out[15]: [1, 0, 2]
use that to return both sublists:
In [16]: [[d[i] for i in idx] for d in data]
Out[16]: [[1, 5, 7], [2, 1, 3]]

Append numpy arrays with different dimensions

I am trying to attach or concatenate two numpy arrays with different dimensions. It does not look good so far.
So, as an example,
a = np.arange(0,4).reshape(1,4)
b = np.arange(0,3).reshape(1,3)
And I am trying
G = np.concatenate(a,b,axis=0)
I get an error as a and b are not the same dimension. The reason I need to concatenate a and b is that I am trying to solve a model recursively and the state space is changing over time. So I need to call the last value function as an input to get a value function for the next time period, etc.:
for t in range(T-1,0,-1):
VG,CG = findv(VT[-1])
VT = np.append(VT,VG,axis=0)
CT = np.append(CT,CG,axis=0)
But, VT has a different dimension from the time period to the next.
Does anyone know how to deal with VT and CT numpy arrays that keep changing dimension?
OK - thanks for the input ... I need the output to be of the following form:
G = [[0, 1, 2, 3],
[0, 1, 2]]
So, if I write G[-1] I will get the last element,
[0,1,2].
I do not know if that is a numpy array?
Thanks, Jesper.
In [71]: a,b,c = np.arange(0,4), np.arange(0,3), np.arange(0,7)
It's easy to put those arrays in a list, either all at once, or incrementally:
In [72]: [a,b,c]
Out[72]: [array([0, 1, 2, 3]), array([0, 1, 2]), array([0, 1, 2, 3, 4, 5, 6])]
In [73]: G =[a,b]
In [74]: G.append(c)
In [75]: G
Out[75]: [array([0, 1, 2, 3]), array([0, 1, 2]), array([0, 1, 2, 3, 4, 5, 6])]
We can make an object dtype array from that list.
In [76]: np.array(G)
Out[76]:
array([array([0, 1, 2, 3]), array([0, 1, 2]),
array([0, 1, 2, 3, 4, 5, 6])], dtype=object)
Be aware that sometimes this could produce a 2d array (if all subarrays were the same size), or an error. Usually it's better to stick with the list.
Repeated append or concatenate to an array is usually not recommended. It's trickier to do right, and slower when it does work.
But let's demonstrate:
In [80]: G = np.array([a,b])
In [81]: G
Out[81]: array([array([0, 1, 2, 3]), array([0, 1, 2])], dtype=object)
c gets 'expanded' with a simple concatenate:
In [82]: np.concatenate((G,c))
Out[82]:
array([array([0, 1, 2, 3]), array([0, 1, 2]), 0, 1, 2, 3, 4, 5, 6],
dtype=object)
Instead we need to wrap c in an object dtype array of its own:
In [83]: cc = np.array([None])
In [84]: cc[0]= c
In [85]: cc
Out[85]: array([array([0, 1, 2, 3, 4, 5, 6])], dtype=object)
In [86]: np.concatenate((G,cc))
Out[86]:
array([array([0, 1, 2, 3]), array([0, 1, 2]),
array([0, 1, 2, 3, 4, 5, 6])], dtype=object)
In general when we concatenate, the dtypes have to match, or at least be compatible. Here, all inputs need to be object dtype. The same would apply when joining compound dtypes (structured arrays). It's only when joining simple numeric dtypes (and strings) that we can ignore dtypes (provided we don't care about integers becoming floats, etc).
You cant really stack arrays with different dimensions or size of dimensions.
This is list (kind of your desired ouput if I understand correctly):
G = [[0, 1, 2, 3],
[0, 1, 2]]
Transformed to numpy array:
G_np = np.array(G)
>>> G_np.shape
(2,)
>>> G_np
array([list([0, 1, 2, 3]), list([0, 1, 2])], dtype=object)
>>>
Solution in your case (based on your requirements):
a = np.arange(0,4)
b = np.arange(0,3)
G_npy = np.array([a,b])
>>> G_np.shape
(2,)
>>> G_np
array([array([0, 1, 2, 3]), array([0, 1, 2])], dtype=object)
>>> G_npy[-1]
array([0, 1, 2])
Edit: In relation to your Question in comment
I must admit I have no Idea how to do it in correct way.
But if a hacky way is ok(Maybe its the correct way), then:
G_npy = np.array([a,b])
G_npy = np.append(G_npy,None) # Allocate space for your new array
G_npy[-1] = np.arange(5) # populate the new space with new array
>>> G_npy
array([array([0, 1, 2, 3]), array([0, 1, 2]), array([0, 1, 2, 3, 4])],
dtype=object)
>>>
Or this way - but then, there is no point in using numpy
temp = [i for i in G_npy]
temp.append(np.arange(5))
G_npy = np.array(temp)
NOTE:
To be honest, i dont think numpy is good for collecting objects(list like this).
If I were you, I would just keep appending a real list. At the end, I would transform it to numpy. But after all, I dont know your application, so I dont know what is best attitude
Try this way:
import numpy as np
a = np.arange(4).reshape(2,2)
b = np.arange(6).reshape(2,3)
c = np.arange(8).reshape(2,4)
a
# array([[0, 1],
# [2, 3]])
b
# array([[0, 1, 2],
# [3, 4, 5]])
c
# array([[0, 1, 2, 3],
# [4, 5, 6, 7]])
np.hstack((a,b,c))
#array([[0, 1, 0, 1, 2, 0, 1, 2, 3],
# [2, 3, 3, 4, 5, 4, 5, 6, 7]])
Hope it helps.
Thanks
You are missing a parentheses there.
Please refer to the concatenate documentation below.
https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.concatenate.html
import numpy as np
a = np.arange(0,4).reshape(1,4)
b = np.arange(0,3).reshape(1,3)
c = np.concatenate((a,b), axis=1) #axis 1 as you have reshaped the numpy array
The above will give you the concatenated output c as:
array([[0, 1, 2, 3, 0, 1, 2]])

How to append last row from 2D array in Python

How can I append the last row of an array to itself ?
something like:
x= np.array([(1,2,3,4,5)])
x= np.append(x, x[0], 1)
Also, Could you explain why this way of working with vectors yields an error?
for i in range(3):
x.append(0)
x
[0, 0, 0]
x= np.append(x, x[0],0)
Which way of working with vectors would be best ? I am trying to work with 2D vectors as being a matrix, keeping in mind i would like to do some future matrix calculations like multiplication etc.
In [3]: x=np.array([(1,2,3,4,5)])
In [4]: x
Out[4]: array([[1, 2, 3, 4, 5]])
In [5]: x=np.append(x,x[0],1)
...
ValueError: all the input arrays must have same number of dimensions
x is (1,5), x[0] is (5,) - one is 2d, the other 1d.
In [11]: x=np.vstack([x,x[0]])
In [12]: x
Out[12]:
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
this works because vstack changes the x[0] to 2d, e.g. (1,5), so it can concatenate it with x.
In [16]: x=np.concatenate([x, np.atleast_2d(x[-1,:])])
In [17]: x.shape
Out[17]: (3, 5)
We can use concatenate (or append) by first expanding x[-1,:] to 2d.
But in general repeated concatenation is a slow way of building an array.
For a list, repeated append like this works. But it does not work for arrays. For one thing, an array does not have an append method. And np.append function returns a new array. It does not change x in place.
In [19]: z=[]
In [20]: for i in range(3):
...: z.append(0)
...:
In [21]: z
Out[21]: [0, 0, 0]
Repeated append to a list is fine. Repeated append to an array is slow.
In [25]: z=[]
In [26]: for i in range(3):
...: z.append(list(range(i,i+4)))
In [27]: z
Out[27]: [[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
In [28]: np.array(z)
Out[28]:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
>>> np.append(x,x[-1:],0)
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
How about this:
np.append(arr=x, values=x[-1,None], axis=0)
#array([[1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5]])

How can I pad and/or truncate a vector to a specified length using numpy?

I have couple of lists:
a = [1,2,3]
b = [1,2,3,4,5,6]
which are of variable length.
I want to return a vector of length five, such that if the input list length is < 5 then it will be padded with zeros on the right, and if it is > 5, then it will be truncated at the 5th element.
For example, input a would return np.array([1,2,3,0,0]), and input b would return np.array([1,2,3,4,5]).
I feel like I ought to be able to use np.pad, but I can't seem to follow the documentation.
This might be slow or fast, I am not sure, however it works for your purpose.
In [22]: pad = lambda a,i : a[0:i] if len(a) > i else a + [0] * (i-len(a))
In [23]: pad([1,2,3], 5)
Out[23]: [1, 2, 3, 0, 0]
In [24]: pad([1,2,3,4,5,6,7], 5)
Out[24]: [1, 2, 3, 4, 5]
np.pad is overkill, better for adding a border all around a 2d image than adding some zeros to a list.
I like the zip_longest, especially if the inputs are lists, and don't need to be arrays. It's probably the closest you'll find to a code that operates on all lists at once in compiled code).
a, b = zip(*list(itertools.izip_longest(a, b, fillvalue=0)))
is a version that does not use np.array at all (saving some array overhead)
But by itself it does not truncate. It stills something like [x[:5] for x in (a,b)].
Here's my variation on all_ms function, working with a simple list or 1d array:
def foo_1d(x, n=5):
x = np.asarray(x)
assert x.ndim==1
s = np.min([x.shape[0], n])
ret = np.zeros((n,), dtype=x.dtype)
ret[:s] = x[:s]
return ret
In [772]: [foo_1d(x) for x in [[1,2,3], [1,2,3,4,5], np.arange(10)[::-1]]]
Out[772]: [array([1, 2, 3, 0, 0]), array([1, 2, 3, 4, 5]), array([9, 8, 7, 6, 5])]
One way or other the numpy solutions do the same thing - construct a blank array of the desired shape, and then fill it with the relevant values from the original.
One other detail - when truncating the solution could, in theory, return a view instead of a copy. But that requires handling that case separately from a pad case.
If the desired output is a list of equal lenth arrays, it may be worth while collecting them in a 2d array.
In [792]: def foo1(x, out):
x = np.asarray(x)
s = np.min((x.shape[0], out.shape[0]))
out[:s] = x[:s]
In [794]: lists = [[1,2,3], [1,2,3,4,5], np.arange(10)[::-1], []]
In [795]: ret=np.zeros((len(lists),5),int)
In [796]: for i,xx in enumerate(lists):
foo1(xx, ret[i,:])
In [797]: ret
Out[797]:
array([[1, 2, 3, 0, 0],
[1, 2, 3, 4, 5],
[9, 8, 7, 6, 5],
[0, 0, 0, 0, 0]])
Pure python version, where a is a python list (not a numpy array): a[:n] + [0,]*(n-len(a)).
For example:
In [42]: n = 5
In [43]: a = [1, 2, 3]
In [44]: a[:n] + [0,]*(n - len(a))
Out[44]: [1, 2, 3, 0, 0]
In [45]: a = [1, 2, 3, 4]
In [46]: a[:n] + [0,]*(n - len(a))
Out[46]: [1, 2, 3, 4, 0]
In [47]: a = [1, 2, 3, 4, 5]
In [48]: a[:n] + [0,]*(n - len(a))
Out[48]: [1, 2, 3, 4, 5]
In [49]: a = [1, 2, 3, 4, 5, 6]
In [50]: a[:n] + [0,]*(n - len(a))
Out[50]: [1, 2, 3, 4, 5]
Function using numpy:
In [121]: def tosize(a, n):
.....: a = np.asarray(a)
.....: x = np.zeros(n, dtype=a.dtype)
.....: m = min(n, len(a))
.....: x[:m] = a[:m]
.....: return x
.....:
In [122]: tosize([1, 2, 3], 5)
Out[122]: array([1, 2, 3, 0, 0])
In [123]: tosize([1, 2, 3, 4], 5)
Out[123]: array([1, 2, 3, 4, 0])
In [124]: tosize([1, 2, 3, 4, 5], 5)
Out[124]: array([1, 2, 3, 4, 5])
In [125]: tosize([1, 2, 3, 4, 5, 6], 5)
Out[125]: array([1, 2, 3, 4, 5])

Finding differences between all values in an List

I want to find the differences between all values in a numpy array and append it to a new list.
Example: a = [1,4,2,6]
result : newlist= [3,1,5,3,2,2,1,2,4,5,2,4]
i.e for each value i of a, determine difference between values of the rest of the list.
At this point I have been unable to find a solution
You can do this:
a = [1,4,2,6]
newlist = [abs(i-j) for i in a for j in a if i != j]
Output:
print newlist
[3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4]
I believe what you are trying to do is to calculate absolute differences between elements of the input list, but excluding the self-differences. So, with that idea, this could be one vectorized approach also known as array programming -
# Input list
a = [1,4,2,6]
# Convert input list to a numpy array
arr = np.array(a)
# Calculate absolute differences between each element
# against all elements to give us a 2D array
sub_arr = np.abs(arr[:,None] - arr)
# Get diagonal indices for the 2D array
N = arr.size
rem_idx = np.arange(N)*(N+1)
# Remove the diagonal elements for the final output
out = np.delete(sub_arr,rem_idx)
Sample run to show the outputs at each step -
In [60]: a
Out[60]: [1, 4, 2, 6]
In [61]: arr
Out[61]: array([1, 4, 2, 6])
In [62]: sub_arr
Out[62]:
array([[0, 3, 1, 5],
[3, 0, 2, 2],
[1, 2, 0, 4],
[5, 2, 4, 0]])
In [63]: rem_idx
Out[63]: array([ 0, 5, 10, 15])
In [64]: out
Out[64]: array([3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4])

Categories