I have two arrays that are paired meaning that element 1 in both arrays needs to have the same index. I want to permute these elements. Currently, I tried np.random.permutation but that does not seem to get the right answer.
For example, if the two arrays are [1,2,3] and [4,5,6], one possible permutation would be [4,2,3] and [1,5,6].
You can stack your arrays and choose a random column for each row using choice.
Setup
a = np.array([1,2,3])
b = np.array([4,5,6])
v = np.column_stack((a,b))
# array([[1, 4],
# [2, 5],
# [3, 6]])
np.random.seed(1)
choices = np.random.choice(v.shape[1], v.shape[0])
# array([1, 1, 0])
Finally, to index:
v[np.arange(v.shape[0]), choices]
array([4, 5, 3])
a=np.array([1, 2, 3])
b=np.array([4, 5, 6])
random_arr=np.random.choice([0, 1], size=(len(a),)) # Generate a random array of 0s and 1s, let's say arr([0,0,1])
a1=random_arr*a + (1-random_arr)*b # arr([0,0,1])*arr([1,2,3]) + arr([1,1,0])*arr([4,5,6]) = arr([4, 5, 3])
b1=random_arr*b + (1-random_arr)*a # arr([0,0,1])*arr([4,5,6]) + arr([1,1,0])*arr([1,2,3]) = arr([1, 2, 6])
a=a1
b=b1
Run 1 of the code above:
a
Out[188]: array([4, 2, 6])
b
Out[189]: array([1, 5, 3])
Run 2:
a
Out[191]: array([4, 5, 3])
b
Out[192]: array([1, 2, 6])
You can use np.choose :
toss=np.random.randint(0,2,len(x))
print(np.choose(toss,[x,y]))
print(np.choose(toss,[y,x]))
#[1 5 6]
#[4 2 3]
Related
I have split a numpy array like so:
x = np.random.randn(10,3)
x_split = np.split(x,5)
which splits x equally into five numpy arrays each with shape (2,3) and puts them in a list. What is the best way to combine a subset of these back together (e.g. x_split[:k] and x_split[k+1:]) so that the resulting shape is similar to the original x i.e. (something,3)?
I found that for k > 0 this is possible with you do:
np.vstack((np.vstack(x_split[:k]),np.vstack(x_split[k+1:])))
but this does not work when k = 0 as x_split[:0] = [] so there must be a better and cleaner way. The error message I get when k = 0 is:
ValueError: need at least one array to concatenate
The comment by Paul Panzer is right on target, but since NumPy now gently discourages vstack, here is the concatenate version:
x = np.random.randn(10, 3)
x_split = np.split(x, 5, axis=0)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=0)
Note the explicit axis argument passed both times (it has to be the same); this makes it easy to adapt the code to work for other axes if needed. E.g.,
x_split = np.split(x, 3, axis=1)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=1)
np.r_ can turn several slices into a list of indices.
In [20]: np.r_[0:3, 4:5]
Out[20]: array([0, 1, 2, 4])
In [21]: np.vstack([xsp[i] for i in _])
Out[21]:
array([[9, 7, 5],
[6, 4, 3],
[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[2, 2, 5],
[4, 4, 5]])
In [22]: np.r_[0:0, 1:5]
Out[22]: array([1, 2, 3, 4])
In [23]: np.vstack([xsp[i] for i in _])
Out[23]:
array([[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[3, 2, 0],
[0, 3, 8],
[2, 2, 5],
[4, 4, 5]])
Internally np.r_ has a lot of ifs and loops to handle the slices and their boundaries, but it hides it all from us.
If the xsp (your x_split) was an array, we could do xsp[np.r_[...]], but since it is a list we have to iterate. Well we could also hide that iteration with an operator.itemgetter object.
In [26]: operator.itemgetter(*Out[22])
Out[26]: operator.itemgetter(1, 2, 3, 4)
In [27]: np.vstack(operator.itemgetter(*Out[22])(xsp))
With an ndarray.view, one can do:
import numpy as np
a = np.arange(6)
b = a.view()
b[...] = [5, 5, 5, 5, 5, 5]
a and b are now both [5, 5, 5, 5, 5, 5].
Now, can I do the same with slicing/indexing? So that the view does not show the full array, but just a slice? Something like:
import numpy as np
a = np.arange(6)
idx = [0, 2, 4]
b = a[idx] # please just return a view into `a` here
b[...] = [5, 5, 5]
Now a is of course still [0, 1, 2, 3, 4, 5] but I'd like to have it to be [5, 1, 5, 3, 5, 5].
This would be very useful when mapping between different arrays.
As mentioned in the comments, we can get views when dealing with patterned strides for indexing.
Let's take a look at few cases.
1) Case #1: Starting index = 0 and with a stride of 2:
In [129]: a = np.arange(6) # Input array
In [130]: idx = [0,2,4] # Simulating these indices for indexing
In [131]: b = a[::2] # Get view
In [132]: b[...] = [5, 5, 5] # Assign values
In [133]: a
Out[133]: array([5, 1, 5, 3, 5, 5]) # Verify
2) Case #2: Starting index = 1 and with a stride of 2:
In [134]: a = np.arange(6) # Input array
In [135]: idx = [1,3,5] # Simulating these indices for indexing
In [136]: b = a[1::2] # Get view
In [137]: b[...] = [5, 5, 5] # Assign values
In [138]: a
Out[138]: array([0, 5, 2, 5, 4, 5]) # Verify
This method is extensible to multi-dimensional arrays.
How can I append the last row of an array to itself ?
something like:
x= np.array([(1,2,3,4,5)])
x= np.append(x, x[0], 1)
Also, Could you explain why this way of working with vectors yields an error?
for i in range(3):
x.append(0)
x
[0, 0, 0]
x= np.append(x, x[0],0)
Which way of working with vectors would be best ? I am trying to work with 2D vectors as being a matrix, keeping in mind i would like to do some future matrix calculations like multiplication etc.
In [3]: x=np.array([(1,2,3,4,5)])
In [4]: x
Out[4]: array([[1, 2, 3, 4, 5]])
In [5]: x=np.append(x,x[0],1)
...
ValueError: all the input arrays must have same number of dimensions
x is (1,5), x[0] is (5,) - one is 2d, the other 1d.
In [11]: x=np.vstack([x,x[0]])
In [12]: x
Out[12]:
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
this works because vstack changes the x[0] to 2d, e.g. (1,5), so it can concatenate it with x.
In [16]: x=np.concatenate([x, np.atleast_2d(x[-1,:])])
In [17]: x.shape
Out[17]: (3, 5)
We can use concatenate (or append) by first expanding x[-1,:] to 2d.
But in general repeated concatenation is a slow way of building an array.
For a list, repeated append like this works. But it does not work for arrays. For one thing, an array does not have an append method. And np.append function returns a new array. It does not change x in place.
In [19]: z=[]
In [20]: for i in range(3):
...: z.append(0)
...:
In [21]: z
Out[21]: [0, 0, 0]
Repeated append to a list is fine. Repeated append to an array is slow.
In [25]: z=[]
In [26]: for i in range(3):
...: z.append(list(range(i,i+4)))
In [27]: z
Out[27]: [[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
In [28]: np.array(z)
Out[28]:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
>>> np.append(x,x[-1:],0)
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
How about this:
np.append(arr=x, values=x[-1,None], axis=0)
#array([[1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5]])
I have couple of lists:
a = [1,2,3]
b = [1,2,3,4,5,6]
which are of variable length.
I want to return a vector of length five, such that if the input list length is < 5 then it will be padded with zeros on the right, and if it is > 5, then it will be truncated at the 5th element.
For example, input a would return np.array([1,2,3,0,0]), and input b would return np.array([1,2,3,4,5]).
I feel like I ought to be able to use np.pad, but I can't seem to follow the documentation.
This might be slow or fast, I am not sure, however it works for your purpose.
In [22]: pad = lambda a,i : a[0:i] if len(a) > i else a + [0] * (i-len(a))
In [23]: pad([1,2,3], 5)
Out[23]: [1, 2, 3, 0, 0]
In [24]: pad([1,2,3,4,5,6,7], 5)
Out[24]: [1, 2, 3, 4, 5]
np.pad is overkill, better for adding a border all around a 2d image than adding some zeros to a list.
I like the zip_longest, especially if the inputs are lists, and don't need to be arrays. It's probably the closest you'll find to a code that operates on all lists at once in compiled code).
a, b = zip(*list(itertools.izip_longest(a, b, fillvalue=0)))
is a version that does not use np.array at all (saving some array overhead)
But by itself it does not truncate. It stills something like [x[:5] for x in (a,b)].
Here's my variation on all_ms function, working with a simple list or 1d array:
def foo_1d(x, n=5):
x = np.asarray(x)
assert x.ndim==1
s = np.min([x.shape[0], n])
ret = np.zeros((n,), dtype=x.dtype)
ret[:s] = x[:s]
return ret
In [772]: [foo_1d(x) for x in [[1,2,3], [1,2,3,4,5], np.arange(10)[::-1]]]
Out[772]: [array([1, 2, 3, 0, 0]), array([1, 2, 3, 4, 5]), array([9, 8, 7, 6, 5])]
One way or other the numpy solutions do the same thing - construct a blank array of the desired shape, and then fill it with the relevant values from the original.
One other detail - when truncating the solution could, in theory, return a view instead of a copy. But that requires handling that case separately from a pad case.
If the desired output is a list of equal lenth arrays, it may be worth while collecting them in a 2d array.
In [792]: def foo1(x, out):
x = np.asarray(x)
s = np.min((x.shape[0], out.shape[0]))
out[:s] = x[:s]
In [794]: lists = [[1,2,3], [1,2,3,4,5], np.arange(10)[::-1], []]
In [795]: ret=np.zeros((len(lists),5),int)
In [796]: for i,xx in enumerate(lists):
foo1(xx, ret[i,:])
In [797]: ret
Out[797]:
array([[1, 2, 3, 0, 0],
[1, 2, 3, 4, 5],
[9, 8, 7, 6, 5],
[0, 0, 0, 0, 0]])
Pure python version, where a is a python list (not a numpy array): a[:n] + [0,]*(n-len(a)).
For example:
In [42]: n = 5
In [43]: a = [1, 2, 3]
In [44]: a[:n] + [0,]*(n - len(a))
Out[44]: [1, 2, 3, 0, 0]
In [45]: a = [1, 2, 3, 4]
In [46]: a[:n] + [0,]*(n - len(a))
Out[46]: [1, 2, 3, 4, 0]
In [47]: a = [1, 2, 3, 4, 5]
In [48]: a[:n] + [0,]*(n - len(a))
Out[48]: [1, 2, 3, 4, 5]
In [49]: a = [1, 2, 3, 4, 5, 6]
In [50]: a[:n] + [0,]*(n - len(a))
Out[50]: [1, 2, 3, 4, 5]
Function using numpy:
In [121]: def tosize(a, n):
.....: a = np.asarray(a)
.....: x = np.zeros(n, dtype=a.dtype)
.....: m = min(n, len(a))
.....: x[:m] = a[:m]
.....: return x
.....:
In [122]: tosize([1, 2, 3], 5)
Out[122]: array([1, 2, 3, 0, 0])
In [123]: tosize([1, 2, 3, 4], 5)
Out[123]: array([1, 2, 3, 4, 0])
In [124]: tosize([1, 2, 3, 4, 5], 5)
Out[124]: array([1, 2, 3, 4, 5])
In [125]: tosize([1, 2, 3, 4, 5, 6], 5)
Out[125]: array([1, 2, 3, 4, 5])
I want to find the differences between all values in a numpy array and append it to a new list.
Example: a = [1,4,2,6]
result : newlist= [3,1,5,3,2,2,1,2,4,5,2,4]
i.e for each value i of a, determine difference between values of the rest of the list.
At this point I have been unable to find a solution
You can do this:
a = [1,4,2,6]
newlist = [abs(i-j) for i in a for j in a if i != j]
Output:
print newlist
[3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4]
I believe what you are trying to do is to calculate absolute differences between elements of the input list, but excluding the self-differences. So, with that idea, this could be one vectorized approach also known as array programming -
# Input list
a = [1,4,2,6]
# Convert input list to a numpy array
arr = np.array(a)
# Calculate absolute differences between each element
# against all elements to give us a 2D array
sub_arr = np.abs(arr[:,None] - arr)
# Get diagonal indices for the 2D array
N = arr.size
rem_idx = np.arange(N)*(N+1)
# Remove the diagonal elements for the final output
out = np.delete(sub_arr,rem_idx)
Sample run to show the outputs at each step -
In [60]: a
Out[60]: [1, 4, 2, 6]
In [61]: arr
Out[61]: array([1, 4, 2, 6])
In [62]: sub_arr
Out[62]:
array([[0, 3, 1, 5],
[3, 0, 2, 2],
[1, 2, 0, 4],
[5, 2, 4, 0]])
In [63]: rem_idx
Out[63]: array([ 0, 5, 10, 15])
In [64]: out
Out[64]: array([3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4])