Zip uneven numpy arrays - python

Consider the following numpy.arrays:
a = np.array([1., 2., 3.])
b = np.array([4., 5.])
c = np.array([6., 7.])
I need to combine these so I end up with the following:
[(1., 4., 6.), (1., 5., 7.), (2., 4., 6.), (2., 5., 7.), (3., 4., 6.), (3., 5., 7.)]
Note that in this case, the array a happens to be the largest array. This is not guaranteed however. Nor is the length guaranteed. In other words, any array could be the longest and each array is of arbitrary length.
I tried using itertools.izip_longest but I can only use fillvalue for the tuple with 3. which will not work. I tried itertools.product also but my result is not a true cartesian product.

You can transpose b and c and then create a product of the a with the transposed array using itertools.product:
>>> from itertools import product
>>> [np.insert(j,0,i) for i,j in product(a,np.array((b,c)).T)]
[array([ 1., 4., 6.]), array([ 1., 5., 7.]), array([ 2., 4., 6.]), array([ 2., 5., 7.]), array([ 3., 4., 6.]), array([ 3., 5., 7.])]
>>>

Let's say you have:
a = np.array([4., 5.])
b = np.array([1., 2., 3.])
c = np.array([6., 7.])
d = np.array([5., 1])
e = np.array([3., 2.])
Now, if you know before-hand which one is the longest array, which is b in this case, you can use an approach based upon np.meshgrid -
# Concatenate elements from identical positions from the equal arrays
others = np.vstack((a,c,d,e)).T # If you have more arrays, edit this line
# Get grided version of the longest array and
# grided-indices for indexing into others array
X,Y = np.meshgrid(np.arange(others.shape[0]),b)
# Concatenate grided longest array and grided indexed others for final output
out = np.hstack((Y.ravel()[:,None],others[X.ravel()]))
Sample run -
In [47]: b
Out[47]: array([ 1., 2., 3.])
In [48]: a
Out[48]: array([ 4., 5.])
In [49]: c
Out[49]: array([ 6., 7.])
In [50]: d
Out[50]: array([ 5., 1.])
In [51]: e
Out[51]: array([ 3., 2.])
In [52]: out
Out[52]:
array([[ 1., 4., 6., 5., 3.],
[ 1., 5., 7., 1., 2.],
[ 2., 4., 6., 5., 3.],
[ 2., 5., 7., 1., 2.],
[ 3., 4., 6., 5., 3.],
[ 3., 5., 7., 1., 2.]])

If the length differences are not extreme (check inputs first) I'd be tempted to pad out the shorter lists to the length of the longest with None and generate all the permutations (27 of them for 3 lists of 3 elements). Then
results = []
for candidate in possibles:
if not (None in candidate): results.append(candidate)
Reasons not to do this: if the cube of the length of the longest list is significant in terms of memory usage (space to store N cubed possibles) or CPU usage.

Related

how to modify a column of numpy arrays stored in a list

I have a list of numpy arrays and want to modify some numbers of arrays. This is my simplified list:
first_list=[np.array([[1.,2.,0.], [2.,1.,0.], [6.,8.,3.], [8.,9.,7.]]),
np.array([[1.,0.,2.], [0.,0.,2.], [5.,5.,1.], [0.,6.,2.]])]
I have a factor which defines how many splits I have in each arrays:
spl_array=2.
it means each array of the list can be splited into 2 ones. I want to add a fixed value (3.) into last column of each split of each array and also copy the last split and subtract this value (3.) from the third column of this copied split. Finally I want to have it as following:
final_list=[np.array([[1.,2.,3.], [2.,1.,3.], [6.,8.,6.], [8.,9.,10.], \
[6.,8.,0.], [8.,9.,4.]]), # copied and subtracted
np.array([[1.,0.,5.], [0.,0.,5.], [5.,5.,4.], [0.,6.,5.], \
[5.,5.,-2.], [0.,6.,-1.]])] # copied and subtracted
I tried some for loops but I totaly lost. In advance , I do appreciate any help.
final_list=[]
for i in first_list:
each_lay=np.split (i, spl_array)
for j in range (len(each_lay)):
final_list.append([each_lay[j][:,0], each_lay[j][:,1], each_lay[j][:,2]+3])
Is it what you expect:
m = np.asarray(first_list)
m = np.concatenate((m, m[:, 2:]), axis=1)
m[:, :4, 2] += 3
m[:, 4:, 2] -= 3
final_list = m.tolist()
>>> m
array([[[ 1., 2., 3.],
[ 2., 1., 3.],
[ 6., 8., 6.],
[ 8., 9., 10.],
[ 6., 8., 0.],
[ 8., 9., 4.]],
[[ 1., 0., 5.],
[ 0., 0., 5.],
[ 5., 5., 4.],
[ 0., 6., 5.],
[ 5., 5., -2.],
[ 0., 6., -1.]]])

Create 3D array from a 2D array by replicating/repeating along the first axis

Suppose I have a n × m array, i.e.:
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]])
And I what to generate a 3D array k × n × m, where all the arrays in the new axis are equal, i.e.: the same array but now 3 × 3 × 3.
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]],
[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]],
[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]]])
How can I get it?
Introduce a new axis at the start with None/np.newaxis and replicate along it with np.repeat. This should work for extending any n dim array to n+1 dim array. The implementation would be -
np.repeat(arr[None,...],k,axis=0)
Sample run -
In [143]: arr
Out[143]:
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]])
In [144]: np.repeat(arr[None,...],3,axis=0)
Out[144]:
array([[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]],
[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]],
[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]]])
View-output for memory-efficiency
We can also generate a 3D view and achieve virtually free runtime with np.broadcast_to. More info - here. Hence, simply do -
np.broadcast_to(arr,(3,)+arr.shape) # repeat 3 times
if you have:
a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
You can use a list comprehension to generate the duplicate array:
b = [a for x in range(3)]
Then (for numpy):
c = array(b)
One possibility would be to use default broadcasting to replicate your array:
a = np.arange(1, 10).reshape(3,3)
n = 3
b = np.ones((n, 3, 3)) * a
Which results in the array you wanted:
array([[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]],
[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]],
[[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]]])
This won't work by default if you want to replicate it along another axis. In that case you would need to be explicit with the dimensions to ensure correct broadcasting.
I think this answer is exactly the answer of Divakar, but the syntax might be a bit easier to understand for a beginner(at least in my case, it is):
a = np.array([[1,2,3],[4,5,6]])
a[np.newaxis,:,:].repeat(3,axis=0)
results in:
array([[[1, 2, 3],
[4, 5, 6]],
[[1, 2, 3],
[4, 5, 6]],
[[1, 2, 3],
[4, 5, 6]]])
I learned about np.newaxis here: What is numpy.newaxis and when to use it.
And about numpy.repeat here: numpy.repeat
Here's an example usage I needed this for:
k = np.array([[[111,121,131,141,151],[211,221,231,241,251]],\
[[112,122,132,142,152],[212,222,232,242,252]],\
[[113,123,133,143,153],[213,223,233,243,253]]])
filter = np.array([[True,True,True,True,False],
[True,False,False,True,False]])
k[filter[None,...].repeat(3,axis=0)] = 0
print(k)
results in:
[[[ 0 0 0 0 151]
[ 0 221 231 0 251]]
[[ 0 0 0 0 152]
[ 0 222 232 0 252]]
[[ 0 0 0 0 153]
[ 0 223 233 0 253]]]

numpy array indexing with negative index

First I have a scalar time series stored in a numpy array:
ts = np.arange(10)
which is
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Suppose I want to extract from ts a series of vectors (2,1,0), (3,2,1), (4,3,2), etc., I can think of the following code to do it:
for i in range(len(ts)-2):
print(ts[2+i:i-1:-1])
However, when i=0, the above code returns an empty array rather than [2,1,0] because the loop body will become
print(ts[2:-1:-1])
where the -1 in the middle creates trouble.
My question is: is there a way to make the indexing work for [2,1,0]?
You need use None:
ts = np.arange(10)
for i in range(len(ts)-2):
print(ts[2+i:None if i == 0 else i - 1:-1])
This should work too:
print(ts[i:i+3][::-1])
another way is to do the following
slices = np.arange(3)
result = np.array([])
while slices[2] < len(ts):
# print(ts[slices])
result = np.r_[result , ts[slices]]
slices += 1
result.reshape((-1 , 3))
Out[165]:
array([[ 0., 1., 2.],
[ 1., 2., 3.],
[ 2., 3., 4.],
[ 3., 4., 5.],
[ 4., 5., 6.],
[ 5., 6., 7.],
[ 6., 7., 8.],
[ 7., 8., 9.]])

Removing Corresponding Entries from Two Numpy Arrays

I have what I'm quite sure is a simple question, but I'm not having much luck finding an explanation online.
I have an array of flux values and a corresponding array of time values. Obviously those two arrays are one-to-one (one flux value for each time value). However, some of my flux values are NaNs.
My question is this: How do I remove the corresponding values from the time array when I remove the NaNs from the flux array?
These arrays are large enough (several thousand entries) that it would be exceedingly cumbersome to do it by hand.
You could try boolean indexing:
In [13]: time
Out[13]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
In [15]: flux
Out[15]: array([ 1., 1., 1., 1., 1., nan, nan, nan, 1., 1., 1.])
In [16]: time2 = time[~np.isnan(flux)]
In [17]: flux2 = flux[~np.isnan(flux)]
In [18]: time2
Out[18]: array([ 0., 1., 2., 3., 4., 8., 9., 10.])
In [19]: flux2
Out[19]: array([ 1., 1., 1., 1., 1., 1., 1., 1.])
Just write time = time[~np.isnan(flux)] etc. if you don't need the original arrays any more.
A more complicated way is to use masked arrays:
In [20]: m = np.ma.masked_invalid(flux)
In [21]: time2 = time[~m.mask]
In [22]: time2
Out[22]: array([ 0., 1., 2., 3., 4., 8., 9., 10.])
In [23]: flux2
Out[23]: array([ 1., 1., 1., 1., 1., 1., 1., 1.])
In [22]: flux2 = flux[~m.mask]

How do I concatenate an array into a 3D matrix?

In my Python application I have a 3D matrix (array) such this:
array([[[ 1., 2., 3.]], [[ 4., 5., 6.]], [[ 7., 8., 9.]]])
and I would like to add, in a particular "line", for example, in the middle, zero arrays. At the end I would like to end with the following matrix:
array([[[ 1., 2., 3.]],
[[ 4., 5., 6.]],
[[ 0., 0., 0.]],
[[ 0., 0., 0.]],
[[ 7., 8., 9.]]])
Anybody knows how to solve this issue? I tried to use "numpy.concatenate", but it allow me only to add more "lines".
Thanks in advance!
Possible duplicate of
Inserting a row at a specific location in a 2d array in numpy?
For example:
a = array([[[ 1., 2., 3.]], [[ 4., 5., 6.]], [[ 7., 8., 9.]]])
output = np.insert(a, 2, np.array([0,0,0]), 0)
output:
array([[[ 1., 2., 3.]],
[[ 4., 5., 6.]],
[[ 0., 0., 0.]],
[[ 7., 8., 9.]]])
Why this works on 3D array?
See doc here.
It says:
numpy.insert(arr, obj, values, axis=None)
...
Parameters :
values : array_like
Values to insert into arr.
If the type of values is different from that of arr,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
values is converted to the type of arr. values should be shaped so that
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arr[...,obj,...] = values is legal.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
So it's very wise function!!
Is this what you want?
result = np.r_[ a[:2], np.zeros(1,2,3), a[2][None] ]
I'd do it this way:
>>> a = np.array([[[ 1., 2., 3.]], [[ 4., 5., 6.]], [[ 7., 8., 9.]]])
>>> np.concatenate((a[:2], np.tile(np.zeros_like(a[0]), (2,1,1)), a[2:]))
array([[[ 1., 2., 3.]],
[[ 4., 5., 6.]],
[[ 0., 0., 0.]],
[[ 0., 0., 0.]],
[[ 7., 8., 9.]]])
The 2 in (2,1,1) given to tile() is how many zero "rows" to insert. The 2 in the slice indexes is of course where to insert.
If you're going to insert a large amount of zeros, it may be more efficient to just create a big array of zeros first and then copy in the parts you need from the original array.

Categories