How to join/connect two list in python? [duplicate] - python

This question already has an answer here:
concatenate two one-dimensional to two columns array
(1 answer)
Closed 7 years ago.
I want to concatenate two arrays vertically in Python using the NumPy package:
a = array([1,2,3,4])
b = array([5,6,7,8])
I want something like this:
c = array([[1,2,3,4],[5,6,7,8]])
How we can do that using the concatenate function? I checked these two functions but the results are the same:
c = concatenate((a,b),axis=0)
# or
c = concatenate((a,b),axis=1)
We have this in both of these functions:
c = array([1,2,3,4,5,6,7,8])

The problem is that both a and b are 1D arrays and so there's only one axis to join them on.
Instead, you can use vstack (v for vertical):
>>> np.vstack((a,b))
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
Also, row_stack is an alias of the vstack function:
>>> np.row_stack((a,b))
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
It's also worth noting that multiple arrays of the same length can be stacked at once. For instance, np.vstack((a,b,x,y)) would have four rows.
Under the hood, vstack works by making sure that each array has at least two dimensions (using atleast_2D) and then calling concatenate to join these arrays on the first axis (axis=0).

Use np.vstack:
In [4]:
import numpy as np
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
c = np.vstack((a,b))
c
Out[4]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
In [5]:
d = np.array ([[1,2,3,4],[5,6,7,8]])
d
​
Out[5]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
In [6]:
np.equal(c,d)
Out[6]:
array([[ True, True, True, True],
[ True, True, True, True]], dtype=bool)

Maybe it's not a good solution, but it's simple way to makes your code works, just add reshape:
a = array([1,2,3,4])
b = array([5,6,7,8])
c = concatenate((a,b),axis=0).reshape((2,4))
print c
out:
[[1 2 3 4]
[5 6 7 8]]
In general if you have more than 2 arrays with the same length:
reshape((number_of_arrays, length_of_array))

To use concatenate, you need to make a and b 2D arrays instead of 1D, as in
c = concatenate((atleast_2d(a), atleast_2d(b)))
Alternatively, you can simply do
c = array((a,b))

Related

iterating via np.nditer function for numpy arrays

I'm a bit new to python and wanted to check for some values in my arrays to see if they go above or below a certain value and adjust them afterwards.
For the case of a 2d array with numpy I found this in some part of its manual.
import numpy as np
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for x in np.nditer(arr[:, ::2]):
print(x)
What's the syntax in python to change that initial value so it doesnt iterate over every value starting from the first but from one I can define such iterate from every 2nd or 3rd as I need to check every 1st, 2nd, 3rd and so on value in my arrays against a different value or is there maybe a better way to do this?
I suspect you need to read some more basic numpy docs.
You created a 2d array:
In [5]: arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
In [6]: arr
Out[6]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
You can view it as a 1d array (nditer iterates as flat)
In [7]: arr.ravel()
Out[7]: array([1, 2, 3, 4, 5, 6, 7, 8])
You can use standard slice notation to select everyother - on the flattened:
In [8]: arr.ravel()[::2]
Out[8]: array([1, 3, 5, 7])
or every other column of the original:
In [9]: arr[:,::2]
Out[9]:
array([[1, 3],
[5, 7]])
You can test every value, such as for being odd:
In [10]: arr % 2 == 1
Out[10]:
array([[ True, False, True, False],
[ True, False, True, False]])
and use that array to select those values:
In [11]: arr[arr % 2 == 1]
Out[11]: array([1, 3, 5, 7])
or modify them:
In [12]: arr[arr % 2 == 1] += 10
In [13]: arr
Out[13]:
array([[11, 2, 13, 4],
[15, 6, 17, 8]])
The documentation for nditer tends to overhype it. It's useful in compiled code, but rarely useful in python. If the above whole-array methods don't work, you can iterate directly, with more control and understanding:
In [14]: for row in arr:
...: print(row)
...: for x in row:
...: print(x)
[11 2 13 4]
11
2
13
4
[15 6 17 8]
15
6
17
8

how to loop over one axis of numpy array, returning inner arrays instead of values

I have several arrays of data, collected into a single array. I want to loop over it, and do operations on each of the inner arrays. What would be the correct way to do this in Numpy
import numpy as np
a = np.arange(9)
a = a.reshape(3,3)
for val in np.nditer(a):
print(val)
and this gives:
0
1
2
3
4
5
6
7
8
But what I want is (something like):
array([0 1 2])
array([3 4 5])
array([6 7 8])
I have been looking at this page: https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.nditer.html but so far have not found the answer. I also know I can do it with a plain for loop but I am assuming there is a more correct way. Any help would be appreciated, thank you.
You can use apply_along_axis but it depends on what your ultimate goal/output is. Here is a simple example showing this.
a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
np.apply_along_axis(lambda x: x + 1, 0, a)
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Actually when you use reshape it will return an array of lists not arrays.
If you would like to get each individual list you could just use
a = np.arange(9)
a = a.reshape(3,3)
for val in a:
print(val)
Why not just simply loop over the array where you get individual rows in the for loop
import numpy as np
a = np.arange(9)
a = a.reshape(3,3)
for val in a:
print(val)
# [0 1 2]
# [3 4 5]
# [6 7 8]

Numpy Vectorization While Indexing Two Arrays

I'm trying to vectorize the following function using numpy and am completely lost.
A = ndarray: Z x 3
B = ndarray: Z x 3
C = integer
D = ndarray: C x 3
Pseudocode:
entries = []
means = []
For i in range(C):
for p in range(len(B)):
if B[p] == D[i]:
entries.append(A[p])
means.append(columnwise_means(entries))
return means
an example would be :
A = [[1,2,3],[1,2,3],[4,5,6],[4,5,6]]
B = [[9,8,7],[7,6,5],[1,2,3],[3,4,5]]
C = 2
D = [[1,2,3],[4,5,6]]
Returns:
[average([9,8,7],[7,6,5]), average(([1,2,3],[3,4,5])] = [[8,7,6],[2,3,4]]
I've tried using np.where, np.argwhere, np.mean, etc but can't seem to get the desired effect. Any help would be greatly appreciated.
Thanks!
Going by the expected output of the question, I am assuming that in the actual code, you would have :
IF conditional statement as : if A[p] == D[i], and
Entries would be appended from B : entries.append(B[p]).
So, here's one vectorized approach with NumPy broadcasting and dot-product -
mask = (D[:,None,:] == A).all(-1)
out = mask.dot(B)/(mask.sum(1)[:,None])
If the input arrays are integer arrays, then you can save on memory and boost up performance, considering the arrays as indices of a n-dimensional array and thus create the 2D mask without going 3D like so -
dims = np.maximum(A.max(0),D.max(0))+1
mask = np.ravel_multi_index(D.T,dims)[:,None] == np.ravel_multi_index(A.T,dims)
Sample run -
In [107]: A
Out[107]:
array([[1, 2, 3],
[1, 2, 3],
[4, 5, 6],
[4, 5, 6]])
In [108]: B
Out[108]:
array([[9, 8, 7],
[7, 6, 5],
[1, 2, 3],
[3, 4, 5]])
In [109]: mask = (D[:,None,:] == A).all(-1)
...: out = mask.dot(B)/(mask.sum(1)[:,None])
...:
In [110]: out
Out[110]:
array([[8, 7, 6],
[2, 3, 4]])
I see two hints :
First, comparing array by rows. A way to do that is to simplify you index system in 1D :
def indexer(M,base=256):
return (M*base**arange(3)).sum(axis=1)
base is an integer > A.max() . Then the selection can be done like that :
indices=np.equal.outer(indexer(D),indexer(A))
for :
array([[ True, True, False, False],
[False, False, True, True]], dtype=bool)
Second, each group can have different length, so vectorisation is difficult for the last step. Here a way to do achieve the job.
B=array(B)
means=[B[i].mean(axis=0) for i in indices]

Numpy repeating a row or column

Suppose we have the matrix A:
A = [1,2,3
4,5,6
7,8,9]
I want to know if there is a way to obtain:
B = [1,2,3
4,5,6
7,8,9
7,8,9]
As well as:
B = [1,2,3,3
4,5,6,6
7,8,9,9]
This is because the function I want to implement is the following:
U(i,j) = min(A(i+1,j)^2, A(i,j)^2)
V(i,j) = min(A(i,j+1)^2, A(i,j)^2)
And the numpy.minimum seems to need two arrays with equal shapes.
My idea is the following:
np.minimum(np.square(A[1:]), np.square(A[:]))
but it will fail.
For your particular example you could use numpy.hstack and numpy.vstack:
In [11]: np.vstack((A, A[-1]))
Out[11]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[7, 8, 9]])
In [12]: np.hstack((A, A[:, [-1]]))
Out[12]:
array([[1, 2, 3, 3],
[4, 5, 6, 6],
[7, 8, 9, 9]])
An alternative to the last one is np.hstack((A, np.atleast_2d(A[:,-1]).T)) or np.vstack((A.T, A.T[-1])).T): you can't hstack a (3,) array to a (3,3) one without putting the elements in the rows of a (3,1) array.
A good answer to your literal question is provided by #xnx, but I wonder whether what you actually need is something else.
This type of question comes up a lot in comparisons, and the usual solution is to take only the valid entries, rather than using a misaligned comparison. That is, something like this is common:
import numpy as np
A = np.arange(9).reshape((3,3))
U = np.minimum(A[1:,:]**2, A[:-1,:]**2)
V = np.minimum(A[:,1:]**2, A[:,:-1]**2)
print U
# [[ 0 1 4]
# [ 9 16 25]]
print V
# [[ 0 1]
# [ 9 16]
# [36 49]]
I suspect that you're probably thinking, "that's a hassle, now U and V have different shapes than A, which is not what I want". But, to this I'd say, "yes, it is a hassle, but it's better to deal with the problem up front and directly than hide it within an invalid row of an array."
A standard example and typical use case of this approach would be numpy.diff, where, "the shape of the output is the same as a except along axis where the dimension is smaller by n."

How do you unroll a Numpy array of (mxn) dimentions into a single vector

I just want to know if there is a short cut to unrolling numpy arrays into a single vector. For instance (convert the following Matlab code to python):
Matlab way:
A = zeros(10,10) %
A_unroll = A(:) % <- How can I do this in python
Thank in advance.
Is this what you have in mind?
Edit: As Patrick points out, one has to be careful with translating A(:) to Python.
Of course if you just want to flatten out a matrix or 2-D array of zeros it does not matter.
So here is a way to get behavior like matlab's.
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> # one way to get Matlab behaivor
... (a.T).ravel()
array([1, 4, 2, 5, 3, 6])
numpy.ravel does flatten 2D array, but does not do it the same way matlab's (:) does.
>>> import numpy as np
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> a.ravel()
array([1, 2, 3, 4, 5, 6])
You have to be careful here, since ravel doesn't unravel the elements in the same that Matlab does with A(:). If you use:
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a.shape
(2,3)
>>> a.ravel()
array([1, 2, 3, 4, 5, 6])
While in Matlab:
>> A = [1:3;4:6];
>> size(A)
ans =
2 3
>> A(:)
ans =
1
4
2
5
3
6
In Matlab, the elements are unraveled first down the columns, then by the rows. In Python it's the opposite. This has to do with the order that elements are stored in (C order by default in NumPy vs. Fortran order in Matlab).
Knowing that A(:) is equivalent to reshape(A,[numel(A),1]), you can get the same behaviour in Python with:
>>> a.reshape(a.size,order='F')
array([1, 4, 2, 5, 3, 6])
Note order='F' which refers to Fortran order (columns first unravelling).

Categories