Combining multiple numpy arrays into one

Combining multiple numpy arrays into one - python

I have this O/P as shown in the below pic
My O/P consists of around 100+ numpy arrays like the one shown above. I'm trying to combine all these 100+ numpy arrays into an single numpy array for further data processing. Any ideas on how to get this done???? I'm using python V3.4

You can use hstack and vstack:
In [29]: hstack((arange(10) for _ in range(10)))
Out[29]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2,
3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5,
6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1,
2, 3, 4, 5, 6, 7, 8, 9])
In [30]: vstack((arange(10) for _ in range(10)))
Out[30]:
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

Related

How to create an array with arrays in one function

I am trying to create an output that will be an array that contains 5 "sub-arrays". Every array should include 10 random numbers between 0 and 10.
I have this code:
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
I get a result like:
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8], [3, 7, 3, 5, 4, 0, 2, 8, 6, 2]]
But instead it should be like:
[[0,2,6,7,9,4,6,1,10,5],[1,3,5,9,8,7,6,9,0,10],[3,5,1,7,9,4,7,2,7,9],[10,2,8,5,6,9,2,3,5,9],[4,5,2,9,8,7,5,1,3,5]]
I cannot seem to get the indentation correct. How do I fix the code?

So what you did was put the print() statement inside a loop, which will print each time it runs.
import random
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
count_tweets()
Hope this helps :)

You got it right, just slide the print out of the for loop.(delete four spaces before print())

Append arrays of different dimensions to get a single array

l have three vectors (numpy arrays), vector_1, vector_2, vector_3
as follow :
Dimension(vector1)=(200,2048)
Dimension(vector2)=(200,8192)
Dimension(vector3)=(200,32768)
l would like to append these vectors to get vector_4 :
Dimension(vector4)= (200,2048+8192+32768)= (200, 43008)
Add respectively vector1 then vector2 then vector3
l tries the following :
vector4=numpy.concatenate((vector1,vector2,vector3),axis=0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
and
vector4=numpy.append(vector4,[vector1,vector2,vectors3],axis=0)
TypeError: append() missing 1 required positional argument: 'values'

I believe you are looking for numpy.hstack.
>>> import numpy as np
>>> a = np.arange(4).reshape(2,2)
>>> b = np.arange(6).reshape(2,3)
>>> c = np.arange(8).reshape(2,4)
>>> a
array([[0, 1],
[2, 3]])
>>> b
array([[0, 1, 2],
[3, 4, 5]])
>>> c
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
>>> np.hstack((a,b,c))
array([[0, 1, 0, 1, 2, 0, 1, 2, 3],
[2, 3, 3, 4, 5, 4, 5, 6, 7]])

The error message is pretty much telling you exactly what is the problem:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
But you are doing the opposite, the concatenation axis dimensions match exactly but the others don't. Consider:
In [3]: arr1 = np.random.randint(0,10,(20, 5))
In [4]: arr2 = np.random.randint(0,10,(20, 3))
In [5]: arr3 = np.random.randint(0,10,(20, 11))
Note the dimensions. Just give it the correct axis. So use the second rather than the first:
In [8]: arr1.shape, arr2.shape, arr3.shape
Out[8]: ((20, 5), (20, 3), (20, 11))
In [9]: np.concatenate((arr1, arr2, arr3), axis=1)
Out[9]:
array([[3, 1, 4, 7, 3, 6, 1, 1, 6, 7, 4, 6, 8, 6, 2, 8, 2, 5, 0],
[4, 2, 2, 1, 7, 8, 0, 7, 2, 2, 3, 9, 8, 0, 7, 3, 5, 9, 6],
[2, 8, 9, 8, 5, 3, 5, 8, 5, 2, 4, 1, 2, 0, 3, 2, 9, 1, 0],
[6, 7, 3, 5, 6, 8, 3, 8, 4, 8, 1, 5, 4, 4, 6, 4, 0, 3, 4],
[3, 5, 8, 8, 7, 7, 4, 8, 7, 3, 8, 7, 0, 2, 8, 9, 1, 9, 0],
[5, 4, 8, 3, 7, 8, 3, 2, 7, 8, 2, 4, 8, 0, 6, 9, 2, 0, 3],
[0, 0, 1, 8, 6, 4, 4, 4, 2, 8, 4, 1, 4, 1, 3, 1, 5, 5, 1],
[1, 6, 3, 3, 9, 2, 3, 4, 9, 2, 6, 1, 4, 1, 5, 6, 0, 1, 9],
[4, 5, 4, 7, 1, 4, 0, 8, 8, 1, 6, 0, 4, 6, 3, 1, 2, 5, 2],
[6, 4, 3, 2, 9, 4, 1, 7, 7, 0, 0, 5, 9, 3, 7, 4, 5, 6, 1],
[7, 7, 0, 4, 1, 9, 9, 1, 0, 1, 8, 3, 6, 0, 5, 1, 4, 0, 7],
[7, 9, 0, 4, 0, 5, 5, 9, 8, 9, 9, 7, 8, 8, 2, 6, 2, 3, 1],
[4, 1, 6, 5, 4, 5, 6, 7, 9, 2, 5, 8, 6, 6, 6, 8, 2, 3, 1],
[7, 7, 8, 5, 0, 8, 5, 6, 4, 4, 3, 5, 9, 8, 7, 9, 8, 8, 1],
[3, 9, 3, 6, 3, 2, 2, 4, 0, 1, 0, 4, 3, 0, 1, 3, 4, 1, 3],
[5, 1, 9, 7, 1, 8, 3, 9, 4, 7, 6, 7, 4, 7, 0, 1, 2, 8, 7],
[6, 3, 8, 0, 6, 2, 1, 8, 1, 0, 0, 3, 7, 2, 1, 5, 7, 0, 7],
[5, 4, 7, 5, 5, 8, 3, 2, 6, 1, 0, 4, 6, 9, 7, 3, 9, 2, 5],
[1, 4, 8, 5, 7, 2, 0, 2, 6, 2, 6, 5, 5, 4, 6, 1, 8, 8, 1],
[4, 4, 5, 6, 2, 6, 0, 5, 1, 8, 4, 5, 8, 9, 2, 1, 0, 4, 2]])
In [10]: np.concatenate((arr1, arr2, arr3), axis=1).shape
Out[10]: (20, 19)

Numpy create index/slicing programmatically from array

I can use numpy.mgrid as follows:
a = numpy.mgrid[x0:x1, y0:y1] # 2 dimensional
b = numpy.mgrid[x0:x1, y0:y1, z0:z1] # 3 dimensional
Now, I'd like to create the expression in brackets programmatically, because I do not know whether I have 1, 2, 3 or more dimensions. I'm looking for something like:
shape = np.array([[x0, x1], [y0, y1], ... maybe more dimensions ...])
idx = (s[0]:s[1] for s in shape)
a = numpy.mgrid[idx]
That gives at least a syntax error in the second line. How can I properly generate those indices/slices programmatically? (The mgrid here is rather an example/use case, the question is really about indexing in general.)

Use the slice object. For example:
shape = np.array([[0, 10], [0, 10]])
idx = tuple(slice(s[0],s[1], 1) for s in shape)
#yields the following
#(slice(0, 10, 1), slice(0, 10, 1))
np.mgrid[idx]
yields
array([[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]],
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]])
Alternatively, you could use the Numpy shorthand np.s_, e.g. np.s_[0:10:1], instead of slice(1, 10, 1), but they are equivalent objects.

Adding the product of an element of a list to an existing list in Python

I am trying to take a list of lists (like shown below)
list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
compute the product of all the elements of each list, and append the result back onto the original list.
So, for example, if I were to take the list I posted above, what I would want it to look like is this:
list_2 = [[5000940,[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]],
[0,[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]],
[0,[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]],
[0,[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2]],
[0,[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]]
The code that I have written so far takes in the list, outputs the products, but unfortunately I can't seem to get it properly appended to the exiting list and I was hoping someone would be able to show me how to do this.
for i in range(len(list)):
global products
products = []
list_prod = reduce(mul, list[i], 1)
#products.append(list_prod)
print products

Here's one way to do it:
from operator import mul
from pprint import pprint
lst = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
lst[:] = map(lambda e: [reduce(mul, e, 1), e], lst)
pprint(lst)
Online Demo

list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
[[reduce(lambda x, y: x * y, line)] + line for line in list]
Gives me
[[5000940, 7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[0, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[0, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[0, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[0, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]

If you are like me, and find list comprehensions difficult to read(Esp in the future). You may find the below code useful. Additionally, avoid using "list" as a name for the variable. As its a library function name.
num_list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
num_list_fin = []
for num_item in num_list:
num_item_u = [reduce(lambda x,y: x*y, num_item)]
num_item_u.append(num_item)
num_list_fin.append(num_item_u)
print num_list_fin
This would give the output:
[[5000940, [7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]], [0, [3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]], [0, [1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]], [0, [6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2]], [0, [7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]]

It may make things clearer if you use a helper function.
def listprod(lst): return reduce(mul, lst, 1)
print( zip(map(listprod, mylist),mylist) )
Change the tuples to lists if you really need that.

Numpy Routine(s) to create a regular grid inside a 2d array

I am trying to write a function that would create a regular grid of 5 pixels by 5 pixels inside a 2d array. I was hoping some combination of numpy.arange and numpy.repeat might do it, but so far I haven't had any luck because numpy.repeat will just repeat along the same row.
Here is an example:
Let's say I want a 5x5 grid inside a 2d array of shape (20, 15). It should look like:
array([[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11]])
I realize I could simply use a loop and slicing to accomplish this, but I could be applying this to very large arrays and I worry that the performance of that would be too slow or impractical.
Can anyone recommend a method to accomplish this?
Thanks in advance.
UPDATE:
All the answers provided seem to work well. Can anyone tell me which will be the most efficient to use for large arrays? By large array I mean it could be 100000 x 100000 or more with 15 x 15 grid cell sizes.

Broadcasting is the answer here:
m, n, d = 20, 15, 5
arr = np.empty((m, n), dtype=np.int)
arr_view = arr.reshape(m // d, d, n // d, d)
vals = np.arange(m // d * n // d).reshape(m // d, 1, n // d, 1)
arr_view[:] = vals
>>> arr
array([[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11]])

Similar to Jaime's answer:
np.repeat(np.arange(0, 10, 3), 4)[..., None] + np.repeat(np.arange(3), 5)[None, ...]

kron will do this expansion (as Brionius also suggested in the comments):
xi, xj, ni, nj = 5, 5, 4, 3
r = np.kron(np.arange(ni*nj).reshape((ni,nj)), np.ones((xi, xj)))
Although I haven't tested it, I assume it's less efficient than the broadcasting approach, but a bit more concise and easier to understand (I hope). It's likely less efficient because: 1) it requires the array of ones, 2) it does xi*xj multiplications by 1, and 3) it does a bunch of concats.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combining multiple numpy arrays into one - python

I have this O/P as shown in the below pic My O/P consists of around 100+ numpy arrays like the one shown above. I'm trying to combine all these 100+ numpy arrays into an single numpy array for further data processing. Any ideas on how to get this done???? I'm using python V3.4

Related

How to create an array with arrays in one function

Append arrays of different dimensions to get a single array

Numpy create index/slicing programmatically from array

Adding the product of an element of a list to an existing list in Python

Numpy Routine(s) to create a regular grid inside a 2d array

Categories

Resources