Combining multiple numpy arrays into one - python

I have this O/P as shown in the below pic
My O/P consists of around 100+ numpy arrays like the one shown above. I'm trying to combine all these 100+ numpy arrays into an single numpy array for further data processing. Any ideas on how to get this done???? I'm using python V3.4

You can use hstack and vstack:
In [29]: hstack((arange(10) for _ in range(10)))
Out[29]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2,
3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5,
6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1,
2, 3, 4, 5, 6, 7, 8, 9])
In [30]: vstack((arange(10) for _ in range(10)))
Out[30]:
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

Related

How to create an array with arrays in one function

I am trying to create an output that will be an array that contains 5 "sub-arrays". Every array should include 10 random numbers between 0 and 10.
I have this code:
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
I get a result like:
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8], [3, 7, 3, 5, 4, 0, 2, 8, 6, 2]]
But instead it should be like:
[[0,2,6,7,9,4,6,1,10,5],[1,3,5,9,8,7,6,9,0,10],[3,5,1,7,9,4,7,2,7,9],[10,2,8,5,6,9,2,3,5,9],[4,5,2,9,8,7,5,1,3,5]]
I cannot seem to get the indentation correct. How do I fix the code?
So what you did was put the print() statement inside a loop, which will print each time it runs.
import random
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
count_tweets()
Hope this helps :)
You got it right, just slide the print out of the for loop.(delete four spaces before print())

Append arrays of different dimensions to get a single array

l have three vectors (numpy arrays), vector_1, vector_2, vector_3
as follow :
Dimension(vector1)=(200,2048)
Dimension(vector2)=(200,8192)
Dimension(vector3)=(200,32768)
l would like to append these vectors to get vector_4 :
Dimension(vector4)= (200,2048+8192+32768)= (200, 43008)
Add respectively vector1 then vector2 then vector3
l tries the following :
vector4=numpy.concatenate((vector1,vector2,vector3),axis=0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
and
vector4=numpy.append(vector4,[vector1,vector2,vectors3],axis=0)
TypeError: append() missing 1 required positional argument: 'values'
I believe you are looking for numpy.hstack.
>>> import numpy as np
>>> a = np.arange(4).reshape(2,2)
>>> b = np.arange(6).reshape(2,3)
>>> c = np.arange(8).reshape(2,4)
>>> a
array([[0, 1],
[2, 3]])
>>> b
array([[0, 1, 2],
[3, 4, 5]])
>>> c
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
>>> np.hstack((a,b,c))
array([[0, 1, 0, 1, 2, 0, 1, 2, 3],
[2, 3, 3, 4, 5, 4, 5, 6, 7]])
The error message is pretty much telling you exactly what is the problem:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
But you are doing the opposite, the concatenation axis dimensions match exactly but the others don't. Consider:
In [3]: arr1 = np.random.randint(0,10,(20, 5))
In [4]: arr2 = np.random.randint(0,10,(20, 3))
In [5]: arr3 = np.random.randint(0,10,(20, 11))
Note the dimensions. Just give it the correct axis. So use the second rather than the first:
In [8]: arr1.shape, arr2.shape, arr3.shape
Out[8]: ((20, 5), (20, 3), (20, 11))
In [9]: np.concatenate((arr1, arr2, arr3), axis=1)
Out[9]:
array([[3, 1, 4, 7, 3, 6, 1, 1, 6, 7, 4, 6, 8, 6, 2, 8, 2, 5, 0],
[4, 2, 2, 1, 7, 8, 0, 7, 2, 2, 3, 9, 8, 0, 7, 3, 5, 9, 6],
[2, 8, 9, 8, 5, 3, 5, 8, 5, 2, 4, 1, 2, 0, 3, 2, 9, 1, 0],
[6, 7, 3, 5, 6, 8, 3, 8, 4, 8, 1, 5, 4, 4, 6, 4, 0, 3, 4],
[3, 5, 8, 8, 7, 7, 4, 8, 7, 3, 8, 7, 0, 2, 8, 9, 1, 9, 0],
[5, 4, 8, 3, 7, 8, 3, 2, 7, 8, 2, 4, 8, 0, 6, 9, 2, 0, 3],
[0, 0, 1, 8, 6, 4, 4, 4, 2, 8, 4, 1, 4, 1, 3, 1, 5, 5, 1],
[1, 6, 3, 3, 9, 2, 3, 4, 9, 2, 6, 1, 4, 1, 5, 6, 0, 1, 9],
[4, 5, 4, 7, 1, 4, 0, 8, 8, 1, 6, 0, 4, 6, 3, 1, 2, 5, 2],
[6, 4, 3, 2, 9, 4, 1, 7, 7, 0, 0, 5, 9, 3, 7, 4, 5, 6, 1],
[7, 7, 0, 4, 1, 9, 9, 1, 0, 1, 8, 3, 6, 0, 5, 1, 4, 0, 7],
[7, 9, 0, 4, 0, 5, 5, 9, 8, 9, 9, 7, 8, 8, 2, 6, 2, 3, 1],
[4, 1, 6, 5, 4, 5, 6, 7, 9, 2, 5, 8, 6, 6, 6, 8, 2, 3, 1],
[7, 7, 8, 5, 0, 8, 5, 6, 4, 4, 3, 5, 9, 8, 7, 9, 8, 8, 1],
[3, 9, 3, 6, 3, 2, 2, 4, 0, 1, 0, 4, 3, 0, 1, 3, 4, 1, 3],
[5, 1, 9, 7, 1, 8, 3, 9, 4, 7, 6, 7, 4, 7, 0, 1, 2, 8, 7],
[6, 3, 8, 0, 6, 2, 1, 8, 1, 0, 0, 3, 7, 2, 1, 5, 7, 0, 7],
[5, 4, 7, 5, 5, 8, 3, 2, 6, 1, 0, 4, 6, 9, 7, 3, 9, 2, 5],
[1, 4, 8, 5, 7, 2, 0, 2, 6, 2, 6, 5, 5, 4, 6, 1, 8, 8, 1],
[4, 4, 5, 6, 2, 6, 0, 5, 1, 8, 4, 5, 8, 9, 2, 1, 0, 4, 2]])
In [10]: np.concatenate((arr1, arr2, arr3), axis=1).shape
Out[10]: (20, 19)

Numpy create index/slicing programmatically from array

I can use numpy.mgrid as follows:
a = numpy.mgrid[x0:x1, y0:y1] # 2 dimensional
b = numpy.mgrid[x0:x1, y0:y1, z0:z1] # 3 dimensional
Now, I'd like to create the expression in brackets programmatically, because I do not know whether I have 1, 2, 3 or more dimensions. I'm looking for something like:
shape = np.array([[x0, x1], [y0, y1], ... maybe more dimensions ...])
idx = (s[0]:s[1] for s in shape)
a = numpy.mgrid[idx]
That gives at least a syntax error in the second line. How can I properly generate those indices/slices programmatically? (The mgrid here is rather an example/use case, the question is really about indexing in general.)
Use the slice object. For example:
shape = np.array([[0, 10], [0, 10]])
idx = tuple(slice(s[0],s[1], 1) for s in shape)
#yields the following
#(slice(0, 10, 1), slice(0, 10, 1))
np.mgrid[idx]
yields
array([[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]],
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]])
Alternatively, you could use the Numpy shorthand np.s_, e.g. np.s_[0:10:1], instead of slice(1, 10, 1), but they are equivalent objects.

Adding the product of an element of a list to an existing list in Python

I am trying to take a list of lists (like shown below)
list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
compute the product of all the elements of each list, and append the result back onto the original list.
So, for example, if I were to take the list I posted above, what I would want it to look like is this:
list_2 = [[5000940,[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]],
[0,[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]],
[0,[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]],
[0,[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2]],
[0,[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]]
The code that I have written so far takes in the list, outputs the products, but unfortunately I can't seem to get it properly appended to the exiting list and I was hoping someone would be able to show me how to do this.
for i in range(len(list)):
global products
products = []
list_prod = reduce(mul, list[i], 1)
#products.append(list_prod)
print products
Here's one way to do it:
from operator import mul
from pprint import pprint
lst = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
lst[:] = map(lambda e: [reduce(mul, e, 1), e], lst)
pprint(lst)
Online Demo
list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
[[reduce(lambda x, y: x * y, line)] + line for line in list]
Gives me
[[5000940, 7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[0, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[0, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[0, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[0, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
If you are like me, and find list comprehensions difficult to read(Esp in the future). You may find the below code useful. Additionally, avoid using "list" as a name for the variable. As its a library function name.
num_list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
num_list_fin = []
for num_item in num_list:
num_item_u = [reduce(lambda x,y: x*y, num_item)]
num_item_u.append(num_item)
num_list_fin.append(num_item_u)
print num_list_fin
This would give the output:
[[5000940, [7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]], [0, [3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]], [0, [1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]], [0, [6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2]], [0, [7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]]
It may make things clearer if you use a helper function.
def listprod(lst): return reduce(mul, lst, 1)
print( zip(map(listprod, mylist),mylist) )
Change the tuples to lists if you really need that.

Numpy Routine(s) to create a regular grid inside a 2d array

I am trying to write a function that would create a regular grid of 5 pixels by 5 pixels inside a 2d array. I was hoping some combination of numpy.arange and numpy.repeat might do it, but so far I haven't had any luck because numpy.repeat will just repeat along the same row.
Here is an example:
Let's say I want a 5x5 grid inside a 2d array of shape (20, 15). It should look like:
array([[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
[ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11]])
I realize I could simply use a loop and slicing to accomplish this, but I could be applying this to very large arrays and I worry that the performance of that would be too slow or impractical.
Can anyone recommend a method to accomplish this?
Thanks in advance.
UPDATE:
All the answers provided seem to work well. Can anyone tell me which will be the most efficient to use for large arrays? By large array I mean it could be 100000 x 100000 or more with 15 x 15 grid cell sizes.
Broadcasting is the answer here:
m, n, d = 20, 15, 5
arr = np.empty((m, n), dtype=np.int)
arr_view = arr.reshape(m // d, d, n // d, d)
vals = np.arange(m // d * n // d).reshape(m // d, 1, n // d, 1)
arr_view[:] = vals
>>> arr
array([[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
[ 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11]])
Similar to Jaime's answer:
np.repeat(np.arange(0, 10, 3), 4)[..., None] + np.repeat(np.arange(3), 5)[None, ...]
kron will do this expansion (as Brionius also suggested in the comments):
xi, xj, ni, nj = 5, 5, 4, 3
r = np.kron(np.arange(ni*nj).reshape((ni,nj)), np.ones((xi, xj)))
Although I haven't tested it, I assume it's less efficient than the broadcasting approach, but a bit more concise and easier to understand (I hope). It's likely less efficient because: 1) it requires the array of ones, 2) it does xi*xj multiplications by 1, and 3) it does a bunch of concats.

Categories