How to initialize a numpy array with lists - python

I want to create a np.array filled with lists. The array I want to create is a 2D one. I wonder if there is a way to create this array full of lists kind of like np.zeros((10, 10)) but lists instead of zeros

if you wish to use a list of lists:
import numpy as np
l = [[1,2,3],[2,3,4],[3,4,5]]
np.array(l)
# array([[1, 2, 3],
# [2, 3, 4],
# [3, 4, 5]])
if you have multiple list of same dimension:
import numpy as np
l1 = [1,2,3]
l2 = [2,3,4]
l3 = [3,4,5]
np.array([l1, l2, l3])
# array([[1, 2, 3],
# [2, 3, 4],
# [3, 4, 5]])

You can try the following method np.array()
Where you can fill each list with numbers, just make sure that each list(list 1, list 2 ... ) all contain the same number of elements.
arr = np.array([[list 1],
[list 2],
[list 3],
...
[list n]])

if you want to generate a random array of lists then you can use np.random.random
>>np.random.random((5,5))
array([[0.72158455, 0.09803052, 0.1160546 , 0.55904644, 0.79821847],
[0.36929337, 0.15455486, 0.25862476, 0.44324732, 0.06120428],
[0.95063129, 0.38533428, 0.96552669, 0.07803165, 0.46604093],
[0.04999251, 0.8845952 , 0.8090841 , 0.64154241, 0.95548603],
[0.83991298, 0.85053047, 0.36522791, 0.89616194, 0.10960277]])
or to generate random values between a range you can use np.random.uniform
>>np.random.uniform(low=0,high=10,size=(5,5))
array([[4.9572961 , 5.44408409, 6.74143596, 6.57745607, 5.90485241],
[7.37032096, 0.70533052, 2.93912528, 8.54091449, 7.6188883 ],
[8.27882354, 0.02749772, 6.45388547, 4.94197824, 9.29715119],
[6.72579011, 4.65019332, 4.67693981, 2.52006744, 8.3876697 ],
[8.99122563, 3.70552959, 2.50082311, 8.68846022, 6.34887673]])

Related

ordering an array based on values of another array

This question is probably basic for some of you but it I am new to Python. I have an initial array:
initial_array = np.array ([1, 6, 3, 4])
I have another array.
value_array= np.array ([10, 2, 3, 15])
I want an array called output array which looks at the values in value_array and reorder the initial array.
My result should look like this:
output_array = np.array ([4, 1, 3, 6])
Does anyone know if this is possible to do in Python?
So far I have tried:
for i in range(4):
find position of element
You can use numpy.argsort to find sort_index from value_array then rearrange the initial_array base sort_index in the reversing order with [::-1].
>>> idx_sort = value_array.argsort()
>>> initial_array[idx_sort[::-1]]
array([4, 1, 3, 6])
You could use stack to put arrays together - basically adding column to initial array, then sort by that column.
import numpy as np
initial_array = np.array ([1, 6, 3, 4])
value_array = np.array ([10, 2, 3, 15])
output_array = np.stack((initial_array, value_array), axis=1)
output_array=output_array[output_array[:, 1].argsort()][::-1]
print (output_array)
[::-1] part is for descending order. Remove to get ascending.
I am assuming initial_array and values_array will have same length.

String data in list to numeric values in python | unhashable type: 'list'

I have a list as follows, it's a huge list, this is just a chunk of it.
my_list= [['I. R. Palmer','U. Kersten'],
['H. Breitwieser', 'U. Kersten'],
['Halvard Skogsrud', 'Boualem Benatallah', 'Fabio Casati', 'Manh Q. Dinh'],
['Stefano Ceri', 'Piero Fraternali', 'Stefano Paraboschi']]
I want to assign each string in list a unique numeric value. if a string is repeating some where else, assign It the same previous value
new_list= [[0,1],
[2,1],
[3,4,5,6],
[7,8,9]]
i have tried
pd.factorize(my_list)
but i am getting
unhashable type: 'list'
You can flatten list, use factorize working with 1d array, create dict by zip and replace in nested list comprehension:
a = [y for x in my_list for y in x]
f1, f2 = pd.factorize(a)
d = dict(zip(f2[f1], f1))
new_list = [[d[y] for y in x] for x in my_list]
print (new_list)
[[0, 1], [2, 1], [3, 4, 5, 6], [7, 8, 9]]
pandas.factorize operates on a one-dimensional sequence, but you have a 2D sequence. And since your 2D sequence isn't a regular shape (each internal list has a different length) you can't work around that by reshaping. The error you are seeing is because pandas is trying to treat the internal lists as the categories rather than the strings inside the internal lists.
You could build the result yourself:
authors_map = {} # I'm just guessing that they're authors
next_id = 0
new_list = []
for authors in my_list:
new_authors = []
for author in authors:
if author not in authors_map:
authors_map[author] = next_id
next_id += 1
new_authors.append(authors_map[author])
new_list.append(new_authors)
You can only have A 1-D sequence in pd.factorize. Refer doc
You can use np.concatenate to convert list into 1D
import numpy as np
print(np.concatenate(my_list))
# array(['I. R. Palmer', 'U. Kersten', 'H. Breitwieser', 'U. Kersten',
# 'Halvard Skogsrud', 'Boualem Benatallah', 'Fabio Casati',
# 'Manh Q. Dinh', 'Stefano Ceri', 'Piero Fraternali',
# 'Stefano Paraboschi'], dtype='<U18')
print(pd.factorize(np.concatenate(my_list)))
Output:
(array([0, 1, 2, 1, 3, 4, 5, 6, 7, 8, 9], dtype=int64),
array(['I. R. Palmer', 'U. Kersten', 'H. Breitwieser', 'Halvard Skogsrud',
'Boualem Benatallah', 'Fabio Casati', 'Manh Q. Dinh',
'Stefano Ceri', 'Piero Fraternali', 'Stefano Paraboschi'],
dtype=object))
factorize + concatenate + cumsum + array_split
pd.factorize works by hashing. But the values in your lists are lists, which aren't hashable. Indeed, in any case, you aren't looking to hash lists but individual values.
Instead, you can factorize a flattened list and use an array of indices for splitting:
import pandas as pd
import numpy as np
flattened = np.concatenate(my_list)
idx_split = np.array(list(map(len, my_list))).cumsum()[:-1]
res = [i.tolist() for i in np.array_split(pd.factorize(flattened)[0], idx_split)]
print(res)
[[0, 1], [2, 1], [3, 4, 5, 6], [7, 8, 9]]

Make member vectors out of tuples in python

I have a list of tuples/lists.
Example:
a = [[1,2], [2,4], [3,6]]
Given all sub-lists are the same length I want to split them and receive lists/vectors for each member.
Or in one [[1,2,3],[2,4,6]]
Every solution using numpy or default lists would be appretiated.
I have not found a way to do this pythonicly, or efficiently by using any other feature than loops:
def vectorise_pairs(pairs):
return [[p[0] for p in pairs],
[p[1] for p in pairs]
]
Is there a better way to do this?
first, second = zip(*a)
print(first, second)
outputs
(1, 2, 3) (2, 4, 6)
If you need lists or numpy arrays you can convert them:
first, second = list(first), list(second)
first, second = np.array(first), np.array(second)
Since you tagged numpy, my_array.T transposes my_array.
>>> import numpy as np
>>> a = [[1,2], [2,4], [3,6]]
>>> np.array(a).T
array([[1, 2, 3],
[2, 4, 6]])
Alternatively, you can use np.transpose (which even accepts lists).
>>> np.transpose(a)
array([[1, 2, 3],
[2, 4, 6]])
Alex's solution works well as a general transposition of any Python iterable. If you have some reason to specifically want to use Numpy, you could also use the following:
import numpy as np
a = np.array([[1,2], [2,4], [3,6]])
first, second = a.T
# OR,
first = a[:, 0]
second = a[:, 1] # etc.
Directly from the official documentation (https://docs.python.org/2/tutorial/datastructures.html#nested-list-comprehensions):
a = [[1,2], [2,4], [3,6]]
[[row[i] for row in a] for i in range(len(a[0]))]
#=> [[1, 2, 3], [2, 4, 6]]

Python: using list to index [from:to] arbitrary numpy arrays

I want to copy a chunk from a matrix into a piece of another matrix.
To use this with any kind of n-dimensional array, I need to apply a list with offsets via the [] operator. Is there a way to do this?
mat_bigger[0:5, 0:5, ..] = mat_smaller[2:7, 2:7, ..]
like:
off_min = [0,0,0]
off_max = [2,2,2]
for i in range(len(off_min)):
mat_bigger[off_min[i] : off_max[i], ..] = ..
You can do this by creating a tuple of slice objects. For example:
mat_big = np.zeros((4, 5, 6))
mat_small = np.random.rand(2, 2, 2)
off_min = [2, 3, 4]
off_max = [4, 5, 6]
slices = tuple(slice(start, end) for start, end in zip(off_min, off_max))
mat_big[slices] = mat_small

Insert a list, element by element, into the elements of an array of arrays

I have an array of arrays:
parameters = [np.array([ 2.1e-04, -8.3e-03, 9.8e-01]), np.array([ 5.5e-04, 1.2e-01, 9.9e-01]), ...]
whose length is:
print len(parameters)
100
If we label the elements of parameters as parameters[i][j]:
it is then possible to access each number, i.e. print parameters[1][2] gives 0.99
I also have an array:
temperatures = [110.51, 1618.079, ...]
whose length is also 100:
print len(temperatures)
100
Let the elements of temperatures be k:
I would like to insert each kth element of temperatures into each ith element of parameters, in order to obtain final:
final = [np.array([ 2.1e-04, -8.3e-03, 9.8e-01, 110.51]), np.array([ 5.5e-04, 1.2e-01, 9.9e-01, 1618.079]), ...]
I have tried to make something like a zip loop:
for i,j in zip(parameters, valid_temperatures):
final = parameters[2][i].append(valid_temperatures[j])
but this does not work. I would appreciate if you could help me.
EDIT: Based on #hpaulj answer:
If you run Solution 1:
parameters = [np.array([ 2.1e-04, -8.3e-03, 9.8e-01]), np.array([ 5.5e-04, 1.2e-01, 9.9e-01])]
temperatures = [110.51, 1618.079]
for i,(arr,t) in enumerate(zip(parameters,temperatures)):
parameters[i] = np.append(arr,t)
print parameters
It gives:
[array([ 2.10000000e-04, -8.30000000e-03, 9.80000000e-01,
1.10510000e+02]), array([ 5.50000000e-04, 1.20000000e-01, 9.90000000e-01,
1.61807900e+03])]
which is the desired output.
In addition, Solution 2:
parameters = [np.array([ 2.1e-04, -8.3e-03, 9.8e-01]), np.array([ 5.5e-04, 1.2e-01, 9.9e-01])]
temperatures = [110.51, 1618.079]
parameters = [np.append(arr,t) for arr, t in zip(parameters,temperatures)]
print parameters
also gives the desired output.
As opposed to Solution 1, Solution 2 doesn't use the ith enumerate index. Therefore, if I just split Solution 2's [np.append ... for arr ] syntax the following way:
parameters = [np.array([ 2.1e-04, -8.3e-03, 9.8e-01]), np.array([ 5.5e-04, 1.2e-01, 9.9e-01])]
temperatures = [110.51, 1618.079]
for arr, t in zip(parameters,temperatures):
parameters = np.append(arr,t)
print parameters
The output contains only the last iteration, and not in an "array-format":
[ 5.50000000e-04 1.20000000e-01 9.90000000e-01 1.61807900e+03]
How would it be possible to make this to work, by printing all the iterations ?
Thanks
You have a list of arrays, plus another list or array:
In [656]: parameters = [np.array([1,2,3]) for _ in range(5)]
In [657]: temps=np.arange(5)
to combine them just iterate through (a list comprehension works fine for that), and perform a concatenate (array append) for each pair.
In [659]: [np.concatenate((arr,[t])) for arr, t in zip(parameters, temps)]
Out[659]:
[array([1, 2, 3, 0]),
array([1, 2, 3, 1]),
array([1, 2, 3, 2]),
array([1, 2, 3, 3]),
array([1, 2, 3, 4])]
the use append saves us two pairs of [], otherwise it is the same:
[np.append(arr,t) for arr, t in zip(parameters,temps)]
A clean 'in-place' version:
for i,(arr,t) in enumerate(zip(parameters,temps)):
parameters[i] = np.append(arr,t)
================
If the subarrays are all the same length, you could turn parameters into a 2d array, and concatenate the temps:
In [663]: np.hstack((np.vstack(parameters),temps[:,None]))
Out[663]:
array([[1, 2, 3, 0],
[1, 2, 3, 1],
[1, 2, 3, 2],
[1, 2, 3, 3],
[1, 2, 3, 4]])

Categories