Covert numpy.ndarray to a list - python

I'm trying to convert this numpy.ndarray to a list
[[105.53518731]
[106.45317529]
[107.37373843]
[108.00632646]
[108.56373502]
[109.28813113]
[109.75593207]
[110.57458371]
[111.47960639]]
I'm using this function to convert it.
conver = conver.tolist()
the output is this, I'm not sure whether it's a list and if so, can I access its elements by doing cover[0] , etc
[[105.5351873125], [106.45317529411764], [107.37373843478261], [108.00632645652173], [108.56373502040816], [109.28813113157895], [109.75593206666666], [110.57458370833334], [111.47960639393939]]
finally, after I convert it to a list, I try to multiply the list members by 1.05 and get this error!
TypeError: can't multiply sequence by non-int of type 'float'

You start with a 2d array, with shape (n,1), like this:
In [342]: arr = np.random.rand(5,1)*100
In [343]: arr
Out[343]:
array([[95.39049043],
[19.09502087],
[85.45215423],
[94.77657561],
[32.7869103 ]])
tolist produces a list - but it contains lists; each [] layer denotes a list. Notice that the [] nesting matches the array's:
In [344]: arr.tolist()
Out[344]:
[[95.39049043424225],
[19.095020872584335],
[85.4521542296349],
[94.77657561477125],
[32.786910295446425]]
To get a number you have to index through each list layer:
In [345]: arr.tolist()[0]
Out[345]: [95.39049043424225]
In [346]: arr.tolist()[0][0]
Out[346]: 95.39049043424225
In [347]: arr.tolist()[0][0]*1.05
Out[347]: 100.16001495595437
If you first turn the array into a 1d one, the list indexing is simpler:
In [348]: arr.ravel()
Out[348]: array([95.39049043, 19.09502087, 85.45215423, 94.77657561, 32.7869103 ])
In [349]: arr.ravel().tolist()
Out[349]:
[95.39049043424225,
19.095020872584335,
85.4521542296349,
94.77657561477125,
32.786910295446425]
In [350]: arr.ravel().tolist()[0]
Out[350]: 95.39049043424225
But if your primary goal is to multiply the elements, doing with the array is simpler:
In [351]: arr * 1.05
Out[351]:
array([[100.16001496],
[ 20.04977192],
[ 89.72476194],
[ 99.5154044 ],
[ 34.42625581]])
You can access elements of the array with:
In [352]: arr[0,0]
Out[352]: 95.39049043424225
But if you do need to iterate, the tolist() option is good to know. Iterating on lists is usually faster than iterating on an array. With an array you should try to use the fast whole-array methods.

you convert to list of list, so you could not broadcast.
import numpy as np
x = [[105.53518731],
[106.45317529],
[107.37373843],
[108.00632646],
[108.56373502],
[109.28813113],
[109.75593207],
[110.57458371],
[111.47960639],]
x = np.hstack(x)
x * 1.05
array([110.81194668, 111.77583405, 112.74242535, 113.40664278,
113.99192177, 114.75253769, 115.24372867, 116.1033129 ,
117.05358671])

yes, it's a list, you can check the type of a variable:
type(a)
to multiply each element with 1.05 then run the code below:
x = [float(i[0]) * 1.05 for i in a]
print(x)

Try this:
import numpy as np
a = [[105.53518731],
[106.45317529],
[107.37373843],
[108.00632646],
[108.56373502],
[109.28813113],
[109.75593207],
[110.57458371],
[111.47960639]]
b = [elem[0] for elem in a]
b = np.array(b)
print(b*1.05)

Related

numpy sum of each array in a list of arrays of different size

Given a list of numpy arrays, each of different length, as that obtained by doing lst = np.array_split(arr, indices), how do I get the sum of every array in the list? (I know how to do it using list-comprehension but I was hoping there was a pure-numpy way to do it).
I thought that this would work:
np.apply_along_axis(lambda arr: arr.sum(), axis=0, arr=lst)
But it doesn't, instead it gives me this error which I don't understand:
ValueError: operands could not be broadcast together with shapes (0,) (12,)
NB: It's an array of sympy objects.
There's a faster way which avoids np.split, and utilizes np.reduceat. We create an ascending array of indices where you want to sum elements with np.append([0], np.cumsum(indices)[:-1]). For proper indexing we need to put a zero in front (and discard the last element, if it covers the full range of the original array.. otherwise just delete the [:-1] indexing). Then we use the np.add ufunc with np.reduceat:
import numpy as np
arr = np.arange(1, 11)
indices = np.array([2, 4, 4])
# this should split like this
# [1 2 | 3 4 5 6 | 7 8 9 10]
np.add.reduceat(arr, np.append([0], np.cumsum(indices)[:-1]))
# array([ 3, 18, 34])

How to broadcast sum of list of list?

How to broadcast the sum of list of list in an efficient way?
below is a working code, but its not quite efficient when list1 has nth value like 30 elements.
Any improvement on this?
from numpy import sum
import numpy as np
list1 = [[4,8],[8,16]]
list2 = [2]
elemSum=[sum(list1[0]),sum(list1[1])]
print((np.array(elemSum)/np.array(list2)))
prints:
[ 6. 12.] # expected output
I want a single line like this below , eliminating the declaration of variable elemSum, but it yields incorrect output since it sums 2 elements to 1
print(sum(np.array(list1)/np.array(list2)))
prints:
18.0 # not expected it sums 2 elements to 1
numpy.sum takes an optional axis argument, which can be used for partial sums along a single axis:
>>> list1 = np.array([[4,8],[8,16]])
>>> list2 = np.array([2])
>>> np.sum(list1)
36
>>> np.sum(list1, axis=1)
array([12, 24])
>>> np.sum(list1, axis=1) / list2
array([ 6., 12.])
Just use numpy the entire time, don't mess with lists if you want arrays:
list1 = [[4,8],[8,16]]
list2 = [2]
import numpy as np
arr1 = np.array(list1)
arr2 = np.array(list2)
Then simply:
result = arr1.sum(axis=1) / arr2

Python - Easy and Fast Way to Divide Values in a 2D List by a Single Value

For example I have a 2D list of 2x2 dimension stored in a variable r.
12 24
36 48
I want to divide every value in the list by 2. An easy slow way to do this is to iterate through each row and column and divide it by 2.
rows = 2
columns = 2
for x in range(rows):
for y in range(columns):
r[x][y] = r[x][y]/2
Is there an easy and fast way to divide each values in a 2D list to a specific value other than the manual approach? I tried the code below but it throws an error:
s = r /2
Error:
TypeError: unsupported operand type(s) for /: 'list' and 'int'
You can use numpy library to achieve the result you want. it uses vectorization so its the fastest way to do the operation.
Try this:
import numpy as np
arr = np.array(r) # --> initialize with 2d array
arr = arr / 2 # --> Now each element in the 2d array gets divided by 2
For Example:
arr = np.array([[1, 2], [2, 3]]) # --> initialize with 2d array
arr = arr / 2 # --> divide by the scalar
print(arr)
Output:
[[0.5 1. ]
[1. 1.5]]
Looks like there is even a section in official documentation about nested list comprehensions:
https://docs.python.org/3/tutorial/datastructures.html#nested-list-comprehensions
i.e.
>>> r = [[12,24],[36,48]]
>>> [[i/2 for i in a] for a in r]
[[6.0, 12.0], [18.0, 24.0]]
import numpy as np
x=np.array(2D_list)
x=x/2

Scale data error: "ValueError: setting an array element with a sequence." [duplicate]

Why do the following code samples:
np.array([[1, 2], [2, 3, 4]])
np.array([1.2, "abc"], dtype=float)
...all give the following error?
ValueError: setting an array element with a sequence.
Possible reason 1: trying to create a jagged array
You may be creating an array from a list that isn't shaped like a multi-dimensional array:
numpy.array([[1, 2], [2, 3, 4]]) # wrong!
numpy.array([[1, 2], [2, [3, 4]]]) # wrong!
In these examples, the argument to numpy.array contains sequences of different lengths. Those will yield this error message because the input list is not shaped like a "box" that can be turned into a multidimensional array.
Possible reason 2: providing elements of incompatible types
For example, providing a string as an element in an array of type float:
numpy.array([1.2, "abc"], dtype=float) # wrong!
If you really want to have a NumPy array containing both strings and floats, you could use the dtype object, which allows the array to hold arbitrary Python objects:
numpy.array([1.2, "abc"], dtype=object)
The Python ValueError:
ValueError: setting an array element with a sequence.
Means exactly what it says, you're trying to cram a sequence of numbers into a single number slot. It can be thrown under various circumstances.
1. When you pass a python tuple or list to be interpreted as a numpy array element:
import numpy
numpy.array([1,2,3]) #good
numpy.array([1, (2,3)]) #Fail, can't convert a tuple into a numpy
#array element
numpy.mean([5,(6+7)]) #good
numpy.mean([5,tuple(range(2))]) #Fail, can't convert a tuple into a numpy
#array element
def foo():
return 3
numpy.array([2, foo()]) #good
def foo():
return [3,4]
numpy.array([2, foo()]) #Fail, can't convert a list into a numpy
#array element
2. By trying to cram a numpy array length > 1 into a numpy array element:
x = np.array([1,2,3])
x[0] = np.array([4]) #good
x = np.array([1,2,3])
x[0] = np.array([4,5]) #Fail, can't convert the numpy array to fit
#into a numpy array element
A numpy array is being created, and numpy doesn't know how to cram multivalued tuples or arrays into single element slots. It expects whatever you give it to evaluate to a single number, if it doesn't, Numpy responds that it doesn't know how to set an array element with a sequence.
In my case , I got this Error in Tensorflow , Reason was i was trying to feed a array with different length or sequences :
example :
import tensorflow as tf
input_x = tf.placeholder(tf.int32,[None,None])
word_embedding = tf.get_variable('embeddin',shape=[len(vocab_),110],dtype=tf.float32,initializer=tf.random_uniform_initializer(-0.01,0.01))
embedding_look=tf.nn.embedding_lookup(word_embedding,input_x)
with tf.Session() as tt:
tt.run(tf.global_variables_initializer())
a,b=tt.run([word_embedding,embedding_look],feed_dict={input_x:example_array})
print(b)
And if my array is :
example_array = [[1,2,3],[1,2]]
Then i will get error :
ValueError: setting an array element with a sequence.
but if i do padding then :
example_array = [[1,2,3],[1,2,0]]
Now it's working.
for those who are having trouble with similar problems in Numpy, a very simple solution would be:
defining dtype=object when defining an array for assigning values to it. for instance:
out = np.empty_like(lil_img, dtype=object)
In my case, the problem was another. I was trying convert lists of lists of int to array. The problem was that there was one list with a different length than others. If you want to prove it, you must do:
print([i for i,x in enumerate(list) if len(x) != 560])
In my case, the length reference was 560.
In my case, the problem was with a scatterplot of a dataframe X[]:
ax.scatter(X[:,0],X[:,1],c=colors,
cmap=CMAP, edgecolor='k', s=40) #c=y[:,0],
#ValueError: setting an array element with a sequence.
#Fix with .toarray():
colors = 'br'
y = label_binarize(y, classes=['Irrelevant','Relevant'])
ax.scatter(X[:,0].toarray(),X[:,1].toarray(),c=colors,
cmap=CMAP, edgecolor='k', s=40)
When the shape is not regular or the elements have different data types, the dtype argument passed to np.array only can be object.
import numpy as np
# arr1 = np.array([[10, 20.], [30], [40]], dtype=np.float32) # error
arr2 = np.array([[10, 20.], [30], [40]]) # OK, and the dtype is object
arr3 = np.array([[10, 20.], 'hello']) # OK, and the dtype is also object
``
In my case, I had a nested list as the series that I wanted to use as an input.
First check: If
df['nestedList'][0]
outputs a list like [1,2,3], you have a nested list.
Then check if you still get the error when changing to input df['nestedList'][0].
Then your next step is probably to concatenate all nested lists into one unnested list, using
[item for sublist in df['nestedList'] for item in sublist]
This flattening of the nested list is borrowed from How to make a flat list out of list of lists?.
The error is because the dtype argument of the np.array function specifies the data type of the elements in the array, and it can only be set to a single data type that is compatible with all the elements. The value "abc" is not a valid float, so trying to convert it to a float results in a ValueError. To avoid this error, you can either remove the string element from the list, or choose a different data type that can handle both float values and string values, such as object.
numpy.array([1.2, "abc"], dtype=object)

Create matrix in a loop with numpy

I would like to build up a numpy matrix using rows I get in a loop. But how do I initialize the matrix? If I write
A = []
A = numpy.vstack((A, [1, 2]))
I get
ValueError: all the input array dimensions except for the concatenation axis must match exactly
What's the best practice for this?
NOTE: I do not know the number of rows in advance. The number of columns is known.
Unknown number of rows
One way is to form a list of lists, and then convert to a numpy array in one operation:
final = []
# x is some generator
for item in x:
final.append(x)
A = np.array(x)
Or, more elegantly, given a generator x:
A = np.array(list(x))
This solution is time-efficient but memory-inefficient.
Known number of rows
Append operations on numpy arrays are expensive and not recommended. If you know the size of the final array in advance, you can instantiate an empty (or zero) array of your desired size, and then fill it with values. For example:
A = np.zeros((10, 2))
A[0] = [1, 2]
Or in a loop, with a trivial assignment to demonstrate syntax:
A = np.zeros((2, 2))
# in reality, x will be some generator whose length you know in advance
x = [[1, 2], [3, 4]]
for idx, item in enumerate(x):
A[idx] = item
print(A)
array([[ 1., 2.],
[ 3., 4.]])

Categories