Given a list of numpy arrays, each of different length, as that obtained by doing lst = np.array_split(arr, indices), how do I get the sum of every array in the list? (I know how to do it using list-comprehension but I was hoping there was a pure-numpy way to do it).
I thought that this would work:
np.apply_along_axis(lambda arr: arr.sum(), axis=0, arr=lst)
But it doesn't, instead it gives me this error which I don't understand:
ValueError: operands could not be broadcast together with shapes (0,) (12,)
NB: It's an array of sympy objects.
There's a faster way which avoids np.split, and utilizes np.reduceat. We create an ascending array of indices where you want to sum elements with np.append([0], np.cumsum(indices)[:-1]). For proper indexing we need to put a zero in front (and discard the last element, if it covers the full range of the original array.. otherwise just delete the [:-1] indexing). Then we use the np.add ufunc with np.reduceat:
import numpy as np
arr = np.arange(1, 11)
indices = np.array([2, 4, 4])
# this should split like this
# [1 2 | 3 4 5 6 | 7 8 9 10]
np.add.reduceat(arr, np.append([0], np.cumsum(indices)[:-1]))
# array([ 3, 18, 34])
How to broadcast the sum of list of list in an efficient way?
below is a working code, but its not quite efficient when list1 has nth value like 30 elements.
Any improvement on this?
from numpy import sum
import numpy as np
list1 = [[4,8],[8,16]]
list2 = [2]
elemSum=[sum(list1[0]),sum(list1[1])]
print((np.array(elemSum)/np.array(list2)))
prints:
[ 6. 12.] # expected output
I want a single line like this below , eliminating the declaration of variable elemSum, but it yields incorrect output since it sums 2 elements to 1
print(sum(np.array(list1)/np.array(list2)))
prints:
18.0 # not expected it sums 2 elements to 1
numpy.sum takes an optional axis argument, which can be used for partial sums along a single axis:
>>> list1 = np.array([[4,8],[8,16]])
>>> list2 = np.array([2])
>>> np.sum(list1)
36
>>> np.sum(list1, axis=1)
array([12, 24])
>>> np.sum(list1, axis=1) / list2
array([ 6., 12.])
Just use numpy the entire time, don't mess with lists if you want arrays:
list1 = [[4,8],[8,16]]
list2 = [2]
import numpy as np
arr1 = np.array(list1)
arr2 = np.array(list2)
Then simply:
result = arr1.sum(axis=1) / arr2
For example I have a 2D list of 2x2 dimension stored in a variable r.
12 24
36 48
I want to divide every value in the list by 2. An easy slow way to do this is to iterate through each row and column and divide it by 2.
rows = 2
columns = 2
for x in range(rows):
for y in range(columns):
r[x][y] = r[x][y]/2
Is there an easy and fast way to divide each values in a 2D list to a specific value other than the manual approach? I tried the code below but it throws an error:
s = r /2
Error:
TypeError: unsupported operand type(s) for /: 'list' and 'int'
You can use numpy library to achieve the result you want. it uses vectorization so its the fastest way to do the operation.
Try this:
import numpy as np
arr = np.array(r) # --> initialize with 2d array
arr = arr / 2 # --> Now each element in the 2d array gets divided by 2
For Example:
arr = np.array([[1, 2], [2, 3]]) # --> initialize with 2d array
arr = arr / 2 # --> divide by the scalar
print(arr)
Output:
[[0.5 1. ]
[1. 1.5]]
Looks like there is even a section in official documentation about nested list comprehensions:
https://docs.python.org/3/tutorial/datastructures.html#nested-list-comprehensions
i.e.
>>> r = [[12,24],[36,48]]
>>> [[i/2 for i in a] for a in r]
[[6.0, 12.0], [18.0, 24.0]]
import numpy as np
x=np.array(2D_list)
x=x/2
Why do the following code samples:
np.array([[1, 2], [2, 3, 4]])
np.array([1.2, "abc"], dtype=float)
...all give the following error?
ValueError: setting an array element with a sequence.
Possible reason 1: trying to create a jagged array
You may be creating an array from a list that isn't shaped like a multi-dimensional array:
numpy.array([[1, 2], [2, 3, 4]]) # wrong!
numpy.array([[1, 2], [2, [3, 4]]]) # wrong!
In these examples, the argument to numpy.array contains sequences of different lengths. Those will yield this error message because the input list is not shaped like a "box" that can be turned into a multidimensional array.
Possible reason 2: providing elements of incompatible types
For example, providing a string as an element in an array of type float:
numpy.array([1.2, "abc"], dtype=float) # wrong!
If you really want to have a NumPy array containing both strings and floats, you could use the dtype object, which allows the array to hold arbitrary Python objects:
numpy.array([1.2, "abc"], dtype=object)
The Python ValueError:
ValueError: setting an array element with a sequence.
Means exactly what it says, you're trying to cram a sequence of numbers into a single number slot. It can be thrown under various circumstances.
1. When you pass a python tuple or list to be interpreted as a numpy array element:
import numpy
numpy.array([1,2,3]) #good
numpy.array([1, (2,3)]) #Fail, can't convert a tuple into a numpy
#array element
numpy.mean([5,(6+7)]) #good
numpy.mean([5,tuple(range(2))]) #Fail, can't convert a tuple into a numpy
#array element
def foo():
return 3
numpy.array([2, foo()]) #good
def foo():
return [3,4]
numpy.array([2, foo()]) #Fail, can't convert a list into a numpy
#array element
2. By trying to cram a numpy array length > 1 into a numpy array element:
x = np.array([1,2,3])
x[0] = np.array([4]) #good
x = np.array([1,2,3])
x[0] = np.array([4,5]) #Fail, can't convert the numpy array to fit
#into a numpy array element
A numpy array is being created, and numpy doesn't know how to cram multivalued tuples or arrays into single element slots. It expects whatever you give it to evaluate to a single number, if it doesn't, Numpy responds that it doesn't know how to set an array element with a sequence.
In my case , I got this Error in Tensorflow , Reason was i was trying to feed a array with different length or sequences :
example :
import tensorflow as tf
input_x = tf.placeholder(tf.int32,[None,None])
word_embedding = tf.get_variable('embeddin',shape=[len(vocab_),110],dtype=tf.float32,initializer=tf.random_uniform_initializer(-0.01,0.01))
embedding_look=tf.nn.embedding_lookup(word_embedding,input_x)
with tf.Session() as tt:
tt.run(tf.global_variables_initializer())
a,b=tt.run([word_embedding,embedding_look],feed_dict={input_x:example_array})
print(b)
And if my array is :
example_array = [[1,2,3],[1,2]]
Then i will get error :
ValueError: setting an array element with a sequence.
but if i do padding then :
example_array = [[1,2,3],[1,2,0]]
Now it's working.
for those who are having trouble with similar problems in Numpy, a very simple solution would be:
defining dtype=object when defining an array for assigning values to it. for instance:
out = np.empty_like(lil_img, dtype=object)
In my case, the problem was another. I was trying convert lists of lists of int to array. The problem was that there was one list with a different length than others. If you want to prove it, you must do:
print([i for i,x in enumerate(list) if len(x) != 560])
In my case, the length reference was 560.
In my case, the problem was with a scatterplot of a dataframe X[]:
ax.scatter(X[:,0],X[:,1],c=colors,
cmap=CMAP, edgecolor='k', s=40) #c=y[:,0],
#ValueError: setting an array element with a sequence.
#Fix with .toarray():
colors = 'br'
y = label_binarize(y, classes=['Irrelevant','Relevant'])
ax.scatter(X[:,0].toarray(),X[:,1].toarray(),c=colors,
cmap=CMAP, edgecolor='k', s=40)
When the shape is not regular or the elements have different data types, the dtype argument passed to np.array only can be object.
import numpy as np
# arr1 = np.array([[10, 20.], [30], [40]], dtype=np.float32) # error
arr2 = np.array([[10, 20.], [30], [40]]) # OK, and the dtype is object
arr3 = np.array([[10, 20.], 'hello']) # OK, and the dtype is also object
``
In my case, I had a nested list as the series that I wanted to use as an input.
First check: If
df['nestedList'][0]
outputs a list like [1,2,3], you have a nested list.
Then check if you still get the error when changing to input df['nestedList'][0].
Then your next step is probably to concatenate all nested lists into one unnested list, using
[item for sublist in df['nestedList'] for item in sublist]
This flattening of the nested list is borrowed from How to make a flat list out of list of lists?.
The error is because the dtype argument of the np.array function specifies the data type of the elements in the array, and it can only be set to a single data type that is compatible with all the elements. The value "abc" is not a valid float, so trying to convert it to a float results in a ValueError. To avoid this error, you can either remove the string element from the list, or choose a different data type that can handle both float values and string values, such as object.
numpy.array([1.2, "abc"], dtype=object)
I would like to build up a numpy matrix using rows I get in a loop. But how do I initialize the matrix? If I write
A = []
A = numpy.vstack((A, [1, 2]))
I get
ValueError: all the input array dimensions except for the concatenation axis must match exactly
What's the best practice for this?
NOTE: I do not know the number of rows in advance. The number of columns is known.
Unknown number of rows
One way is to form a list of lists, and then convert to a numpy array in one operation:
final = []
# x is some generator
for item in x:
final.append(x)
A = np.array(x)
Or, more elegantly, given a generator x:
A = np.array(list(x))
This solution is time-efficient but memory-inefficient.
Known number of rows
Append operations on numpy arrays are expensive and not recommended. If you know the size of the final array in advance, you can instantiate an empty (or zero) array of your desired size, and then fill it with values. For example:
A = np.zeros((10, 2))
A[0] = [1, 2]
Or in a loop, with a trivial assignment to demonstrate syntax:
A = np.zeros((2, 2))
# in reality, x will be some generator whose length you know in advance
x = [[1, 2], [3, 4]]
for idx, item in enumerate(x):
A[idx] = item
print(A)
array([[ 1., 2.],
[ 3., 4.]])