I'm on python2.7 and I want to get object from specific coordinate in my matrix after initializing all the coordinates at 0:
import numpy as np
class test:
"it's a test"
def __init__(self):
self.x=4
self.y=5
mat=np.full(shape=(4,4),fill_value=0)
mat[2,2]=test()
print(mat[2,2].x)
print(mat[2,2].y)
But I have this error:
Traceback (most recent call last):
File "/root/Documents/matrix.py", line 11, in <module>
mat[2,2]=test()
AttributeError: test instance has no attribute '__trunc__'enter code here
And if I change the line 9 into:
`mat=np.zeros(shape=(4,4))
I get this error:
Traceback (most recent call last):
File "/root/Documents/matrix.py", line 11, in <module>
mat[2]=test()
AttributeError: test instance has no attribute '__float__'
It works fine for an element of a simple list so I hope this is not due to the fact that I use matrix with numpy...
I hope someone can help me, thanks!
Pay attention to what your statements create.
In [164]: mat=np.full(shape=(4,4),fill_value=0)
In [165]:
In [165]: mat
Out[165]:
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
In [166]: mat.dtype
Out[166]: dtype('int64')
This array can only hold integers. The error means it tries to apply the __trunc__ method to your object. That would work with a number like 12.23.__trunc__(). But you haven't defined such a method.
In [167]: mat=np.zeros(shape=(4,4))
In [168]: mat
Out[168]:
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
In [169]: mat.dtype
Out[169]: dtype('float64')
Here the dtype is float. Again, you haven't defined a __float__ method.
A list holds pointers to Python objects.
In [171]: class test:
...: "it's a test"
...: def __init__(self):
...: self.x=4
...: self.y=5
...: def __repr__(self):
...: return 'test x={},y={}'.format(self.x, self.y)
...:
In [172]: alist = [test(), test()]
In [173]: alist
Out[173]: [test x=4,y=5, test x=4,y=5]
We can make an array that holds your objects:
In [174]: arr = np.array(alist)
In [175]: arr
Out[175]: array([test x=4,y=5, test x=4,y=5], dtype=object)
In [176]: arr[0].x
Out[176]: 4
But note the dtype.
Object dtype arrays are list like, with some array properties. They can be reshaped, but most operations have to use some sort of list iteration. Math is hit-and-miss depending on what methods you defined.
Don't use object dtype arrays unless you really need them. Lists are easier to use.
You should make explicit the fact that the data type are objects
mat=np.full(shape=(4,4),fill_value=0, dtype=object)
Related
I want to create a NumPy array of np.ndarray from an iterable. This is because I have a function that will return np.ndarray of some constant shape, and I need to create an array of results from this function, something like this:
OUTPUT_SHAPE = some_constant
def foo(input) -> np.ndarray:
# processing
# generated np.ndarray of shape OUTPUT_SHAPE
return output
inputs = [i for i in range(100000)]
iterable = (foo(input) for input in inputs)
arr = np.fromiter(iterable, np.ndarray)
This obviously gives an error:-
cannot create object arrays from iterator
I cannot first create a list then convert it to an array, because it will first create a copy of every output array, so for a time, there will be almost double memory occupied, and I have very limited memory.
Can anyone help me?
You probably shouldn't make an object array. You should probably make an ordinary 2D array of non-object dtype. As long as you know the number of results the iterator will give in advance, you can avoid most of the copying you're worried about by doing it like this:
arr = numpy.empty((num_iterator_outputs, OUTPUT_SHAPE), dtype=whatever_appropriate_dtype)
for i, output in enumerate(iterable):
arr[i] = output
This only needs to hold arr and a single output in memory at once, instead of arr and every output.
If you really want an object array, you can get one. The simplest way would be to go through a list, which will not perform the copying you're worried about as long as you do it right:
outputs = list(iterable)
arr = numpy.empty(len(outputs), dtype=object)
arr[:] = outputs
Note that if you just try to call numpy.array on outputs, it will try to build a 2D array, which will cause the copying you're worried about. This is true even if you specify dtype=object - it'll try to build a 2D array of object dtype, and that'll be even worse, for both usability and memory.
An object dtype array contains references, just like a list.
Define 3 arrays:
In [589]: a,b,c = np.arange(3), np.ones(3), np.zeros(3)
put them in a list:
In [590]: alist = [a,b,c]
and in an object dtype array:
In [591]: arr = np.empty(3,object)
In [592]: arr[:] = alist
In [593]: arr
Out[593]:
array([array([0, 1, 2]), array([1., 1., 1.]), array([0., 0., 0.])],
dtype=object)
In [594]: alist
Out[594]: [array([0, 1, 2]), array([1., 1., 1.]), array([0., 0., 0.])]
Modify one, and see the change in the list and array:
In [595]: b[:] = [1,2,3]
In [596]: b
Out[596]: array([1., 2., 3.])
In [597]: alist
Out[597]: [array([0, 1, 2]), array([1., 2., 3.]), array([0., 0., 0.])]
In [598]: arr
Out[598]:
array([array([0, 1, 2]), array([1., 2., 3.]), array([0., 0., 0.])],
dtype=object)
A numeric dtype array created from these copies all values:
In [599]: arr1 = np.stack(arr)
In [600]: arr1
Out[600]:
array([[0., 1., 2.],
[1., 2., 3.],
[0., 0., 0.]])
So even if your use of fromiter worked, it wouldn't be any different, memory wise from a list accumulation:
alist = []
for i in range(n):
alist.append(constant_array)
For a special application dealing with numpy arrays of different lengths, I need my preferably numpy array, not just a list, to have the form np.ndarray[np.ndarray[ ], np.ndarray[ ], ..., dtype=object]. If I have given sequence, list, etc. of numpy arrays, I want them always to have this form. However, for a list of numpy arrays of the same length, e.g.,
np.array(*np.array([np.arange(4), np.arange(4)], dtype=object)
gives me np.ndarray[np.ndarray[[]], dtype=object] so I came up with the workaround below.
Is there any other magic option, which could be passed to np.array() or another method which gives the desired result more directly?
Workaround:
inp_arr_a = np.asarray([np.arange(4), np.arange(3)], dtype=object)
inp_arr_b = np.array([np.arange(4), np.arange(4)])
def split_to_obj_arr(arr):
return np.delete(np.array([*arr, 'dummy'], dtype=object), -1, 0)
gives for split_to_obj_arr(inp_arr_a)
array([array([0, 1, 2, 3]), array([0, 1, 2])], dtype=object)
and for split_to_obj_arr(inp_arr_b)
array([array([0, 1, 2, 3]), array([0, 1, 2, 3])], dtype=object)
np.array(...) by design tries to return as high a dimensional numeric array as possible. If the inputs are ragged it will raise a future-warning (unless you specify object dtype) and return the object array containing arrays. Or with some combinations of shapes it will raise an error.
Forcing an object dtype with the None element and then deleting that is one way around this. I prefer creating the None filled array first, and assigning the elements:
In [80]: def foo(alist):
...: res = np.empty(len(alist), object)
...: res[:] = alist
...: return res
...:
In [81]: foo([[],[]])
Out[81]: array([list([]), list([])], dtype=object)
In [82]: foo([np.array([]),np.array([])])
Out[82]: array([array([], dtype=float64), array([], dtype=float64)], dtype=object)
In [83]: foo([np.ones((2,3)),np.zeros((2,3))])
Out[83]:
array([array([[1., 1., 1.],
[1., 1., 1.]]),
array([[0., 0., 0.],
[0., 0., 0.]])], dtype=object)
In [84]: foo([np.array([2,3]),np.array([1,2])])
Out[84]: array([array([2, 3]), array([1, 2])], dtype=object)
Creating a 2d object array like this is also possible, but trickier. It may be simpler to reshape a 1d as needed after.
I want to create a numpy array b where each component is a 2D matrix, which dimensions are determined by the coordinates of vector a.
What I get doing the following satisfies me:
>>> a = [3,4,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> np.array(b)
array([ array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]]),
array([[ 0., 0., 0., 0., 0.]])], dtype=object)
but if I have found this pathological case where it does not work:
>>> a = [2,1,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> b
[array([[ 0., 0., 0.]]), array([[ 0., 0.]])]
>>> np.array(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (3) into shape (1)
I will present a solution to the problem, but do take into account what was said in the comments. Having Numpy arrays that are not aligned prevents most of the useful operations from working their magic. Consider using lists instead.
That being said, curious error indeed. I got the thing to work by assigning in a basic for-loop instead of using the np.array call.
a = [2,1,1]
b = np.zeros(len(a)-1, dtype=object)
for i in range(1, len(a)):
b[i-1] = np.zeros((a[i], a[i - 1] + 1))
And the result:
>>> b
array([array([[0., 0., 0.]]), array([[0., 0.]])], dtype=object)
This is a bit peculiar. Typically, numpy will try to create one array from the input of np.array with a common data type. A list of arrays would be interpreted with the list as being the new dimension. For instance, np.array([np.zeros(3, 1), np.zeros(3, 1)]) would produce a 2 x 3 x 1 array. So this can only happen if the arrays in your list match in shape. Otherwise, you end up with an array of arrays (with dtype=object), which as commented, is not really an ideal scenario.
However, your error seems to occur when the first dimension matches. Numpy for some reason tries to broadcast the arrays somehow and fails. I can reproduce your error even if the arrays are of higher dimension, as long as the first dimension between arrays matches.
I know this isn't a solution, but this wouldn't fit in a comment. As noted by #roganjosh, making this kind of array really gives you no benefit. You're better off sticking to a list of arrays for readability and to avoid the cost of creating these arrays.
I would like to set all nan entries in my numpy array a to zero.
Regardless how I use np.nan_to_num(), the array is not processed at all (it still leaves np.nan in the array)
import numpy as np
a = np.empty((0, 3), dtype='object')
for runner in range(10):
a = np.insert(a, a.shape[0], [[1, np.nan, 1]], axis=0)
These are are my unsuccessful tries:
np.nan_to_num(a)
np.nan_to_num(a,copy=True)
np.nan_to_num(a,copy=False)
a=np.nan_to_num(a)
a=np.nan_to_num(a,copy=False)
a=np.nan_to_num(a,copy=True)
As the nan_to_num docstring states:
If x is not inexact, then no replacements are made.
And dtype object does not count as inexact.
If for some reason one needs to use dtype object (perhaps one wants to have nans and exact ints, for example), then here is a work-around:
a[a!=a] = 0
Note that in theory there could be other objects than nan for which x!=x evaluates to True (one can of course create one's own class and fiddle with __eq__, __neq__) but in practice I can't think of any.
Only mildly contrived example:
>>> import numpy as np
>>> import math
>>>
>>> a = np.random.randint(0, 1000, (6,)).astype(object)
>>> a[a%2==0] = np.nan
>>>
>>> fact_exact = np.vectorize(math.factorial, 'O', 'O')
>>>
>>> fact_exact(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/paul/.local/lib/python3.6/site-packages/numpy/lib/function_base.py", line 1972, in __call__
return self._vectorize_call(func=func, args=vargs)
File "/home/paul/.local/lib/python3.6/site-packages/numpy/lib/function_base.py", line 2048, in _vectorize_call
outputs = ufunc(*inputs)
ValueError: factorial() only accepts integral values
>>>
>>> a[a!=a] = 0
>>> fact_exact(a)
array([9819935662418089743352075922310862095706065486822583658822975979153852871637910339598847876493575760863201233608970580391009961465728060140206398380369810186460532083760537973722230477712617437079362600099095591538946730193485520929914465963675497331037894791629662134417383906616748712477435411911352595846133057242505006764835196420336585309344206359125847804414531691517822911373600118902137858177047463867389635205323328678714656377591230065986360526515442653777496908763065282294664208227077490200850296013058820462199153017425546879776071769432946284989651969735166129654123362278827485074178681546981559466233191972688158356430976918192398846419304865350500808417927115875428971873067092978672051108353026958311731456630717915806992149025378731927814021805881859364498816522297657223802150320368577537638698692463078070519911729996949263069045872688620575874758242248117345983373644762881336075203583068807371386560008413979828440302163961903567206206098114957943899603695885783671168564745354608640000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,
22328783881661914958481873975346502495151470121092663127656427617172486869336444341196216861471796204456103981797935323465763492125980526669772652700063306391000092324747490987759008282321662774044560021923711172537165034028116470777032463317525690139861312277154265627409161865934581816407380706408159413469087649804140238680046340298380454769197056000000000000000000000000000000000000000000000000000,
1,

1, 1], dtype=object)
This is an issue of how you are declaring your array.
a = np.array([[1,np.nan,3],[np.nan, 0, np.nan]])
a=np.insert(a, a.shape[0],[[1, np.nan, 1]], axis=0)
a
array([[ 1., nan, 3.],
[nan, 0., nan],
[ 1., nan, 1.]])
np.nan_to_num(a)
array([[1., 0., 3.],
[0., 0., 0.],
[1., 0., 1.]])
I'm using a video processing tool that needs to input the processing data from each frame into an array.
for p in det.read(frame, fac):
point_values = np.array([])
for j, (x, y) in enumerate(p): #iteration through points
point_values = np.append(point_values,y)
point_values = np.append(point_values,x)
this code runs again each frame. I'm expecting "point_values = np.array([])" to reset the array and then start filling it again.
I'm not sure if my logic is wrong or is it a syntax issue.
Your code does:
In [77]: p = [(0,0),(0,2),(1,0),(1,2)]
In [78]: arr = np.array([])
In [79]: for j,(x,y) in enumerate(p):
...: arr = np.append(arr,y)
...: arr = np.append(arr,x)
...:
In [80]: arr
Out[80]: array([0., 0., 2., 0., 0., 1., 2., 1.])
No syntax error. The list equivalent is faster and cleaner:
In [85]: alist =[]
In [86]: for x,y in p: alist.extend((y,x))
In [87]: alist
Out[87]: [0, 0, 2, 0, 0, 1, 2, 1]
But you don't give any indication of how this action is supposed to fit within a larger context. You create a new point_values for each p, but then don't do anything with it.