concatenate numpy arrays that are class instance attributes in python - python

I am attempting to use a class that strings together several instances of another class as a numpy array of objects. I want to be able to concatenate attributes of the instances that are contained in the numpy array. I figured out a sloppy way to do it with a bunch of for loops, but I think there must be a more elegant, pythonic way of doing this. The following code does what I want, but I want to know if there is a cleaner way to do it:
import numpy as np
class MyClass(object):
def __init__(self):
self.a = 37.
self.arr = np.arange(5)
class MyClasses(object):
def __init__(self):
self.N = 5
# number of MyClass instances to become attributes of this
# class
def make_subclas_arrays(self):
self.my_class_inst = np.empty(shape=self.N, dtype="object")
for i in range(self.N):
self.my_class_inst[i] = MyClass()
def concatenate_attributes(self):
self.a = np.zeros(self.N)
self.arr = np.zeros(self.N * self.my_class_inst[0].arr.size)
for i in range(self.N):
self.a[i] = self.my_class_inst[i].a
slice_start = i * self.my_class_inst[i].arr.size
slice_end = (i + 1.) * self.my_class_inst[i].arr.size
self.arr[slice_start:slice_end] = (
self.my_class_inst[i].arr )
my_inst = MyClasses()
my_inst.make_subclas_arrays()
my_inst.concatenate_attributes()
Edit: Based on the response from HYRY, here is what the methods look like now:
def make_subclass_arrays(self):
self.my_class_inst = np.array([MyClass() for i in range(self.N)])
def concatenate_attributes(self):
self.a = np.hstack([i.a for i in self.my_class_inst])
self.arr = np.hstack([i.arr for i in self.my_class_inst])

you can use numpy.hstack() to concatenate arrays:
def concatenate_attributes(self):
self.a = np.hstack([o.a for o in self.my_class_inst])
self.arr = np.hstack([o.arr for o in self.my_class_inst])
See Also
vstack : Stack arrays in sequence vertically (row wise).
dstack : Stack arrays in sequence depth wise (along third axis).
concatenate : Join a sequence of arrays together.

For the latter function I would recommend this:
init = []
ContainerClass.arr = np.array([init + Array(myclass.arr) for myclass in self.my_class_inst])
typecast numpy array to normal array, catenate and typecast it back. Assuming now that you have simple 1D arrays. I don't remember by heart if numpy array has catenation function. You can use that instead of '+' sign and save the trouble of typecasting.
For the first you have the simplest form I can think of, although I usually use normal arrays instead of numpy ones for objects.
If you want to be really clever you can create an __add__ function for both of the classes. Then you can use '+' sign to add classes. a + b calls a.__add__(b). Now you would have to create functions that have following properties
MyClass + MyClass returns new MyClasses instance with a and b inside
MyClasses + MyClass adds MyClass to MyClasses in a way you want
Now if a,b,c,d are myClass instances, a+b+c+d should return MyClasses -class which contains MyClass instances a,b,c and d and their combined arrays. This would be the pythonic way, although its a bit too complicated in my taste.
edit:
Ok, sorry my bad. I did not have python when I wrote the code. This is the correct version:
init = []
my_inst.arr = np.array([init + list(myclass.arr.flat) for myclass in my_inst.my_class_inst]).flatten()
This is what I meant with the __add__ (and the pythonic way... regadless of its complicatedness):
import numpy as np
class MyClass(object):
def __init__(self):
self.a = 37.
self.arr = np.arange(5)
def __add__(self, classToAdd):
a = MyClasses() + self + classToAdd
return a
class MyClasses(object):
def __init__(self):
self.N = 0
self.my_class_inst = np.array([])
self.a = np.array([])
self.arr = np.array([])
def __add__(self, singleClass):
self.my_class_inst = np.hstack([self.my_class_inst, singleClass])
self.a = np.hstack([self.a, singleClass.a])
self.arr = np.hstack([self.arr, singleClass.arr])
self.N = self.my_class_inst.shape[0]
return self
#add_test = MyClass() + MyClass()
add_test = np.sum([MyClass() for i in range(5)])
print add_test.a, add_test.arr, add_test.N
print add_test.__class__, add_test.my_class_inst[0].__class__

Related

How to Subclassing ndarray to cumulatively record the number of addition operators?

I am using Python and I am trying to construct a class, say Numbers, using Subclassing ndarray. I wish my class satisfies two following properties:
All the methods of numpy is applicable to my class Numbers. (This is the reason why I chose to use Subclassing ndarray.)
Every times I perform the addition between two instances of class Numbers, I can cumulatively record the number of additions I have used.
Here is what I tried
import numpy as np
class Numbers(np.ndarray):
nb_add = 0
def __new__(cls, values):
self = np.asarray(values).view(cls)
return self
def __add__(self, new_numbers):
Numbers.nb_add += len(new_numbers)
return self + new_numbers
a = Numbers([1,2,3,4])
b = Numbers([5,6,7,8])
c = a+b
print(a.reshape(2, 2)) # expect [[1,2], [3,4]]
print(Numbers.nb_add)# expect 4 = number of addtions
But the method __add__ leads to error.
I found a similar post here, but it is not the case I am looking for.
Could anyone help me? Thanks!
The problem is that the expression self + new_numbers is referencing the __add__ method of Numbers and hence the method calls itself.
If you explicitly call the __add__ method of the base class, then you get the result you want:
import numpy as np
class Numbers(np.ndarray):
nb_add = 0
def __new__(cls, values):
self = np.asarray(values).view(cls)
return self
def __add__(self, new_numbers):
Numbers.nb_add += len(new_numbers)
return np.ndarray.__add__(self,new_numbers)
a = Numbers([1,2,3,4])
b = Numbers([5,6,7,8])
c = a+b
print(a.reshape(2, 2)) # expect [[1,2], [3,4]]
print(Numbers.nb_add)# expect 4 = number of addtions

__radd__() method with Numpy Array as other, the Numpy Array was passed one by one

I have a class (MyClass) with an attribute data which is a Numpy Array. I would like to allow operations such as :
myclass3 = myclass1 + myclass2
myclass3 = myclass1 + Numpy.ndarray
myclass3 = Numpy.ndarray + myclass1
where all these operations add the data together and return a new MyClass. So the first two are easy using by defining add(). But for the last case, it behaved not as I expected that the ndarray passes the element one by one of sum with myclass1.data.
This is what I mean.
import numpy as np
class MyClass:
def __init__(self, data):
self.data = data
def __add__(self, other):
print(other)
if isinstance(other, MyClass):
data = self.data + other.data
else:
data = self.data + other
return MyClass(data)
def __radd__(self, other):
print(other)
data = self.data + other
return MyClass(data)
myclass1 = MyClass(np.arange(5))
myclass2 = MyClass(np.ones(5))
nparray = np.arange(5) + 10
alist = [1, 1, 1, 1, 1]
In all the addition combination, they are all fine even alist + myclass1, but nparray + myclass1 returns:
In __radd__: 10
In __radd__: 11
In __radd__: 12
In __radd__: 13
In __radd__: 14
What happened was that each element of the Numpy Array was passed into radd one by one rather than as a whole. And it returns five times, and I got the res as <class 'numpy.ndarray'> rather than MyClass object.
So how can I allow Numpy.ndarray + MyClass operation where the entire ndarray will pass in as other in radd().
Best regards,
J
Unfortunately, there's nothing you can do about it. alist + myclass1 fails, so it calls your __radd__ function, which works as intended. But, in nparray + myclass1, numpy tries to avoid failure by broadcasting. So, it will do the equivalent of
for value in nparray.data:
value + myclass
which will fail each time, and only then your __radd__ will be called.

Python method calls in constructor and variable naming conventions inside a class

I try to process some data in Python and I defined a class for a sub-type of data. You can find a very simplified version of the class definition below.
class MyDataClass(object):
def __init__(self, input1, input2, input3):
"""
input1 and input2 are a 1D-array
input3 is a 2D-array
"""
self._x_value = None # int
self._y_value = None # int
self.data_array_1 = None # 2D array
self.data_array_2 = None # 1D array
self.set_data(input1, input2, input3)
def set_data(self, input1, input2, input3):
self._x_value, self._y_value = self.get_x_and_y_value(input1, input2)
self.data_array_1 = self.get_data_array_1(input1)
self.data_array_2 = self.get_data_array_2(input3)
#staticmethod
def get_x_and_y_value(input1, input2):
# do some stuff
return x_value, y_value
def get_data_array_1(self, input1):
# do some stuff
return input1[self._x_value:self._y_value + 1]
def get_data_array_2(self, input3):
q = self.data_array_1 - input3[self._x_value:self._y_value + 1, :]
return np.linalg.norm(q, axis=1)
I'm trying to follow the 'Zen of Python' and thereby to write beautiful code. I'm quite sceptic, whether the class definition above is a good pratice or not. While I was thinking about alternatives I came up with the following questions, to which I would like to kindly get your opinions and suggestions.
Does it make sense to define ''get'' and ''set'' methods?
IMHO, as the resulting data will be used several times (in several plots and computation routines), it is more convenient to create and store them once. Hence, I calculate the data arrays once in the constructor.
I do not deal with huge amount of data and therefore processing takes not more than a second, however I cannot estimate its potential implications on RAM if someone would use the same procedure for huge data.
Should I put the function get_x_and_y_value() out of the class scope and convert static method to a function?
As the method is only called inside the class definition, it is better to use it as a static method. If I should define it as a function, should I put all the lines relevant to this class inside a script and create a module of it?
The argument naming of the function get_x_and_y_value() are the same as __init__ method. Should I change it?
It would ease refactoring but could confuse others who read it.
In Python, you do not need getter and setter functions. Use properties instead. This is why you can access attributes directly in Python, unlike other languages like Java where you absolutely need to use getters and setters and to protect your attributes.
Consider the following example of a Circle class. Because we can use the #property decorator, we don't need getter and setter functions like other languages do. This is the Pythonic answer.
This should address all of your questions.
class Circle(object):
def __init__(self, radius):
self.radius = radius
self.x = 0
self.y = 0
#property
def diameter(self):
return self.radius * 2
#diameter.setter
def diameter(self, value):
self.radius = value / 2
#property
def xy(self):
return (self.x, self.y)
#xy.setter
def xy(self, xy_pair):
self.x, self.y = xy_pair
>>> c = Circle(radius=10)
>>> c.radius
10
>>> c.diameter
20
>>> c.diameter = 10
>>> c.radius
5.0
>>> c.xy
(0, 0)
>>> c.xy = (10, 20)
>>> c.x
10
>>> c.y
20

Atributes of class in numpy array

I have a class like:
class MyClass:
def __init__( self, params):
self.A = params[0]
self.B = params[1]
self.C = params[2]
and a numpy array built from instances of this class:
import numpy as np
ArrayA = np.empty((3,4),dtype = object)
for ii in range(3):
for jj in range(4):
ArrayA[ii,jj] = MyClass(np.random.rand(3))
I want to retrieve "MyClass.B" for ArrayA where "MyClass.A" is minimum, so I did:
WhereMin = np.where(ArrayA[:,:].A)
MinB = ArrayA[WhereMin].B
but that does not work. Any ideas?
EDIT:
When I run the above code I get the following error:
----> WhereMin = np.nanmin(ArrayA[:,:].A)
AttributeError: 'numpy.ndarray' object has no attribute 'A'
When I would expect to get an array of indices to use in "MinB".
Possible Solution
I found a possible solution to my problem:
Min = np.nanmin([[x.A for x in XX] for XX in ArrayA])
XXX = [[x for x in XX if x.A == Min] for XX in ArrayA]
MinB = [XX for XX in XXX if XX != [] ][0][0].B
Might not be too elegant, but does the job. Thank you all!
The .A attribute belongs to each individual element of ArrayA, not to the array as a whole. So, ArrayA[0,0].A is valid, because ArrayA[0,0] points to an instance of MyClass, but ArrayA[:,:] returns a copy of the original ndarray.
I would consider reorganizing your data so that you keep everything you want in the .A attribute in a single numpy array, and everything in .B in a single numpy array, etc. That would have two advantages, 1) you would be able to use where, and 2) your numpy arrays would be of dtype=float (you lose the advantage of numpy if you have to use dtype=object).
You can create a structured numpy array. Pass dtype a list of tuples of field name and data type. You can then access the complete array of a given field by indexing the array by the field name. To rework your example:
ArrayA = np.zeros((3,4),dtype=[('A','<f4'),('B','<f4'),('C','<f4')])
for ii in range(3):
for jj in range(4):
ArrayA[ii,jj] = np.random.rand(3)
minA = ArrayA['A'].min()
WhereMin = np.where(a['A'] == minA)
MinB = ArrayA[WhereMin]['B']

Applying method to objects in a numpy array with vectorize results in empty array

I want to apply a method to each object in a numpy array. I thought of using numpy.vectorize to speed things up, but I get an empty array instead. I can't figure out what I am doing wrong. Please help!
Here's the code:
import numpy
class Foo(object):
def __init__(self):
self.x = None
def SetX(self, x):
self.x = x
# Initialize and array of Foo objects
y = numpy.empty( 3, dtype=object )
vFoo = numpy.vectorize(lambda x: Foo() )
yfoo = vFoo(y)
# Apply method SetX to each object
xsetter = numpy.vectorize( lambda foo: foo.SetX(3.45) )
print xsetter(yfoo) #[None None None]
Thanks in advance!
The problem is that the lambda function return values is None (the result of Foo.SetX), you can do this:
def f(foo):
foo.SetX(3.45)
return foo
xsetter = numpy.vectorize( f )
It's because your SetX method does not return a value. One way to fix this would be by rewriting SetX as
def SetX(self, x):
self.x = x
return self

Categories