Say I have a class:
class Data():
def __init__(self):
self.scores = []
self.encoding= {1: 'first', 2: 'second', 3:'third'}
def build():
self.scores = [1, 2, 3]
def translate(self):
return [self.encoding[score] for val in self.scores]
Now I want to be able to translate the columns for a given data object...
# I want to be able to do
d= Data()
d.scores.translate()
# AttributeError: 'list' object has no attribute 'translate'
# Instead of
d= Data()
d.translate()
Now I am fully aware that I am trying to access a method that does NOT exist for that list (translate()). I want to be able to make method calls as is mentioned above (d.scores.translate()) because I may have some specific subslice of d.scores I want to translate.
For Example, if d.scores was a nested numpy array (I only want to translate 1st 5 columns but keep all rows)
#this is what I would like to be able to do
d.scores[:, 1:5].translate()
# And I don't want to build a kwarg in the translate method to handle it like
d.scores.translate(indices=[1])
I know this is more of an implementation question, and I'm wondering what the best practice should be.
Am I trying to force a square peg into a round hole at this point? Should I just give up and define a module function or consider the kwargs? Is that more 'pythonic'?
UPDATE
I should have said this sooner but I did try using the kwarg and staticmethod route. I just want to know if there are other ways to accomplish this? Maybe through subclassing? or Python's equivalent of interfacing in java/C# (if it exists?)
Yes, you are trying to "force a square peg into a round hole".
Your translate method works on the whole scores list, full stop. This can not be changed with some trickery, which is simply not supported in Python.
When you want to do subslices, I would recommend doing it explicitly.
Examples:
# Using args/kwargs:
scores = john.translate(10, 15) # translate subslice 10:15
# Using a new static method:
scores = Person.translate_scores(john.scores[10:15])
Looks not that elegant, but works.
(BTW: Since you changed your question, my classes might be a little of, but I will not change my answer with every edit you make)
Your trickery simply does not work, because "scores" is not some part of your main class, but simply an attribute of it, which has its own type. So, when you do "d.scores.translate()" translate is not called on d, but on a list or whatever type scores is. You can not change that, because it is core Python.
You could do it by using a second class and use _scores for the list and a sub-object scores which manipulates _scores:
class DataHelper(object):
def __init__(self, data_obj):
self.data_obj = data_obj
def translate(self, *args):
... # work on self.data_obj._scores
class Data(object):
def __init__(self):
self.scores = DataHelper(self)
self._scores = []
With such a class structure, you might be able to to this:
scores = d.scores.translate(1, 5)
And with more trickery, you might be able to even do:
scores = d.scores[1:5].translate()
But for that, you will need a third class (objects of that will be created temporary on indexing scores objects, so that d.scores[1:5] will not create a list slice but a new object with translate method).
Related
This question already has answers here:
How to get class object from class name string in the same module?
(3 answers)
Closed last year.
I am creating a python CLI, where user can provide an operation they want to perform, for eg:
sum 10 15
In my code, I have defined my classes as follows:
class Operation:
# common stuff
pass
class Sum(Operation):
identifier = "sum"
def perform(a, b):
return a + b
class Difference(Operation):
identifier = "diff"
def perform(a, b):
return a - b
Now, in my CLI, if I type sum 10 15 I want to return the result of Sum.perform(10, 15) and similarly if I type diff 10 15, I return the result of Difference.perform(10, 15), as sum is the identifier of class Sum and diff is the identifier of class Difference.
How do I dynamically access the class and its perform method, when I get the input directly from user input?
Classes in Python are first-class citizens, meaning they can be used as standard objects. In particular we can simply store them in a dictionary:
my_dict = {
'sum': Sum,
'diff': Difference,
}
and so on. Then when you get the operation name as string from command line you simply do
my_dict[op_name].perform(a, b)
Note that this is a very basic (and you will soon see problematic, e.g. not all operators accept two arguments) approach to what is known as parsing and abstract syntax trees. This is a huge topic, a bit hard but also very interesting. I encourage you to read about it.
// EDIT: If you want to keep identifier on the class, then you can apply a simple class decorator:
my_dict = {}
def autoregister(cls):
# It would be good idea to check whether we
# overwrite an entry here, to avoid errors.
my_dict[cls.identifier] = cls
return cls
#autoregister
class Sum(Operation):
identifier = "sum"
def perform(a, b):
return a + b
print(my_dict)
You have to remember though to import all classes before you use my_dict. In my opinion an explicit dict is easier to maintain.
Reading your comment, I think you need to interpret the input. The way I would go about this is splitting the input by spaces (based on your example), and then checking that list. For example:
# This is the place you called the input:
input_unsplit = input("Enter your command and args")
input_list = input_unsplit.split(" ")
# Check the first word to see what function we're calling
if("sum") in input_list[0].lower():
result = Sum.perform(input_list[1], input_list[2])
print(result)
# this logic can be applied to other functions as well.
This is a simple solution that could be hard to scale.
=== EDITED ===
I have more to add.
If used correctly, dir() can make a list of defined classes up to a certain point in the code. I wrote a calculator for my precal class, and in it I chose to use dir after defining all the math classes, and then if the name met certain conditions (i.e not main), it would be appended to a list of valid args to pass. You can modify your classes to include some kind of operator name property:
def class Addition:
self.op_name = "sum"
and then perform to take in an array:
def perform(numbers):
return numbers[0] + numbers [1]
To solve many of your scalability issues. Then, after declaring your classes, use dir() in a for loop to append to that valid array, like so:
valid_names = []
defined_names = dir()
for name in defined_names:
if '_' not in name:
if name not in ("sys","argparse","<any other imported module/defined var>"):
valid_names.append(name)
Note that making this step work for you is all in the placement in the script. it's a bit tedious, but works flawlessly if handled correctly (in my experience).
Then, you can use eval (safe in this context) to call the method you want:
# get input here
for name in defined_names:
if eval(name).op_name == input_list[0].lower():
eval(name).perform(input_list)
This should be a fairly easy-to-scale solution. Just watch that you keep the dir check up to date, and everything else just... works.
I have a dictionary, and a print statement as follows:
d = {'ar':4, 'ma':4, 'family':pf.Normal()}
print(d)
Which gives me
{'ar': 4, 'ma': 4, 'family': <pyflux.families.normal.Normal object at 0x11b6bc198>}
Is there any way to clean up the value of the 'family' key? It's important that the calls remains simply 'print(d)' because it used to print other dictionaries without this issue. Is this possible? Thanks for your time.
EDIT:
Thanks for the answer guys, I would mark one as right but I haven't tried them and can't confirm. I ended up creating another dictionary with the cleaned up string as a key, and the object as the value. It was a bit more work but I did it before reading/getting the responses so I just stuck to it. Thanks nonetheless!
You are mistaken. You don't want to change what print(dict) outputs. That would require changing the way builtin dictionaries are printed. You want to add a custom __repr__() to your pf.Normal() object.
I believe pf.Normal() comes from the pyflux package, so I suggest seeing what data the class is suppose to hold, and pretty printing it by inheriting from the class:
class CustomNormalObject(pf.Normal):
def __repr__(self):
# Add pretty printed data here
pass
Or if you need to pass your own arguments into the custom class, you can use super():
class CustomNormalObject(pf.Normal):
def __init__(self, myparm, *args, **kwargs):
# If using Python 3, you can call super without
# passing in any arguments. This is simply for Python 2
# compatibility.
super(CustomNormalObject, self).__init__(*args, **kwargs)
self.myparm = myparm
def __repr__(self):
# Add pretty printed data here
pass
as I think pyflux is an external library, I think you should not edit it directly. An easy way to do what you want is to overwrite the __str__ or __repr__ methods of the Normal class.
pyflux.families.normal.Normal.__str__ = lambda self: "Clean up value here."
Here I used a lambda function for illustration, but of course you can use a defined function.
I never got formal OOP instruction, and just kinda stumbled my way through the basics in python, but am at a crossroads.
When dealing with an instantiated class object, is it better to assign attributes via methods, or have the methods just return the values? I've read a lot about not letting the object's state get out of whack, but can't figure out the best way. Here's a simple example:
import magic
class Histogram():
def __init__(self,directory):
self.directory = directory
# Data Option 1
def read_data(self):
data = []
file_ref = open(self.directory,'r')
line = file_ref.readline()
while line:
data.append(line)
line = file_ref.readline()
return data
# Data Option 2
def set_data(self):
data = []
file_ref = open(self.directory,'r')
line = file_ref.readline()
while line:
data.append(line)
line = file_ref.readline()
self.data = data
# Hist Option 1
def build_histogram(self):
data = self.read_data()
# It's not important what magic.histogram does.
self.histogram = magic.histogram(data)
# Hist Option 2
def get_histogram(self,data):
return magic.histogram(data)
# Hist Option 3 - this requires self.get_data() to have already run.
def build_histogram_2(self):
self.histogram = magic.histogram(self.data)
So Data Option 1 forces the user to either call that and store it somewhere to use in conjunction with Hist Option 2 or store it in self.data to use with Hist Option 3.
Data Option 2 lets you use Hist Option 3, but you still have had to already run set_data.
So my real question is, for a class with methods to do different things, that often CAN but don't HAVE to be chained together, how should I write it? Implicitly setting attributes and risk getting the state messed up? Return variables and let the "User" set them? Have getters for the attributes that my methods use, and if the attributes don't exist handle that somehow?
Please let me know if you need better explanation, or another example or anything.
Ask what the object represents. Does that data reasonably belong to the object itself?
In this case, I would say yes. Your data option 2 is loading the data which reasonably "belongs to" the histogram object -- although it would be reasonable to argue that the constructor ought just load it rather than requiring a separate method call to accomplish that.
Also, if you go the other way, this is not really an object; you're simply using the object framework to collect some related subroutines.
I think if the value might be used by many methods you better set it on instantiation (init). If you need to calculate every time you are going to use it, then you don't need it to be an attribute (just a variable) and it should be calculated when you are going to use it.
In the end what I would try to do is avoiding my object to be like a state machine, where you don't know what methods you can or can't call at some moment because you don't know what values you have already calculated.
I apologize for the newbie question, but this is my first time working with classes. The class I'm trying to create is intended to perform a regex find and replace on all keys and values within a dictionary. The specific find and replace is defined upon instantiation.
There are two issues that I have. The first issue is that each instance of the class needs to accept a new dictionary. I'm not clear on how to create a class that accepts a general dictionary which I can specify upon creating an instance.
The second issue is that the class I have simply isn't working. I'm receiving the error message TypeError: expected string or buffer in the class line v = re.sub(self.find,self.replace,v).
There are three instances I want to create, one for each input dictionary: input_iter1, input_iter2, and input_iter3.
The following is the class:
class findreplace:
values = []
keys = []
def __init__(self, find, replace):
self.find = find
self.replace = replace
def value(self):
for k,v in input_iter1.items():
v = re.sub(self.find,self.replace,v)
findreplace.values.append(v)
def key(self):
for k,v in input_iter1.items():
k = re.sub(self.find,self.replace,k)
findreplace.keys.append(k)
The following are the instances:
values1 = findreplace('[)?:(]','')
values1.value()
values2 = findreplace(r'(,\s)(,\s)(\d{5})({e<=1})',r'\2\3')
values2.value()
keys1 = findreplace(r'(?<=^)(.+)(?=$)',r'(?:\1)')
keys1.key()
keys2 = findreplace(r'(?=$)',r'{e}')
keys2.key()
print values
print keys
If anyone has any insight on how I can workaround these two issues, I'd be grateful to hear them. Thanks!
First, Python 2 classes should start off this way:
class Foo(object):
Otherwise, you get an "old-style class", which is some ancient crusty thing no one uses.
Also, class names in Python are typically written in CamelCase.
Second, do not use mutable values (like lists!) as class attributes, as you're doing here with keys and values. They'll be shared across all instances of your class! It looks like you're even aware of this, since you refer to findreplace.keys directly, but it doesn't make sense to store instance-specific values in a class attribute like that.
But, most importantly: why is this a class at all? What does a findreplace represent? It looks like this would be much clearer if it were just a single function.
To answer your actual questions:
You pass in a dictionary just like you're passing in find and replace. Add another argument to __init__, and pass another argument when you construct your class.
Presumably, you're getting the TypeError because one of the values in your dictionary isn't a string, and you can only perform regexes on strings.
Where is your definition of the input_iter dicts? How do they look like? Your error indicates that the values of your dicts are not strings.
First, if you guys think the way I'm trying to do things is not Pythonic, feel free to offer alternative suggestions.
I have an object whose functionality needs to change based on outside events. What I've been doing originally is create a new object that inherits from original (let's call it OrigObject()) and overwrites the methods that change (let's call the new object NewObject()). Then I modified both constructors such that they can take in a complete object of the other type to fill in its own values based on the passed in object. Then when I'd need to change functionality, I'd just execute myObject = NewObject(myObject).
I'm starting to see several problems with that approach now. First of all, other places that reference the object need to be updated to reference the new type as well (the above statement, for example, would only update the local myObject variable). But that's not hard to update, only annoying part is remembering to update it in other places each time I change the object in order to prevent weird program behavior.
Second, I'm noticing scenarios where I need a single method from NewObject(), but the other methods from OrigObject(), and I need to be able to switch the functionality on the fly. It doesn't seem like the best solution anymore to be using inheritance, where I'd need to make M*N different classes (where M is the number of methods the class has that can change, and N is the number of variations for each method) that inherit from OrigObject().
I was thinking of using attribute remapping instead, but I seem to be running into issues with it. For example, say I have something like this:
def hybrid_type2(someobj, a):
#do something else
...
class OrigObject(object):
...
def hybrid_fun(self, a):
#do something
...
def switch(type):
if type == 1:
self.hybrid_fun = OrigObject.hybrid_fun
else:
self.fybrid_fun = hybrid_type2
Problem is, after doing this and trying to call the new hybrid_fun after switching it, I get an error saying that hybrid_type2() takes exactly 2 arguments, but I'm passing it one. The object doesn't seem to be passing itself as an argument to the new function anymore like it does with its own methods, anything I can do to remedy that?
I tried including hybrid_type2 inside the class as well and then using self.hybrid_fun = self.hybrid_type2 works, but using self.hybrid_fun = OrigObject.hybrid_fun causes a similar error (complaining that the first argument should be of type OrigObject). I know I can instead define OrigObject.hybrid_fun() logic inside OrigObject.hybrid_type1() so I can revert it back the same way I'm setting it (relative to the instance, rather than relative to the class to avoid having object not be the first argument). But I wanted to ask here if there is a cleaner approach I'm not seeing here? Thanks
EDIT:
Thanks guys, I've given points for several of the solutions that worked well. I essentially ended up using a Strategy pattern using types.MethodType(), I've accepted the answer that explained how to do the Strategy pattern in python (the Wikipedia article was more general, and the use of interfaces is not needed in Python).
Use the types module to create an instance method for a particular instance.
eg.
import types
def strategyA(possible_self):
pass
instance = OrigObject()
instance.strategy = types.MethodType(strategyA, instance)
instance.strategy()
Note that this only effects this specific instance, no other instances will be effected.
You want the Strategy Pattern.
Read about descriptors in Python. The next code should work:
else:
self.fybrid_fun = hybrid_type2.__get__(self, OrigObject)
What about defining it like so:
def hybrid_type2(someobj, a):
#do something else
...
def hybrid_type1(someobj, a):
#do something
...
class OrigObject(object):
def __init__(self):
...
self.run_the_fun = hybrid_type1
...
def hybrid_fun(self, a):
self.run_the_fun(self, a)
def type_switch(self, type):
if type == 1:
self.run_the_fun = hybrid_type1
else:
self.run_the_fun = hybrid_type2
You can change class at runtime:
class OrigObject(object):
...
def hybrid_fun(self, a):
#do something
...
def switch(self):
self.__class__ = DerivedObject
class DerivedObject(OrigObject):
def hybrid_fun(self, a):
#do the other thing
...
def switch(self):
self.__class__ = OrigObject