I have a bunch of variables that are equal to values pulled from a database. Sometimes, the database doesn't have a value and returns "NoneType". I'm taking these variables and using them to build an XML file. When the variable is NoneType, it causes the XML value to read "None" rather than blank as I'd prefer.
My question is: Is there an efficient way to go through all the variables at once and search for a NoneType and, if found, turn it to a blank string?
ex.
from types import *
[Connection to database omitted]
color = database.color
size = database.size
shape = database.shape
name = database.name
... etc
I could obviously do something like this:
if type(color) is NoneType:
color = ""
but that would become tedious for the 15+ variables I have. Is there a more efficient way to go through and check each variable for it's type and then correct it, if necessary? Something like creating a function to do the check/correction and having an automated way of passing each variable through that function?
All the solutions given here will make your code shorter and less tedious, but if you really have a lot of variables I think you will appreciate this, since it won't make you add even a single extra character of code for each variable:
class NoneWrapper(object):
def __init__(self, wrapped):
self.wrapped = wrapped
def __getattr__(self, name):
value = getattr(self.wrapped, name)
if value is None:
return ''
else:
return value
mydb = NoneWrapper(database)
color = mydb.color
size = mydb.size
shape = mydb.shape
name = mydb.name
# All of these will be set to an empty string if their
# original value in the database is none
Edit
I thought it was obvious, but I keep forgetting it takes time until all the fun Python magickery becomes a second nature. :) So how NoneWrapper does its magic? It's very simple, really. Each python class can define some "special" methods names that are easy to identify, because they are always surrounded by two underscores from each side. The most common and well-known of these methods is __init__(), which initializes each instance of the class, but there are many other useful special methods, and one of them is __getattr__(). This method is called whenever someone tries to access an attribute. of an instance of your class, and you can customize it to customize attribute access.
What NoneWrapper does is to override getattr, so whenever someone tries to read an attribute of mydb (which is a NoneWrapper instance), it reads the attribute with the specified name from the wrapped object (in this case, database) and return it - unless it's value is None, in which case it returns an empty string.
I should add here that both object variables and methods are attributes, and, in fact, for Python they are essentially the same thing: all attributes are variables that could be changed, and methods just happen to be variables that have their value set to a function of special type (bound method). So you can also use getattr() to control access to functions, which could lead to many interesting uses.
The way I would do it, although I don't know if it is the best, would be to put the variables you want to check and then use a for statement to iterate through the list.
check_vars = [color,size,shape,name]
for var in check_vars:
if type(var) is NoneType:
var = ""
To add variables all you have to do is add them to the list.
If you're already getting them one at a time, it's not that much longer to write:
def none_to_blank(value):
if value is None:
return ""
return value
color = none_to_blank(database.color)
size = none_to_blank(database.size)
shape = none_to_blank(database.shape)
name = none_to_blank(database.name)
Incidentally, use of "import *" is generally discouraged. Import only what you're using.
you can simply use:
color = database.color or ""
another way is to use a function:
def filter_None(var):
"" if (a is None) else a
color = filter_None(database.color)
I don't know how the database object is structured but another solution is to modify the database object like:
def myget(self, varname):
value = self.__dict__[varname]
return "" if (value is None) else value
DataBase.myget = myget
database = DataBase(...)
[...]
color = database.myget("color")
you can do better using descriptors or properties
Related
This question already has answers here:
How to get class object from class name string in the same module?
(3 answers)
Closed last year.
I am creating a python CLI, where user can provide an operation they want to perform, for eg:
sum 10 15
In my code, I have defined my classes as follows:
class Operation:
# common stuff
pass
class Sum(Operation):
identifier = "sum"
def perform(a, b):
return a + b
class Difference(Operation):
identifier = "diff"
def perform(a, b):
return a - b
Now, in my CLI, if I type sum 10 15 I want to return the result of Sum.perform(10, 15) and similarly if I type diff 10 15, I return the result of Difference.perform(10, 15), as sum is the identifier of class Sum and diff is the identifier of class Difference.
How do I dynamically access the class and its perform method, when I get the input directly from user input?
Classes in Python are first-class citizens, meaning they can be used as standard objects. In particular we can simply store them in a dictionary:
my_dict = {
'sum': Sum,
'diff': Difference,
}
and so on. Then when you get the operation name as string from command line you simply do
my_dict[op_name].perform(a, b)
Note that this is a very basic (and you will soon see problematic, e.g. not all operators accept two arguments) approach to what is known as parsing and abstract syntax trees. This is a huge topic, a bit hard but also very interesting. I encourage you to read about it.
// EDIT: If you want to keep identifier on the class, then you can apply a simple class decorator:
my_dict = {}
def autoregister(cls):
# It would be good idea to check whether we
# overwrite an entry here, to avoid errors.
my_dict[cls.identifier] = cls
return cls
#autoregister
class Sum(Operation):
identifier = "sum"
def perform(a, b):
return a + b
print(my_dict)
You have to remember though to import all classes before you use my_dict. In my opinion an explicit dict is easier to maintain.
Reading your comment, I think you need to interpret the input. The way I would go about this is splitting the input by spaces (based on your example), and then checking that list. For example:
# This is the place you called the input:
input_unsplit = input("Enter your command and args")
input_list = input_unsplit.split(" ")
# Check the first word to see what function we're calling
if("sum") in input_list[0].lower():
result = Sum.perform(input_list[1], input_list[2])
print(result)
# this logic can be applied to other functions as well.
This is a simple solution that could be hard to scale.
=== EDITED ===
I have more to add.
If used correctly, dir() can make a list of defined classes up to a certain point in the code. I wrote a calculator for my precal class, and in it I chose to use dir after defining all the math classes, and then if the name met certain conditions (i.e not main), it would be appended to a list of valid args to pass. You can modify your classes to include some kind of operator name property:
def class Addition:
self.op_name = "sum"
and then perform to take in an array:
def perform(numbers):
return numbers[0] + numbers [1]
To solve many of your scalability issues. Then, after declaring your classes, use dir() in a for loop to append to that valid array, like so:
valid_names = []
defined_names = dir()
for name in defined_names:
if '_' not in name:
if name not in ("sys","argparse","<any other imported module/defined var>"):
valid_names.append(name)
Note that making this step work for you is all in the placement in the script. it's a bit tedious, but works flawlessly if handled correctly (in my experience).
Then, you can use eval (safe in this context) to call the method you want:
# get input here
for name in defined_names:
if eval(name).op_name == input_list[0].lower():
eval(name).perform(input_list)
This should be a fairly easy-to-scale solution. Just watch that you keep the dir check up to date, and everything else just... works.
I am trying to set the attribute values of a certain class AuxiliaryClass than is instantiated in a method from MainClass class in the most efficient way possible.
AuxiliaryClass is instantiated within a method of MainClass - see below. However, AuxiliaryClass has many different attributes and I need to set the value of those attributes once the class has been instantiated - see the last 3 lines of my code.
Note: due to design constraints I cannot explain here, my classes only contain methods, meaning that I need to declare attributes as methods (see below).
class AuxiliaryClass(object):
def FirstMethod(self):
return None
...
def NthMethod(self):
return None
class MainClass(object):
def Auxiliary(self):
return AuxiliaryClass()
def main():
obj = MainClass()
obj.Auxiliary().FirstMethod = #some_value
...
obj.Auxiliary().NthMethod = #some_other_value
# ~~> further code
Basically I want to replace these last 3 lines of code with something neater, more elegant and more efficient. I know I could use a dictionary if I was instantiating AuxiliaryClass directly:
d = {'FirstMethod' : some_value,
...
'NthMethod' : some_other_value}
obj = AuxiliaryClass(**d)
But this does not seem to work for the structure of my problem. Finally, I need to set the values of AuxiliaryClass's attributes once MainClass has been instantiated (so I can't set the attribute's values within method Auxiliary).
Is there a better way to do this than obj.Auxiliary().IthMethod = some_value?
EDIT
A couple of people have said that the following lines:
obj.Auxiliary().FirstMethod = #some_value
...
obj.Auxiliary().NthMethod = #some_other_value
will have no effect because they will immediately get garbage collected. I do not really understand what this means, but if I execute the following lines (after the lines above):
print(obj.Auxiliary().FirstMethod())
...
print(obj.Auxiliary().NthMethod())
I am getting the values I entered previously.
To speed things up, and make the customization somewhat cleaner, you can cache the results of the AuxilliaryClass constructor/singleton/accessor, and loop over a dict calling setattr().
Try something like this:
init_values = {
'FirstMethod' : some_value,
:
'NthMethod' : some_other_value,
}
def main():
obj = MainClass()
aux = obj.Auxiliary() # cache the call, only make it once
for attr,value in init_values.items(): # python3 here, iteritems() in P2
setattr(aux, attr, value)
# other stuff below this point
I understand what is happening here: my code has a series of decorators before all methods which allow memoization. I do not know exactly how they work but when used the problem described above - namely, that lines of type obj.Auxiliary().IthMethod = some_value get immediately garbage collected - does not occur.
Unfortunately I cannot give further details regarding these decorators as 1) I do not understand them very well and 2) I cannot transmit this information outside my company. I think under this circumstances it is difficult to answer my question because I cannot fully disclose all the necessary details.
I want to do something like "calling a method of a property":
class my_property(property):
def __init__(self, ...):
...
def some_method(self):
# do something fancy like a plausibility check, publishing
# etc. in context of the property.
This improved property would now be used like a normal property except I can call an extra method:
setattr(self.__class__, 'some_value', property(getter, setter))
self.some_value = 5
self.some_value.some_method()
Of course this won't work because some_method() would be called on what the property returned - which is 5 in my case.
Same for this approach:
some_function(self.some_value)
What I could do now is s.th. like this:
some_function(self, "some_value")
But this is bad because this way tools like pylint cannot help me any more which is important for me (e.g. check whether self has an attribute some_value).
Is there a way to use the expression self.some_value without writing it in parenthesis and without just evaluating the property?
My approaches:
Don't use properties. Use getters and setters instead.
self.some_value.set(5)
x = self.some_value.get()
self.some_value.publish()
Use unintuitive black magic
self.some_value = 5 # will assign a value using `__set__()`
x = self.some_value # will read a value using `__get__()`
self.some_value = publish # calls `__set__()` with a well known magic value
publish would be some object known to __set__ which would make __set__ accomplish some task.
Use even darker magic
self.some_value = 5 # will assign a value using `__set__()`
x = self.some_value # will read a value using `__get__()`
some_method(self.some_value) # checks whether `__set__()` has been called from some_method()
This can by done by examining the callstack inside __get()__ and check whether some_method() is the next element. Instead of returning a value it would produce the desired behavior.
Background:
I have implemented some sort of property tree which makes use of properties to implement complex get/get semantics. It can be used like this:
root.some.subtree.element = 5
element is a property-attribute of root.some.subtree. This assignment would call it's __set__ method which e.g. might store the value on disc or send it through TCP.
In the same way
print(root.some.subtree.element)
might read and return some value from any provider.
Propertyies have a set and a get semantic but now I need a new semantic, e.g. publish. So what I would best like to have would be a property implementation with a new __publish__() method.
Of course I can easily read the property value and call a function like this:
publish_value('root.some.subtree.element', root.some.subtree.element)
But in this case pylint can't warn me if I misspelled the string expression.
Edit:
Originally I wrote
self.some_value = property(...)
to apply a property to self. This doesn't work this way so I corrected it. My intention is to let any object (in my case self) have a property which I can define extra operations on.
I never got formal OOP instruction, and just kinda stumbled my way through the basics in python, but am at a crossroads.
When dealing with an instantiated class object, is it better to assign attributes via methods, or have the methods just return the values? I've read a lot about not letting the object's state get out of whack, but can't figure out the best way. Here's a simple example:
import magic
class Histogram():
def __init__(self,directory):
self.directory = directory
# Data Option 1
def read_data(self):
data = []
file_ref = open(self.directory,'r')
line = file_ref.readline()
while line:
data.append(line)
line = file_ref.readline()
return data
# Data Option 2
def set_data(self):
data = []
file_ref = open(self.directory,'r')
line = file_ref.readline()
while line:
data.append(line)
line = file_ref.readline()
self.data = data
# Hist Option 1
def build_histogram(self):
data = self.read_data()
# It's not important what magic.histogram does.
self.histogram = magic.histogram(data)
# Hist Option 2
def get_histogram(self,data):
return magic.histogram(data)
# Hist Option 3 - this requires self.get_data() to have already run.
def build_histogram_2(self):
self.histogram = magic.histogram(self.data)
So Data Option 1 forces the user to either call that and store it somewhere to use in conjunction with Hist Option 2 or store it in self.data to use with Hist Option 3.
Data Option 2 lets you use Hist Option 3, but you still have had to already run set_data.
So my real question is, for a class with methods to do different things, that often CAN but don't HAVE to be chained together, how should I write it? Implicitly setting attributes and risk getting the state messed up? Return variables and let the "User" set them? Have getters for the attributes that my methods use, and if the attributes don't exist handle that somehow?
Please let me know if you need better explanation, or another example or anything.
Ask what the object represents. Does that data reasonably belong to the object itself?
In this case, I would say yes. Your data option 2 is loading the data which reasonably "belongs to" the histogram object -- although it would be reasonable to argue that the constructor ought just load it rather than requiring a separate method call to accomplish that.
Also, if you go the other way, this is not really an object; you're simply using the object framework to collect some related subroutines.
I think if the value might be used by many methods you better set it on instantiation (init). If you need to calculate every time you are going to use it, then you don't need it to be an attribute (just a variable) and it should be calculated when you are going to use it.
In the end what I would try to do is avoiding my object to be like a state machine, where you don't know what methods you can or can't call at some moment because you don't know what values you have already calculated.
I apologize for the newbie question, but this is my first time working with classes. The class I'm trying to create is intended to perform a regex find and replace on all keys and values within a dictionary. The specific find and replace is defined upon instantiation.
There are two issues that I have. The first issue is that each instance of the class needs to accept a new dictionary. I'm not clear on how to create a class that accepts a general dictionary which I can specify upon creating an instance.
The second issue is that the class I have simply isn't working. I'm receiving the error message TypeError: expected string or buffer in the class line v = re.sub(self.find,self.replace,v).
There are three instances I want to create, one for each input dictionary: input_iter1, input_iter2, and input_iter3.
The following is the class:
class findreplace:
values = []
keys = []
def __init__(self, find, replace):
self.find = find
self.replace = replace
def value(self):
for k,v in input_iter1.items():
v = re.sub(self.find,self.replace,v)
findreplace.values.append(v)
def key(self):
for k,v in input_iter1.items():
k = re.sub(self.find,self.replace,k)
findreplace.keys.append(k)
The following are the instances:
values1 = findreplace('[)?:(]','')
values1.value()
values2 = findreplace(r'(,\s)(,\s)(\d{5})({e<=1})',r'\2\3')
values2.value()
keys1 = findreplace(r'(?<=^)(.+)(?=$)',r'(?:\1)')
keys1.key()
keys2 = findreplace(r'(?=$)',r'{e}')
keys2.key()
print values
print keys
If anyone has any insight on how I can workaround these two issues, I'd be grateful to hear them. Thanks!
First, Python 2 classes should start off this way:
class Foo(object):
Otherwise, you get an "old-style class", which is some ancient crusty thing no one uses.
Also, class names in Python are typically written in CamelCase.
Second, do not use mutable values (like lists!) as class attributes, as you're doing here with keys and values. They'll be shared across all instances of your class! It looks like you're even aware of this, since you refer to findreplace.keys directly, but it doesn't make sense to store instance-specific values in a class attribute like that.
But, most importantly: why is this a class at all? What does a findreplace represent? It looks like this would be much clearer if it were just a single function.
To answer your actual questions:
You pass in a dictionary just like you're passing in find and replace. Add another argument to __init__, and pass another argument when you construct your class.
Presumably, you're getting the TypeError because one of the values in your dictionary isn't a string, and you can only perform regexes on strings.
Where is your definition of the input_iter dicts? How do they look like? Your error indicates that the values of your dicts are not strings.