I have a config.cfg which I parse using the python-module ConfigParser. In one section I want to configure assignments of the form fileextension : ClassName. Parsing results in the following dictionary:
types = {
"extension1" : "ClassName1",
"extension2" : "ClassName2"
}
EDIT: I know I can now do:
class_ = eval(types[extension])
foo = class()
But I was given to understand that eval is evil and should not be used.
Do you know a nicer way to dynamically configure which file-extension results in which class?
You could use eval, if the class name in the config file exactly matches the class names in your python code (and if the classes are in scope!), but ..... eval is evil (a coincidence that there's only one letter difference? I think not!)
A safer way to do it would be to add an extra dictionary that maps from configuration class name to python class name. I'd do this because:
configuration files don't have to know about your code's names
can change config files without changing code and vice versa
it avoids eval
So it'd look something like:
mappingDict = {"ClassName1" : MyPythonClass1,
"ClassName2" : MyPythonClass2, ... }
# keys are strings, values are classes
Then you perform a lookup using the value from the config file:
myClassName = types['extension1']
myClass = mappingDict[myClassName]
If module is the module the class named classname lives in, you can get the class object using
class_ = getattr(module, classname)
(If the class lives in the main module, use import __main__ to get a module object for this module.)
To look up the class in the current module's global scope, use
class_ = globals()[classname]
I think a static dictionary as in Matt's answer is the better solution.
Related
Question
Is there a "pythonic" (i.e. canonical, official, PEP8-approved, etc) way to re-use string literals in python internal (and external) APIs?
Background
For example, I'm working with some (inconsistent) JSON-handling code (thousands of lines) where there are various JSON "structs" we assemble, parse, etc. One of the recurring problems that comes up during code reviews is different JSON structs that use the same internal parameter names, causing confusion and eventually causing bugs to arise, e.g.:
pathPacket['src'] = "/tmp"
pathPacket['dst'] = "/home/user/out"
urlPacket['src'] = "localhost"
urlPacket['dst'] = "contoso"
These two (example) packets that have dozens of identically named fields, but they represent very different types of data. There was no code-reuse justification for this implementation. People typically use code-completion engines to get the members of the JSON struct, and this eventually leads to hard-to-debug problems down the road due to mis-typed string literals causing functional issues, and not triggering an error earlier on. When we have to change these APIs, it takes a lot of time to hunt down the string literals to find out which JSON structs use which fields.
Question - Redux
Is there a better approach to this that is common amongst members of the python community? If I was doing this in C++, the earlier example would be something like:
const char *JSON_PATH_SRC = "src";
const char *JSON_PATH_DST = "dst";
const char *JSON_URL_SRC = "src";
const char *JSON_URL_DST = "dst";
// Define/allocate JSON structs
pathPacket[JSON_PATH_SRC] = "/tmp";
pathPacket[JSON_PATH_DST] = "/home/user/out";
urlPacket[JSON_URL_SRC] = "localhost";
urlPacket[JSON_URL_SRC] = "contoso";
My initial approach would be to:
Use abc to make an abstract base class that can't be initialized as an object, and populate it with read-only constants.
Use that class as a common module throughout my project.
By using these constants, I can reduce the chance of a monkey-patching error as the symbols won't exist if mis-spelled, whereas a string literal typo can slip through code reviews.
My Proposed Solution (open to advice/criticism)
from abc import ABCMeta
class Custom_Structure:
__metaclass__ = ABCMeta
#property
def JSON_PATH_SRC():
return self._JSON_PATH_SRC
#property
def JSON_PATH_DST():
return self._JSON_PATH_DST
#property
def JSON_URL_SRC():
return self._JSON_URL_SRC
#property
def JSON_URL_DST():
return self._JSON_URL_DST
The way this is normally done is:
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
pathPacket[JSON_PATH_SRC] = "/tmp"
pathPacket[JSON_PATH_DST] = "/home/user/out"
urlPacket[JSON_URL_SRC] = "localhost"
urlPacket[JSON_URL_SRC] = "contoso"
Upper-case to denote "constants" is the way it goes. You'll see this in the standard library, and it's even recommended in PEP8:
Constants are usually defined on a module level and written in all
capital letters with underscores separating words. Examples include
MAX_OVERFLOW and TOTAL.
Python doesn't have true constants, and it seems to have survived without them. If it makes you feel more comfortable wrapping this in a class that uses ABCmeta with properties, go ahead. Indeed, I'm pretty sure abc.ABCmeta doesn't not prevent object initialization. Indeed, if it did, your use of property would not work! property objects belong to the class, but are meant to be accessed from an instance. To me, it just looks like a lot of rigamarole for very little gain.
The easiest way in my opinion to make constants is just to set them as variables in your module (and not modify them).
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
Then if you need to reference them from another module they're already namespaced for you.
>>> that_module.JSON_PATH_SRC
'src'
>>> that_module.JSON_PATH_DST
'dst'
>>> that_module.JSON_URL_SRC
'src'
>>> that_module.JSON_URL_DST
'dst'
The simplest way to create a bunch of constants is to place them into a module, and import them as necessary. For example, you could have a constants.py module with
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
Your code would then do something like
from constants import JSON_URL_SRC
...
urlPacket[JSON_URL_SRC] = "localhost"
If you would like a better defined grouping of the constants, you can either stick them into separate modules in a dedicated package, allowing you to access them like constants.json.url.DST for example, or you could use Enums. The Enum class allows you to group related sets of constants into a single namespace. You could write a module constants.py like this:
from enum import Enum
class JSONPath(Enum):
SRC = 'src'
DST = 'dst'
class JSONUrl(Enum):
SRC = 'src'
DST = 'dst'
OR
from enum import Enum
class JSON(Enum):
PATH_SRC = 'src'
PATH_DST = 'dst'
URL_SRC = 'src'
URL_DST = 'dst'
How exactly you separate your constants is up to you. You can have a single giant enum, one per category or something in between. You would access the in your code like this:
from constants import JSONURL
...
urlPacket[JSONURL.SRC.value] = "localhost"
OR
from constants import JSON
...
urlPacket[JSON.URL_SRC.value] = "localhost"
I want to use Python for creating JSON.
Since I found no library which can help me, I want to know if it's possible to inspect the order of the classes in a Python file?
Example
# example.py
class Foo:
pass
class Bar:
pass
If I import example, I want to know the order of the classes. In this case it is [Foo, Bar] and not [Bar, Foo].
Is this possible? If "yes", how?
Background
I am not happy with yaml/json. I have the vague idea to create config via Python classes (only classes, not instantiation to objects).
Answers which help me to get to my goal (Create JSON with a tool which is easy and fun to use) are welcome.
The inspect module can tell the line numbers of the class declarations:
import inspect
def get_classes(module):
for name, value in inspect.getmembers(module):
if inspect.isclass(value):
_, line = inspect.getsourcelines(value)
yield line, name
So the following code:
import example
for line, name in sorted(get_classes(example)):
print line, name
Prints:
2 Foo
5 Bar
First up, as I see it, there are 2 things you can do...
Continue pursuing to use Python source files as configuration files. (I won't recommend this. It's analogous to using a bulldozer to strike a nail or converting a shotgun to a wheel)
Switch to something like TOML, JSON or YAML for configuration files, which are designed for the job.
Nothing in JSON or YAML prevents them from holding "ordered" key-value pairs. Python's dict data type is unordered by default (at least till 3.5) and list data type is ordered. These map directly to object and array in JSON respectively, when using the default loaders. Just use something like Python's OrderedDict when deserializing them and voila, you preserve order!
With that out of the way, if you really want to use Python source files for the configuration, I suggest trying to process the file using the ast module. Abstract Syntax Trees are a powerful tool for syntax level analysis.
I whipped a quick script for extracting class line numbers and names from a file.
You (or anyone really) can use it or extend it to be more extensive and have more checks if you want for whatever you want.
import sys
import ast
import json
class ClassNodeVisitor(ast.NodeVisitor):
def __init__(self):
super(ClassNodeVisitor, self).__init__()
self.class_defs = []
def visit(self, node):
super(ClassNodeVisitor, self).visit(node)
return self.class_defs
def visit_ClassDef(self, node):
self.class_defs.append(node)
def read_file(fpath):
with open(fpath) as f:
return f.read()
def get_classes_from_text(text):
try:
tree = ast.parse(text)
except Exception as e:
raise e
class_extractor = ClassNodeVisitor()
li = []
for definition in class_extractor.visit(tree):
li.append([definition.lineno, definition.name])
return li
def main():
fpath = "/tmp/input_file.py"
try:
text = read_file(fpath)
except Exception as e:
print("Could not load file due to " + repr(e))
return 1
print(json.dumps(get_classes_from_text(text), indent=4))
if __name__ == '__main__':
sys.exit(main())
Here's a sample run on the following file:
input_file.py:
class Foo:
pass
class Bar:
pass
Output:
$ py_to_json.py input_file.py
[
[
1,
"Foo"
],
[
5,
"Bar"
]
]
If I import example,
If you're going to import the module, the example module to be on the import path. Importing means executing any Python code in the example module. This is a pretty big security hole - you're loading a user-editable file in the same context as the rest of the application.
I'm assuming that since you care about preserving class-definition order, you also care about preserving the order of definitions within each class.
It is worth pointing out that is now the default behavior in python, since python3.6.
Aslo see PEP 520: Preserving Class Attribute Definition Order.
(Moving my comments to an answer)
That's a great vague idea. You should give Figura a shot! It does exactly that.
(Full disclosure: I'm the author of Figura.)
I should point out the order of declarations is not preserved in Figura, and also not in json.
I'm not sure about order-preservation in YAML, but I did find this on wikipedia:
... according to the specification, mapping keys do not have an order
It might be the case that specific YAML parsers maintain the order, though they aren't required to.
You can use a metaclass to record each class's creation time, and later, sort the classes by it.
This works in python2:
class CreationTimeMetaClass(type):
creation_index = 0
def __new__(cls, clsname, bases, dct):
dct['__creation_index__'] = cls.creation_index
cls.creation_index += 1
return type.__new__(cls, clsname, bases, dct)
__metaclass__ = CreationTimeMetaClass
class Foo: pass
class Bar: pass
classes = [ cls for cls in globals().values() if hasattr(cls, '__creation_index__') ]
print(sorted(classes, key = lambda cls: cls.__creation_index__))
The standard json module is easy to use and works well for reading and writing JSON config files.
Objects are not ordered within JSON structures but lists/arrays are, so put order dependent information into a list.
I have used classes as a configuration tool, the thing I did was to derive them from a base class which was customised by the particular class variables. By using the class like this I did not need a factory class. For example:
from .artifact import Application
class TempLogger(Application): partno='03459'; path='c:/apps/templog.exe'; flag=True
class GUIDisplay(Application): partno='03821'; path='c:/apps/displayer.exe'; flag=False
in the installation script
from .install import Installer
import app_configs
installer = Installer(apps=(TempLogger(), GUIDisplay()))
installer.baseline('1.4.3.3475')
print installer.versions()
print installer.bill_of_materials()
One should use the right tools for the job, so perhaps python classes are not the right tool if you need ordering.
Another python tool I have used to create JSON files is Mako templating system. This is very powerful. We used it to populate variables like IP addresses etc into static JSON files that were then read by C++ programs.
I'm not sure if this is answers your question, but it might be relevant. Take a look at the excellent attrs module. It's great for creating classes to use as data types.
Here's an example from glyph's blog (creator of Twisted Python):
import attr
#attr.s
class Point3D(object):
x = attr.ib()
y = attr.ib()
z = attr.ib()
It saves you writing a lot of boilerplate code - you get things like str representation and comparison for free, and the module has a convenient asdict function which you can pass to the json library:
>>> p = Point3D(1, 2, 3)
>>> str(p)
'Point3D(x=1, y=2, z=3)'
>>> p == Point3D(1, 2, 3)
True
>>> json.dumps(attr.asdict(p))
'{"y": 2, "x": 1, "z": 3}'
The module uses a strange naming convention, but read attr.s as "attrs" and attr.ib as "attrib" and you'll be okay.
Just touching the point about creating JSON from python. there is an excellent library called jsonpickle which lets you dump python objects to json. (and using this alone or with other methods mentioned here you can probably get what you wanted)
I'm creating a backend application with SQLAlchemy using the declarative base. The ORM requires about 15 tables each of which maps to a class object in SQLAlchemy. Because these class objects are all defined identically I thought a factory pattern could produce the classes more concisely. However, these classes not only have to be defined, they have to be assigned to unique variable names so they can be imported and used through the project.
(Sorry if this question is a bit long, I updated it as I better understood the problem.)
Because we have so many columns (~1000) we define their names and types in external text files to keep things readable. Having done that one way to go about declaring our models is like this:
class Foo1(Base):
__tablename___ = 'foo1'
class Foo2(Base):
__tablename___ = 'foo2'
... etc
and then I can add the columns by looping over the contents of the external text file and using the setattr() on each class definition.
This is OK but it feels too repetitive as we have about 15 tables. So instead I took a stab at writing a factory function that could define the classes dynamically.
def orm_factory(class_name):
class NewClass(Base):
__tablename__ = class_name.lower()
NewClass.__name__ = class_name.upper()
return NewClass
Again I can just loop over the columns and use setattr(). When I put it together it looks like this:
for class_name in class_name_list:
ORMClass = orm_factory(class_name)
header_keyword_list = get_header_keyword_list(class_name)
define_columns(ORMClass, header_keyword_list)
Where get_header_keyword_list gets the column information and define_columns performs the setattr() assignment. When I use this and run Base.metadata.create_all() the SQL schema get generated just fine.
But, when I then try to import these class definitions into another model I get an error like this:
SAWarning: The classname 'NewClass' is already in the registry of this declarative base, mapped to <class 'ql_database_interface.IR_FLT_0'>
This, I now realize makes total sense based on what I learned yesterday: Python class variable name vs __name__.
You can address this by using type as a class generator in your factory function (as two of the answers below do). However, this does not solve the issue of being able to import the class because the while the classes are dynamically constructed in the factory function the variable the output of that function is assigned to is static. Even if it were dynamic, such as a dictionary key, it has to be in the module name space in order to be imported from another module. See my answer for more details.
This sounds like a sketchy idea. But it's fun to solve so here is how you make it work.
As I understand it, your problem is you want to add dynamically created classes to a module. I created a hack using a module and the init.py file.
dynamicModule/__init__.py:
import dynamic
class_names = ["One", "Two", "Three"]
for new_name in class_names:
dynamic.__dict__['Class%s' % new_name] = type("Class%s" % (new_name), (object,), {'attribute_one': 'blah'})
dynamicModule/dynamic.py:
"""Empty file"""
test.py:
import dynamicModule
from dynamicModule import dynamic
from dynamicModule.dynamic import ClassOne
dynamic.ClassOne
"""This all seems evil but it works for me on python 2.6.5"""
__init__.py:
"""Empty file"""
[Note, this is the original poster]
So after some thinking and talking to people I've decided that that ability to dynamically create and assign variables to class objects in the global name space in this way this just isn't something Python supports (and likely with good reason). Even though I think my use case isn't too crazy (pumping out predefined list of identically constructed classes) it's just not supported.
There are lots of questions that point towards using a dictionary in a case like this, such as this one: https://stackoverflow.com/a/10963883/1216837. I thought of something like that but the issue is that I need those classes in the module name space so I can import them into other modules. However, adding them with globals() like globals()['MyClass'] = class_dict['MyClass'] seems like it's getting pretty out there and my impression is people on SO frown on using globals() like this.
There are hacks such as the one suggested by patjenk but at a certain point the obfuscation and complexity out weight the benefits of the clarity of declaring each class object statically. So while it seems repetitive I'm just going to write out all the class definitions. Really, this end up being pretty concise/maintainable:
Class1 = class_factory('class1')
Class2 = class_factory('class2')
...
I have a script as follows
from mapper import Mapper
class A(object):
def foo(self):
print "world"
a = A()
a.foo()
Mapper['test']()
with Mapper defined in the file mapper.py:
Mapper = {'test': a.foo}
where I want to define a function call referencing an object not defined in mapper.py, but in the original code. However the code above gives the error
NameError: name 'a' is not defined
which makes kind of sense, as a is not defined in mapper.py itself. However, is it possible to change the code to let the code do the name resolution in the main code itself, or by the use of globals or something?
To solve this problem I could specify the implementation in mapper.py as a text and use eval in the main code, but I would like to avoid the usage of eval.
Additional information:
The full definition of the function has to be made in mapper.py
It is not known beforehand what the instance a is, or from what clas it is instantiated.
Barring security holes like eval, it's not possible to use a name a in mapper.py unless the name is either defined somewhere in mapper.py or imported from another module. There is no way to just let mapper.py automatically and silently access a value a from a different module.
In addition, if you're using it just in a dict as in your example, a.foo is going to be evaluated as soon as the dict is created. It's not going wait until you actually call the function; as soon as it evaluates a.foo to create the dict, it will fail because it doesn't know what a is.
You could get around this second problem by wrapping the element in a function (using a lambda for brevity):
Mapper = {'test': lambda: a.foo}
. . . but this still won't help unless you can somehow get a to be available inside mapper.py.
One possibility is to parameterize your Mapper by the "mystery" object and then pass that object in from outside:
# mapper.py
Mapper = {'test': lambda a: a.foo}
# other module
from mapper import Mapper
Mapper['test'](a)()
Or, similar to what mgilson suggested, you could "register" the object a with Mapper somehow. This lets you pass the object a only once to register it, and then you don't have to pass it for every call:
# mapper.py
Mapper = {'test': lambda a: Mapper['a'].foo}
# other module
from mapper import Mapper
Mapper['a'] = a
Mapper['test']()()
Note the two sets of parentheses at the end there: one set to evaluate the lambda and extract the function you want to call, and the second set to actually call that function. You could do a similar deal by, instead of using Mapper['a'] as the reference, using a module-level variable:
# mapper.py
Mapper = {'test': lambda: a.foo}
# other module
import mapper
Mapper = mapper.Mapper
mapper.a = a
Mapper['test']()()
Note that this requires you to do import mapper in order to set the module variable in that other module.
You could streamline this somewhat by using a custom class for Mapper instead of a regular dict, and having that class do some work in its __getitem__ to look in a "known location" (e.g., read some module variable) to use as a base for evaluating a. That would be a heavier-weight solution though.
The bottom line is that you simply cannot (again, without the use of eval or other such holes) write code in mapper.py that uses an undefined variable a, and then define a variable a in another module and have mapper.py automatically know about that. There has to be some line of code somewhere that "tells" mapper.py what value of a you want it to use.
I'm not sure I completely follow, but a could "register" it's method with Mapper from anywhere which has a reference to Mapper:
#mapping.py
Mapper = {}
and then:
#main.py
from mapping import Mapper
#snip
a = A()
Mapper['test'] = a.foo #put your instance method into the Mapper dict.
#snip
Mapper['test']()
I'd like to init a class from data stored in a simple python file specified while calling the script. The config file named myconfig.py is :
str='home'
val=2
flt=7.0
I'd like to call it during class initilization like so. One of the objectives is to define variable types as well in the file. I know of the configparser, but this method less verbose if it can be made to work.
class ClassInit(object):
def __init__(self, configFile):
fp, path, des = imp.find_module('',configFile)
imp.load_module(configFile, fp, path, des)
self.__dict__ = configFile.__dict__
fp.close()
def printVal(self):
print '%s %0.2f'%(self.str, self.val)
if __name__ == '__main__':
srcDir = 'src/'
config = osp.join(srcDir, argv[0]) # config for current run
ci = ClassInit(config)
ci.printVal()
Is anything like this possible?
Well, there are several ways to do this. The easiest way would be to use eval() or exec to evaluate this code within the class scope. But that's also the most dangerous way, especially if these files can be created by someone other than you. In that case, the creator can write malicious code that can pretty much do anything. You can override the __builtins__ key of the globals dictionary, but I'm not sure if this makes eval/exec entirely safe. For example:
class ClassInit(object):
def __init__(self, configFile):
f = open(configFile)
config = f.read()
f.close()
config_dic = { '__builtins__': None}
exec 'a = 4' in config_dic
for key, value in config_dic.iteritems():
if key != '__builtins__':
setattr(self, key, value)
This method kills the unsafe 'builtins' object, but it's still not quite safe. For instance, the file may be able to define a function which would override one of your class's functions with malicious code. So I really don't recommend it, unless you absolutely control thos .py files.
A safer but more complex way would be to create a custom interpreter that interprets this file but doesn't allow running any custom code.
You can read the following thread, to see some suggestions for parsing libraries or other safer alternatives to eval():
Python: make eval safe
Besides, if all you ever need your config.py file for is to initialize some variables in a nice way, and you don't need to be able to call fancy python functions from inside it, you should consider using JSON instead. Python 2.6 and up includes simplejson, which you can use to initialize an object from file. The syntax is Javascript and not Python, but for initializing variables there's little difference there.
Can you try self.__dict__.update(configFile.__dict__)? I don't see why that wouldn't work.