How to initialize a class with data from a python file - python

I'd like to init a class from data stored in a simple python file specified while calling the script. The config file named myconfig.py is :
str='home'
val=2
flt=7.0
I'd like to call it during class initilization like so. One of the objectives is to define variable types as well in the file. I know of the configparser, but this method less verbose if it can be made to work.
class ClassInit(object):
def __init__(self, configFile):
fp, path, des = imp.find_module('',configFile)
imp.load_module(configFile, fp, path, des)
self.__dict__ = configFile.__dict__
fp.close()
def printVal(self):
print '%s %0.2f'%(self.str, self.val)
if __name__ == '__main__':
srcDir = 'src/'
config = osp.join(srcDir, argv[0]) # config for current run
ci = ClassInit(config)
ci.printVal()
Is anything like this possible?

Well, there are several ways to do this. The easiest way would be to use eval() or exec to evaluate this code within the class scope. But that's also the most dangerous way, especially if these files can be created by someone other than you. In that case, the creator can write malicious code that can pretty much do anything. You can override the __builtins__ key of the globals dictionary, but I'm not sure if this makes eval/exec entirely safe. For example:
class ClassInit(object):
def __init__(self, configFile):
f = open(configFile)
config = f.read()
f.close()
config_dic = { '__builtins__': None}
exec 'a = 4' in config_dic
for key, value in config_dic.iteritems():
if key != '__builtins__':
setattr(self, key, value)
This method kills the unsafe 'builtins' object, but it's still not quite safe. For instance, the file may be able to define a function which would override one of your class's functions with malicious code. So I really don't recommend it, unless you absolutely control thos .py files.
A safer but more complex way would be to create a custom interpreter that interprets this file but doesn't allow running any custom code.
You can read the following thread, to see some suggestions for parsing libraries or other safer alternatives to eval():
Python: make eval safe
Besides, if all you ever need your config.py file for is to initialize some variables in a nice way, and you don't need to be able to call fancy python functions from inside it, you should consider using JSON instead. Python 2.6 and up includes simplejson, which you can use to initialize an object from file. The syntax is Javascript and not Python, but for initializing variables there's little difference there.

Can you try self.__dict__.update(configFile.__dict__)? I don't see why that wouldn't work.

Related

Use Python for Creating JSON

I want to use Python for creating JSON.
Since I found no library which can help me, I want to know if it's possible to inspect the order of the classes in a Python file?
Example
# example.py
class Foo:
pass
class Bar:
pass
If I import example, I want to know the order of the classes. In this case it is [Foo, Bar] and not [Bar, Foo].
Is this possible? If "yes", how?
Background
I am not happy with yaml/json. I have the vague idea to create config via Python classes (only classes, not instantiation to objects).
Answers which help me to get to my goal (Create JSON with a tool which is easy and fun to use) are welcome.
The inspect module can tell the line numbers of the class declarations:
import inspect
def get_classes(module):
for name, value in inspect.getmembers(module):
if inspect.isclass(value):
_, line = inspect.getsourcelines(value)
yield line, name
So the following code:
import example
for line, name in sorted(get_classes(example)):
print line, name
Prints:
2 Foo
5 Bar
First up, as I see it, there are 2 things you can do...
Continue pursuing to use Python source files as configuration files. (I won't recommend this. It's analogous to using a bulldozer to strike a nail or converting a shotgun to a wheel)
Switch to something like TOML, JSON or YAML for configuration files, which are designed for the job.
Nothing in JSON or YAML prevents them from holding "ordered" key-value pairs. Python's dict data type is unordered by default (at least till 3.5) and list data type is ordered. These map directly to object and array in JSON respectively, when using the default loaders. Just use something like Python's OrderedDict when deserializing them and voila, you preserve order!
With that out of the way, if you really want to use Python source files for the configuration, I suggest trying to process the file using the ast module. Abstract Syntax Trees are a powerful tool for syntax level analysis.
I whipped a quick script for extracting class line numbers and names from a file.
You (or anyone really) can use it or extend it to be more extensive and have more checks if you want for whatever you want.
import sys
import ast
import json
class ClassNodeVisitor(ast.NodeVisitor):
def __init__(self):
super(ClassNodeVisitor, self).__init__()
self.class_defs = []
def visit(self, node):
super(ClassNodeVisitor, self).visit(node)
return self.class_defs
def visit_ClassDef(self, node):
self.class_defs.append(node)
def read_file(fpath):
with open(fpath) as f:
return f.read()
def get_classes_from_text(text):
try:
tree = ast.parse(text)
except Exception as e:
raise e
class_extractor = ClassNodeVisitor()
li = []
for definition in class_extractor.visit(tree):
li.append([definition.lineno, definition.name])
return li
def main():
fpath = "/tmp/input_file.py"
try:
text = read_file(fpath)
except Exception as e:
print("Could not load file due to " + repr(e))
return 1
print(json.dumps(get_classes_from_text(text), indent=4))
if __name__ == '__main__':
sys.exit(main())
Here's a sample run on the following file:
input_file.py:
class Foo:
pass
class Bar:
pass
Output:
$ py_to_json.py input_file.py
[
[
1,
"Foo"
],
[
5,
"Bar"
]
]
If I import example,
If you're going to import the module, the example module to be on the import path. Importing means executing any Python code in the example module. This is a pretty big security hole - you're loading a user-editable file in the same context as the rest of the application.
I'm assuming that since you care about preserving class-definition order, you also care about preserving the order of definitions within each class.
It is worth pointing out that is now the default behavior in python, since python3.6.
Aslo see PEP 520: Preserving Class Attribute Definition Order.
(Moving my comments to an answer)
That's a great vague idea. You should give Figura a shot! It does exactly that.
(Full disclosure: I'm the author of Figura.)
I should point out the order of declarations is not preserved in Figura, and also not in json.
I'm not sure about order-preservation in YAML, but I did find this on wikipedia:
... according to the specification, mapping keys do not have an order
It might be the case that specific YAML parsers maintain the order, though they aren't required to.
You can use a metaclass to record each class's creation time, and later, sort the classes by it.
This works in python2:
class CreationTimeMetaClass(type):
creation_index = 0
def __new__(cls, clsname, bases, dct):
dct['__creation_index__'] = cls.creation_index
cls.creation_index += 1
return type.__new__(cls, clsname, bases, dct)
__metaclass__ = CreationTimeMetaClass
class Foo: pass
class Bar: pass
classes = [ cls for cls in globals().values() if hasattr(cls, '__creation_index__') ]
print(sorted(classes, key = lambda cls: cls.__creation_index__))
The standard json module is easy to use and works well for reading and writing JSON config files.
Objects are not ordered within JSON structures but lists/arrays are, so put order dependent information into a list.
I have used classes as a configuration tool, the thing I did was to derive them from a base class which was customised by the particular class variables. By using the class like this I did not need a factory class. For example:
from .artifact import Application
class TempLogger(Application): partno='03459'; path='c:/apps/templog.exe'; flag=True
class GUIDisplay(Application): partno='03821'; path='c:/apps/displayer.exe'; flag=False
in the installation script
from .install import Installer
import app_configs
installer = Installer(apps=(TempLogger(), GUIDisplay()))
installer.baseline('1.4.3.3475')
print installer.versions()
print installer.bill_of_materials()
One should use the right tools for the job, so perhaps python classes are not the right tool if you need ordering.
Another python tool I have used to create JSON files is Mako templating system. This is very powerful. We used it to populate variables like IP addresses etc into static JSON files that were then read by C++ programs.
I'm not sure if this is answers your question, but it might be relevant. Take a look at the excellent attrs module. It's great for creating classes to use as data types.
Here's an example from glyph's blog (creator of Twisted Python):
import attr
#attr.s
class Point3D(object):
x = attr.ib()
y = attr.ib()
z = attr.ib()
It saves you writing a lot of boilerplate code - you get things like str representation and comparison for free, and the module has a convenient asdict function which you can pass to the json library:
>>> p = Point3D(1, 2, 3)
>>> str(p)
'Point3D(x=1, y=2, z=3)'
>>> p == Point3D(1, 2, 3)
True
>>> json.dumps(attr.asdict(p))
'{"y": 2, "x": 1, "z": 3}'
The module uses a strange naming convention, but read attr.s as "attrs" and attr.ib as "attrib" and you'll be okay.
Just touching the point about creating JSON from python. there is an excellent library called jsonpickle which lets you dump python objects to json. (and using this alone or with other methods mentioned here you can probably get what you wanted)

Dynamically instantiating objects

I'm attempting to instantiate an object from a string. Specifically, I'm trying to change this:
from node.mapper import Mapper
mapper = Mapper(file)
mapper.map(src, dst)
into something like this:
with open('C:.../node/mapper.py', 'r') as f:
mapping_script = f.read()
eval(mapping_script)
mapper = Mapper(file)
mapper.map(src, dst)
The motivation for this seemingly bizarre task is to be able to store different versions of mapping scripts in a database and then retrieve/use them as needed (with emphasis on the polymorphism of the map() method).
The above does not work. For some reason, eval() throws SyntaxError: invalid syntax. I don't understand this since it's the same file that's being imported in the first case. Is there some reason why eval() cannot be used to define classes?
I should note that I am aware of the security concerns around eval(). I would love to hear of alternative approaches if there are any. The only other thing I can think of is to fetch the script, physically save it into the node package directory, and then import it, but that seems even crazier.
You need to use exec:
exec(mapping_script)
eval() works only for expressions. exec() works for statements. A typical Python script contains statements.
For example:
code = """class Mapper: pass"""
exec(code)
mapper = Mapper()
print(mapper)
Output:
<__main__.Mapper object at 0x10ae326a0>
Make sure you either call exec() (Python 3, in Python 2 it is a statement) at the module level. When you call it in a function, you need to add globals(), for example exec(code, globals()), to make the objects available in the global scope and to the rest of the function as discussed here.

Calling a function from a dictionary, dictionary in imported settings file

So I have a dictionary with a bunch of names that I use to call functions. It works fine, but I prefer to put it in my settings file. If I do so, though, I will get errors from the settings file saying that there are no functions by that name(even though I'm not calling them at the time). Any workarounds?
def callfunct(id, time):
#stuff here
def callotherfunct(id, time):
#stuff here
dict = {"blah blah": callfunct, "blah blah blah": callfunct, "otherblah": callotherfunct}
dict[str(nameid)](id, time)
Hope this makes sense. Also open to other ideas, but basically I have about 50 iterations of these definitions and unique names that are passed by nameid that need to call specific functions, so that's why I do it the way I do, so that I can add new names quickly. It would obviously be even quicker if I could get the dictionary into the settings file seamlessly as well.
If you try
def f_one(id, time):
pass
def f_two(id, time):
pass
d = {"blah blah":"f_one", "blah blah blah":"f_one", "otherblah","f_two"
locals()[d[str(nameid)]](id, time)
(replacing the dictionary initialization with just loading the config file with the string name of the functions you want to call), does that work?
If not, there needs to be a little more info: What does the config file look like, and how are you loading it?
I'm guessing the reason that the config file part isn't working is that you're trying to reference the functions directly from the config file, which shouldn't work. This is using whatever's stored in the config file and looking it up in the locals() dictionary (if you're in a function, you'll have to use globals() instead)
You could initialise the dictionary with the looked up function only when you attempt to access it:
d = {}
d.setdefault('func1', globals()['func1'])()

lazy load dictionary

I have a dictionary called fsdata at module level (like a global variable).
The content gets read from the file system. It should load its data once on the first access. Up to now it loads the data during importing the module. This should be optimized.
If no code accesses fsdata, the content should not be read from the file system (save CPU/IO).
Loading should happen, if you check for the boolean value, too:
if mymodule.fsdata:
... do_something()
Update: Some code already uses mymodule.fsdata. I don't want to change the other places. It should be variable, not a function. And "mymodule" needs to be a module, since it gets already used in a lot of code.
I think you should use Future/Promise like this https://gist.github.com/2935416
Main point - you create not an object, but a 'promise' about object, that behave like an object.
You can replace your module with an object that has descriptor semantics:
class FooModule(object):
#property
def bar(self):
print "get"
import sys
sys.modules[__name__] = FooModule()
Take a look at http://pypi.python.org/pypi/apipkg for a packaged approach.
You could just create a simple function that memoizes the data:
fsdata = []
def get_fsdata:
if not fsdata:
fsdata.append(load_fsdata_from_file())
return fsdata[0]
(I'm using a list as that's an easy way to make a variable global without mucking around with the global keyword).
Now instead of referring to module.fsdata you can just call module.get_fsdata().

load python code at runtime

I would like to load a .py file at runtime. This .py file is basically a config file with the following format:
var1=value
var2=value
predicate_function=func line : <return true or false>
Once this file is loaded, I would like to be able to access var1, var2 and predicate_function. For each line, I'll pass it to the predicate function, and if it returns false, I'll ignore it.
In any case, I'm not sure how to load a python file at runtime and access its variables.
Clarification: there may be any number of these config files that I need to pass to the main program and I won't know their names until runtime. Google tells me I should use __import__. I'm not sure how to correctly use that method and then access the variables of the imported file.
As written in the python official documentation, if you just want to import a module by name, you can look it up in the sys.modules dictionary after using __import__.
Supposing your configuration is in myproject.mymodule, you would do like that :
module_name = 'myproject.mymodule'
import sys
__import__(module_name)
mymodule = sys.modules[module_name]
# Then you can just access your variables and functions
print mymodule.var1
print mymodule.var2
# etc...
You can also use the return value of __import__ statement but you will have to understand fully how python works with namespaces and scopes.
You just need to be able to dynamically specify the imports and then dynamically get at the variables.
Let's say your config file is bar.py and looks like this:
x = 3
y = 4
def f(x): return (x<4)
Then your code should look like this:
import sys
# somehow modnames should be a list of strings that are the names of config files
#
# you can do this more dynamically depending on what you're doing
modnames = ['bar']
for modname in modnames:
exec('import %s' % modname)
for modname in modnames:
mod = sys.modules[modname]
for k in mod.__dict__:
if k[:2] != '__':
print modname, k, mod.__dict__[k]
I get this output:
bar f <function f at 0x7f2354eb4cf8>
bar x 3
bar y 4
Then you at least have all the variables and functions. I didn't quite get what you wanted from the predicate functions, but maybe you can get that on your own now.
To access another Python module, you import it. execfile has been mentioned by a couple people, but it is messy and dangerous. execfile clutters your namespace, possibly even messing up the code you are running. When you want to access another Python source file, use the import statement.
Even better would be not to use a Python file for configuration at all, but rather to use the builtin module ConfigParser or a serialization format like JSON. This way your configuration files don't allow execution of arbitrary (possibly malicious) code, doesn't require people to know Python to configure your program, and can easily be altered programatically.
If the imported module is on the regular search path, you can use __import__.
If you need to load the module from an arbitrary path in the filesystem, use imp.load_module.
Be sure to consider the security implications of loading arbitrary user-specified code.
In Python 2.*, execfile works (I recommend passing a specific dictionary and accessing the variables from there -- as the note in the docs says, execfile can't affect the calling function's locals() dictionary).
In Python 3.*, execfile has been removed, so do, instead:
with open('thefile.py') as f:
exec(f.read(), somedict)
Since the Python version hasn't been clearly mentioned, it is worth pointing out that the imp module has been deprecated in newer Python versions in favor of the importlib module. Example here.
I'm kinda late to the party, but I want to present an alternative answer nonetheless.
If you want to import code without affecting the global module namespace, you can create an anonymous module (using types.ModuleType) and load arbitrary code in it (using compile and exec). For instance, like this:
import types
filename = "/path/to/your/file.py"
with open(filename) as fp:
code = compile(fp.read(), filename, "exec")
config_module = types.ModuleType("<config>")
exec code in config_module.__dict__
You can then access the variables as config_module.var1, &c.
If you want to have a configuration file that will only be edited by the user when the program isn't running, just import it as a normal python file
ie.
main.py:
import config
print config.var1
config.py:
var="var12"
var2 = 100.5
try the imp module : http://docs.python.org/library/imp.html

Categories