Dynamic Python Class Definition in SQLAlchemy - python

I'm creating a backend application with SQLAlchemy using the declarative base. The ORM requires about 15 tables each of which maps to a class object in SQLAlchemy. Because these class objects are all defined identically I thought a factory pattern could produce the classes more concisely. However, these classes not only have to be defined, they have to be assigned to unique variable names so they can be imported and used through the project.
(Sorry if this question is a bit long, I updated it as I better understood the problem.)
Because we have so many columns (~1000) we define their names and types in external text files to keep things readable. Having done that one way to go about declaring our models is like this:
class Foo1(Base):
__tablename___ = 'foo1'
class Foo2(Base):
__tablename___ = 'foo2'
... etc
and then I can add the columns by looping over the contents of the external text file and using the setattr() on each class definition.
This is OK but it feels too repetitive as we have about 15 tables. So instead I took a stab at writing a factory function that could define the classes dynamically.
def orm_factory(class_name):
class NewClass(Base):
__tablename__ = class_name.lower()
NewClass.__name__ = class_name.upper()
return NewClass
Again I can just loop over the columns and use setattr(). When I put it together it looks like this:
for class_name in class_name_list:
ORMClass = orm_factory(class_name)
header_keyword_list = get_header_keyword_list(class_name)
define_columns(ORMClass, header_keyword_list)
Where get_header_keyword_list gets the column information and define_columns performs the setattr() assignment. When I use this and run Base.metadata.create_all() the SQL schema get generated just fine.
But, when I then try to import these class definitions into another model I get an error like this:
SAWarning: The classname 'NewClass' is already in the registry of this declarative base, mapped to <class 'ql_database_interface.IR_FLT_0'>
This, I now realize makes total sense based on what I learned yesterday: Python class variable name vs __name__.
You can address this by using type as a class generator in your factory function (as two of the answers below do). However, this does not solve the issue of being able to import the class because the while the classes are dynamically constructed in the factory function the variable the output of that function is assigned to is static. Even if it were dynamic, such as a dictionary key, it has to be in the module name space in order to be imported from another module. See my answer for more details.

This sounds like a sketchy idea. But it's fun to solve so here is how you make it work.
As I understand it, your problem is you want to add dynamically created classes to a module. I created a hack using a module and the init.py file.
dynamicModule/__init__.py:
import dynamic
class_names = ["One", "Two", "Three"]
for new_name in class_names:
dynamic.__dict__['Class%s' % new_name] = type("Class%s" % (new_name), (object,), {'attribute_one': 'blah'})
dynamicModule/dynamic.py:
"""Empty file"""
test.py:
import dynamicModule
from dynamicModule import dynamic
from dynamicModule.dynamic import ClassOne
dynamic.ClassOne
"""This all seems evil but it works for me on python 2.6.5"""
__init__.py:
"""Empty file"""

[Note, this is the original poster]
So after some thinking and talking to people I've decided that that ability to dynamically create and assign variables to class objects in the global name space in this way this just isn't something Python supports (and likely with good reason). Even though I think my use case isn't too crazy (pumping out predefined list of identically constructed classes) it's just not supported.
There are lots of questions that point towards using a dictionary in a case like this, such as this one: https://stackoverflow.com/a/10963883/1216837. I thought of something like that but the issue is that I need those classes in the module name space so I can import them into other modules. However, adding them with globals() like globals()['MyClass'] = class_dict['MyClass'] seems like it's getting pretty out there and my impression is people on SO frown on using globals() like this.
There are hacks such as the one suggested by patjenk but at a certain point the obfuscation and complexity out weight the benefits of the clarity of declaring each class object statically. So while it seems repetitive I'm just going to write out all the class definitions. Really, this end up being pretty concise/maintainable:
Class1 = class_factory('class1')
Class2 = class_factory('class2')
...

Related

Is there a way to have Python code completion work when inheriting from a variable array of classes?

I am working on a templatized and extensible python library and I want to do something like this:
registered_features = [
feature1.feature.Support,
feature2.feature.Support,
feature3.feature.Support,
]
where each of those references a class in a directory structure like this:
feature1
--feature.py
and inside of feature.py is
class Support:
def someExtendedFunctionality(self):
....
Then I inherit everything like this:
class App(*registered_features):
def __init__(self):
....
Everything works at runtime, but the issue I hit is for code completion, I get no suggestions for the "registered_features"
Is there any way I can make this work? I'm guessing code completion is a static analysis and expanding that list is a runtime operation. Is there a different way to do something like this?
I'm assuming the classes in registered_features don't change and you are using it as a static object to hold a list of classes. If not and it changes during running time, the static analysis wouldn't be able to help with code completion.
So, assuming it's a static class, you can replace it with this instead:
class RegisteredFeatures(SupportA, SupportB, SupportC):
pass
class App(RegisteredFeatures):
pass
Instead of using a list object to hold all the classes, you use a class that is doing just about the same thing. I tested this and the code completion works with Pycharm 2021.2.3

Is it bad practice to modify attributes of one module from another module?

I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...

How to log the name of the test class, if the test method resides in a class common for all tests?

I have the following project structure:
/root
/tests
common_test_case.py
test_case_1.py
test_case_2.py
...
project_file.py
...
Every test test_case_... is inherited from both unittest.TestCase and common_test_case.CommonTestCase. Class CommonTestCase contains test methods that should be executed by all the tests (though using data unique to each test, stored and accessed in self.something of the test). If some specific tests are needed for an exact test case, they are added directly to that particular class.
Currently I am working on adding logging to my tests. Among other things I would like to log the class the method was run from (since the approach above implies the same test method name for different classes). I would like to stick with the built-in logging module to achieve this.
I have tried the following LogRecord attributes:%(filename)s, %(module)s, %(pathname)s. Though, for methods defined in common_test_case.py they all return path/name to the common_test_case.py and not the test module they were actually run from.
My questions are:
Is there a way to achieve what I am trying to, using only built-in logging module?
Using some third-party/other module (I was thinking maybe some "hacky" solution with inspect)?
Is it possible to achieve (in Python) at all?
Your question appears similar to this one, and solved by:
self.id()
See the function definition here, which calls self.__class__ for the instance of the TestCase class that is instantiated. Given that you are using multiple inheritance the multiple inheritance rules from Python apply:
For most purposes, in the simplest cases, you can think of the search for attributes inherited from a parent class as depth-first, left-to-right, not searching twice in the same class where there is an overlap in the hierarchy.
Which means that common_test_case.CommonTestCase will be searched then unittest.TestCase. If there is no id function in common_test_case.CommonTestCase things should work as if it is only derived from unittest.TestCase. If you feel the need to add an id function to the CommonTestCase, something like this (if really necessary):
def id(self):
if issubclass(self,unittest.TestCase):
return super(unittest.TestCase,self).id()
The solution I've found (that does the trick, so far):
import inspect
class_called_from = inspect.stack()[1][0].f_locals['self'].__class__.__name__
I'm still wondering, though, if there is a "clearer" method, or if this is possible to achieve using logging module.
Recipes, based on West's answer (tested on Python 3.6.1):
test_name = self.id().split('.')[-1]
class_called_from = self.id().split('.')[-2]

Dynamically create Ctypes in Python

I have a file that I read from which has definitions of ctypes that are used in a separate project. I can read the file and obtain all the necessary information to create a ctype that I want in Python like the name, fields, bitfields, ctype base class (Structure, Union, Enum, etc), and pack.
I want to be able to create a ctype class from the information above. I also want these ctypes to be pickleable.
I currently have two solutions, both of which I feel like are hacks.
Solution 1
Generate a Python code object in an appropriate ctype format by hand or with the use of something like Jinja2 and then evaluate the python code object.
This solution has the downside of using eval. I always try to stay away from eval and I don't feel like this is a good place to use it.
Solution 2
Create a ctype dynamically in a function like so:
from ctypes import Structure
def create_ctype_class(name, base, fields, pack):
class CtypesStruct(base):
_fields_ = fields
_pack_ = pack
CtypesStruct.__name__ = name
return CtypesStruct
ctype = create_ctype_class('ctype_struct_name', ctypes.Structure,
[('field1', ctypes.c_uint8)], 4)
This solution isn't so bad, but setting the name of the class is ugly and the type cannot be pickled.
Is there a better way of creating a dynamic ctype class?
Note: I am using Python 2.7
Solution 2 is probably your better option, though if you're also writing such classes statically, you may want to use a metaclass to deduplicate some of that code. If you need your objects to be pickleable, then you'll need a way to reconstruct them from pickleable objects. Once you've implemented such a mechanism, you can make the pickle module aware of it with a __reduce__() method.
I would go with a variant of Solution 1. Instead of evaling code, create a directory with an __init__.py (i.e. a package), add it to your sys.path and write out an entire python module containing all of the classes. Then you can import them from a stable namespace which will make pickle happier.
You can either take the output and add it to your app's source code or dynamically recreate it and cache it on a target machine at runtime.
pywin32 uses an approach like this for caching classes generated from ActiveX interfaces.

How to find out what methods, properties, etc a python module possesses

Lets say I import a module. In order for me to make the best use of it, I would like to know what properties, methods, etc. that I can use. Is there a way to find that out?
As an example: Determining running programs in Python
In this line:
os.system('WMIC /OUTPUT:C:\ProcessList.txt PROCESS get Caption,Commandline,Processid')
Let's say I wanted to also print out the memory consumed by the processes. How do I find out if that's possible? And what would be the correct 'label' for it? (just as the author uses 'Commandline', 'ProcessId')
Similarly, in this:
import win32com.client
def find_process(name):
objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator")
objSWbemServices = objWMIService.ConnectServer(".", "root\cimv2")
colItems = objSWbemServices.ExecQuery(
"Select * from Win32_Process where Caption = '{0}'".format(name))
return len(colItems)
print find_process("SciTE.exe")
How would I make the function also print out the memory consumed, the executable path, etc.?
As for Python modules, you can do
>>> import module
>>> help(module)
and you'll get a list of supported methods (more exactly, you get the docstring, which might not contain every single method). If you want that, you can use
>>> dir(module)
although now you'd just get a long list of all properties, methods, classes etc. in that module.
In your first example, you're calling an external program, though. Of course Python has no idea which features wmic.exe has. How should it?
dir(module) returns the names of the attributes of the module
module.__dict__ is the mapping between the keys and the attributes objects themselves
module.__dict__.keys() and dir(module) are lists having the same elements, though they are not equals because the elements aren't in same order in them
it seems that help(module) iswhat you really need
Python has a build in function called dir(). I'm not sure if this is what you are referring to, but fire up a interactive python console and type:
import datetime
dir(datetime)
This should give you a list of methods, properties and submodules
#ldmvcd
Ok, excuse me, I think you are a beginner and you don't see to what fundamental notions I am refering.
Objects are Python’s abstraction for
data. All data in a Python program is
represented by objects or by relations
between objects.
http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy
I don't understand why it is called "abstraction": for me an object is something real in the machine, a series of bits organized according certain rules to represent conceptual data or functionning.
Names refer to objects. Names are
introduced by name binding operations.
Each occurrence of a name in the
program text refers to the binding of
that name established in the innermost
function block containing the use.
http://docs.python.org/reference/executionmodel.html#naming-and-binding
.
A namespace is a mapping from names to
objects. Most namespaces are currently
implemented as Python dictionaries,
but that’s normally not noticeable in
any way (except for performance), and
it may change in the future. Examples
of namespaces are: the set of built-in
names (containing functions such as
abs(), and built-in exception names);
the global names in a module; and the
local names in a function invocation.
In a sense the set of attributes of an
object also form a namespace.
http://docs.python.org/tutorial/classes.html#a-word-about-names-and-objects
.
By the way, I use the word attribute
for any name following a dot — for
example, in the expression z.real,
real is an attribute of the object z.
Strictly speaking, references to names
in modules are attribute references:
in the expression modname.funcname,
modname is a module object and
funcname is an attribute of it. In
this case there happens to be a
straightforward mapping between the
module’s attributes and the global
names defined in the module: they
share the same namespace!
http://docs.python.org/tutorial/classes.html#a-word-about-names-and-objects
.
Namespaces are created at different
moments and have different lifetimes.
http://docs.python.org/tutorial/classes.html#a-word-about-names-and-objects
.
The namespace for a module is
automatically created the first time a
module is imported. The main module
for a script is always called
main. http://docs.python.org/reference/executionmodel.html#naming-and-binding
.
Well, a Python programm is a big machine that plays with objects, references to these objects , names of these objects, and namespaces in which are binded the names and the objects , namespaces being implemented as dictionaries.
So, you're right: when I refer to keys , I refer to names being the keys in the diverse namespaces. Names are arbitrary or not , according if the objects they have been created to name are user's objects or built-in objects.
I give advise you to read thoroughly the parts
3.1. Objects , values and types
http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy
and
4.1. Naming and binding
http://docs.python.org/reference/executionmodel.html#naming-and-binding

Categories