Pythonic approach to internal API for strings - python

Question
Is there a "pythonic" (i.e. canonical, official, PEP8-approved, etc) way to re-use string literals in python internal (and external) APIs?
Background
For example, I'm working with some (inconsistent) JSON-handling code (thousands of lines) where there are various JSON "structs" we assemble, parse, etc. One of the recurring problems that comes up during code reviews is different JSON structs that use the same internal parameter names, causing confusion and eventually causing bugs to arise, e.g.:
pathPacket['src'] = "/tmp"
pathPacket['dst'] = "/home/user/out"
urlPacket['src'] = "localhost"
urlPacket['dst'] = "contoso"
These two (example) packets that have dozens of identically named fields, but they represent very different types of data. There was no code-reuse justification for this implementation. People typically use code-completion engines to get the members of the JSON struct, and this eventually leads to hard-to-debug problems down the road due to mis-typed string literals causing functional issues, and not triggering an error earlier on. When we have to change these APIs, it takes a lot of time to hunt down the string literals to find out which JSON structs use which fields.
Question - Redux
Is there a better approach to this that is common amongst members of the python community? If I was doing this in C++, the earlier example would be something like:
const char *JSON_PATH_SRC = "src";
const char *JSON_PATH_DST = "dst";
const char *JSON_URL_SRC = "src";
const char *JSON_URL_DST = "dst";
// Define/allocate JSON structs
pathPacket[JSON_PATH_SRC] = "/tmp";
pathPacket[JSON_PATH_DST] = "/home/user/out";
urlPacket[JSON_URL_SRC] = "localhost";
urlPacket[JSON_URL_SRC] = "contoso";
My initial approach would be to:
Use abc to make an abstract base class that can't be initialized as an object, and populate it with read-only constants.
Use that class as a common module throughout my project.
By using these constants, I can reduce the chance of a monkey-patching error as the symbols won't exist if mis-spelled, whereas a string literal typo can slip through code reviews.
My Proposed Solution (open to advice/criticism)
from abc import ABCMeta
class Custom_Structure:
__metaclass__ = ABCMeta
#property
def JSON_PATH_SRC():
return self._JSON_PATH_SRC
#property
def JSON_PATH_DST():
return self._JSON_PATH_DST
#property
def JSON_URL_SRC():
return self._JSON_URL_SRC
#property
def JSON_URL_DST():
return self._JSON_URL_DST

The way this is normally done is:
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
pathPacket[JSON_PATH_SRC] = "/tmp"
pathPacket[JSON_PATH_DST] = "/home/user/out"
urlPacket[JSON_URL_SRC] = "localhost"
urlPacket[JSON_URL_SRC] = "contoso"
Upper-case to denote "constants" is the way it goes. You'll see this in the standard library, and it's even recommended in PEP8:
Constants are usually defined on a module level and written in all
capital letters with underscores separating words. Examples include
MAX_OVERFLOW and TOTAL.
Python doesn't have true constants, and it seems to have survived without them. If it makes you feel more comfortable wrapping this in a class that uses ABCmeta with properties, go ahead. Indeed, I'm pretty sure abc.ABCmeta doesn't not prevent object initialization. Indeed, if it did, your use of property would not work! property objects belong to the class, but are meant to be accessed from an instance. To me, it just looks like a lot of rigamarole for very little gain.

The easiest way in my opinion to make constants is just to set them as variables in your module (and not modify them).
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
Then if you need to reference them from another module they're already namespaced for you.
>>> that_module.JSON_PATH_SRC
'src'
>>> that_module.JSON_PATH_DST
'dst'
>>> that_module.JSON_URL_SRC
'src'
>>> that_module.JSON_URL_DST
'dst'

The simplest way to create a bunch of constants is to place them into a module, and import them as necessary. For example, you could have a constants.py module with
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
Your code would then do something like
from constants import JSON_URL_SRC
...
urlPacket[JSON_URL_SRC] = "localhost"
If you would like a better defined grouping of the constants, you can either stick them into separate modules in a dedicated package, allowing you to access them like constants.json.url.DST for example, or you could use Enums. The Enum class allows you to group related sets of constants into a single namespace. You could write a module constants.py like this:
from enum import Enum
class JSONPath(Enum):
SRC = 'src'
DST = 'dst'
class JSONUrl(Enum):
SRC = 'src'
DST = 'dst'
OR
from enum import Enum
class JSON(Enum):
PATH_SRC = 'src'
PATH_DST = 'dst'
URL_SRC = 'src'
URL_DST = 'dst'
How exactly you separate your constants is up to you. You can have a single giant enum, one per category or something in between. You would access the in your code like this:
from constants import JSONURL
...
urlPacket[JSONURL.SRC.value] = "localhost"
OR
from constants import JSON
...
urlPacket[JSON.URL_SRC.value] = "localhost"

Related

Use Python for Creating JSON

I want to use Python for creating JSON.
Since I found no library which can help me, I want to know if it's possible to inspect the order of the classes in a Python file?
Example
# example.py
class Foo:
pass
class Bar:
pass
If I import example, I want to know the order of the classes. In this case it is [Foo, Bar] and not [Bar, Foo].
Is this possible? If "yes", how?
Background
I am not happy with yaml/json. I have the vague idea to create config via Python classes (only classes, not instantiation to objects).
Answers which help me to get to my goal (Create JSON with a tool which is easy and fun to use) are welcome.
The inspect module can tell the line numbers of the class declarations:
import inspect
def get_classes(module):
for name, value in inspect.getmembers(module):
if inspect.isclass(value):
_, line = inspect.getsourcelines(value)
yield line, name
So the following code:
import example
for line, name in sorted(get_classes(example)):
print line, name
Prints:
2 Foo
5 Bar
First up, as I see it, there are 2 things you can do...
Continue pursuing to use Python source files as configuration files. (I won't recommend this. It's analogous to using a bulldozer to strike a nail or converting a shotgun to a wheel)
Switch to something like TOML, JSON or YAML for configuration files, which are designed for the job.
Nothing in JSON or YAML prevents them from holding "ordered" key-value pairs. Python's dict data type is unordered by default (at least till 3.5) and list data type is ordered. These map directly to object and array in JSON respectively, when using the default loaders. Just use something like Python's OrderedDict when deserializing them and voila, you preserve order!
With that out of the way, if you really want to use Python source files for the configuration, I suggest trying to process the file using the ast module. Abstract Syntax Trees are a powerful tool for syntax level analysis.
I whipped a quick script for extracting class line numbers and names from a file.
You (or anyone really) can use it or extend it to be more extensive and have more checks if you want for whatever you want.
import sys
import ast
import json
class ClassNodeVisitor(ast.NodeVisitor):
def __init__(self):
super(ClassNodeVisitor, self).__init__()
self.class_defs = []
def visit(self, node):
super(ClassNodeVisitor, self).visit(node)
return self.class_defs
def visit_ClassDef(self, node):
self.class_defs.append(node)
def read_file(fpath):
with open(fpath) as f:
return f.read()
def get_classes_from_text(text):
try:
tree = ast.parse(text)
except Exception as e:
raise e
class_extractor = ClassNodeVisitor()
li = []
for definition in class_extractor.visit(tree):
li.append([definition.lineno, definition.name])
return li
def main():
fpath = "/tmp/input_file.py"
try:
text = read_file(fpath)
except Exception as e:
print("Could not load file due to " + repr(e))
return 1
print(json.dumps(get_classes_from_text(text), indent=4))
if __name__ == '__main__':
sys.exit(main())
Here's a sample run on the following file:
input_file.py:
class Foo:
pass
class Bar:
pass
Output:
$ py_to_json.py input_file.py
[
[
1,
"Foo"
],
[
5,
"Bar"
]
]
If I import example,
If you're going to import the module, the example module to be on the import path. Importing means executing any Python code in the example module. This is a pretty big security hole - you're loading a user-editable file in the same context as the rest of the application.
I'm assuming that since you care about preserving class-definition order, you also care about preserving the order of definitions within each class.
It is worth pointing out that is now the default behavior in python, since python3.6.
Aslo see PEP 520: Preserving Class Attribute Definition Order.
(Moving my comments to an answer)
That's a great vague idea. You should give Figura a shot! It does exactly that.
(Full disclosure: I'm the author of Figura.)
I should point out the order of declarations is not preserved in Figura, and also not in json.
I'm not sure about order-preservation in YAML, but I did find this on wikipedia:
... according to the specification, mapping keys do not have an order
It might be the case that specific YAML parsers maintain the order, though they aren't required to.
You can use a metaclass to record each class's creation time, and later, sort the classes by it.
This works in python2:
class CreationTimeMetaClass(type):
creation_index = 0
def __new__(cls, clsname, bases, dct):
dct['__creation_index__'] = cls.creation_index
cls.creation_index += 1
return type.__new__(cls, clsname, bases, dct)
__metaclass__ = CreationTimeMetaClass
class Foo: pass
class Bar: pass
classes = [ cls for cls in globals().values() if hasattr(cls, '__creation_index__') ]
print(sorted(classes, key = lambda cls: cls.__creation_index__))
The standard json module is easy to use and works well for reading and writing JSON config files.
Objects are not ordered within JSON structures but lists/arrays are, so put order dependent information into a list.
I have used classes as a configuration tool, the thing I did was to derive them from a base class which was customised by the particular class variables. By using the class like this I did not need a factory class. For example:
from .artifact import Application
class TempLogger(Application): partno='03459'; path='c:/apps/templog.exe'; flag=True
class GUIDisplay(Application): partno='03821'; path='c:/apps/displayer.exe'; flag=False
in the installation script
from .install import Installer
import app_configs
installer = Installer(apps=(TempLogger(), GUIDisplay()))
installer.baseline('1.4.3.3475')
print installer.versions()
print installer.bill_of_materials()
One should use the right tools for the job, so perhaps python classes are not the right tool if you need ordering.
Another python tool I have used to create JSON files is Mako templating system. This is very powerful. We used it to populate variables like IP addresses etc into static JSON files that were then read by C++ programs.
I'm not sure if this is answers your question, but it might be relevant. Take a look at the excellent attrs module. It's great for creating classes to use as data types.
Here's an example from glyph's blog (creator of Twisted Python):
import attr
#attr.s
class Point3D(object):
x = attr.ib()
y = attr.ib()
z = attr.ib()
It saves you writing a lot of boilerplate code - you get things like str representation and comparison for free, and the module has a convenient asdict function which you can pass to the json library:
>>> p = Point3D(1, 2, 3)
>>> str(p)
'Point3D(x=1, y=2, z=3)'
>>> p == Point3D(1, 2, 3)
True
>>> json.dumps(attr.asdict(p))
'{"y": 2, "x": 1, "z": 3}'
The module uses a strange naming convention, but read attr.s as "attrs" and attr.ib as "attrib" and you'll be okay.
Just touching the point about creating JSON from python. there is an excellent library called jsonpickle which lets you dump python objects to json. (and using this alone or with other methods mentioned here you can probably get what you wanted)

How does one get an Enum's members into the global namespace?

Python now has an Enum type (new in 3.4 with PEP 435, and alse backported), and while namespaces are a good thing, sometimes Enums are used more like constants, and the enum members should live in the global (er, module) namespace.
So instead of:
Constant(Enum):
PI = 3.14
...
area = Constant.PI * r * r
I can just say:
area = PI * r * r
Is there an easy way to get from Constant.PI to just PI?
The officially supported method is something like this:
globals().update(Constant.__members__)
This works because __members__ is the dict-like object that holds the names and members of the Enum class.
I personally find that ugly enough that I usually add the following method to my Enum classes:
#classmethod
def export_to(cls, namespace):
namespace.update(cls.__members__)
and then in my top level code I can say:
Constant.export_to(globals())
Note: exporting an Enum to the global namespace only works well when the module only has one such exported Enum. If you have several it is better to have a shorter alias for the Enum itself, and use that instead of polluting the global namespace:
class Constant(Enum):
PI = ....
C = Constant
area = C.PI * r * r
FWIW — this is more of a comment rather than an answer — below is the beginning of a function from some old code I wrote which implements it's own named int-like enumerated-value objects which were added to the global namespace by default (there's no named container Enum class involved). However it's doing something similar what's shown in your own answer, so I think it a good overall approach because it's worked well for me.
def Enum(names, values=None, namespace=None):
"""Function to assign values to names given and add them to a
namespace. Default values are a sequence of integers starting at
zero. Default namespace is the caller's globals."""
if namespace is None:
namespace = sys._getframe(1).f_globals # caller's globals
pairs = _paired(names, values)
namespace.update(pairs) # bind names to cooresponding named numbers
. . .
The point being, as far as implementing something for the current Enum class module goes, I'd suggest adding something like it or the def export_to() method shown in your own answer to the Enum base class in the next Python release so it's available automatically.

Dynamic Python Class Definition in SQLAlchemy

I'm creating a backend application with SQLAlchemy using the declarative base. The ORM requires about 15 tables each of which maps to a class object in SQLAlchemy. Because these class objects are all defined identically I thought a factory pattern could produce the classes more concisely. However, these classes not only have to be defined, they have to be assigned to unique variable names so they can be imported and used through the project.
(Sorry if this question is a bit long, I updated it as I better understood the problem.)
Because we have so many columns (~1000) we define their names and types in external text files to keep things readable. Having done that one way to go about declaring our models is like this:
class Foo1(Base):
__tablename___ = 'foo1'
class Foo2(Base):
__tablename___ = 'foo2'
... etc
and then I can add the columns by looping over the contents of the external text file and using the setattr() on each class definition.
This is OK but it feels too repetitive as we have about 15 tables. So instead I took a stab at writing a factory function that could define the classes dynamically.
def orm_factory(class_name):
class NewClass(Base):
__tablename__ = class_name.lower()
NewClass.__name__ = class_name.upper()
return NewClass
Again I can just loop over the columns and use setattr(). When I put it together it looks like this:
for class_name in class_name_list:
ORMClass = orm_factory(class_name)
header_keyword_list = get_header_keyword_list(class_name)
define_columns(ORMClass, header_keyword_list)
Where get_header_keyword_list gets the column information and define_columns performs the setattr() assignment. When I use this and run Base.metadata.create_all() the SQL schema get generated just fine.
But, when I then try to import these class definitions into another model I get an error like this:
SAWarning: The classname 'NewClass' is already in the registry of this declarative base, mapped to <class 'ql_database_interface.IR_FLT_0'>
This, I now realize makes total sense based on what I learned yesterday: Python class variable name vs __name__.
You can address this by using type as a class generator in your factory function (as two of the answers below do). However, this does not solve the issue of being able to import the class because the while the classes are dynamically constructed in the factory function the variable the output of that function is assigned to is static. Even if it were dynamic, such as a dictionary key, it has to be in the module name space in order to be imported from another module. See my answer for more details.
This sounds like a sketchy idea. But it's fun to solve so here is how you make it work.
As I understand it, your problem is you want to add dynamically created classes to a module. I created a hack using a module and the init.py file.
dynamicModule/__init__.py:
import dynamic
class_names = ["One", "Two", "Three"]
for new_name in class_names:
dynamic.__dict__['Class%s' % new_name] = type("Class%s" % (new_name), (object,), {'attribute_one': 'blah'})
dynamicModule/dynamic.py:
"""Empty file"""
test.py:
import dynamicModule
from dynamicModule import dynamic
from dynamicModule.dynamic import ClassOne
dynamic.ClassOne
"""This all seems evil but it works for me on python 2.6.5"""
__init__.py:
"""Empty file"""
[Note, this is the original poster]
So after some thinking and talking to people I've decided that that ability to dynamically create and assign variables to class objects in the global name space in this way this just isn't something Python supports (and likely with good reason). Even though I think my use case isn't too crazy (pumping out predefined list of identically constructed classes) it's just not supported.
There are lots of questions that point towards using a dictionary in a case like this, such as this one: https://stackoverflow.com/a/10963883/1216837. I thought of something like that but the issue is that I need those classes in the module name space so I can import them into other modules. However, adding them with globals() like globals()['MyClass'] = class_dict['MyClass'] seems like it's getting pretty out there and my impression is people on SO frown on using globals() like this.
There are hacks such as the one suggested by patjenk but at a certain point the obfuscation and complexity out weight the benefits of the clarity of declaring each class object statically. So while it seems repetitive I'm just going to write out all the class definitions. Really, this end up being pretty concise/maintainable:
Class1 = class_factory('class1')
Class2 = class_factory('class2')
...

COM objects (arcobjects) in Python

I'm new to OOP and trying to use COM objects (arcobjects) in Python. Program is GIS related, but I did not get any answers on GIS.SE, so I am asking here. Below is piece of my code. I am stuck at the end where I receive iFrameElement. ESRI describe it as member/interface of Abstract Class, which can not create objects itself. I need to pass information contained in it to object in its CoClass (MapFrame).
Any suggestions how to do this?
Also where can I find name conventions for objects in Python? There are p, i as prefix and I am not sure where they come from.
from comtypes.client import CreateObject, GetModule
import arcpy
def CType(obj, interface):
"""Casts obj to interface and returns comtypes POINTER or None"""
try:
newobj = obj.QueryInterface(interface)
return newobj
except:
return None
def NewObj(MyClass, MyInterface):
"""Creates a new comtypes POINTER object where\n\
MyClass is the class to be instantiated,\n\
MyInterface is the interface to be assigned"""
from comtypes.client import CreateObject
try:
ptr = CreateObject(MyClass, interface=MyInterface)
return ptr
except:
return None
esriCarto = GetModule(r"C:\Program Files (x86)\ArcGIS\Desktop10.0\com\esriCarto.olb")
esriCartoUI = GetModule(r"C:\Program Files (x86)\ArcGIS\Desktop10.0\com\esriCartoUI.olb")
esriMapUI = GetModule(r"C:\Program Files (x86)\ArcGIS\Desktop10.0\com\esriArcMapUI.olb")
esriFrame = GetModule(r"C:\Program Files (x86)\ArcGIS\Desktop10.0\com\esriFramework.olb")
arcpy.SetProduct('Arcinfo')
pApp = NewObj(esriFrame.AppROT, esriFrame.IAppROT).Item(0)
pDoc = pApp.Document
pMxDoc = CType(pDoc, esriMapUI.IMxDocument)
pLayout = pMxDoc.PageLayout
pGraphContLayout = CType(pLayout, esriCarto.IGraphicsContainer)
iFrameElement = pGraphContLayout.FindFrame(pMxDoc.ActiveView.FocusMap)
As far as I understand, iFrameElement is an interface of an abstract class from which I need to inherit attributes (pointer) to MapFrame object. How do I do that? How do it get to object with IMapGrids interface? Any suggestions?
IFrameElement is an interface, so you can't create an instance of it per se. This interface is implemented by various classes, including MapFrame, which means (in basic terms) that an instance of any of those objects 'behaves' like an IFrameElement. So if you get an IFrameElement from IGraphicsContainer.FindFrame(), you can pass it to something else that expects an IFrameElement without having to find out what the actual type of the object is.
I would suggest reading up on what Interfaces mean in OOP, because ESRI's code uses them a lot.
On naming convetions - there is no hard & fast rule on what to name your variables.
By the looks of your code, the p refers to an object with a distinct type, and i refers to an object defined only by an interface. But on that note, calling a variable by the same name as the interface it's referencing (except with a lower-case 'i') is a bad way to do things, and will lead to confusion. (IMO)
Edit:
To answer your final question (sorry, I missed it originally):
If pGraphContLayout.FindFrame() returns an object of type MapFrame (and there is no guarantee that it does) then you should be able to simply cast it across to IMapGrids:
pGraphContLayout = CType(pLayout, esriCarto.IGraphicsContainer)
pFrame = pGraphContLayout.FindFrame(pMxDoc.ActiveView.FocusMap)
pGrids = CType(pFrame, IMapGrids)
It sounds like you may be getting confused by Python's abstract base classes, which seem to serve the purpose of interfaces...? This thread is useful: Difference between abstract class and interface in Python

Dynamically call python-class defined in config-file

I have a config.cfg which I parse using the python-module ConfigParser. In one section I want to configure assignments of the form fileextension : ClassName. Parsing results in the following dictionary:
types = {
"extension1" : "ClassName1",
"extension2" : "ClassName2"
}
EDIT: I know I can now do:
class_ = eval(types[extension])
foo = class()
But I was given to understand that eval is evil and should not be used.
Do you know a nicer way to dynamically configure which file-extension results in which class?
You could use eval, if the class name in the config file exactly matches the class names in your python code (and if the classes are in scope!), but ..... eval is evil (a coincidence that there's only one letter difference? I think not!)
A safer way to do it would be to add an extra dictionary that maps from configuration class name to python class name. I'd do this because:
configuration files don't have to know about your code's names
can change config files without changing code and vice versa
it avoids eval
So it'd look something like:
mappingDict = {"ClassName1" : MyPythonClass1,
"ClassName2" : MyPythonClass2, ... }
# keys are strings, values are classes
Then you perform a lookup using the value from the config file:
myClassName = types['extension1']
myClass = mappingDict[myClassName]
If module is the module the class named classname lives in, you can get the class object using
class_ = getattr(module, classname)
(If the class lives in the main module, use import __main__ to get a module object for this module.)
To look up the class in the current module's global scope, use
class_ = globals()[classname]
I think a static dictionary as in Matt's answer is the better solution.

Categories