Generate variable definition code from runtime data structure - python

Say I have a dict at hand at runtime, is there an easy way to create code that defines the dict. For example it should output the string
"d = {'string_attr1' : 'value1', 'bool_attr1': True}"
Of course it would be possible to write a converter function by hand, which iterates over the key-value pairs and puts together the string. Would still require to handle special cases to decide if values have to be quoted or not, etc.
More generally: Is there a built in way or a library to generate variable declarations from runtime data structures?
Context: I would like to use a list of dicts as input for a code generator. The content of the dicts would be queried from an SQL database. I don't want to tightly couple code generation to the querying of the SQL database, so I think it would be convenient to go with generating a python source file defining a list of dictionaries, which can be used as an input to the code generator.

>>> help(repr)
Help on built-in function repr in module __builtin__:
repr(...)
repr(object) -> string
Return the canonical string representation of the object.
For most object types, eval(repr(object)) == object.

Related

python mapping with syntax-checkable keys

I need to create a mapping of keys (strings, all suitable python identifiers) to values in python (3.9).
All keys and values are constant and known at creation time and i want to make sure, that every single key has an associated value.
1. dict
The first idea that comes to mind for this would be using a dictionary, which comes with the big problem that keys (in my case) would be strings.
That means i have to retype the key each time a value is accessed manually in a string literal, so IDEs and type checkers can't spot typos, suggest key names in autocomplete and i can't use their utility functions to rename or find usages of a key.
1.5 dict with constant variable keys
the naive solution for this would be to create a constant for each key or an enum, which i don't think is a good solution. Not only is at least one name-lookup added to each access, it also means that the key definition and the value assignment are separated, which can lead to keys that don't have a value assigned to them.
2. enum
This leads to the idea to skip the dict and use an enum to associate the keys directly with the values. Enums are conveniently supported by syntax-checkers, auto completion an the likes, as they support both attribute reference via "dot-notation" and subscriptions via "[]".
However an enum has the big disadvantage that it requires all keys/Enum-Members to have unique values and keys violating this rule will automatically be converted to aliases which makes outputs very confusing.
I already thought about copying the Enum-Code and removing the unwanted bits, but this seems to be a lot of effort for such a basic problem.
question:
So basically, what i'm looking for is a pythonic, neat and concise way to define a (potentially immutable) mapping from string keys to arbitrary values which supports the following:
iterable (over keys)
keys with identical values don't interfere with each other
keys are required to have an associated value
keys are considered by syntax-checkers, auto-completion, refactorings, etc.
The preferred way of using it would be to define it in a python source file but it would be a nice bonus, if the solution supported easy means to write the data to a text file (json format, or ini or similar) and to create a new instance from such a file.
How would you do that and why would you choose a specific solution?
For the first part, I would use aenum1, which has a noalias setting (so duplicate values can exist with distinct names):
from aenum import NoAliasEnum
class Unqiue(NoAliasEnum):
first = 1
one = 1
and in use:
>>> Unique.first
<Unique.first: 1>
>>> Unique.one
<Unique.one: 1>
>>> # name lookup still works
>>> Unique['one']
<Unique.one: 1>
>>> # but value lookups do not
>>> Unique(1)
Traceback (most recent call last):
...
TypeError: NoAlias enumerations cannot be looked up by value
For the second part, decide which you want:
read and create enums from a file
create enum in Python and write to a file
Doing both doesn't seem to make a lot of sense.
To create from a file you can use my JSONEnumMeta answer.
To write to a file you can use my share enums with arduino answer (after adapting the __init_subclass__ code).
The only thing I'm not certain of is the last point of syntax-checker and auto-completion support.
1 Disclosure: I am the author of the Python stdlib Enum, the enum34 backport, and the Advanced Enumeration (aenum) library.

How to dump all the variables in a file?

Is there an easy and more or less standard way to dump all the variables into a file, something like stacktrace but with the variables names and values? The ones that are in locals(), globals() and maybe dir().
I can't find an easy way, here's my code for "locals()" which doesn't work because the keys can be of different types:
vars1 = list(filter(lambda x: len(x) > 2 and locals()[x][:2] != "__", locals()))
And without filtering, when trying to dump the variables I get an error:
f.write(json.dumps(locals()))
# =>
TypeError: <filter object at 0x7f9bfd02b710> is not JSON serializable
I think there must be something better that doing it manually.
To start, in your non-working example, you don't exactly filter the keys (which should normally only be strings even if it's not technically required); locals()[x] is the values.
But even if you did filter the keys in some way, you don't generally know that all of the remaining values are JSON serialisable. Therefore, you either need to filter the values to keep only types that can be mapped to JSON, or you need a default serialiser implementation that applies some sensible serialisation to any value. The simplest thing would be to just use the built-in string representation as a fall-back:
json.dumps(locals(), default=repr)
By the way, there's also a more direct and efficient way of dumping JSON to a file (note the difference between dump and dumps):
json.dump(locals(), f, default=repr)

Python + JSON serialization for MD5 hash - how can I guarantee that two equivalent objects will serialize to exactly the same string?

I need to take an md5 hash of the contents of a dict or list and I want to ensure that two equivalent structures will give me the same hash result.
My approach thus far has been to carefully define the order of the structures and to sort the various lists and dictionaries that they contain prior to running them through json.dumps().
As my structures get more complex, however, this is becoming laborious and error prone, and in any case I was never sure it was working 100% of the time or just 98% of the time.
Just curious if anyone has a quick solution for this? Is there an option I can set in the json module to sort objects completely? Or some other trick I can use to do a complete comparison of the information in two structures and return a hash guaranteed to be unique to it?
I only need the strings (and then the md5) to come out the same when I serialize the objects -- I'm not concerned about deserializing for this use case.
JSON output by default is non-deterministic simply because the results of __hash__ are salted for str (key values for typical JSON objects) to prevent a DoS vector (see the notes in documentation). For this reason you need to call json.dumps with sort_keys set to True.
>>> import json
>>> d = {'this': 'This word', 'that': 'That other word', 'other': 'foo'}
>>> json.dumps(d)
'{"this": "This word", "other": "foo", "that": "That other word"}'
>>> json.dumps(d, sort_keys=True)
'{"other": "foo", "that": "That other word", "this": "This word"}'
For objects that end up serialized into a list (i.e. list, tuple) you will need to ensure the ordering is done in the expected way because by definition lists are not ordered in any particular way (ordering of the elements in those collections will be persistent in the position they have been placed/modified by the program itself).

Convert string representation of list of objects back to list in python

I have a list of objects, that has been stringified:
u'[<object: objstuff1, objstuff2>, <object: objstuff1, objstuff2>]'
I want to convert this back into a list:
[<object: objstuff1, objstuff2>, <object: objstuff1, objstuff2>]
I've tried using ast.literal_eval(), but unfortunately, it doesn't seem to work if the elements are objects, and I get a SyntaxError.
Is there any way I can reconvert my string representation of the list of objects back into a list?
You need to have a look at the pickle module to do this.
Basically, dump your objects using pickle.dumps, and load them back using pickle.loads.
ast.literal_eval doesn't work obviously, because there is a lot of information related to the objects (like attributes, and values) which is simply not captured in that string. Also note that you will be able to resurrect only the pickled data, if all you have are those string representations right now, you won't be able to create the objects back from them because of the information loss.

Python: Linking to a dictionary through a text string

I'm trying to create a program module that contains data structures (dictionaries) and text strings that describe those data structures. I want to import these (dictionaries and descriptions) into a module that is feeding a GUI interface. One of the displayed lines is the contents contained in the first dictionary with one field that contains all possible values contained in another dictionary. I'm trying to avoid 'hard-coding' this relationship and would like to pass a link to the second dictionary (containing all possible values) to the string describing the first dictionary. An abstracted example would be:
dict1 = {
"1":["dog","cat","fish"],
"2":["alpha","beta","gamma","epsilon"]
}
string="parameter1,parameter2,dict1"
# Silly example starts here
#
string=string.split(",")
print string[2]["2"]
(I'd like to get: ["alpha","beta","gamma","epsilon"]
But of course this doesn't work
Does anyone have a clever solution to this problem?
Generally, this kind of dynamic code execution is a bad idea. it leads to very difficult to read and maintain code. However, if you must, you can use globals for this:
globals()[string[2]]["2"]
A better solution would be to put dict1 into a dictionary in the first place:
dict1 = ...
namespace = {'dict1': dict1}
string = ...
namespace[string[2]]["2"]

Categories