How checking lookup depth into nested dictionary as class attribute?

How checking lookup depth into nested dictionary as class attribute? - python

I created a nested dictionary based on AttrDict found there :
Object-like attribute access for nested dictionary
I modified it to contain str commands in "leaves" that gets executed when the value is requested/written to :
commands = {'root': {'com': {'read': 'READ_CMD', 'write': 'WRITE_CMD'} } }
class AttrTest()
def __init__:
self.__dict__['attr'] = AttrDict(commands)
test = AttrTest()
data = test.attr.root.com.read # data = value read with the command
test.attr.root.com.write = data # data = value written on the com port
While it works beautifully, I'd like to :
Avoid people getting access to attr/root/com as these returns a sub-level dictonary
People accessing attr.root.com directly (through __getattribute__/__setattr__)
Currently, I'm facing the following problems :
As said, when accessing the 'trunk' of the nested dict, I get a partial dict of the 'leaves'
When accessing attr.root.com it returns {'read': 'READ_CMD', 'write': 'WRITE_CMD'}
If detecting a read I do a forward lookup and return the value, but then attr.root.com.read fails
Is it possible to know what is the final level Python will request in the "path" ?
To block access to attr/root
To read/write the value accessing attr.root.com directly (using forward lookup)
To return the needed partial dict only if attr.root.com.read or attr.root.com.write are requested
Currently I've found nothing that allows me to control how deep the lookup is expected to go.
Thanks for your consideration.

For a given attribute lookup you cannot determine how many others will follow; this is how Python works. In order to resolve x.y.z, first the object x.y needs to be retrieved before the subsequent attribute lookup (x.y).z can be performed.
What you can do however, is return a proxy object that represents the (partial) path instead of the actual underlying object which is stored in the dict. So for example if you did test.attr.com then this would return a proxy object which represents the path attr.com to-be-looked up on the test object. Only when you encounter a read or write leaf in the path, you would resolve the path and read/write the data.
The following is a sample implementation which uses an AttrDict based on __getattr__ to provide the Proxy objects (so you don't have to intercept __getattribute__):
from functools import reduce
class AttrDict(dict):
def __getattr__(self, name):
return Proxy(self, (name,))
def _resolve(self, path):
return reduce(lambda d, k: d[k], path, self)
class Proxy:
def __init__(self, obj, path):
object.__setattr__(self, '_obj', obj)
object.__setattr__(self, '_path', path)
def __str__(self):
return f"Path<{'.'.join(self._path)}>"
def __getattr__(self, name):
if name == 'read':
return self._obj._resolve(self._path)[name]
else:
return type(self)(self._obj, (*self._path, name))
def __setattr__(self, name, value):
if name != 'write' or name not in (_dict := self._obj._resolve(self._path)):
raise AttributeError(f'Cannot set attribute {name!r} for {self}')
_dict[name] = value
commands = {'root': {'com': {'read': 'READ_CMD', 'write': 'WRITE_CMD'} } }
test = AttrDict({'attr': commands})
print(f'{test.attr = !s}') # Path<attr>
print(f'{test.attr.root = !s}') # Path<attr.root>
print(f'{test.attr.root.com = !s}') # Path<attr.root.com>
print(f'{test.attr.root.com.read = !s}') # READ_CMD
test.attr.root.com.write = 'test'
test.attr.root.write = 'illegal' # raises AttributeError

Related

Python dataclass attributes monkeypatching

I have problem with my class Config, that is works as proxy between user and ini file. It can load parameters from ini files and set them to its name equivalent in dataclass. I've realized, that it I want to get some attribute with dot like Config()._BASE_DIR, it returns str value, because ConfigParser can get values as a str. My idea is to create some method, which will patch all my attributes with property and property.setter to make possible to access dataclass attributes using dot, but wrap them with annotation classes, so, for example, Config()._minAR will return not 4.0 as string but as float.
Is my idea acceptable, or do I need to do it differently?
Config code parts:
import configparser
import pathlib
from dataclasses import dataclass
from itertools import zip_longest
#dataclass
class Config:
_IGNORE_FIELDS = {'_IGNORE_FIELDS' ,'_CONF_PARSER'}
_CONF_PARSER: configparser.ConfigParser = configparser.ConfigParser()
_BASE_TABLE_FILE_SUFFIX: str = '.csv'
_BASE_DIR: pathlib.Path = pathlib.Path().absolute()
_CONF_PATH: pathlib.Path = _BASE_DIR / 'conf'
_CONF_FILE_PATH: pathlib.Path = _CONF_PATH / 'settings.ini'
_DATA_TABLE_PATH: pathlib.Path = _CONF_PATH / ('_data_table' + _BASE_TABLE_FILE_SUFFIX)
_minAR: float = 4.0
_maxAR: float = 5.0
CATCH_TIME: int = 6
def __init__(self) -> None:
self.prepare()
def check_synchronized(self) -> tuple[bool, str]:
if not self.CONF_PARSER.has_section('settings'):
return False, 'ini'
parser_config = self.CONF_PARSER['settings'].items()
python_config = {
k: v
for k, v in self.__dataclass_fields__.items()
if k not in self._IGNORE_FIELDS
}.items()
for pair_1, pair_2 in zip_longest(python_config, parser_config, fillvalue=(None, None)):
key_1, val_1 = pair_1
if key_1 is None:
return False, 'script'
key_2, val_2 = pair_2
if key_2 is None:
return False, 'ini'
if key_2 in self._IGNORE_FIELDS:
continue
if key_1.lower() != key_2.lower() or (default := str(val_1.default)) != val_2:
mode = 'ini' if default != str(getattr(self, key_1)) else 'script'
return False, mode
return True, 'both'
def updateFromIni(self):
for key, value in self.CONF_PARSER['settings'].items():
upper_key = key.upper()
if str(getattr(self, upper_key)) == value:
continue
setattr(self, upper_key, value)
def prepare(self):
self._createConfDir()
is_sync, mode = self.check_synchronized()
if is_sync:
return
if mode == 'ini' or mode == 'both':
self._writeAll()
elif mode == 'script':
self.updateFromIni()
def _writeAll(self):
if not self.CONF_PARSER.has_section('settings'):
self.CONF_PARSER.add_section('settings')
for key, field in self.__dataclass_fields__.items():
if key in self._IGNORE_FIELDS:
continue
self.CONF_PARSER.set('settings', key, str(field.default))
self._writeInFile()
def _writeInFile(self):
with open(self.CONF_FILE_PATH, 'w') as file:
self.CONF_PARSER.write(file)
def _createConfDir(self) -> None:
if not self.CONF_PATH.exists():
self.CONF_PATH.mkdir(parents=True, exist_ok=True)
def setValue(self, field, value):
if not hasattr(self, field) or field in self._IGNORE_FIELDS:
return
setattr(self, field, value)
if not isinstance(value, str):
value = str(value)
self.CONF_PARSER.set('settings', field, value)
self._writeInFile()
More context: I use dataclass with configParser to make my Config class able to do the following things:
Sync attributes with ini file (if no ini file, create it from Config structure with default values; if Config not syncronized with ini file, load from ini, and write to ini, it ini-file has wrong structure, or some values are incorrect) to avoid the situation, when user accidentally delete ini file;
Set and Get all existing values in config from any part of my program (it is PyQt6 application);
Save it state from one session (application run) to another.
So, I had no idea, what other structure of config class I should have used, except for this. If you have better idea for synchronizable config, tell me.

I've discovered, that only one change, that I need to make my Config class make custom dot access to attributes, is to write custom magic method __getattribute__ in my class.
result:
import configparser
import pathlib
from dataclasses import dataclass
from itertools import zip_longest
from typing import Any
ACCESS_FIELDS = {
'BASE_TABLE_FILE_SUFFIX', 'BASE_DIR', 'CONF_PATH', 'CONF_FILE_PATH',
'DATA_TABLE_PATH', 'minAR', 'CATCH_TIME'
}
class Config:
# some code ...
def __getattribute__(self, __name: str) -> Any:
if __name == 'ACCESS_FIELDS':
return ACCESS_FIELDS
attr = super().__getattribute__(__name)
if __name in ACCESS_FIELDS:
_type = self.__annotations__[__name]
return _type(attr)
return attr
# other code ...
I created variable with accessed fields not in class body, because in other cases, if I get ACCESS_FIELDS by using Config.ACCESS_FIELDS or self.ACCESS_FIELDS, it will call __getattrubute__ method again and cause recursion error.
Basically, I got all what I need by using this solution, but I still has problem with setValue method. I've discovered, that __setattr__ overriden method works not so good with __getattribute__ overriden method in my class (it cause recursion error too). Probably, I'll restructure my Config class, but not now.

Alternate Values for MongoDB fields setup via Mongoengine

I have a somewhat different need/requirement in how I use mongoengine and wanted to share the below to get any ideas on how to deal with it.
My requirement is that for some fields in a document, I want to be able to keep two values. One is a system-default value and the other a user-specified value. When accessing this field, it should return back a value in the following manner:
if a user-specified value exists, return that OR
if a user-specified value doesn’t exist, return the default
I can not use the default attribute that mongoengine already provides because:
It is applied to the mongoengine class model rather than the document
The default value would need to be updated based on system changes in newer versions of the software
There would not be a memory of what the default is at the document level so it would be difficult to determine if the default needs to be changed for certain documents only
I thought of creating something called an AlternateValueField, which is based off a MapField. Below is the code for this new field type:
import mongoengine
class AlternateValueField(mongoengine.MapField):
def __init__(self, field=None, *args, **kwargs):
self.allowed_keys = (
'_active', # Active Value - changeable by user
'_default', # Default Value - provided by system
)
# Remove the field's default since it now gets handled
# as _default
self.kwargs_default = kwargs.pop('default', None)
super().__init__(field=field, *args, **kwargs)
def __get__(self, instance, owner):
result = super().__get__(instance, owner)
if isinstance(result, dict):
if '_active' in result.keys():
return result['_active']
if '_default' in result.keys():
return result['_default']
return result
def __set__(self, instance, value):
instance_dict = instance._data.get(self.name)
if isinstance(value, dict):
if isinstance(instance_dict, dict):
# To keep the order (_active followed by _default)
new_value = {}
for key in self.allowed_keys:
if (key in instance_dict.keys()
and key not in value.keys()):
new_value[key] = instance_dict[key]
elif key in value.keys():
new_value[key] = value[key]
value = new_value
elif value is not None:
new_value = {
'_active': value,
}
if isinstance(instance_dict, dict):
if '_default' in instance_dict.keys():
new_value['_default'] = instance_dict['_default']
value = new_value
else:
if self.required and self.default is not None:
new_value = {
'_default': self.kwargs_default,
}
value = new_value
self._check_for_allowed_keys(value)
super().__set__(instance, value)
def _check_for_allowed_keys(self, value):
if value is None:
return
for key in value.keys():
if key in self.allowed_keys:
continue
err_msg = (
f"Key '{key}' is not allowed."
f" Allowed keys: {self.allowed_keys}"
)
raise KeyError(err_msg)
def validate(self, value, **kwargs):
super().validate(value, **kwargs)
self._check_for_allowed_keys(value)
A document example that uses it:
class MySchedule(mongoengine.Document):
my_name = AlternateValueField(mongoengine.StringField())
sched1 = mongoengine.StringField()
Using this model to create a document:
>>> s2 = MySchedule(my_name={'_active': 'one', '_default': 'two'}, sched1='yes')
>>> s2.validate()
>>> s2.my_name # returns just the value (of active or default) instead of dict
'one'
>>> s2.to_json()
'{"my_name": {"_active": "one", "_default": "two"}, "sched1": "yes"}'
>>> s2.to_mongo()
SON([('my_name', {'_active': 'one', '_default': 'two'}), ('sched1', 'yes')])
>>> s2._fields
{'my_name': <__main__.AlternateValueField object at 0x7fd37fbaf1c0>, 'sched1': <mongoengine.fields.StringField object at 0x7fd37f968250>, 'id': <mongoengine.base.fields.ObjectIdField object at 0x7fd378225640>}
>>> s2._data
{'id': None, 'my_name': {'_active': 'one', '_default': 'two'}, 'sched1': 'yes'}
>>> s2.my_name = 'anotherone' # works due to the __set__() implementation above
>>> s2.my_name
'anotherone'
>>> s2._data
{'id': None, 'my_name': {'_active': 'anotherone', '_default': 'two'}, 'sched1': 'yes'}
>>> s2.my_name['_active'] = 'blah' # won't work now
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
>>> s2._data['my_name']['_active'] = 'blah' # will work
>>> s2.my_name
'blah'
>>> s2._data
{'id': None, 'my_name': {'_active': 'blah', '_default': 'two'}, 'sched1': 'yes'}
My issue is, I would like to be able to use these cases:
s2.my_name # works - returns the _active value or _default value, not dict of them
s2.my_name = 'something' # works - via the __set__() implementation
s2.my_name['_active'] = 'blah' # causes error as shown above
The 3rd case above is what I'd like to be able to do without causing an error.
I tried the following in an attempt to get around it:
overriding __get__(), which doesn't work because it's job is done prior to recognizing there is a getitem call
thought about overriding BaseDict.__getitem__() to recognize this, but that won't work for when there is no getitem call
I would need something after the AlternateField's __get__() call and before BaseDict's __getitem__() call to recognize whether a getitem is following it or not.
Any thoughts on best way to get around this? Obviously, the above may just all be a bad idea. Any alternatives (pardon the pun :)) on creating fields with alternative values that can be changed on a per-document basis and remembered across restarts of the software? Fields can have all the types mongoengine supports, but mostly will be string, int, and potentially, list.

Unpickling "None" object in Python

I am using redis to try to save a request's session object. Based on how to store a complex object in redis (using redis-py), I have:
def get_object_redis(key,r):
saved = r.get(key)
obj = pickle.loads(saved)
return obj
redis = Redis()
s = get_object_redis('saved',redis)
I have situations where there is no saved session and 'saved' evaluates to None. In this case I get:
TypeError: must be string or buffer, not None
Whats the best way to deal with this?

There are several ways to deal with it. This is what they would have in common:
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
# maybe add code here
return ... # return something you expect
obj = pickle.loads(saved)
return obj
You need to make it clear what you expect if a key is not found.
Version 1
An example would be you just return None:
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
return None
obj = pickle.loads(saved)
return obj
redis = Redis()
s = get_object_redis('saved',redis)
s is then None. This may be bad because you need to handle that somewhere and you do not know whether it was not found or it was found and really None.
Version 2
You create an object, maybe based on the key, that you can construct because you know what lies behind a key.
class KeyWasNotFound(object):
# just an example class
# maybe you have something useful in mind
def __init__(self, key):
self.key = key
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
return KeyWasNotFound(key)
obj = pickle.loads(saved)
return obj
Usually, if identity is important, you would store the object after you created it, to return the same object for the key.
Version 3
TypeError is a very geneneric error. You can create your own error class. This would be the preferred way for me, because I do not like version 1 and do not have knowledge of which object would be useful to return.
class NoRedisObjectFoundForKey(KeyError):
pass
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
raise NoRedisObjectFoundForKey(key)
obj = pickle.loads(saved)
return obj

GAE converting dictionary to NDB datastore entity

I would like to ask some guidelines on a small task that I am trying to solve.
I am experimenting with a small app that uses JSON data to save entities.
I know that you can easily convert a dict to an entity by just creating the model but, I am trying to build a more generic approach that would convert any dict to an entity.
My steps are:
Get the dict.
Validate that the dict keys correspond to an entitys model definitions by reading the class.dict of the model.
Try to unpack the validated properties in the model class contructor (create the model instance)
return it.
So far I am ok but lack of my python knowledge, is either constraining me, or confusing me.
Maybe I am as well forgetting or unaware of more simple way to do it.
So here is it:
#classmethod
def entity_from_dict(cls, parent_key, dict):
valid_properties = {}
logging.info(cls.__dict__)
for property,value in dict.iteritems():
if property in cls.__dict__: # should not iterate over functions, classmethods, and #property
logging.info(cls.__dict__[property]) # this outputs eg: StringProperty('title', required=True)
logging.info(type(cls.__dict__[property])) #this is more interesting <class 'google.appengine.ext.ndb.model.StringProperty'>
valid_properties.update({property: value})
# Update the id from the dict
if 'id' in dict: # if not creating a new entity
valid_properties['id'] = dict['id']
# Add the parent
valid_properties['parent'] = parent_key
#logging.info(valid_properties)
try:
entity = cls(**valid_properties)
except Exception as e:
logging.exception('Could not create entity \n' + repr(e))
return False
return entity
My problem is that I want only to validate ndb. Properties and not #classmethods, #property as well because this causes a conflict.
I am also using expando classes, so any property in the dict that is extra gets stored.
How can I check against these specific types?

Solved it as #Tim Hoffman proposed using the ._properties of the Ndb model.
A thing I didn't know is that via the ._properties I could get the model definition properties and I thought that it would only return the instance properties :-).
Also I did not use populate because I find that it does the same as passing the valid dict unpacked in the model's contructor ;-)
So here it is:
#classmethod
def entity_from_dict(cls, parent_key, data_dict):
valid_properties = {}
for cls_property in cls._properties:
if cls_property in data_dict:
valid_properties.update({cls_property: data_dict[cls_property]})
#logging.info(valid_properties)
# Update the id from the data_dict
if 'id' in data_dict: # if creating a new entity
valid_properties['id'] = data_dict['id']
# Add the parent
valid_properties['parent'] = parent_key
try:
entity = cls(**valid_properties)
except Exception as e:
logging.exception('Could not create entity \n' + repr(e))
return False
return entity

The JSON dump method in python which we using during the converting models to JSON for export converts non-strings into strings. Therefore Jimmy Kane methods throw the error due to model incompatibility. To avoid this problem I updated his method and added a method named prop_literal just for converting non-string characters which capsuled in the string into their literal type.
I also added the entity.put() to add the entity to datastore because the aim was that :)
def prop_literal(prop_type,prop_val):
"""
Convert non-string encapsulated in the string into literal type
"""
if "Integer" in prop_type:
return int(prop_val)
elif "Float" in prop_type:
return float(prop_val)
elif "DateTime" in prop_type:
# bos gecsin neticede locale
return None
elif ("String" in prop_type) or ("Text" in prop_type):
return prop_val
elif "Bool" in prop_type:
return True if prop_val == True else False
else:
return prop_val
def entity_from_dict(cls, parent_key, data_dict):
valid_properties = {}
for cls_property in cls._properties:
if cls_property in data_dict:
prop_type = str(cls._properties[cls_property])
# logging.info(prop_type)
real_val = prop_literal(prop_type,data_dict[cls_property])
try:
valid_properties.update({cls_property: real_val})
except Exception as ex:
# logging.info("Veri aktariminda hata:"+str(ex))
else:
# logging.info("prop skipped")
#logging.info(valid_properties)
# Update the id from the data_dict
if 'id' in data_dict: # if creating a new entity
valid_properties['id'] = data_dict['id']
# Add the parent
valid_properties['parent'] = parent_key
try:
entity = cls(**valid_properties)
logging.info(entity)
entity.put()
except Exception as e:
logging.exception('Could not create entity \n' + repr(e))
return False
return entity

Python / YAML: How to initialize additional objects not just from the YAML file, within loadConfig?

I have what I think is a small misconception with loading some YAML objects. I defined the class below.
What I want to do is load some objects with the overridden loadConfig function for YAMLObjects. Some of these come from my .yaml file, but others should be built out of objects loaded from the YAML file.
For instance, in the class below, I load a member object named "keep" which is a string naming some items to keep in the region. But I want to also parse this into a list and have the list stored as a member object too. And I don't want the user to have to give both the string and list version of this parameter in the YAML.
My current work around has been to override the __getattr__ function inside Region and make it create the defaults if it looks and doesn't find them. But this is clunky and more complicated than needed for just initializing objects.
What convention am I misunderstanding here. Why doesn't the loadConfig method create additional things not found in the YAML?
import yaml, pdb
class Region(yaml.YAMLObject):
yaml_tag = u'!Region'
def __init__(self, name, keep, drop):
self.name = name
self.keep = keep
self.drop = drop
self.keep_list = self.keep.split("+")
self.drop_list = self.drop.split("+")
self.pattern = "+".join(self.keep_list) + "-" + "-".join(self.drop_list)
###
def loadConfig(self, yamlConfig):
yml = yaml.load_all(file(yamlConfig))
for data in yml:
# These get created fine
self.name = data["name"]
self.keep = data["keep"]
self.drop = data["drop"]
# These do not get created.
self.keep_list = self.keep.split("+")
self.drop_list = self.drop.split("+")
self.pattern = "+".join(self.keep_list) + "-" + "-".join(self.drop_list)
###
### End Region
if __name__ == "__main__":
my_yaml = "/home/path/to/test.yaml"
region_iterator = yaml.load_all(file(my_yaml))
# Set a debug breakpoint to play with region_iterator and
# confirm the extra stuff isn't created.
pdb.set_trace()
And here is test.yaml so you can run all of this and see what I mean:
Regions:
# Note: the string conventions below are for an
# existing system. This is a shortened, representative
# example.
Market1:
!Region
name: USAndGB
keep: US+GB
drop: !!null
Market2:
!Region
name: CanadaAndAustralia
keep: CA+AU
drop: !!null
And here, for example, is what it looks like for me when I run this in an IPython shell and explore the loaded object:
In [57]: %run "/home/espears/testWorkspace/testRegions.py"
--Return--
> /home/espears/testWorkspace/testRegions.py(38)<module>()->None
-> pdb.set_trace()
(Pdb) region_iterator
<generator object load_all at 0x1139d820>
(Pdb) tmp = region_iterator.next()
(Pdb) tmp
{'Regions': {'Market2': <__main__.Region object at 0x1f858550>, 'Market1': <__main__.Region object at 0x11a91e50>}}
(Pdb) us = tmp['Regions']['Market1']
(Pdb) us
<__main__.Region object at 0x11a91e50>
(Pdb) us.name
'USAndGB'
(Pdb) us.keep
'US+GB'
(Pdb) us.keep_list
*** AttributeError: 'Region' object has no attribute 'keep_list'

A pattern I have found useful for working with yaml for classes that are basically storage is to have the loader use the constructor so that objects are created in the same way as when you make them normally. If I understand what you are attempting to do correctly, this kind of structure might be useful:
import inspect
import yaml
from collections import OrderedDict
class Serializable(yaml.YAMLObject):
__metaclass__ = yaml.YAMLObjectMetaclass
#property
def _dict(self):
dump_dict = OrderedDict()
for var in inspect.getargspec(self.__init__).args[1:]:
if getattr(self, var, None) is not None:
item = getattr(self, var)
if isinstance(item, np.ndarray) and item.ndim == 1:
item = list(item)
dump_dict[var] = item
return dump_dict
#classmethod
def to_yaml(cls, dumper, data):
return ordered_dump(dumper, '!{0}'.format(data.__class__.__name__),
data._dict)
#classmethod
def from_yaml(cls, loader, node):
fields = loader.construct_mapping(node, deep=True)
return cls(**fields)
def ordered_dump(dumper, tag, data):
value = []
node = yaml.nodes.MappingNode(tag, value)
for key, item in data.iteritems():
node_key = dumper.represent_data(key)
node_value = dumper.represent_data(item)
value.append((node_key, node_value))
return node
You would then want to have your Region class inherit from Serializable, and remove the loadConfig stuff. The code I posted inspects the constructor to see what data to save to the yaml file, and then when loading a yaml file calls the constructor with that same set of data. That way you just have to get the logic right in your constructor and the yaml loading should get it for free.
That code was ripped from one of my projects, apologies in advance if it doesn't quite work. It is also slightly more complicated than it needs to be because I wanted to control the order of output by using OrderedDict. You could replace my ordered_dump function with a call to dumper.represent_dict.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How checking lookup depth into nested dictionary as class attribute? - python

Related

Python dataclass attributes monkeypatching

Alternate Values for MongoDB fields setup via Mongoengine

Unpickling "None" object in Python

GAE converting dictionary to NDB datastore entity

Python / YAML: How to initialize additional objects not just from the YAML file, within loadConfig?

Categories

Resources