Using function value as kwargs.get() default - python

I have a factory function for a model with several foreign keys in my unit tests. I would like for that factory function to be variadic, allowing the user to specify the objects to use as foreign keys as keyword arguments, but calling the relevant factory function to spawn a new one for any that are left out.
I originally wrote something like:
def model_factory(i, **kwargs):
"""Create a new Model for testing"""
test_model_data = {
'fk1': kwargs.get('fk1', fk1_factory(i)),
'fk2': kwargs.get('fk2', fk2_factory(i)),
'fk3': kwargs.get('fk3', fk3_factory(i)),
}
return Model.objects.create(**test_model_data)
but this calls the fkN_factory() methods even if the keyword is present, causing a lot of side effects that are interfering with my tests. My question is whether or not there is a simpler way to do what I intended here without resulting in lots of needless function calls, rather than what I have now, which is more like:
def model_factory(i, **kwargs):
"""Create a new Model for testing"""
test_model_data = {
'fk1': kwargs.get('fk1', None),
'fk2': kwargs.get('fk2', None),
}
if kwargs['f1'] is None:
kwargs['f1'] = fk1_factory(i)
if kwargs['f2'] is None:
kwargs['f2'] = fk2_factory(i)

You want to factor out that repeated code in some way. The simplest is:
def get_value(mapping, key, default_func, *args):
try:
return mapping[key]
except KeyError:
return default_func(*args)
# ...
test_model_data = {
'fk1': get_value(kwargs, 'fk1', fk1_factory, i),
'fk2': get_value(kwargs, 'fk2', fk2_factory, i),
# etc.
}
Almost as simple as your original non-working version.
You could take this even farther:
def map_data(mapping, key_factory_map, *args):
return {key: get_value(mapping, key, factory, *args)
for key, factory in key_factory_map.items()}
# …
test_model_data = map_data(kwargs, {
'fk1': fk1_factory,
'fk2': fk2_factory,
# …
}, i)
But I'm not sure that's actually better. (If you have an obvious place to define that key-to-factory mapping out-of-line, it probably is; if not, probably not.)

Related

Pymongo Dynamically generate find query

I am trying to use find_one operator to fetch results from mongodb:
My document structure is as below:
{"_id":{"$oid":"600e6f592944ccc5790f1a9e"},
"user_id":"user_1",
"device_access":[
{"device_id":"DT002","access_type":"r"},
{"device_id":"DT007","access_type":"rm"},
{"device_id":"DT009","access_type":"rt"},
]
}
I have created my filter query as below
filter={'user_id': 'user_1','device_access.device_id': 'DT002'},
{'device_access': {'$elemMatch': {'device_id': 'DT002'}}}
But Pymongo returns None, when used in a function as below:
#Model.py
#this function in turn calls the pymongo find_one function
def test(self):
doc = self.__find(filter)
print(doc)
def __find(self, key):
device_document = self._db.get_single_data(COLLECTION_NAME, key)
return device_document
#Database.py
def get_single_data(self, collection, key):
db_collection = self._db[collection]
document = db_collection.find_one(key)
return document
.
Could you let me know what might be wrong here ?
Your brackets are incorrect. Try:
filter = {'user_id': 'user_1','device_access.device_id': 'DT002', 'device_access': {'$elemMatch': {'device_id': 'DT002'}}}
Also filter is a built-in function in python so you're better off using a different variable name.
Finally figured out on why the above query is returning None
The **filter** variable, as you could see, is ,comma separated. This is considered by python internally as *Tuple of Dictionary values* and when the filter var is passed on over another function it considers as a separate argument, though it never throws any exception in this case.
In order to overcome this, I now added *args to all the function calls that is being passed over.
#Model.py
#this function in turn calls the pymongo find_one function
def test(self):
doc = self.__find(filter)
print(doc)
def __find(self, key, *args):
device_document = self._db.get_single_data(COLLECTION_NAME, key, *args)
return device_document
#Database.py
def get_single_data(self, collection, key, *args):
db_collection = self._db[collection]
document = db_collection.find_one(key, *args)
return document

Django queryset interpolation for a previous / next instance function

Writing a dry function that returns either previous or next instances of a given instance.
This function return previous instances:
def previous(instance):
try:
return Picture.objects.filter(id__lt=instance.id).first()
except Picture.DoesNotExist:
return instance
I want to create an abstracted function which returns either the previous or the next instance using an additional gt_or_lt argument. The problem lies in interpolating that argument into the filter(id__gt_or_lt).
def seek_instance(gt_or_lt, instance):
try:
return Picture.objects.filter(id__gt_or_lt=instance.id).first()
except Picture.DoesNotExist:
return instance
I've tried:
return Picture.objects.filter(id__gt_or_lt = instance.id).first()
seek_instance("gt", instance)
return Picture.objects.filter(id__f"{gt_or_lt}" = instance.id).first()
seek_instance("gt", instance)
return Picture.objects.filter(f"{gt_or_lt}" = instance.id).first()
return Picture.objects.filter(gt_or_lt = instance.id).first()
seek("id__gt", instance)
All fail with their respective errors.
Use a dictionary with kwargs expansion.
return Picture.objects.filter(**{f"id__{gt_or_lt}": instance.id})
You can use dictionary expansion, like #DanielRoseman suggests. But that will still not per se render the previous, or next item. If for example the model has an ordering option [Django-doc], then it is possible that the order is different than on the id. Furthermore, for the previous one, you will need to reverse the ordering.
Furthermore depending on the situation, you might want to prevent that seek_instance can be given a different lookup, like 'in' for example.
We can thus use an if … elif … else here to branch on the item we wish to retrieve, and raise a ValueError in case you use some other lookup:
def seek_instance(lt_or_gt, instance):
try:
if lt_or_gt == 'lt':
return Picture.objects.filter(pk__lt=instance.pk).order_by('-pk').first()
elif lt_or_gt == 'gt':
return Picture.objects.filter(pk__gt=instance.pk).order_by('pk').first()
else:
raise ValueError("Should be 'lt' or 'gt'")
except Picture.DoesNotExist:
return instance

How do I create a dictionary that indexes the location of certain keys?

I have a class that inherits the dict object.
my_subclassed_dict = SubclassedDictionary({
"id": {"value1": 144
"value2": "steve",
"more" {"id": 114}
},
"attributes": "random"
})
On initialization of SubclassedDictionary, I would like paths generated which match a certain condition.
Hypothetically, if I was to make this condition, 'index all numbers above 100' This could perhaps then access my_subclassed_dict.get_paths(), which would then return some kind of structure resembling this:
[
['id', 'value1'],
['id', 'more', 'id',]
]
In short, how can I subclass dict which generates paths for keys matching a certain condition, on instantiation?
EDIT
Since someone asked for an example implementation. However the problem with this is that it doesn't handle nested dictionaries.
class SubclassedDictionary(dict):
paths = []
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs) # use the free update to set keys
def update(self, *args, **kwargs):
temp = args[0]
for key, value in temp.items():
if isinstance(value, int):
if value > 100:
self.paths.append(key)
super(SubclassedDictionary, self).update(*args, **kwargs)
dictionary = {
"value1": 333,
"v2": 99,
"v2": 129,
"v3": 30,
"nested": {
"nested_value" 1000
}
}
new_dict = SubclassedDictionary(dictionary)
print(new_dict.paths) # outputs: ['v2','value1']
If it did work as intended.
print(new_dict.paths)
Would output
[
['v2'],
['value1'],
['nested', 'nested_value']
]
From what I understand, you want a dictionary that is capable of returning the keys of dictionaries within dictionaries if the value the key's are associated with match a certain condition.
class SubclassedDictionary(dict):
def __init__(self, new_dict, condition=None, *args, **kwargs):
super(SubclassedDictionary, self).__init__(new_dict, *args, **kwargs)
self.paths = []
self.get_paths(condition)
def _get_paths_recursive(self, condition, iterable, parent_path=[]):
path = []
for key, value in iterable.iteritems():
# If we find an iterable, recursively obtain new paths.
if isinstance(value, (dict, list, set, tuple)):
# Make sure to remember where we have been (parent_path + [key])
recursed_path = self._get_paths_recursive(condition, value, parent_path + [key])
if recursed_path:
self.paths.append(parent_path + recursed_path)
elif condition(value) is True:
self.paths.append(parent_path + [key])
def get_paths(self, condition=None):
# Condition MUST be a function that returns a bool!
self.paths = []
if condition is not None:
return self._get_paths_recursive(condition, self)
def my_condition(value):
try:
return int(value) > 100
except ValueError:
return False
my_dict = SubclassedDictionary({"id": {"value1": 144,
"value2": "steve",
"more": {"id": 114}},
"attributes": "random"},
condition=my_condition)
print my_dict.paths # Returns [['id', 'value1'], ['id', 'more', 'id']]
There's a few benefits to this implementation. One is that you can change your condition whenever you want. In your question it sounded like this may be a feature that you were interested in. If you want a different condition you can easily write a new function and pass it into the constructor of the class, or simply call get_paths() with your new condition.
When developing a recursive algorithm there are 3 things you should take into account.
1) What is my stopping condition? In this case your literal condition is not actually your stopping condition. The recursion stops when there are no longer elements to iterate through.
2) Create a non-recursive function This is important for two reasons (I'll get to the second later). The first reason is that it is a safe way to encapsulate functionality that you don't want consumers to use. In this case the _get_paths_recursive() has extra parameters that if a consumer got a hold of could ruin your paths attribute.
3) Do as much error handling before recursion (Second reason behind two functions) The other benefit to a second function is that you can do non-recursive operations. More often than not, when you are writing a recursive algorithm you are going to have to do something before you start recursing. In this case I make sure the condition parameter is valid (I could add more checking to make sure its a function that returns a bool, and accepts one parameter). I also reset the paths attribute so you don't end up with a crazy amount of paths if get_paths() is called more than once.
The minimal change is something like:
class SubclassedDictionary(dict):
def __init__(self, *args, **kwargs):
self.paths = [] # note instance, not class, attribute
self.update(*args, **kwargs) # use the free update to set keys
def update(self, *args, **kwargs):
temp = args[0]
for key, value in temp.items():
if isinstance(value, int):
if value > 100:
self.paths.append([key]) # note adding a list to the list
# recursively handle nested dictionaries
elif isinstance(value, dict):
for path in SubclassedDictionary(value).paths:
self.paths.append([key]+path)
super(SubclassedDictionary, self).update(*args, **kwargs)
Which gives the output you're looking for:
>>> SubclassedDictionary(dictionary).paths
[['v2'], ['value1'], ['nested', 'nested_value']]
However, a neater method might be to make paths a method, and create nested SubclassedDictionary instances instead of dictionaries, which also allows you to specify the rule when calling rather than hard-coding it. For example:
class SubclassedDictionary(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs) # use the free update to set keys
def update(self, *args, **kwargs):
temp = args[0]
for key, value in temp.items():
if isinstance(value, dict):
temp[key] = SubclassedDictionary(value)
super(SubclassedDictionary, self).update(*args, **kwargs)
def paths(self, rule):
matching_paths = []
for key, value in self.items():
if isinstance(value, SubclassedDictionary):
for path in value.paths(rule):
matching_paths.append([key]+path)
elif rule(value):
matching_paths.append([key])
return matching_paths
In use, to get the paths of all integers larger than 100:
>>> SubclassedDictionary(dictionary).paths(lambda val: isinstance(val, int) and val > 100)
[['v2'], ['value1'], ['nested', 'nested_value']]
One downside is that this recreates the path list every time it's called.
It's worth noting that you don't currently handle the kwargs correctly (so neither does my code!); have a look at e.g. Overriding dict.update() method in subclass to prevent overwriting dict keys where I've provided an answer that shows how to implement an interface that matches the basic dict's. Another issue that your current code has is that it doesn't deal with keys subsequently being removed from the dictionary; my first snippet doesn't either, but as the second rebuilds the path list each time it's not a problem there.

Formencode OneOf validator with dynamic list to test against

I'm using formencode 1.3.0a1 (and turbogeras 2.3.4) and run into a problem with the validator OneOf.
I want to validate some input according to a list in the Database.
Here is my validation schema and the method for getting the list:
from formencode import Schema, validators
def getActiveCodes():
codes = DBSession.query(SomeObject.code).all()
codes = [str(x[0]) for x in codes]
return codes
class itemsEditSchema(Schema):
code = validators.OneOf(getActiveCodes())
allow_extra_fields = True
The method "getActiveCodes" is executed just once (I guess during schema init or something like that).
I need it to run every time when I want to check my user input for "code", how can I do that?
Thanks for the help
I don't know any way to make formencode do what you ask. However, since this is Python, there are few limits to what we can do.
You can solve this by wrapping the call to getActiveCodes in a purpose-built class. The wrapper class, RefreshBeforeContainsCheck, will implement the special methods __iter__ and __contains__ to provide the necessary interface to be used as a iterable object:
from formencode import Schema, validators, Invalid
class RefreshBeforeContainsCheck(object):
def __init__(self, func):
self._func = func
self._current_list = None
def __iter__(self):
print '__iter__ was called.'
#return iter(self._func()) # Could have refreshed here too, but ...
return iter(self._current_list)
def __contains__(self, item):
print '__contains__ was called.'
self._current_list = self._func() # Refresh list.
return item in self._current_list
I've added print statements to make it clearer how it behaves during run-time. The RefreshBeforeContainsCheck class can then be used like
class ItemsEditSchema(Schema):
code = validators.OneOf(RefreshBeforeContainsCheck(getActiveCodes))
allow_extra_fields = True
in the validator schema.
The way it is implemented above, the getActiveCodes function will be called each time the OneOf validator performs an item in list test (where our class acts as the list), because that will cause RefreshBeforeContainsCheck.__contains__ to be called. Now, if the validation fails, the OneOf validator generates an error message listing all the elements of list; that case is handled by our __iter__ implementation. To avoid calling the database twice in case of validation errors, I've chosen to cache the "database" result list as self._current_list, but whether or not that's appropriate depends on your needs.
I've created a gist for this: https://gist.github.com/mtr/9719d08f1bbace9ebdf6, basically creating an example of using the above code with the following code.
def getActiveCodes():
# This function could have performed a database lookup.
print 'getActivityCodes() was called.'
codes = map(str, [1, 2, 3, 4])
return codes
def validate_input(schema, value):
print 'Validating: value: {!r}'.format(value)
try:
schema.to_python(value)
except Invalid, error:
print 'error: {!r}'.format(error)
print
def main():
schema = ItemsEditSchema()
validate_input(schema, {'code': '3'})
validate_input(schema, {'code': '4'})
validate_input(schema, {'code': '5'})
The output of the gist is:
Validating: value: {'code': '3'}
__contains__ was called.
getActivityCodes() was called.
Validating: value: {'code': '4'}
__contains__ was called.
getActivityCodes() was called.
Validating: value: {'code': '5'}
__contains__ was called.
getActivityCodes() was called.
__iter__ was called.
error: Invalid("code: Value must be one of: 1; 2; 3; 4 (not '5')",
{'code': '5'}, None, None,
{'code': Invalid(u"Value must be one of: 1; 2; 3; 4 (not '5')",
'5', None, None, None)})
In the end I worte a FancyValidtor instead of using OneOf, here is my code:
class codeCheck(FancyValidator):
def to_python(self, value, state=None):
if value==None:
raise Invalid('missing a value', value, state)
return super(codeCheck,self).to_python(value,state)
def _validate_python(self, value, state):
codes = DBSession.query(Code).all()
if value not in codes:
raise Invalid('wrong code',value, state)
return value
class itemsEditSchema(Schema):
code = codeCheck ()
allow_extra_fields = True

Django: What is a clean way to populate a list of optional object attributes that need to be wrapped in try/except? Is this a good place for eval()?

I repeatedly find myself in a position where I'm writing specific django model instance fields into a list for various reasons (exporting to CSV, logging) and I'd imagine the same is true for many other people.
Generating a report could require traversing through foreign keys IF they exist. The larger the report, the more unreadable the code gets as I wrap attribute getter attempts in try/except blocks.
Optional foreign keys are also problems: item.optional_fk.optional_date.method()
for item in django_model_instances:
try:
date_created = item.order.date_created.strftime('%Y/%m/%d')
except AttributeError:
date_created = ''
try:
date_complete = item.order.date_complete.strftime('%Y/%m/%d')
except AttributeError:
date_complete = ''
# perhaps more try/except...
writer.writerow([
item.optional_fk.optional_field.strtime('%Y'),
item.optional_fk.method(),
item.bar,
date_created,
# other attributes...
date_complete,
# other attributes...
])
As you have more columns to write the code starts to look like a monster.
I like the readability of using eval() wrapped in try/except but I read I should avoid eval like the plague.
Is using eval in Python a bad practice?
There is almost always a better way to do it - trying to find a better way without writing too much code :)
Very dangerous and insecure - the strings are hard coded
Makes debugging difficult - True
Slow - code is for generating reports, it can be slow.
.
def no_exceptions_getter(item, statement):
try:
return eval(statement)
except AttributeError, e:
log.debug(e)
return ''
for item in django_model_instances:
writer.writerow([no_exceptions_getter(item, x) for x in (
'item.foo',
'item.bar',
'item.date_created.strftime("%Y/%m/%d")',
'item.date_complete.strftime("%Y/%m/%d")',
'item.optional_foreign_key.foo',
# more items in a readable list format
)])
I don't see the security vulnerability, speed or debugging problems being an issue.
So my question for you experts out there is: is this an OK use of eval?
Why aren't you just using getattr?
for item in django_model_instances:
date_created = getattr(item.order, 'date_created', '')
if date_created:
date_created = date_created.strftime('%Y/%m/%d')
or a simple wrapper, if this particular pattern is used a lot:
def get_strftime(object, attr):
value = getattr(object, attr, None)
if value is None:
return ''
return value.strftime('%Y/%m/%d')
writer.writerow([
item.foo,
item.bar,
get_strftime(item.order, 'date_created'),
get_strftime(item.order, 'date_complete'),
])
I assume in your example that date_created might not exist because you're not always passing the same model class to your loop. The solution may then appear to be to put the logic in the object via the model class definition, and write a 'getrow' method function for any classes you might want to do this to. For classes that have a 'date_created' method, they return that, otherwise return '' or None.
How about
def getattrchain(obj, attrStr, fnArgs=None, defaultResult=''):
try:
for attr in attrStr.split('.'):
obj = getattr(obj,attr)
except AttributeError:
return defaultResult
if callable(obj) and fnArgs is not None:
return obj(*fnArgs)
else:
return obj
for item in django_model_instances:
writer.writerow([getchainattr(item, x, args) for x,args in (
('foo', None),
('bar', None),
('date_created.strftime', [r'%Y/%m/%d']),
('date_complete.strftime', [r'%Y/%m/%d']),
('optional_foreign_key.foo', None),
)
])

Categories