Creating Custom varbinds dictionary using pysnmp - python

I am using sendVarbinds from pysnmp to send traps with custom MIBs
e.g.
def sendTrap(self, trapName, trapObjects):
mibViewController = view.MibViewController(self.snmpEngine.getMibBuilder())
ntfOrg = ntforg.NotificationOriginator()
ntfOrg.snmpContext = self.snmpContext
ntfOrg.sendVarBinds(
self.snmpEngine,
'my-notification', # notification targets
None, '', # contextEngineId, contextName
rfc1902.NotificationType(
trapName,
objects=trapObjects # <-- how to construct this?
).resolveWithMib(mibViewController)
)
The NotificationType requires a dictionary of varbinds , having object key value pair .
I would like to populate the object dictionary key and value from a json object .Right now I am doing it like this for each trap.
def function1():
trapName = rfc1902.ObjectIdentity('MY-MIB', 'ServerPart')
trapObjects = {
('MY-MIB', 'ServerIP'): event.get_ServerIP(),
('MY-MIB', 'ServerPartName'): event.get_ServerPartName(),
('MY-MIB', 'ServerState'): event.get_ServerState()
}
def function2():
trapName = rfc1902.ObjectIdentity('MY-MIB', 'ServerMemory')
trapObjects = {
('MY-MIB', 'ServerIP'): event.get_ServerIP(),
('MY-MIB', 'ServerMemoryAv'): event.get_ServerMemoryAv()
}
But I want to have a single function to create the traps without checking the varbinds and fill them using the JSON object? Say I run a loop on dictionary items().
My flow is like this.
Client --- event JSON object ----> Trap daemon ---- traps -----> Manager
This is an example of the JSON object which comes from client
my_event('ServerPart', '10.22.1.1', '44', '1', host, port)
my_event('ServerMemory', '10.22.1.1', '100MB', host, port)
Is there any example which can be followed?

Each notification type requires its own set of managed objects (trapOpjects) to be included with the notification. I suppose the mapping of event name -> notification type is 1-to-1 and you can re-constract event field getter from field name.
Then what's remaining is to map event and its parameters to notification type and managed objects:
eventParamsToManagedObjectsMap = {
'ServerPart': { # <- notification type
'ServerIP': 'get_ServerIP', # <- trap objects
# e.g. to populate ServerIP,
# call get_serverIP()
# ...
}
...
}
Finally, to build such map you could try something along these lines:
notificationType = 'ServerMemory' # on input you should have all
# event or notification types
trapObjectIdentity = rfc1902.ObjectIdentity('MY-MIB', notificationType)
trapObjectIdentiry.resolveWithMib(mibViewController)
trapObjectNames = trapObjectIdentity.getMibNode().getObjects()
eventParamsToManagedObjectsMap = defaultdict(dict)
for mibName, mibObjectName in trapObjectNames:
eventParamsToManagedObjectsMap[notificationType][mibObjectName] = 'get_' + mibObjectName
Once you have this map, you should be able to build notifications out of events uniformly within a single function.

I guess you can try something like this (untested):
def get_notification(trapName, **kwargs):
return rfc1902.NotificationType(
trapName,
objects={('MY-MIB', k): v for k, v in kwargs.items()}
}
Assuming json_data is already a Python dict (if it is a string call json.loads(json_data) in order to parse it) you can call it like this:
ntfOrg.sendVarBinds(
self.snmpEngine,
'my-notification', # notification targets
None, '', # contextEngineId, contextName
get_notification(**json_data).resolveWithMib(mibViewController)
)

Related

Clone Kubernetes objects programmatically using the Python API

The Python API is available to read objects from a cluster. By cloning we can say:
Get a copy of an existing Kubernetes object using kubectl get
Change the properties of the object
Apply the new object
Until recently, the option to --export api was deprecated in 1.14. How can we use the Python Kubernetes API to do the steps from 1-3 described above?
There are multiple questions about how to extract the code from Python API to YAML, but it's unclear how to transform the Kubernetes API object.
Just use to_dict() which is now offered by Kubernetes Client objects. Note that it creates a partly deep copy. So to be safe:
copied_obj = copy.deepcopy(obj.to_dict())
Dicts can be passed to create* and patch* methods.
For convenience, you can also wrap the dict in Prodict.
copied_obj = Prodict.from_dict(copy.deepcopy(obj.to_dict()))
The final issue is getting rid of superfluous fields. (Unfortunately, Kubernetes sprinkles them throughout the object.) I use kopf's internal facility for getting the "essence" of an object. (It takes care of the deep copy.)
copied_obj = kopf.AnnotationsDiffBaseStorage().build(body=kopf.Body(obj.to_dict()))
copied_obj = Prodic.from_dict(copied_obj)
After looking at the requirement, I spent a couple of hours researching the Kubernetes Python API. Issue 340 and others ask about how to transform the Kubernetes API object into a dict, but the only workaround I found was to retrieve the raw data and then convert to JSON.
The following code uses the Kubernetes API to get a deployment and its related hpa from the namespaced objects, but retrieving their raw values as JSON.
Then, after transforming the data into a dict, you can alternatively clean up the data by removing null references.
Once you are done, you can transform the dict as YAML payload to then save the YAML to the file system
Finally, you can apply either using kubectl or the Kubernetes Python API.
Note:
Make sure to set KUBECONFIG=config so that you can point to a cluster
Make sure to adjust the values of origin_obj_name = "istio-ingressgateway" and origin_obj_namespace = "istio-system" with the name of the corresponding objects to be cloned in the given namespace.
import os
import logging
import yaml
import json
logging.basicConfig(level = logging.INFO)
import crayons
from kubernetes import client, config
from kubernetes.client.rest import ApiException
LOGGER = logging.getLogger(" IngressGatewayCreator ")
class IngressGatewayCreator:
#staticmethod
def clone_default_ingress(clone_context):
# Clone the deployment
IngressGatewayCreator.clone_deployment_object(clone_context)
# Clone the deployment's HPA
IngressGatewayCreator.clone_hpa_object(clone_context)
#staticmethod
def clone_deployment_object(clone_context):
kubeconfig = os.getenv('KUBECONFIG')
config.load_kube_config(kubeconfig)
v1apps = client.AppsV1beta1Api()
deployment_name = clone_context.origin_obj_name
namespace = clone_context.origin_obj_namespace
try:
# gets an instance of the api without deserialization to model
# https://github.com/kubernetes-client/python/issues/574#issuecomment-405400414
deployment = v1apps.read_namespaced_deployment(deployment_name, namespace, _preload_content=False)
except ApiException as error:
if error.status == 404:
LOGGER.info("Deployment %s not found in namespace %s", deployment_name, namespace)
return
raise
# Clone the object deployment as a dic
cloned_dict = IngressGatewayCreator.clone_k8s_object(deployment, clone_context)
# Change additional objects
cloned_dict["spec"]["selector"]["matchLabels"]["istio"] = clone_context.name
cloned_dict["spec"]["template"]["metadata"]["labels"]["istio"] = clone_context.name
# Save the deployment template in the output dir
context.save_clone_as_yaml(cloned_dict, "deployment")
#staticmethod
def clone_hpa_object(clone_context):
kubeconfig = os.getenv('KUBECONFIG')
config.load_kube_config(kubeconfig)
hpas = client.AutoscalingV1Api()
hpa_name = clone_context.origin_obj_name
namespace = clone_context.origin_obj_namespace
try:
# gets an instance of the api without deserialization to model
# https://github.com/kubernetes-client/python/issues/574#issuecomment-405400414
hpa = hpas.read_namespaced_horizontal_pod_autoscaler(hpa_name, namespace, _preload_content=False)
except ApiException as error:
if error.status == 404:
LOGGER.info("HPA %s not found in namespace %s", hpa_name, namespace)
return
raise
# Clone the object deployment as a dic
cloned_dict = IngressGatewayCreator.clone_k8s_object(hpa, clone_context)
# Change additional objects
cloned_dict["spec"]["scaleTargetRef"]["name"] = clone_context.name
# Save the deployment template in the output dir
context.save_clone_as_yaml(cloned_dict, "hpa")
#staticmethod
def clone_k8s_object(k8s_object, clone_context):
# Manipilate in the dict level, not k8s api, but from the fetched raw object
# https://github.com/kubernetes-client/python/issues/574#issuecomment-405400414
cloned_obj = json.loads(k8s_object.data)
labels = cloned_obj['metadata']['labels']
labels['istio'] = clone_context.name
cloned_obj['status'] = None
# Scrub by removing the "null" and "None" values
cloned_obj = IngressGatewayCreator.scrub_dict(cloned_obj)
# Patch the metadata with the name and labels adjusted
cloned_obj['metadata'] = {
"name": clone_context.name,
"namespace": clone_context.origin_obj_namespace,
"labels": labels
}
return cloned_obj
# https://stackoverflow.com/questions/12118695/efficient-way-to-remove-keys-with-empty-strings-from-a-dict/59959570#59959570
#staticmethod
def scrub_dict(d):
new_dict = {}
for k, v in d.items():
if isinstance(v, dict):
v = IngressGatewayCreator.scrub_dict(v)
if isinstance(v, list):
v = IngressGatewayCreator.scrub_list(v)
if not v in (u'', None, {}):
new_dict[k] = v
return new_dict
# https://stackoverflow.com/questions/12118695/efficient-way-to-remove-keys-with-empty-strings-from-a-dict/59959570#59959570
#staticmethod
def scrub_list(d):
scrubbed_list = []
for i in d:
if isinstance(i, dict):
i = IngressGatewayCreator.scrub_dict(i)
scrubbed_list.append(i)
return scrubbed_list
class IngressGatewayContext:
def __init__(self, manifest_dir, name, hostname, nats, type):
self.manifest_dir = manifest_dir
self.name = name
self.hostname = hostname
self.nats = nats
self.ingress_type = type
self.origin_obj_name = "istio-ingressgateway"
self.origin_obj_namespace = "istio-system"
def save_clone_as_yaml(self, k8s_object, kind):
try:
# Just try to create if it doesn't exist
os.makedirs(self.manifest_dir)
except FileExistsError:
LOGGER.debug("Dir already exists %s", self.manifest_dir)
full_file_path = os.path.join(self.manifest_dir, self.name + '-' + kind + '.yaml')
# Store in the file-system with the name provided
# https://stackoverflow.com/questions/12470665/how-can-i-write-data-in-yaml-format-in-a-file/18210750#18210750
with open(full_file_path, 'w') as yaml_file:
yaml.dump(k8s_object, yaml_file, default_flow_style=False)
LOGGER.info(crayons.yellow("Saved %s '%s' at %s: \n%s"), kind, self.name, full_file_path, k8s_object)
try:
k8s_clone_name = "http2-ingressgateway"
hostname = "my-nlb-awesome.a.company.com"
nats = ["123.345.678.11", "333.444.222.111", "33.221.444.23"]
manifest_dir = "out/clones"
context = IngressGatewayContext(manifest_dir, k8s_clone_name, hostname, nats, "nlb")
IngressGatewayCreator.clone_default_ingress(context)
except Exception as err:
print("ERROR: {}".format(err))
Not python, but I've used jq in the past to quickly clone something with the small customisations required for each use case (usually cloning secrets into a new namespace).
kc get pod whatever-85pmk -o json \
| jq 'del(.status, .metadata ) | .metadata.name="newname"' \
| kc apply -f - -o yaml --dry-run
This is really easy to do with Hikaru.
Here is an example from my own open source project:
def duplicate_without_fields(obj: HikaruBase, omitted_fields: List[str]):
"""
Duplicate a hikaru object, omitting the specified fields
This is useful when you want to compare two versions of an object and first "cleanup" fields that shouldn't be
compared.
:param HikaruBase obj: A kubernetes object
:param List[str] omitted_fields: List of fields to be omitted. Field name format should be '.' separated
For example: ["status", "metadata.generation"]
"""
if obj is None:
return None
duplication = obj.dup()
for field_name in omitted_fields:
field_parts = field_name.split(".")
try:
if len(field_parts) > 1:
parent_obj = duplication.object_at_path(field_parts[:-1])
else:
parent_obj = duplication
setattr(parent_obj, field_parts[-1], None)
except Exception:
pass # in case the field doesn't exist on this object
return duplication
Dumping the object to yaml afterwards or re-applying it to the cluster is trivial with Hikaru
We're using this to clean up objects so that can show users a github-style diff when objects change, without spammy fields that change often like generation

Python search replace with multiple Json objects

I wasn't sure how to search for this but I am trying to make a script that dynamically launches programs. I will have a couple of JSON files and I want to be able to do a search replace sort of thing.
So I'll setup an example:
config.json
{
"global_vars": {
"BASEDIR": "/app",
"CONFIG_DIR": "{BASEDIR}/config",
"LOG_DIR": "{BASEDIR}/log",
"CONFIG_ARCHIVE_DIR": "{CONFIG_DIR}/archive"
}
}
Then process.json
{
"name": "Dummy_Process",
"binary": "java",
"executable": "DummyProcess-0.1.0.jar",
"launch_args": "-Dspring.config.location={CONFIG_DIR}/application.yml -Dlogging.config={CONFIG_DIR}/logback-spring.xml -jar {executable}",
"startup_log": "{LOG_DIR}/startup_{name}.out"
}
Now I want to be able to load both of these JSON objects and be able to use the values there to update. So like "CONFIG_ARCHIVE_DIR": "{CONFIG_DIR}/archive" will become CONFIG_ARCHIVE_DIR": "/app/config/archive"
Does anyone know a good way to do this recursively because I'm running into issues when I'm trying to use something like CONFIG_DIR which requires BASEDIR first.
I have this function that loads all the data:
#Recursive function, loops and loads all values into data
def _load_data(data,obj):
for i in obj.keys():
if isinstance(obj[i],str):
data[i]=obj[i]
if isinstance(obj[i],dict):
data=_load_data(data,obj[i])
return data
Then I have this function:
def _update_data(data,data_str=""):
if not data_str:
data_str=json.dumps(data)
for i in data.keys():
if isinstance(data[i],str):
data_str=data_str.replace("{"+i+"}",data[i])
if isinstance(data[i],dict):
data=_update_data(data,data_str)
return json.loads(data_str)
So this works for one level but I don't know if this is the best way to do it. It stops working when I hit a case like the CONFIG_DIR because it would need to loop over the data multiple times. First it needs to update the BASEDIR then once more to update CONFIG_DIR. suggestion welcome.
The end goal of this script is to create a start/stop/status script to manage all of our binaries. They all use different binaries to start and I want one Processes file for multiple servers. Each process will have a servers array to tell the start/stop script what to run on given server. Maybe there's something like this already out there so if there is, please point me in the direction.
I will be running on Linux and prefer to use Python. I want something smart and easy for someone else to pickup and use/modify.
I made something that works with the example files you provided. Note that I didn't handle multiple keys or non-dictionaries in the data. This function accepts a list of the dictionaries obtained after JSON parsing your input files. It uses the fact that re.sub can accept a function for the replacement value and calls that function with each match. I am sure there are plenty of improvements that could be made to this, but it should get you started at least.
def make_config(configs):
replacements = {}
def find_defs(config):
# Find leaf nodes of the dictionary.
defs = {}
for k, v in config.items():
if isinstance(v, dict):
# Nested dictionary so recurse.
defs.update(find_defs(v))
else:
defs[k] = v
return defs
for config in configs:
replacements.update(find_defs(config))
def make_replacement(m):
# Construct the replacement string.
name = m.group(0).strip('{}')
if name in replacements:
# Replace replacement strings in the replacement string.
new = re.sub('\{[^}]+\}', make_replacement, replacements[name])
# Cache result
replacements[name] = new
return new
raise Exception('Replacement string for {} not found'.format(name))
finalconfig = {}
for name, value in replacements.items():
finalconfig[name] = re.sub('\{[^}]+\}', make_replacement, value)
return finalconfig
With this input:
[
{
"global_vars": {
"BASEDIR": "/app",
"CONFIG_DIR": "{BASEDIR}/config",
"LOG_DIR": "{BASEDIR}/log",
"CONFIG_ARCHIVE_DIR": "{CONFIG_DIR}/archive"
}
},
{
"name": "Dummy_Process",
"binary": "java",
"executable": "DummyProcess-0.1.0.jar",
"launch_args": "-Dspring.config.location={CONFIG_DIR}/application.yml -Dlogging.config={CONFIG_DIR}/logback-spring.xml -jar {executable}",
"startup_log": "{LOG_DIR}/startup_{name}.out"
}
]
It gives this output:
{
'BASEDIR': '/app',
'CONFIG_ARCHIVE_DIR': '/app/config/archive',
'CONFIG_DIR': '/app/config',
'LOG_DIR': '/app/log',
'binary': 'java',
'executable': 'DummyProcess-0.1.0.jar',
'launch_args': '-Dspring.config.location=/app/config/application.yml -Dlogging.config=/app/config/logback-spring.xml -jar DummyProcess-0.1.0.jar',
'name': 'Dummy_Process',
'startup_log': '/app/log/startup_Dummy_Process.out'
}
As an alternative to the answer by #FamousJameous and if you don't mind changing to ini format, you can also use the python built-in configparser which already has support to expand variables.
I implemented a solution with a class (Config) with a couple of functions:
_load: simply convert from JSON to a Python object;
_extract_params: loop over the document (output of _load) and add them to a class object (self.params);
_loop: loop over the object returned from _extract_params and, if the values contains any {param}, call the _transform method;
_transform: replace the {param} in the values with the correct values, if there is any '{' in the value linked to the param that needs to be replaced, call again the function
I hope I was clear enough, here is the code:
import json
import re
config = """{
"global_vars": {
"BASEDIR": "/app",
"CONFIG_DIR": "{BASEDIR}/config",
"LOG_DIR": "{BASEDIR}/log",
"CONFIG_ARCHIVE_DIR": "{CONFIG_DIR}/archive"
}
}"""
process = """{
"name": "Dummy_Process",
"binary": "java",
"executable": "DummyProcess-0.1.0.jar",
"launch_args": "-Dspring.config.location={CONFIG_DIR}/application.yml -Dlogging.config={CONFIG_DIR}/logback-spring.xml -jar {executable}",
"startup_log": "{LOG_DIR}/startup_{name}.out"
}
"""
class Config(object):
def __init__(self, documents):
self.documents = documents
self.params = {}
self.output = {}
# Loads JSON to dictionary
def _load(self, document):
obj = json.loads(document)
return obj
# Extracts the config parameters in a dictionary
def _extract_params(self, document):
for k, v in document.items():
if isinstance(v, dict):
# Recursion for inner dictionaries
self._extract_params(v)
else:
# if not a dict set params[k] as v
self.params[k] = v
return self.params
# Loop on the configs dictionary
def _loop(self, params):
for key, value in params.items():
# if there is any parameter inside the value
if len(re.findall(r'{([^}]*)\}', value)) > 0:
findings = re.findall(r'{([^}]*)\}', value)
# call the transform function
self._transform(params, key, findings)
return self.output
# Replace all the findings with the correct value
def _transform(self, object, key, findings):
# Iterate over the found params
for finding in findings:
# if { -> recursion to set all the needed values right
if '{' in object[finding]:
self._transform(object, finding, re.findall(r'{([^}]*)\}', object[finding]))
# Do de actual replace
object[key] = object[key].replace('{'+finding+'}', object[finding])
self.output = object
return self.output
# Entry point
def process_document(self):
params = {}
# _load the documents and extract the params
for document in self.documents:
params.update(self._extract_params(self._load(document)))
# _loop over the params
return self._loop(params)
# return self.output
if __name__ == '__main__':
config = Config([config, process])
print(config.process_document())
I am sure there are many other better ways to reach your goal, but I still hope this can bu useful to you.

Python organzing default values as objects

I have a django application that is utilizing a third party API and needs to receive several arguments such as client_id, user_id etc. I currently have these values labeled at the top of my file as variables, but I'd like to store them in an object instead.
My current set up looks something like this:
user_id = 'ID HERE'
client_id = 'ID HERE'
api_key = 'ID HERE'
class Social(LayoutView, TemplateView):
def grab_data(self):
authenticate_user = AuthenticateService(client_id, user_id)
I want the default values set up as an object
SERVICE_CONFIG = {
'user_id': 'ID HERE',
'client_id': 'ID HERE'
}
So that I can access them in my classes like so:
authenticate_user = AuthenticateService(SERVICE_CONFIG.client_id, SERVICE_CONFIG.user_id)
I've tried SERVICE_CONFIG.client_id, and SERVICE_CONFIG['client_id'], as well as setting up the values as a mixin but I can't figure out how to access them any other way.
Python is not Javascript. That's a dictionary, not an object. You access dictionaries using the item syntax, not the attribute syntax:
AuthenticateService(SERVICE_CONFIG['client_id'], SERVICE_CONFIG['user_id'])
You can use a class, an instance, or a function object to store data as properties:
class ServiceConfig:
user_id = 1
client_id = 2
ServiceConfig.user_id # => 1
service_config = ServiceConfig()
service_config.user_id # => 1
service_config = lambda:0
service_config.user_id = 1
service_config.client_id = 2
service_config.user_id # => 1
Normally using a dict is the simplest way to store data, but in some cases higher readability of property access can be preferred, then you can use the examples above. Using a lambda is the easiest way but more confusing for someone reading your code, therefore the first two approaches are preferable.

RestKit POST object to Python

I am using RestKit for iOS to perform a POST to my Python (Flask) server. The POST arguments is a set of nested dictionaries. When I create my hierarchical argument object on the client side and perform the post, there are no errors. But on the server side, the form data is flattened into a single set of keys which are themselves indexed strings:
#interface TestArgs : NSObject
#property (strong, nonatomic) NSDictionary *a;
#property (strong, nonatomic) NSDictionary *b;
#end
RKObjectMapping *requestMapping = [RKObjectMapping requestMapping]; // objectClass == NSMutableDictionary
[requestMapping addAttributeMappingsFromArray:#[
#"a.name",
#"a.address",
#"a.gender",
#"b.name",
#"b.address",
#"b.gender",
]];
RKRequestDescriptor *requestDescriptor = [RKRequestDescriptor requestDescriptorWithMapping:requestMapping objectClass:[TestArgs class] rootKeyPath:nil];
RKObjectManager *manager = [RKObjectManager managerWithBaseURL:[NSURL URLWithString:#"http://localhost:5000"]];
[manager addRequestDescriptor:requestDescriptor];
NSDictionary *a = [NSDictionary dictionaryWithObjectsAndKeys:
#"Alexis", #"name",
#"Boston", #"address",
#"female", #"gender",
nil];
NSDictionary *b = [NSDictionary dictionaryWithObjectsAndKeys:
#"Chris", #"name",
#"Boston", #"address",
#"male", #"gender",
nil];
TestArgs *tArgs = [[TestArgs alloc] init];
tArgs.a = a;
tArgs.b = b;
[manager postObject:tArgs path:#"/login" parameters:nil success:nil failure:nil];
On the server side, the POST body is this:
{'b[gender]': u'male', 'a[gender]': u'female', 'b[name]': u'Chris', 'a[name]': u'Alexis', 'b[address]': u'Boston', 'a[address]': u'Boston'}
When what I really want is this:
{'b': {'gender' : u'male', 'name': u'Chris', 'address': u'Boston'}, 'a': {'gender': u'female', 'name': u'Alexis', 'address': u'Boston'}}
Why is the POST body not maintaining its hierarchy on the server side? Is this an error with my client side encoding logic? On the server side with Flask decoding the JSON? Any ideas?
Thanks
The error is with the client side mapping. Your mapping needs to represent the structure of the data you want and the relationships it contains. Currently the mapping uses keypaths which effectively hide the structural relationships.
You need 2 mappings:
The dictionary of parameters
The container of the dictionaries
The mappings are defined as:
paramMapping = [RKObjectMapping requestMapping];
containerMapping = [RKObjectMapping requestMapping];
[paramMapping addAttributeMappingsFromArray:#[
#"name",
#"address",
#"gender",
]];
RKRelationshipMapping *aRelationship = [RKRelationshipMapping
relationshipMappingFromKeyPath:#"a"
toKeyPath:#"a"
withMapping:paramMapping];
RKRelationshipMapping *bRelationship = [RKRelationshipMapping
relationshipMappingFromKeyPath:#"b"
toKeyPath:#"b"
withMapping:paramMapping]
[containerMapping addPropertyMapping:aRelationship];
[containerMapping addPropertyMapping:bRelationship];
Then your request descriptor is defined using the container mapping.

ndb.Key filter for MapReduce input_reader

Playing with new Google App Engine MapReduce library filters for input_reader I would like to know how can I filter by ndb.Key.
I read this post and I've played with datetime, string, int, float, in filters tuples, but How I can filter by ndb.Key?
When I try to filter by a ndb.Key I get this error:
BadReaderParamsError: Expected Key, got u"Key('Clients', 406)"
Or this error:
TypeError: Key('Clients', 406) is not JSON serializable
I tried to pass a ndb.Key object and string representation of the ndb.Key.
Here are my two filters tuples:
Sample 1:
input_reader': {
'input_reader': 'mapreduce.input_readers.DatastoreInputReader',
'entity_kind': 'model.Sales',
'filters': [("client","=", ndb.Key('Clients', 406))]
}
Sample 2:
input_reader': {
'input_reader': 'mapreduce.input_readers.DatastoreInputReader',
'entity_kind': 'model.Sales',
'filters': [("client","=", "%s" % ndb.Key('Clients', 406))]
}
This is a bit tricky.
If you look at the code on Google Code you can see that mapreduce.model defines a JSON_DEFAULTS dict which determines the classes that get special-case handling in JSON serialization/deserialization: by default, just datetime. So, you can monkey-patch the ndb.Key class into there, and provide it with functions to do that serialization/deserialization - something like:
from mapreduce import model
def _JsonEncodeKey(o):
"""Json encode an ndb.Key object."""
return {'key_string': o.urlsafe()}
def _JsonDecodeKey(d):
"""Json decode a ndb.Key object."""
return ndb.Key(urlsafe=d['key_string'])
model.JSON_DEFAULTS[ndb.Key] = (_JsonEncodeKey, _JsonDecodeKey)
model._TYPE_IDS['Key'] = ndb.Key
You may also need to repeat those last two lines to patch mapreduce.lib.pipeline.util as well.
Also note if you do this, you'll need to ensure that this gets run on any instance that runs any part of a mapreduce: the easiest way to do this is to write a wrapper script that imports the above registration code, as well as mapreduce.main.APP, and override the mapreduce URL in your app.yaml to point to your wrapper.
Make your own input reader based on DatastoreInputReader, which knows how to decode key-based filters:
class DatastoreKeyInputReader(input_readers.DatastoreKeyInputReader):
"""Augment the base input reader to accommodate ReferenceProperty filters"""
def __init__(self, *args, **kwargs):
try:
filters = kwargs['filters']
decoded = []
for f in filters:
value = f[2]
if isinstance(value, list):
value = db.Key.from_path(*value)
decoded.append((f[0], f[1], value))
kwargs['filters'] = decoded
except KeyError:
pass
super(DatastoreKeyInputReader, self).__init__(*args, **kwargs)
Run this function on your filters before passing them in as options:
def encode_filters(filters):
if filters is not None:
encoded = []
for f in filters:
value = f[2]
if isinstance(value, db.Model):
value = value.key()
if isinstance(value, db.Key):
value = value.to_path()
entry = (f[0], f[1], value)
encoded.append(entry)
filters = encoded
return filters
Are you aware of the to_old_key() and from_old_key() methods?
I had the same problem and came up with a workaround with computed properties.
You can add to your Sales model a new ndb.ComputedProperty with the Key id. Ids are just strings, so you wont have any JSON problems.
client_id = ndb.ComputedProperty(lambda self: self.client.id())
And then add that condition to your mapreduce query filters
input_reader': {
'input_reader': 'mapreduce.input_readers.DatastoreInputReader',
'entity_kind': 'model.Sales',
'filters': [("client_id","=", '406']
}
The only drawback is that Computed properties are not indexed and stored until you call the put() parameter, so you will have to traverse all the Sales entities and save them:
for sale in Sales.query().fetch():
sale.put()

Categories