Storing functions in a sparse array with Python - python

I have a relatively large enum wherein each member represents a message type. A client will receive a message containing the integer value associated with the msg type in the enum. For each msg type there will be an individual function callback to handle the msg.
I'd like to make the lookup and dispatching of the callback as quick as possible by using a sparse array (or vector) in which the enum value maps to the index of the callback. Is this possible in Python given arrays can't hold function types?
#pseudo code for 'enum'
class MsgType(object):
LOGIN, LOGOUT, HEARTBEAT, ... = range(n)
#handler class
class Handler(object):
def handleMsg(self, msg):
#dispatch msg to specific handler
def __onLogin(self, msg):
#handle login
def __onLogout(self, msg):
#handle logout
Update:
I wasn't clear in my terminology. I now understand Python dictionary lookups to be of complexity O(1) which makes them the perfect candidate. Thanks.

class MsgID(int):
pass
LOGIN = MsgID(0)
LOGOUT = MsgID(1)
HEARTBEAT = MsgID(2)
... # add all other message identifier numbers
class MsgType(object):
def __init__(self, id, data):
self.id = id
self.data = data
def login_handler(msg):
... # do something here
def logout_handler(msg):
... # do something here
def heartbeat_handler(msg):
... # do something here
msg_func = {
LOGIN : login_handler,
LOGOUT : logout_handler,
HEARTBEAT : heartbeat_handler,
...
}
class Handler(object):
def handleMsg(self, msg):
try:
msg_func[msg.id](msg) # lookup function reference in dict, call function
except KeyError:
log_error_mesg('message without a handler function: %d' % msg.id)
It's not strictly needed, but I added a subclass of int for message ID. That way you can check to see if the ID value is really an ID value rather than just some random integer.
I assume that each message will have an ID value in it, identifying what sort of message it is, plus some data. The msg_func dictionary uses MsgID values as keys, which map to function references.
You could put all the functions inside a class, but I didn't do that here; they are just functions.

Related

Pythonic Classes and the Zen of Python

The Zen of Python says:
“There should be one—and preferably only one—obvious way to do it.”
Let’s say I want to create a class that builds a financial transaction. The class should allow the user to build a transaction and then call a sign() method to sign the transaction in preparation for it to be broadcast via an API call.
The class will have the following parameters:
sender
recipient
amount
signer (private key for signing)
metadata
signed_data
All of these are strings, except for the amount which is an int, and all are required except for the last two: metadata which is an optional parameter, and signed_data which is created when the method sign() is called.
We would like all of the parameters to undergo some kind of validation before the signing happens so we can reject badly formatted transactions by raising an appropriate error for the user.
This seems straight-forward using a classic Python class and constructor:
class Transaction:
def __init__(self, sender, recipient, amount, signer, metadata=None):
self.sender = sender
self.recipient = recipient
self.amount = amount
self.signer = signer
if metadata:
self.metadata = metadata
def is_valid(self):
# check that all required parameters are valid and exist and return True,
# otherwise return false
def sign(self):
if self.is_valid():
# sign transaction
self.signed_data = "pretend signature"
else:
# raise InvalidTransactionError
Or with properties:
class Transaction:
def __init__(self, sender, recipient, amount, signer, metadata=None):
self._sender = sender
self._recipient = recipient
self._amount = amount
self._signer = signer
self._signed_data = None
if metadata:
self._metadata = metadata
#property
def sender(self):
return self._sender
#sender.setter
def sender(self, sender):
# validate value, raise InvalidParamError if invalid
self._sender = sender
#property
def recipient(self):
return self._recipient
#recipient.setter
def recipient(self, recipient):
# validate value, raise InvalidParamError if invalid
self._recipient = recipient
#property
def amount(self):
return self._amount
#amount.setter
def amount(self, amount):
# validate value, raise InvalidParamError if invalid
self._amount = amount
#property
def signer(self):
return self._signer
#signer.setter
def signer(self, signer):
# validate value, raise InvalidParamError if invalid
self._signer = signer
#property
def metadata(self):
return self._metadata
#metadata.setter
def metadata(self, metadata):
# validate value, raise InvalidParamError if invalid
self._metadata = metadata
#property
def signed_data(self):
return self._signed_data
#signed_data.setter
def signed_data(self, signed_data):
# validate value, raise InvalidParamError if invalid
self._signed_data = signed_data
def is_valid(self):
return (self.sender and self.recipient and self.amount and self.signer)
def sign(self):
if self.is_valid():
# sign transaction
self.signed_data = "pretend signature"
else:
# raise InvalidTransactionError
print("Invalid Transaction!")
We can now validate each value when it’s set so by the time we go to sign we know we have valid parameters and the is_valid() method only has to check that all required parameters have been set. This feels a little more Pythonic to me than doing all the validation in the single is_valid() method but I am unsure if all the extra boiler plate code is really worth it.
With dataclasses:
#dataclass
class Transaction:
sender: str
recipient: str
amount: int
signer: str
metadata: str = None
signed_data: str = None
def is_valid(self):
# check that all parameters are valid and exist and return True,
# otherwise return false
def sign(self):
if self.is_valid():
# sign transaction
self.signed_data = "pretend signature"
else:
# raise InvalidTransactionError
print("Invalid Transaction!")
Comparing this to Approach
1, this is pretty nice. It’s concise, clean, and readable and already has __init__(), __repr__() and __eq__() methods built-in. On the other hand, compared to Approach
2 we’re back to validating all the inputs via a massive is_valid() method.
We could try to use properties with dataclasses but that's actually harder than it sounds. According to this blog post it can be done something like this:
#dataclass
class Transaction:
sender: str
_sender: field(init=False, repr=False)
recipient: str
_recipient: field(init=False, repr=False)
. . .
# properties for all parameters
def is_valid(self):
# if all parameters exist, return True,
# otherwise return false
def sign(self):
if self.is_valid():
# sign transaction
self.signed_data = "pretend signature"
else:
# raise InvalidTransactionError
print("Invalid Transaction!")
Is there one and only one obvious way to do this? Are dataclasses recommended for this kind of application?
As a general rule, and not limited to Python, it is a good idea to write code which "fails fast": that is, if something goes wrong at runtime, you want it to be detected and signalled (e.g. by throwing an exception) as early as possible.
Especially in the context of debugging, if the bug is that an invalid value is being set, you want the exception to be thrown at the time the value is set, so that the stack trace includes the method setting the invalid value. If the exception is thrown at the time the value is used, then you can't signal which part of the code caused the invalid value.
Of your three examples, only the second one allows you to follow this principle. It may require more boilerplate code, but writing boilerplate code is easy and doesn't take much time, compared to debugging without a meaningful stack trace.
By the way, if you have setters which do validation, then you should call these setters from your constructor too, otherwise it's possible to create an object with an invalid initial state.
Given your constraints, I think your dataclass approach can be improved to produce an expressive and idiomatic solution with very strong runtime assertions about the resulting Transaction instances, mostly by leveraging the __post_init__ mechanism:
from dataclasses import dataclass, asdict, field
from typing import Optional
#dataclass(frozen=True)
class Transaction:
sender: str
recipient: str
amount: int
signer: str
metadata: Optional[str] = None
signed_data: str = field(init=False)
def is_valid(self) -> bool:
... # implement your validity assertion logic
def __post_init__(self):
if self.is_valid():
object.__setattr__(self, "signed_data", "pretend signature")
else:
raise ValueError(f"Invalid transaction with parameter list "
f"{asdict(self)}.")
This reduces the amount of code you have to maintain and understand to a degree where every written line relates to a meaningful part of your requirements, which is the essence of pythonic code.
Put into words, instances of this Transaction class may specify metadata but don't need to and may not supply their own signed_data, something which was possible in your variant #3. Attributes can't be mutated any more after initialization (enforced by frozen=True), so that an instance that is valid cannot be altered into an invalid state. And most importantly, since the validation is now part of the constructor, it is impossible for an invalid instance to exist. Whenever you are able to refer to a Transaction in runtime, you can be 100% sure that it passed the validity check and would do so again.
Since you based your question on python-zen conformity (referring to Beautiful is better than ugly and Simple is better than complex in particular), I'd say this solution is preferable to the property based one.

Python: Using API Event Handlers with OOP

I am trying to build some UI panels for an Eclipse based tool. The API for the tool has a mechanism for event handling based on decorators, so for example, the following ties callbackOpen to the opening of a_panel_object:
#panelOpenHandler(a_panel_object)
def callbackOpen(event):
print "opening HERE!!"
This works fine, but I wanted to wrap all of my event handlers and actual data processing for the panel behind a class. Ideally I would like to do something like:
class Test(object):
def __init__(self):
# initialise some data here
#panelOpenHandler(a_panel_object)
def callbackOpen(self, event):
print "opening HERE!!"
But this doesn't work, I think probably because I am giving it a callback that takes both self and event, when the decorator is only supplying event when it calls the function internally (note: I have no access to source code on panelOpenHandler, and it is not very well documented...also, any error messages are getting swallowed by Eclipse / jython somewhere).
Is there any way that I can use a library decorator that provides one argument to the function being decorated on a function that takes more than one argument? Can I use lambdas in some way to bind the self argument and make it implicit?
I've tried to incorporate some variation of the approaches here and here, but I don't think that it's quite the same problem.
Your decorator apparently registers a function to be called later. As such, it's completely inappropriate for use on a class method, since it will have no idea of which instance of the class to invoke the method on.
The only way you'd be able to do this would be to manually register a bound method from a particular class instance - this cannot be done using the decorator syntax. For example, put this somewhere after the definition of your class:
panelOpenHandler(context.controls.PerformanceTuneDemoPanel)(Test().callbackOpen)
I found a work around for this problem. I'm not sure if there is a more elegant solution, but basically the problem boiled down to having to expose a callback function to global() scope, and then decorate it with the API decorator using f()(g) syntax.
Therefore, I wrote a base class (CallbackRegisterer), which offers the bindHandler() method to any derived classes - this method wraps a function and gives it a unique id per instance of CallbackRegisterer (I am opening a number of UI Panels at the same time):
class CallbackRegisterer(object):
__count = 0
#classmethod
def _instanceCounter(cls):
CallbackRegisterer.__count += 1
return CallbackRegisterer.__count
def __init__(self):
"""
Constructor
#param eq_instance 0=playback 1=record 2=sidetone.
"""
self._id = self._instanceCounter()
print "instantiating #%d instance of %s" % (self._id, self._getClassName())
def bindHandler(self, ui_element, callback, callback_args = [], handler_type = None,
initialize = False, forward_event_args = False, handler_id = None):
proxy = lambda *args: self._handlerProxy(callback, args, callback_args, forward_event_args)
handler_name = callback.__name__ + "_" + str(self._id)
if handler_id is not None:
handler_name += "_" + str(handler_id)
globals()[handler_name] = proxy
# print "handler_name: %s" % handler_name
handler_type(ui_element)(proxy)
if initialize:
proxy()
def _handlerProxy(self, callback, event_args, callback_args, forward_event_args):
try:
if forward_event_args:
new_args = [x for x in event_args]
new_args.extend(callback_args)
callback(*new_args)
else:
callback(*callback_args)
except:
print "exception in callback???"
self.log.exception('In event callback')
raise
def _getClassName(self):
return self.__class__.__name__
I can then derive a class from this and pass in my callback, which will be correctly decorated using the API decorator:
class Panel(CallbackRegisterer):
def __init__(self):
super(Panel, self).__init__()
# can bind from sub classes of Panel as well - different class name in handle_name
self.bindHandler(self.controls.test_button, self._testButtonCB, handler_type = valueChangeHandler)
# can bind multiple versions of same function for repeated ui elements, etc.
for idx in range(0, 10):
self.bindHandler(self.controls["check_box_"+str(idx)], self._testCheckBoxCB,
callback_args = [idx], handler_type = valueChangeHandler, handler_id = idx)
def _testCheckBoxCB(self, *args):
check_box_id = args[0]
print "in _testCheckBoxCB #%d" % check_box_id
def _testButtonCB(self):
"""
Handler for test button
"""
print "in _testButtonCB"
panel = Panel()
Note, that I can also derive further sub-classes from Panel, and any callbacks bound there will get their own unique handler_name, based on class name string.

Exceptions that reflect error codes of a remote service

I'm working with an external service which reports errors by code.
I have the list of error codes and the associated messages. Say, the following categories exist: authentication error, server error.
What is the smartest way to implement these errors in Python so I can always lookup an error by code and get the corresponding exception object?
Here's my straightforward approach:
class AuthError(Exception):
pass
class ServerError(Exception):
pass
map = {
1: AuthError,
2: ServerError
}
def raise_code(code, message):
""" Raise an exception by code """
raise map[code](message)
Would like to see better solutions :)
Your method is correct, except that map should be renamed something else (e.g. ERROR_MAP) so it does not shadow the builtin of the same name.
You might also consider making the function return the exception rather than raising it:
def error(code, message):
""" Return an exception by code """
return ERROR_MAP[code](message)
def foo():
raise error(code, message)
By placing the raise statement inside foo, you'd raise the error closer to where the error occurred and there would be one or two less lines to trace through if the stack trace is printed.
Another approach is to create a polymorphic base class which, being instantiated, actually produces a subclass that has the matching code.
This is implemented by traversing __subclasses__() of the parent class and comparing the error code to the one defined in the class. If found, use that class instead.
Example:
class CodeError(Exception):
""" Base class """
code = None # Error code
def __new__(cls, code, *args):
# Pick the appropriate class
for E in cls.__subclasses__():
if E.code == code:
C = E
break
else:
C = cls # fall back
return super(CodeError, cls).__new__(C, code, *args)
def __init__(self, code, message):
super(CodeError, self).__init__(message)
# Subclasses with error codes
class AuthError(CodeError):
code = 1
class ServerError(CodeError):
code = 2
CodeError(1, 'Wrong password') #-> AuthError
CodeError(2, 'Failed') #-> ServerError
With this approach, it's trivial to associate error message presets, and even map one class to multiple codes with a dict.

Methods on descriptors

I'm trying to implement a wrapper around a redis database that does some bookkeeping, and I thought about using descriptors. I have an object with a bunch of fields: frames, failures, etc., and I need to be able to get, set, and increment the field as needed. I've tried to implement an Int-Like descriptor:
class IntType(object):
def __get__(self,instance,owner):
# issue a GET database command
return db.get(my_val)
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
class Stream:
_prefix = 'stream'
frames = IntType()
failures = IntType()
uuid = StringType()
s = Stream()
s.frames.increment(1) # float' object has no attribute 'increment'
Is seems like I can't access the increment() method in my descriptor. I can't have increment be defined in the object that the __get__ returns. This would require an additional db query if all I want to do is increment! I also don't want increment() on the Stream class, as later on when I want to have additional fields like strings or sets in Stream, then I'd need to type check the heck out of everything.
Does this work?
class Stream:
_prefix = 'stream'
def __init__(self):
self.frames = IntType()
self.failures = IntType()
self.uuid = StringType()
Why not define the magic method iadd as well as get and set. This will allow you to do normal addition with assignment on the class. It will also mean you can treat the increment separately from the get function and thereby minimise the database accesses.
So change:
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
to:
def __iadd__(self,other):
# your code goes here
Try this:
class IntType(object):
def __get__(self,instance,owner):
class IntValue():
def increment(self,count):
# issue an INCRBY database command
db.hincrby(self.name,count)
def getValue(self):
# issue a GET database command
return db.get(my_val)
return IntValue()
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)

Python - Better to have multiple methods or lots of optional parameters?

I have a class which makes requests to a remote API. I'd like to be able to reduce the number of calls I'm making. Some of the methods in my class make the same API calls (but for different reasons), so I'ld like the ability for them to 'share' a cached API response.
I'm not entirely sure if it's more Pythonic to use optional parameters or to use multiple methods, as the methods have some required parameters if they are making an API call.
Here are the approches as I see them, which do you think is best?
class A:
def a_method( item_id, cached_item_api_response = None):
""" Seems awkward having to supplied item_id even
if cached_item_api_response is given
"""
api_response = None
if cached_item_api_response:
api_response = cached_item_api_response
else:
api_response = ... # make api call using item_id
... #do stuff
Or this:
class B:
def a_method(item_id = None, cached_api_response = None):
""" Seems awkward as it makes no sense NOT to supply EITHER
item_id or cached_api_response
"""
api_response = None
if cached_item_api_response:
api_response = cached_item_api_response
elif item_id:
api_response = ... # make api call using item_id
else:
#ERROR
... #do stuff
Or is this more appropriate?
class C:
"""Seems even more awkward to have different method calls"""
def a_method(item_id):
api_response = ... # make api call using item_id
api_response_logic(api_response)
def b_method(cached_api_response):
api_response_logic(cached_api_response)
def api_response_logic(api_response):
... # do stuff
Normally when writing method one could argue that a method / object should do one thing and it should do it well. If your method get more and more parameters which require more and more ifs in your code that probably means that your code is doing more then one thing. Especially if those parameters trigger totally different behavior. Instead maybe the same behavior could be produced by having different classes and having them overload methods.
Maybe you could use something like:
class BaseClass(object):
def a_method(self, item_id):
response = lookup_response(item_id)
return response
class CachingClass(BaseClass):
def a_method(self, item_id):
if item_id in cache:
return item_from_cache
return super(CachingClass, self).a_method(item_id)
def uncached_method(self, item_id)
return super(CachingClass, self).a_method(item_id)
That way you can split the logic of how to lookup the response and the caching while also making it flexible for the user of the API to decide if they want the caching capabilities or not.
There is nothing wrong with the method used in your class B. To make it more obvious at a glance that you actually need to include either item_id or cached_api_response, I would put the error checking first:
class B:
def a_method(item_id = None, cached_api_response = None):
"""Requires either item_id or cached_api_response"""
if not ((item_id == None) ^ (cached_api_response == None)):
#error
# or, if you want to allow both,
if (item_id == None) and (cached_api_response == None):
# error
# you don't actually have to do this on one line
# also don't use it if cached_item_api_response can evaluate to 'False'
api_response = cached_item_api_response or # make api call using item_id
... #do stuff
Ultimately this is a judgement that must be made for each situation. I would ask myself, which of these two more closely fits:
Two completely different algorithms or actions, with completely different semantics, even though they may be passed similar information
A single conceptual idea, with consistent semantics, but with nuance based on input
If the first is closest, go with separate methods. If the second is closest, go with optional arguments. You might even implement a single method by testing the type of the argument(s) to avoid passing additional arguments.
This is an OO anti-pattern.
class API_Connection(object):
def do_something_with_api_response(self, response):
...
def do_something_else_with_api_response(self, response):
...
You have two methods on an instance and you're passing state between them explicitly? Why are these methods and not bare functions in a module?
Instead, think about using encapsulation to help you by having the instance of the class own the api response.
For example:
class API_Connection(object):
def __init__(self, api_url):
self._url = api_url
self.cached_response = None
#property
def response(self):
"""Actually use the _url and get the response when needed."""
if self._cached_response is None:
# actually calculate self._cached_response by making our
# remote call, etc
self._cached_response = self._get_api_response(self._url)
return self._cached_response
def _get_api_response(self, api_param1, ...):
"""Make the request and return the api's response"""
def do_something_with_api_response(self):
# just use self.response
do_something(self.response)
def do_something_else_with_api_response(self):
# just use self.response
do_something_else(self.response)
You have caching and any method which needs this response can run in any order without making multiple api requests because the first method that needs self.response will calculate it and every other will use the cached value. Hopefully it's easy to imagine extending this with multiple URLs or RPC calls. If you have a need for a lot of methods that cache their return values like response above then you should look into a memoization decorator for your methods.
The cached response should be saved in the instance, not passed around like a bag of Skittles -- what if you dropped it?
Is item_id unique per instance, or can an instance make queries for more than one? If it can have more than one, I'd go with something like this:
class A(object):
def __init__(self):
self._cache = dict()
def a_method( item_id ):
"""Gets api_reponse from cache (cache may have to get a current response).
"""
api_response = self._get_cached_response( item_id )
... #do stuff
def b_method( item_id ):
"""'nother method (just for show)
"""
api_response = self._get_cached_response( item_id )
... #do other stuff
def _get_cached_response( self, item_id ):
if item_id in self._cache:
return self._cache[ item_id ]
response = self._cache[ item_id ] = api_call( item_id, ... )
return response
def refresh_response( item_id ):
if item_id in self._cache:
del self._cache[ item_id ]
self._get_cached_response( item_id )
And if you may have to get the most current info about item_id, you can have a refresh_response method.

Categories