how to wrap file object read and write operation (which are readonly)? - python

i am trying to wrap the read and write operation of an instance of a file object (specifically the readline() and write() methods).
normally, i would simply replace those functions by a wrapper, a bit like this:
def log(stream):
def logwrite(write):
def inner(data):
print 'LOG: > '+data.replace('\r','<cr>').replace('\n','<lf>')
return write(data)
return inner
stream.write = logwrite(stream.write)
but the attributes of a file object are read-only ! how could i wrap them properly ?
(note: i am too lazy to wrap the whole fileobject... really, i don't want to miss a feature that i did not wrap properly, or a feature which may be added in a future version of python)
more context :
i am trying to automate the communication with a modem, whose AT command set is made available on the network through a telnet session. once logged in, i shall "grab" the module with which i want to communicate with. after some time without activity, a timeout occurs which releases the module (so that it is available to other users on the network... which i don't care, i am the sole user of this equipment). the automatic release writes a specific line on the session.
i want to wrap the readline() on a file built from a socket (cf. socket.makefile()) so that when the timeout occurs, a specific exception is thrown, so that i can detect the timeout anywhere in the script and react appropriately without complicating the AT command parser...
(of course, i want to do that because the timeout is quite spurious, otherwise i would simply feed the modem with commands without any side effect only to keep the module alive)
(feel free to propose any other method or strategy to achieve this effect)

use __getattr__ to wrap your file object. provide modified methods for the ones that you are concerned with.
class Wrapped(object):
def __init__(self, file_):
self._file = file_
def write(self, data):
print 'LOG: > '+data.replace('\r','<cr>').replace('\n','<lf>')
return self._file.write(data)
def __getattr__(self, attr):
return getattr(self._file, attr)
This way, requests for attributes which you don't explicitly provide will be routed to the attribute on the wrapped object and you can just implement the ones that you want
logged = Wrapped(open(filename))

Related

Transparently passing through a function with a variable argument list

I am using Python RPyC to communicate between two machines. Since the link may be prone to errors I would like to have a generic wrapper function which takes a remote function name plus that function's parameters as its input, does some status checking, calls the function with the parameters, does a little more status checking and then returns the result of the function call. The wrapper should have no knowledge of the function, its parameters/parameter types or the number of them, or the return value for that matter, the user has to get that right; it should just pass them transparently through.
I get the getattr(conn.root, function)() pattern to call the function but my Python expertise runs out at populating the parameters. I have read various posts on the use of *arg and **kwarg, in particular this one, which suggests that it is either difficult or impossible to do what I want to do. Is that correct and, if so, might there be a scheme which would work if I, say, ensured that all the function parameters were keyword parameters?
I do own both ends of this interface (the caller and the called) so I could arrange to dictionary-ise all the function parameters but I'd rather not make my API too peculiar if I could possibly avoid it.
Edit: the thing being called, at the remote end of the link, is a class with very ordinary methods, e.g.;
def exposed_a(self)
def exposed_b(self, thing1)
def exposed_c(self, thing1=None)
def exposed_d(self, thing1=DEFAULT_VALUE1, thing2=None)
def exposed_e(self, thing1, thing2, thing3=DEFAULT_VALUE1, thing4=None)
def exposed_f(self, thing1=None, thing2=None)
...where the types of each argument (and the return values) could be string, dict, number or list.
And it is indeed, trivial, my Goggle fu had simply failed me in finding the answer. In the hope of helping anyone else who is inexperienced in Python and is having a Google bad day:
One simply takes *arg and **kwarg as parameters and passes them directly on, with the asterisks attached. So in my case, to do my RPyC pass-through, where conn is the RPyC connection:
def my_passthru(conn, function_name, *function_args, **function_kwargs):
# Do a check of something or other here
return_value = getattr(conn.root, function_name)(*function_args, **function_kwargs)
# Do another check here
return return_value
Then, for example, a call to my exposed_e() method above might be:
return_value = my_passthru(conn, e, thing1, thing2, thing3)
(the exposed_ prefix being added automagically by RPyC in this case).
And of course one could put a try: / except ConnectionRefusedError: around the getattr() call in my_passthru() to generically catch the case where the connection has dropped underneath RPyC, which was my main purpose.

Caching a non changing frequently read file in python

Okay folks lemme illustrate, I've this
def get_config_file(file='scrapers_conf.json'):
"""
Load the default .json config file
"""
return json.load(open(file))
and this function is called a lot, This will be on a server and every request will trigger this function at least 5 times, I've multiple scrapers running, each one is on the following shape.
I removed helper methods for convenience, the problem is, each scraper should have it's own request headers, payload, ... or use the default ones that lie in scrapers_conf.json
class Scraper(threading.Thread): # init is overriden and has set .conf
def run(self):
self.get()
def get(self):
# logic
The problem is that I'm getting the headers like
class Scraper(threading.Thread):
def run(self):
self.get()
def get(self):
headers = self.conf.get('headers') or get_config_file().get('headers')
so as you see, each single instance on each single request calls the get_config_file() function which I don't think is optimal in my case. I know about lru_cache but I don't think it's the optimal solution (correct me please!)
The config files are small, os.sys.getsizeof reports under 1 KB.
I'm thinking of just leaving it as is considering that reading a 1 KB ain't a problem.
Thanks in advance.
lru_cache(maxsize=None) sounds like the right way to do this; the maxsize=None makes it faster by turning off the LRU machinery.
The other way would be to call get_config_file() at the beginning of the program (in __init__, get, or in the place that instantiates the class), assign it to an attribute on each Scraper class and then always refer to self.config (or whatever). That has the advantage that you can skip reading the config file in unit tests — you can pass a test config directly into the class.
In this case, since the class already has a self.conf, it might be best to update that dictionary with the values from the file, rather than referring to two places in each of the methods.
I've totally forgot about #functools.cached_property
#cached_property
def get_config_file(file='scrapers_conf.json'):
"""
Load the default .json config file
"""
return json.load(open(file))

How to call a function in a different process using Connections

I want to realize some sort oft client-server-connection using Python and are rather new to multiprocessing. Basically, I have a class 'Manager' that inherits from multiprocessing.Process and manages the connection from a client to different data sources. This process has some functions like 'get_value(key)' that should return the value of the key-data source. Now, as I want this to run asynchronized, I cannot simply call this function from my client process.
My idea so far would be that I connect the Client- and Manager-Processes using a Pipe and then send a message from the Client to the Manager to execute this function. I would realize this by sending a list through the pipe where the first element is the name of the function the remaining elements are the arguments of the actual function, e.g. ['get_value', 'datasource1']. The process then would receive this and send the return value through the pipe to the client. This would look something like this:
from multiprocessing import Process, Pipe
import time
class Manager(Process):
def __init__(self, connection):
super(Process, self).__init__()
self.connection = connection
def run(self):
while True:
if self.connection.poll():
msg = self.connection.recv()
self.call_function(msg[0], msg[:])
def call_function(self, name, *args):
print('Function Called with %s' % name)
return_val = getattr(self, name)(*args)
self.connection.send(return_val)
def get_value(self, key):
return 1.0
While I guess that this would work, I am not very happy with this solution. Especially the call-function-by-string-method seems very error-prone. Is there a more elegant way of requesting to execute a function in Python?
I think that your approach, all in all, is a good one (there are other ways to do the same thing, of course, but there is nothing wrong with your general approach).
That said, I would change the design slightly to add a "routing" component: think of some logic that somehow limits what "commands" can be sent by clients, and hooks between commands and "handlers" - that is functions that handle them. Basically think Web Framework routing (if you are familiar with the concept).
This is a good idea both in terms of flexibility of the design, in terms of error detection and in terms of security (you don't want clients to call ['__del__'] for example on your Manager.
At it's very basic form, a router can be a dictionary mapping commands to class methods:
class Manager(Process):
def __init__(self, connection):
super(Process, self).__init__()
self.connection = connection
self.routes = {'do_action': self._do_action,
'do_other_action': some_callable,
'ping': lambda args: args} # <- as long as it's callable and has the right signature...
def call_function(self, name, *args):
try:
handler = self.routes[name]
except KeyError:
return self._error_reply('{} is not a valid command'.format(name))
try:
return_val = handler(*args) # handler functions will need to throw something if arguments are wrong...
except ValueError as e:
return self._error_reply('Invalid command arguments: {}'.format(str(e)))
except Exception as e:
# This is your catch-all "internal server error" handler
return self._error_reply(str(e))
self.connection.send(return_val)
This is of course just an example of an approach. You will need to implement _error_reply() in whatever way works for you.
You can expand on it by creating a Router class and passing it as a dependency to Manager, making it even more flexible. You might also want to think about making your Manager a separate thing and not a subclass of Process (because you might want to run it regardless of whether it is in a subprocess - for example in testing).
BTW, there are frameworks for implementing such things with various degrees of complexity and flexibility (Thrift, ZeroMQ, ...), but if you want to do something simple and learn, doing it yourself is in my opinion a great choice.

How do I know when I can/should use `with` keyword?

In C#, when an object implements IDisposable, using should be used to guarantee that resources will be cleaned if an exception is thrown. For instance, instead of:
var connection = new SqlConnection(...);
...
connection.Close();
one needs to write:
using (var connection = new SqlConnection(...))
{
...
}
Therefore, just by looking at the signature of the class, I know exactly whether or not I should initialize the object inside a using.
In Python 3, a similar construct is with. Similarly to C#, it ensures that the resources will be cleaned up automatically when exiting the with context, even if a error is raised.
However, I'm not sure how should I determine whether with should be used or not for a specific class. For instance, an example from psycopg doesn't use with, which may mean that:
I shouldn't either, or:
The example is written for Python 2, or:
The authors of the documentation were unaware of with syntax, or:
The authors decided not to handle exceptional cases for the sake of simplicity.
In general, how should I determine whether with should be used when initializing an instance of a specific class (assuming that documentation says nothing on the subject, and that I have access to source code)?
Regarding when you should use it:
No one forces you to use the with statement, it's just syntactic sugar that's there to make your life easier. If you use it or not is totally up to you but, it is generally recommended to do so. (We're forgetful and with ... looks ways better than explicit initialize resource/finalize recourse calls).
When you can use it:
When you can use it boils down to examining if it defines the context manager protocol. This could be as simple as trying to use with and seeing that it fails :-)
If you dynamically need to check if an object is a context manager, you have two options.
First, wait for the stable release of Python 3.6 which defines an ABC for context managers, ContextManager, which can be used in issubclass/isinstance checks:
>>> from typing import ContextManager
>>> class foo:
... def __enter__(self): pass
... def __exit__(self): pass
...
>>> isinstance(foo(), ContextManager)
True
>>> class foo2: pass
...
>>> isinstance(foo2(), ContextManager)
False
Or, create your own little function to check for it:
def iscontext(inst):
cls = type(inst)
return (any("__enter__" in vars(a) for a in cls.__mro__) and
any("__exit__" in vars(a) for a in cls.__mro__))
As a final note, the with statement is present in Python 2 and in 3, the use case you saw probably just wasn't aware of it :-).
with is for use with context managers.
At the code level, a context manager must define two methods:
__enter__(self)
__exit__(self, type, value, traceback).
Be aware that there are class decorators which can turn otherwise simple classes/functions into context managers - see contextlib for some examples
You should use with whenever you need to perform some similar action before and after executing the statement. For example:
Want to execute SQL query? You need to open and close the connections safely.Use with.
Want to perform some action on file? You have to open and close the file safely. Use with
Want to store some data in temporary file to perform some task? You need to create the directory, and clean it up once you are done. Use with, and so on. . .
Everything you want to perform before the query execution, add it to the __enter__() method. And the action to be performed after, add it to the __exit__() method.
One of the nice thing about with is, __exit__ is executed even if the code within with raises any Exception

Implementing use of 'with object() as f' in custom class in python

I have to open a file-like object in python (it's a serial connection through /dev/) and then close it. This is done several times in several methods of my class. How I WAS doing it was opening the file in the constructor, and then closing it in the destructor. I'm getting weird errors though and I think it has to do with the garbage collector and such, I'm still not used to not knowing exactly when my objects are being deleted =\
The reason I was doing this is because I have to use tcsetattr with a bunch of parameters each time I open it and it gets annoying doing all that all over the place. So I want to implement an inner class to handle all that so I can use it doing
with Meter('/dev/ttyS2') as m:
I was looking online and I couldn't find a really good answer on how the with syntax is implemented. I saw that it uses the __enter__(self) and __exit(self)__ methods. But is all I have to do implement those methods and I can use the with syntax? Or is there more to it?
Is there either an example on how to do this or some documentation on how it's implemented on file objects already that I can look at?
Those methods are pretty much all you need for making the object work with with statement.
In __enter__ you have to return the file object after opening it and setting it up.
In __exit__ you have to close the file object. The code for writing to it will be in the with statement body.
class Meter():
def __init__(self, dev):
self.dev = dev
def __enter__(self):
#ttysetattr etc goes here before opening and returning the file object
self.fd = open(self.dev, MODE)
return self
def __exit__(self, type, value, traceback):
#Exception handling here
close(self.fd)
meter = Meter('dev/tty0')
with meter as m:
#here you work with the file object.
m.fd.read()
Easiest may be to use standard Python library module contextlib:
import contextlib
#contextlib.contextmanager
def themeter(name):
theobj = Meter(name)
try:
yield theobj
finally:
theobj.close() # or whatever you need to do at exit
# usage
with themeter('/dev/ttyS2') as m:
# do what you need with m
m.read()
This doesn't make Meter itself a context manager (and therefore is non-invasive to that class), but rather "decorates" it (not in the sense of Python's "decorator syntax", but rather almost, but not quite, in the sense of the decorator design pattern;-) with a factory function themeter which is a context manager (which the contextlib.contextmanager decorator builds from the "single-yield" generator function you write) -- this makes it so much easier to separate the entering and exiting condition, avoids nesting, &c.
The first Google hit (for me) explains it simply enough:
http://effbot.org/zone/python-with-statement.htm
and the PEP explains it more precisely (but also more verbosely):
http://www.python.org/dev/peps/pep-0343/

Categories