Pyramid global object share across thread - python

I have a small pyramid web service.
I have also a python class that creates an index of items and methods to search fast across them. Something like:
class MyCorpus(object):
def __init__(self):
self.table = AwesomeDataStructure()
def insert(self):
self.table.push_back(1)
def find(self, needle):
return self.table.find(needle)
I would like to expose the above class to my api.
I can create only one instance of that class (memory limit).
So I need to be able to instantiate this class before the server starts.
And my threads should be able to access it.
I also need some locking mechanism(conccurrent inserts are not supported).
What is the best way to achieve that?

Add an instance of your class to the global application registry during your Pyramid application's configuration:
config.registry.mycorpus = MyCorpus()
and later, for example in your view code, access it through a request:
request.registry.mycorpus
You could also register it as a utility with Zope Component Architecture using registry.registerUtility, but you'd need to define what interface MyCorpus provides etc., which is a good thing in the long run. Either way having a singleton instance as part of the registry makes testing your application easier; just create a configuration with a mock corpus.
Any locking should be handled by the instance itself:
from threading import Lock
class MyCorpus(object):
def __init__(self, Lock=Lock):
self.table = AwesomeDataStructure()
self.lock = Lock()
...
def insert(self):
with self.lock:
self.table.push_back(1)

Any global variable is shared between threads in Python, so this part is really easy: "... create only one instance of that class ... before the server starts ... threads should be able to access it":
corpus = MyCorpus() # in global scope in any module
Done! Then import the instance from anywhere and call your class' methods:
from mydata import corpus
corpus.do_stuff()
No need for ZCA, plain pythonic Python :)
(the general approach of keeping something large and very database-like within the webserver process feels quite suspicious though, I hope you know what you're doing. I mean - persistence? locking? sharing data between multiple processes? Redis, MongoDB and 1001 other database products have those problems solved)

Related

Cleanest way to reset a class attribute

I have a class that looks like the following
class A:
communicate = set()
def __init__(self):
pass
...
def _some_func(self):
...some logic...
self.communicate.add(some_var)
The communicate variable is shared among the instances of the class. I use it to provide a convenient way for the instances of this class to communicate with one another (they have some mild orchestration needed and I don't want to force the calling object to serve as an intermediary of this communication). However, I realized this causes problems when I run my tests. If I try to test multiple aspects of my code, since the python interpreter is the same throughout all the tests, I won't get a "fresh" A class for the tests, and as such the communicate set will be the aggregate of all objects I add to that set (in normal usage this is exactly what I want, but for testing I don't want interactions between my tests). Furthermore, down the line this will also cause problems in my code execution if I want to loop over my whole process multiple times (because I won't have a way of resetting this class variable).
I know I can fix this issue where it occurs by having the creator of the A objects do something like
A.communicate = set()
before it creates and uses any instances of A. However, I don't really love this because it forces my caller to know some details about the communication pathways of the A objects, and I don't want that coupling. Is there a better way for me to to reset the communicate A class variable? Perhaps some method I could call on the class instead of an instance itself like (A.new_batch()) that would perform this resetting? Or is there a better way I'm not familiar with?
Edit:
I added a class method like
class A:
communicate = set()
def __init__(self):
pass
...
#classmethod
def new_batch(cls):
cls.communicate = set()
def _some_func(self):
...some logic...
self.communicate.add(some_var)
and this works with the caller running A.new_batch(). Is this the way it should be constructed and called, or is there a better practice here?

Copy member functions as a way of providing an interface

Is this good Python practice?
import threading
import Queue
class Poppable(threading.Thread):
def __init__(self):
super(Poppable, self).__init__()
self._q = Queue.Queue()
# provide a limited subset of the Queue interface to clients
self.qsize = self._q.qsize
self.get = self._q.get
def run(self):
# <snip> -- do stuff that puts new items onto self._q
# this is why clients don't need access to put functionality
Does this approach of "promoting" member's functions up to the containing class's interface violate the style, or Zen, of Python?
Mainly I'm trying to contrast this approach with the more standard one that would involve declaring wrapper functions normally:
def qsize(self):
return self._q.qsize()
def get(self, *args):
return self._q.get(*args)
I don't think that is Python specific. In general, this is a good OOP practice. You expose just the functions you need the client to know, hiding the internals of the contained queue. This is a typical approach when wrapping an object, and totally compliant with principle of least knowledge.
If, instead of self.qsize the client had to call self._q.qsize, you cannot easily change _q with a different data type, which does not have a qsize method if that is needed later. So, your approach, makes the object more open to possible future changes.

How to avoid circular reference in python - a class member creating an object of another class and passing self as parameter?

I need some help in terms of 'pythonic' way of handling a specific scenario.
I'm writing an Ssh class (wraps paramiko) that provides the capability to connect to and executes commands on a device under test (DUT) over ssh.
class Ssh:
def connect(some_params):
# establishes connection
def execute_command(command):
# executes command and returns response
def disconnect(some_params):
# closes connection
Next, I'd like to create a Dut class that represents my device under test. It has other things, besides capability to execute commands on the device over ssh. It exposes a wrapper for command execution that internally invokes the Ssh's execute_command. The Ssh may change to something else in future - hence the wrapper.
def Dut:
def __init__(some params):
self.ssh = Ssh(blah blah)
def execute_command(command)
return self.ssh.execute_command(command)
Next, the device supports a custom command line interface for device under test. So, a class that accepts a DUT object as an input and exposes a method to execute the customised command.
def CustomCli:
def __init__(dut_object):
self.dut = dut_object
def _customize(command):
# return customised command
def execute_custom_command(command):
return self.dut.execute_command(_customize(command))
Each of the classes can be used independently (CustomCli would need a Dut object though).
Now, to simplify things for user, I'd like to expose a wrapper for CustomCli in the Dut class. This'll allow the creator of the Dut class to exeute a simple or custom command.
So, I modify the Dut class as below:
def Dut:
def __init__(some params):
self.ssh = Ssh(blah blah)
self.custom_cli = Custom_cli(self) ;# how to avoid this circular reference in a pythonic way?
def execute_command(command)
return self.ssh.execute_command(command)
def execute_custom_command(command)
return self.custom_cli.execute_custom_command(command)
This will work, I suppose. But, in the process I've created a circular reference - Dut is pointing to CustomCli and CustomCli has a reference to it's creator Dut instance. This doesn't seem to be the correct design.
What's the best/pythonic way to deal with this?
Any help would be appreciated!
Regards
Sharad
In general, circular references aren't a bad thing. Many programs will have them, and people just don't notice because there's another instance in-between like A->B->C->A. Python's garbage collector will properly take care of such constructs.
You can make circular references a bit easier on your conscience by using weak references. See the weakref module. This won't work in your case, however.
If you want to get rid of the circular reference, there are two way:
Have CustomCLI inherit from Dut, so you end up with just one instance. You might want to read up on Mixins.
class CLIMerger(Dut):
def execute_custom_command(command):
return self.execute_command(_customize(command))
# use self^ instead of self.dut
class CLIMixin(object):
# inherit from object, won't work on its own
def execute_custom_command(command):
return self.execute_command(_customize(command))
# use self^ instead of self.dut
class CLIDut(Dut, CLIMixin):
# now the mixin "works", but still could enhance other Duts the same way
pass
The Mixin is advantageous if you need several cases of merging a CLI and Dut.
Have an explicit interface class that combines CustomCli and Dut.
class DutCLI(object):
def __init__(self, *bla, **blah):
self.dut = Dut(*bla, **blah)
self.cli = CustomCLI(self.dut)
This requires you to write boilerplate or magic to forward every call from DutCLI to either dut or cli.

What is the idiomatic Twisted way to implement observers?

I have a class that encapsulates database access by my application, and I want to allow other parts of the application to be notified when rows change in the database. Right now, I'm maintaining a list of callback functions, out of which I just-in-time create a DeferredList when I need to send a notification. This seems super kludgy -- is there a more idiomatic way?
Sample Code:
class Db(object):
def __init__(self):
self.observers = []
def _on_notify(self, notify):
# called by the db connection
DeferredList(*[Deferred().addCallback(observer) for observer in self.observers]).callback(dict(notify=notify, db=self))
def observe(self, callback):
self.observers.append(callback)

OOP: Call method of creating object from class handler

I'm working on creating a simple start/stop HTTP Server python app, and I currently have the following class setup:
#The Tkinter interface for the application
class Application():
def __init__(self,win):
self.serverThread=ServerThread()
self.output=Tkinter.Text(win)
self.output.pack()
#The Server Thread
class ServerThread():
class ServerHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def log_message(msg):
//here's where I'm confused
def __init__(self):
self.server=BaseHTTPServer.HTTPServer(('',8000),self.ServerHandler)
What would be the correct way to add text the the output field in my application from the ServerHandler class without using a global output variable? Does ServerHandler even have a reference to the ServerThread object that created it?
Edit: I guess what I'm really looking for is this: How can I let ServerHandler, which is passed as a class to BaseHTTPServer.HTTPServer - know about the ServerThread and Application objects created without using global variables?
If you are looking for logging facilities, I would recommend that you check out the standard logging module. In effect, it gives you as many "global" logging outputs as you want. It also obviously has the advantage of being standard: this makes logging code more legible, as many people use this method.

Categories