block read of instance variable when trying to set it - python

Class A(object):
def __init__(self, cookie):
self.__cookie = cookie
def refresh_cookie():
```This method refresh the cookie after every 10 min```
self.__cookie = <newcookie>
#property
def cookie(self):
return self.__cookie
Problem is cookie value gets changed after every 10 min. However if some method already had the older cookie then request fails. This happen when multiple threads using the same A object.
I am looking for some solution where whenever we tries to refresh i.e. modify cookie value no one should be able to read the cookie value rather there should be a lock at cookie value.

This is a job for a condition variable.
from threading import Lock, Condition
class A(object):
def __init__(self, cookie):
self.__cookie = cookie
self.refreshing = Condition()
def refresh_cookie():
```This method refresh the cookie after every 10 min```
with self.refreshing:
self.__cookie = <newcookie>
self.refreshing.notifyAll()
#property
def cookie(self):
with self.refreshing:
return self.__cookie
Only one thread can enter a with block governed by self.refreshing at a time. The first thread to try will succeed; the others will block until the first leaves its with block.

Related

How to restart session after n amount of requests have been made

I have a script that tries to scrape data from a website. The website blocks any incoming requests after ~75 requests have already been made to it. I found that resetting a session after 50 requests and sleeping for 30s seems to get around the problem of getting blocked. Now I would like to subclass requests.Session and modify It's behaviour in order so It automatically resets the session when it needs to. Here is my code so far:
class Session(requests.Session):
request_count_limit = 50
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.request_count = 0
def get(self, url, **kwargs):
if self.request_count == self.request_count_limit:
self = Session.restart_session()
response = super().get(url, **kwargs)
self.request_count += 1
return response
#classmethod
def restart_session(cls):
print('Restarting Session, Sleeping For 20 seconds...')
time.sleep(20)
return cls()
However, the code above doesn't work. The reason is although I am reassigning self the object itself doesn't change and with that the request_count doesn't change as well. Any help would be appreciated
Assigning to self is just changing a local variable, it has absolutely no effect outside of the method. You could try implementing .new() instead.
Look here: Python Class: overwrite `self`

Non-lazy instance creation with Pyro4 and instance_mode='single'

My aim is to provide to a web framework access to a Pyro daemon that has time-consuming tasks at the first loading. So far, I have managed to keep in memory (outside of the web app) a single instance of a class that takes care of the time-consuming loading at its initialization. I can also query it with my web app. The code for the daemon is:
Pyro4.expose
#Pyro4.behavior(instance_mode='single')
class Store(object):
def __init__(self):
self._store = ... # the expensive loading
def query_store(self, query):
return ... # Useful query tool to expose to the web framework.
# Not time consuming, provided self._store is
# loaded.
with Pyro4.Daemon() as daemon:
uri = daemon.register(Thing)
with Pyro4.locateNS() as ns:
ns.register('thing', uri)
daemon.requestLoop()
The issue I am having is that although a single instance is created, it is only created at the first proxy query from the web app. This is normal behavior according to the doc, but not what I want, as the first query is still slow because of the initialization of Thing.
How can I make sure the instance is already created as soon as the daemon is started?
I was thinking of creating a proxy instance of Thing in the code of the daemon, but this is tricky because the event loop must be running.
EDIT
It turns out that daemon.register() can accept either a class or an object, which could be a solution. This is however not recommended in the doc (link above) and that feature apparently only exists for backwards compatibility.
Do whatever initialization you need outside of your Pyro code. Cache it somewhere. Use the instance_creator parameter of the #behavior decorator for maximum control over how and when an instance is created. You can even consider pre-creating server instances yourself and retrieving one from a pool if you so desire? Anyway, one possible way to do this is like so:
import Pyro4
def slow_initialization():
print("initializing stuff...")
import time
time.sleep(4)
print("stuff is initialized!")
return {"initialized stuff": 42}
cached_initialized_stuff = slow_initialization()
def instance_creator(cls):
print("(Pyro is asking for a server instance! Creating one!)")
return cls(cached_initialized_stuff)
#Pyro4.behavior(instance_mode="percall", instance_creator=instance_creator)
class Server:
def __init__(self, init_stuff):
self.init_stuff = init_stuff
#Pyro4.expose
def work(self):
print("server: init stuff is:", self.init_stuff)
return self.init_stuff
Pyro4.Daemon.serveSimple({
Server: "test.server"
})
But this complexity is not needed for your scenario, just initialize the thing (that takes a long time) and cache it somewhere. Instead of re-initializing it every time a new server object is created, just refer to the cached pre-initialized result. Something like this;
import Pyro4
def slow_initialization():
print("initializing stuff...")
import time
time.sleep(4)
print("stuff is initialized!")
return {"initialized stuff": 42}
cached_initialized_stuff = slow_initialization()
#Pyro4.behavior(instance_mode="percall")
class Server:
def __init__(self):
self.init_stuff = cached_initialized_stuff
#Pyro4.expose
def work(self):
print("server: init stuff is:", self.init_stuff)
return self.init_stuff
Pyro4.Daemon.serveSimple({
Server: "test.server"
})

Does Python HttpServer instantiate a new request handler for every request

I'm creating a server like this:
server = HTTPServer(('', PORT_NUMBER), MyHandler)
...and then the handler:
class MyHandler(BaseHTTPRequestHandler):
x = 0
some_object = SomeClass()
def do_GET(self):
print self.x
self.x += 1
# etc. but x is not used further
class SomeClass:
def __init__(self):
print "Initialising SomeClass"
Now, everytime I make a get request, the value printed for self.x is always 0. However, the SomeClass constructor is only called once, when the server is first fired up (I'm assuming this is the case because the print message in the constructor is only called once).
The fact that self.x keeps resetting for every request suggests that the handler class is recreated new for each request, but the fact that the SomeClass message only prints once contradicts this.
Can someone tell me what's going on here?
It doesn't contradict anything. Because you're calling SomeClass() in the class definition (rather than __init__), it's called when the class is defined, not when it is instantiated.
What happens when self.x += 1 is called, is that the value of self.x is read from the class level, but then the assignment is made on the instance level, so a new x is created that is specific to the instance.
You could try changing it from self.x to MyHandler.x and see what happens.

Having persistent runtime objects with Tornado

I'm working on a project in Tornado that relies heavily on the asynchronous features of the library. By following the chat demo, I've managed to get long-polling working with my application, however I seem to have run into a problem with the way it all works.
Basically what I want to do is be able to call a function on the UpdateManager class and have it finish the asynchronous request for any callbacks in the waiting list. Here's some code to explain what I mean:
update.py:
class UpdateManager(object):
waiters = []
attrs = []
other_attrs = []
def set_attr(self, attr):
self.attrs.append(attr)
def set_other_attr(self, attr):
self.other_attrs.append(attr)
def add_callback(self, cb):
self.waiters.append(cb)
def send(self):
for cb in self.waiters:
cb(self.attrs, self.other_attrs)
class LongPoll(tornado.web.RequestHandler, UpdateManager):
#tornado.web.asynchronous
def get(self):
self.add_callback(self.finish_request)
def finish_request(self, attrs, other_attrs):
# Render some JSON to give the client, etc...
class SetSomething(tornado.web.RequestHandler):
def post(self):
# Handle the stuff...
self.add_attr(some_attr)
(There's more code implementing the URL handlers/server and such, however I don't believe that's necessary for this question)
So what I want to do is make it so I can call UpdateManager.send from another place in my application and still have it send the data to the waiting clients. The problem is that when you try to do this:
from update import UpdateManager
UpdateManager.send()
it only gets the UpdateManager class, not the instance of it that is holding user callbacks. So my question is: is there any way to create a persistent object with Tornado that will allow me to share a single instance of UpdateManager throughout my application?
Don't use instance methods - use class methods (after all, you're already using class attributes, you just might not realize it). That way, you don't have to instantiate the object, and can instead just call the methods of the class itself, which acts as a singleton:
class UpdateManager(object):
waiters = []
attrs = []
other_attrs = []
#classmethod
def set_attr(cls, attr):
cls.attrs.append(attr)
#classmethod
def set_other_attr(cls, attr):
cls.other_attrs.append(attr)
#classmethod
def add_callback(cls, cb):
cls.waiters.append(cb)
#classmethod
def send(cls):
for cb in cls.waiters:
cb(cls.attrs, cls.other_attrs)
This will make...
from update import UpdateManager
UpdateManager.send()
work as you desire it to.

Waiting on events in other requests in Twisted

I have a simple Twisted server that handles requests like this (obviously, asynchronously)
global SomeSharedMemory
if SomeSharedMemory is None:
SomeSharedMemory = LoadSharedMemory()
return PickSomething(SomeSharedMemory)
Where SomeSharedMemory is loaded from a database.
I want to avoid loading SomeSharedMemory from the database multiple times. Specifically, when the server first starts, and we get two concurrent incoming requests, we might see something like this:
Request 1: Check for SomeSharedMemory, don't find it
Request 1: Issue database query to load SSM
Request 2: Check for SSM, don't find it
Request 2: Issue database query to load SSM
Request 1: Query returns, store SSM
Request 1: Return result
Request 2: Query returns, store SSM
Request 2: Return result
With more concurrent requests, the database gets hammered. I'd like to do something like this (see http://docs.python.org/library/threading.html#event-objects):
global SomeSharedMemory, SSMEvent
if SomeSharedMemory is None:
if not SSMEvent.isSet():
SSMEvent.wait()
else:
# assumes that the event is initialized "set"
SSMEvent.clear()
SomeSharedMemory = LoadSharedMemory()
SSMEvent.set()
return PickSomething(SomeSharedMemory)
Such that if one request is loading the shared memory, other requests will wait politely until the query is complete rather than issue their own duplicate database queries.
Is this possible in Twisted?
The way your example is set up, it's hard to see how you could actually have the problem you're describing. If a second request comes in to your Twisted server before the call to LoadSharedMemory issued by the first has returned, then the second request will just wait before being processed. When it is finally handled, SomeSharedMemory will be initialized and there will be no duplication.
However, I suppose maybe it is the case that LoadSharedMemory is asynchronous and returns a Deferred, so that your code really looks more like this:
def handleRequest(request):
if SomeSharedMemory is None:
d = initSharedMemory()
d.addCallback(lambda ignored: handleRequest(request))
else:
d = PickSomething(SomeSharedMemory)
return d
In this case, it's entirely possible that a second request might arrive while initSharedMemory is off doing its thing. Then you would indeed end up with two tasks trying to initialize that state.
The thing to do, of course, is notice this third state that you have. There is not only un-initialized and initializ-ed, but also initializ-ing. So represent that state as well. I'll hide it inside the initSharedMemory function to keep the request handler as simpler as it already is:
initInProgress = None
def initSharedMemory():
global initInProgress
if initInProgress is None:
initInProgress = _reallyInit()
def initialized(result):
global initInProgress, SomeSharedMemory
initInProgress = None
SomeSharedMemory = result
initInProgress.addCallback(initialized)
d = Deferred()
initInProgress.chainDeferred(d)
return d
This is a little gross because of the globals everywhere. Here's a slightly cleaner version:
from twisted.internet.defer import Deferred, succeed
class SharedResource(object):
def __init__(self, initializer):
self._initializer = initializer
self._value = None
self._state = "UNINITIALIZED"
self._waiting = []
def get(self):
if self._state == "INITIALIZED":
# Return the already computed value
return succeed(self._value)
# Create a Deferred for the caller to wait on
d = Deferred()
self._waiting.append(d)
if self._state == "UNINITIALIZED":
# Once, run the setup
self._initializer().addCallback(self._initialized)
self._state = "INITIALIZING"
# Initialized or initializing state here
return d
def _initialized(self, value):
# Save the value, transition to the new state, and tell
# all the previous callers of get what the result is.
self._value = value
self._state = "INITIALIZED"
waiting, self._waiting = self._waiting, None
for d in waiting:
d.callback(value)
SomeSharedMemory = SharedResource(initializeSharedMemory)
def handleRequest(request):
return SomeSharedMemory.get().addCallback(PickSomething)
Three states, nice explicit transitions between them, no global state to update (at least if you give SomeSharedMemory some non-global scope), and handleRequest doesn't know about any of this, it just asks for a value and then uses it.

Categories