How to defer a Django DB operation from within Twisted? - python

I have a normal Django site running. In addition, there is another twisted process, which listens for Jabber presence notifications and updates the Django DB using Django's ORM.
So far it works as I just call the corresponding Django models (after having set up the settings environment correctly). This, however, blocks the Twisted app, which is not what I want.
As I'm new to twisted I don't know, what the best way would be to access the Django DB (via its ORM) in a non-blocking way using deferreds.
deferredGenerator ?
twisted.enterprise.adbapi ? (circumvent the ORM?)
???
If the presence message is parsed I want to save in the Django DB that the user with jid_str is online/offline (using the Django model UserProfile). I do it with that function:
def django_useravailable(jid_str, user_available):
try:
userhost = jid.JID(jid_str).userhost()
user = UserProfile.objects.get(im_jabber_name=userhost)
user.im_jabber_online = user_available
user.save()
return jid_str, user_available
except Exception, e:
print e
raise jid_str, user_available,e
Currently, I invoke it with:
d = threads.deferToThread(django_useravailable, from_attr, user_available)
d.addCallback(self.success)
d.addErrback(self.failure)

"I have a normal Django site running."
Presumably under Apache using mod_wsgi or similar.
If you're using mod_wsgi embedded in Apache, note that Apache is multi-threaded and your Python threads are mashed into Apache's threading. Analysis of what's blocking could get icky.
If you're using mod_wsgi in daemon mode (which you should be) then your Django is a separate process.
Why not continue this design pattern and make your "jabber listener" a separate process.
If you'd like this process to be run any any of a number of servers, then have it be started from init.rc or cron.
Because it's a separate process it will not compete for attention. Your Django process runs quickly and your Jabber listener runs independently.

I have been successful using the method you described as your current method. You'll find by reading the docs that the twisted DB api uses threads under the hood because most SQL libraries have a blocking API.
I have a twisted server that saves data from power monitors in the field, and it does it by starting up a subthread every now and again and calling my Django save code. You can read more about my live data collection pipeline (that's a blog link).
Are you saying that you are starting up a sub thread and that is still blocking?

I have a running Twisted app where I use Django ORM. I'm not deferring it. I know it's wrong, but hadd no problems yet.

Related

Concurrent requests Flask python

I have some Flask application. It works with some database, I'm using SQLAlchemy for this. So I have one question:
Flask handle requests one-by-one. So, for example, I have two users, which are modifying the same record in the table of database, for example A and B (they are concurrent).
How can I say to user B that user A has changed this record? It must be some message to user B.
In the development server version, when you do app.run(), you get a single synchronous process, which means at most 1 requests being processed at a time. So you cannot accept multiple users at the same time.
However, gunicorn is a solid, easy-to-use WSGI server that will let you spawn multiple workers (separate processes), and even comes with asynchronous workers when you need to deploy your application.
However, to answer your question, since, they run on separate threads, the data that exists in the database at the specific time when the query is run in that thread will be used/returned.
I hope this answers your query.

Dynamic pages with Django & Celery

I have a Celery task registered in my tasks.py file. When someone POST to /run/pk I run the task with the given parameters. This task also executes other tasks (normal Python functions), and I'd like to update my page (the HttpResponse returned at /run/pk) whenever a subtask finishes its work.
Here is my task:
from celery.decorators import task
#task
def run(project, branch=None):
if branch is None:
branch = project.branch
print 'Creating the virtualenv'
create_virtualenv(project, branch)
print 'Virtualenv created' ##### Here I want to send a signal or something to update my page
runner = runner(project, branch)
print 'Using {0}'.format(runner)
try:
result, output = runner.run()
except Exception as e:
print 'Error: {0}'.format(e)
return False
print 'Finished'
run = Run(project=project, branch=branch,
output=output, **result._asdict())
run.save()
return True
Sending push notifications to the client's browser using Django isn't easy, unfortunately. The simplest implementation is to have the client continuously poll the server for updates, but that increases the amount of work your server has to do by a lot. Here's a better explanation of your different options:
Django Push HTTP Response to users
If you weren't using Django, you'd use websockets for these notifications. However Django isn't built for using websockets. Here is a good explanation of why this is, and some suggestions for how to go about using websockets:
Making moves w/ websockets and python / django ( / twisted? )
With many years past since this question was asked, Channels is a way you could now achieve this using Django.
Then Channels website describes itself as a "project to make Django able to handle more than just plain HTTP requests, including WebSockets and HTTP2, as well as the ability to run code after a response has been sent for things like thumbnailing or background calculation."
There is a service called Pusher that will take care of all the messy parts of Push Notifications in HTML5. They supply a client-side and server-library to handle all the messaging and notifications, while taking care of all the HTML5 Websocket nuances.

Websocket Server with twisted and Python doing complex jobs in the background

I want to code a Server which handles Websocket Clients while doing mysql selects via sqlalchemy and scraping several Websites on the same time (scrapy). The received data has to be calculated, saved to the db and then send to the websocket Clients.
My question ist how can this be done in Python from the logical point of view. How do I need to set up the code structure and what modules are the best solution for this job? At the moment I'm convinced of using twisted with threads in which the scrape and select stuff is running. But can this be done an easier way? I only find simple twisted examples but obviously this seems to be a more complex job. Are there similar examples? How do I start?
Cyclone, a Twisted-based 'network toolkit', based on/similar to facebook/friendfeed's Tornado server, contains support for WebSockets: https://github.com/fiorix/cyclone/blob/master/cyclone/web.py#L908
Here's example code:
https://github.com/fiorix/cyclone/blob/master/demos/websocket/websocket.tac
Here's an example of using txwebsocket:
http://www.saltycrane.com/blog/2010/05/quick-notes-trying-twisted-websocket-branch-example/
You may have a problem using SQLAlchemy with Twisted; from what I have read, they do not work well together (source). Are you married to SQLA, or would another, more compatible OR/M suffice?
Some twisted-friendly OR/Ms include Storm (a fork) and Twistar, and you can always fall back on Twisted's core db abstraction library twisted.enterprise.adbapi.
There are also async-friendly db libraries for other products, such as txMySQL, txMongo, and txRedis, and paisley (couchdb).
You could conceivably use both Cyclone (or txwebsockets) and Scrapy as child services of the same MultiService, running on different ports, but packaged within the same Application instance. The services may communicate, either through the parent service or some RPC mechanism (like JSONRPC, Perspective Broker, AMP, XML-RPC (2) etc), or you can just write to the db from the scrapy service and read from it using websockets. Redis would be great for this IMO.
Ideally you'll want to avoid writing your own WebSockets server, but since you're running Twisted, you might not be able to do that: there are several WebSockets implementations (see this search on PyPI). Unfortunately none of them are Twisted-based [Edit see #JP-Calderone's comment below.]
Twisted should drive the master server, so you probably want to begin with writing something that can be run via twistd (see here if your'e new to this). The WebSocket implementation mentioned by #JP-Calderone and Scrapy are both Twisted -based so they should be reasonable trivial to drive from your master Twisted-based server. SQLAlchemy will be more difficult, I've commented on this before in this question.

Keeping concurrency in web.py applications on mod_wsgi

Sorry if this makes no sense. Please comment if clarification is needed.
I'm writing a small file upload app in web.py which I am deploying using mod_wsgi + apache. I have been having a problem with my session management and would like clarification on how the threading works in web.py.
Essentially I embed a code in a hidden field of the html page I render when someone accesses my page. The file upload is then done via a standard POST request containing both the file and the code. Then I retrieve the progress of the file by updating it in the file upload POST method and grabbing it with a GET request to a different class. The 'session' (apologies for it being fairly naive) is stored in a session object like this:
class session:
def __init__(self):
self.progress = 0
self.title = ""
self.finished = False
def advance(self):
self.progress = self.progress + 1
The sessions are all kept in a global dictionary within my app script and then accessed with my code (from earlier) as the key.
For some reason my progress seems to stay at 0 and never increments. I've been debugging for a couple hours now and I've found that the two session objects referenced from the upload class and the progress class are not the same. The two codes, however, are (as far as I can tell) equal. This is driving me mad as it worked without any problems on the web.py test server on my local machine.
EDIT: After some research it seems that the dictionary may get copied for every request. I've tried putting the dictionary in another and importing but this doesn't work. Is there some other way short of using a database to 'seperate' the sessions dictionary?
Apache/mod_wsgi can run in multiprocess configurations and possible your requests aren't even being serviced by the same process and never will if for that multiprocess configuration each process is single thread because while the upload is occuring no other requests can be handled by that same process. Read:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
Possibly you should use mod_wsgi daemon mode with single multiple thread daemon process.
From PEP 333, defining WSGI:
Servers that can run multiple requests in parallel, should also provide the option of running an application in a single-threaded fashion, so that applications or frameworks that are not thread-safe may still be used with that server
Check the documentation of your WSGI server.

Python, Twisted, Django, reactor.run() causing problem

I have a Django web application. I also have a spell server written using twisted running on the same machine having django (running on localhost:8090). The idea being when user does some action, request comes to Django which in turn connects to this twisted server & server sends data back to Django. Finally Django puts this data in some html template & serves it back to the user.
Here's where I am having a problem. In my Django app, when the request comes in I create a simple twisted client to connect to the locally run twisted server.
...
factory = Spell_Factory(query)
reactor.connectTCP(AS_SERVER_HOST, AS_SERVER_PORT, factory)
reactor.run(installSignalHandlers=0)
print factory.results
...
The reactor.run() is causing a problem. Since it's an event loop. The next time this same code is executed by Django, I am unable to connect to the server. How does one handle this?
The above two answers are correct. However, considering that you've already implemented a spelling server then run it as one. You can start by running it on the same machine as a separate process - at localhost:PORT. Right now it seems you have a very simple binary protocol interface already - you can implement an equally simple Python client using the standard lib's socket interface in blocking mode.
However, I suggest playing around with twisted.web and expose a simple web interface. You can use JSON to serialize and deserialize data - which is well supported by Django. Here's a very quick example:
import json
from twisted.web import server, resource
from twisted.python import log
class Root(resource.Resource):
def getChild(self, path, request):
# represents / on your web interface
return self
class WebInterface(resource.Resource):
isLeaf = True
def render_GET(self, request):
log.msg('GOT a GET request.')
# read request.args if you need to process query args
# ... call some internal service and get output ...
return json.dumps(output)
class SpellingSite(server.Site):
def __init__(self, *args, **kwargs):
self.root = Root()
server.Site.__init__(self, self.root, **kwargs)
self.root.putChild('spell', WebInterface())
And to run it you can use the following skeleton .tac file:
from twisted.application import service, internet
site = SpellingSite()
application = service.Application('WebSpell')
# attach the service to its parent application
service_collection = service.IServiceCollection(application)
internet.TCPServer(PORT, site).setServiceParent(service_collection)
Running your service as another first class service allows you to run it on another machine one day if you find the need - exposing a web interface makes it easy to horizontally scale it behind a reverse proxying load balancer too.
reactor.run() should be called only once in your whole program. Don't think of it as "start this one request I have", think of it as "start all of Twisted".
Running the reactor in a background thread is one way to get around this; then your django application can use blockingCallFromThread in your Django application and use a Twisted API as you would any blocking API. You will need a little bit of cooperation from your WSGI container, though, because you will need to make sure that this background Twisted thread is started and stopped at appropriate times (when your interpreter is initialized and torn down, respectively).
You could also use Twisted as your WSGI container, and then you don't need to start or stop anything special; blockingCallFromThread will just work immediately. See the command-line help for twistd web --wsgi.
You should stop reactor after you got results from Twisted server or some error/timeout happening. So on each Django request that requires query your Twisted server you should run reactor and then stop it. But, it's not supported by Twisted library — reactor is not restartable. Possible solutions:
Use separate thread for Twisted reactor, but you will need to deploy your django app with server, which has support for long running threads (I don't now any of these, but you can write your own easily :-)).
Don't use Twisted for implementing client protocol, just use plain stdlib's socket module.

Categories