I have some Flask application. It works with some database, I'm using SQLAlchemy for this. So I have one question:
Flask handle requests one-by-one. So, for example, I have two users, which are modifying the same record in the table of database, for example A and B (they are concurrent).
How can I say to user B that user A has changed this record? It must be some message to user B.
In the development server version, when you do app.run(), you get a single synchronous process, which means at most 1 requests being processed at a time. So you cannot accept multiple users at the same time.
However, gunicorn is a solid, easy-to-use WSGI server that will let you spawn multiple workers (separate processes), and even comes with asynchronous workers when you need to deploy your application.
However, to answer your question, since, they run on separate threads, the data that exists in the database at the specific time when the query is run in that thread will be used/returned.
I hope this answers your query.
Related
I have a processing engine built in python and a driver program with several components that uses this engine to process some files.
See the pictorial representation here.
The Engine is used for math calculations.
The Driver program has several components.
Scanner keeps scanning a folder to check for new files, if found makes entry into DB by calling a API.
Scheduler picks new entries made by scanner and schedules them for processing (makes entry into 'jobs' table in DB)
Executer picks entries from job table and executes them using the engine and outputs new files.
All the components run as separate python process continuously. This is very in efficient, how can I improve this? The use of Django is to provide a DB (so the multiple processes could communicate) and keep a record of how many files are processed.
Next came a new requirement to manually check the processed files for errors so a UI was developed for this. Also the assess to the engine was to be made API based. See the new block diagram here
Now the entire thing is a huge mess in my opinion. For start, the Django now has to serve 2 different sets of API - one for the UI and other for the driver program. If the server stops, the UI stops working and also the Driver program stops working.
Since the engine is API based there is a huge amount of data passed to it in the request. The Engine takes several minutes (3 to 4) to process the files and most of the time the request to engine get timeout. The Driver program is started separately from terminal and it fails if Django server is not running as the DB APIs are required to schedule jobs and execute the jobs.
I want to ask what is the beast way to structure such projects.
should I keep the Engine and driver program logic inside Django? In this case how do I start the driver program?
Should I keep both of them outside Django, in which case how do I communicate with Django such that even if the Django server is down I can still keep processing the files.
I would really appreciate any sort of improvement ideas in any of the areas.
I'm trying to build a recommendation system with python using lightfm library and an api created with Flask framework.
My question is more design related than coding.
The webservice which will be called when a user logs in the website, recieves a json with userid and return a json with userid and 5 product sku to be recommended.
My desire is to save those recommendations in a DB. I want to do that because in this way I can see and comparing this table with other tables in DB and find out if a user has purchased the product that I recommended.
My concern (maybe it's stupid) is that everything will slow down if I open a connection to DB and write data in it.
Potentially the service can be called between 5k to 7k times per day.
Thanks
What I've understood from your explanation is that you will be comparing the actual selected data by the user and the ones you recommended. So, considering you are comparing every week once, it won't affect much of your processing.
Your concern is, would everything slow down if a DB connection is opened?
It won't slow down the service. Considering the usage of service of 5k times per day, other major factors are there which will slow the service down or will cause it to stop. Like when the number of users is too high, one python process will fail.
What you need to do here is, use a web application server like Gunicorn or uwsgi Using Gunicorn with Flask
This way, what gunicorn does is it starts multiple python processes running flask so it will support a high number of concurrent users.
I am creating a Python application that uses embedded SQLite databases. The programme creates the db files and they are on a shared network drive. At this point there will be no more than 5 computers on the network running the programme.
My initial thought was to ask the user on startup if they are the server or client. If they are the server then they create the database. If they are the client they must find a server instance on the network. The one way I suppose is to send all db commands from client to server and server implements in the database. Will that solve the shared db issue?
Alternatively, is there some way to create a SQLite "server". I presume this would be the quicker option if available?
Note: I can't use a server engine such as MySQL or PostgreSQL at this point but I am implementing using ORM and so when this becomes viable, it should be easy to change over.
Here's a "SQLite Server", http://sqliteserver.xhost.ro/, but it looks like not in maintain for years.
SQLite supports concurrency itself, multiple processes can read data at one time and only one can write data into it. Also, When some process is writing, it'll lock the whole database file for a few seconds and others have to wait in the mean time according official document.
I guess this is sufficient for 5 processes as yor scenario. Just you need to write codes to handle the waiting.
I am currently trying to develop something using Google AppEngine, I am using Python as my runtime and require some advise on setting up the following.
I am running a webserver that provides JSON data to clients, The data comes from an external service in which I have to pull the data from.
What I need to be able to do is run a background system that will check the memcache to see if there are any required ID's, if there is an ID I need to fetch some data for that ID from the external source and place the data in the memecache.
If there are multiple id's, > 30 I need to be able to pull all 30 request as quickly and efficiently as possible.
I am new to Python Development and AppEngine so any advise you guys could give would be great.
Thanks.
You can use "backends" or "task queues" to run processes in the background. Tasks have a 10-minute run time limit, and backends have no run time limit. There's also a cronjob mechanism which can trigger requests at regular intervals.
You can fetch the data from external servers with the "URLFetch" service.
Note that using memcache as the communication mechanism between front-end and back-end is unreliable -- the contents of memcache may be partially or fully erased at any time (and it does happen from time to time).
Also note that you can't query memcache of you don't know the exact keys ahead of time. It's probably better to use the task queue to queue up requests instead of using memcache, or using the datastore as a storage mechanism.
I have a normal Django site running. In addition, there is another twisted process, which listens for Jabber presence notifications and updates the Django DB using Django's ORM.
So far it works as I just call the corresponding Django models (after having set up the settings environment correctly). This, however, blocks the Twisted app, which is not what I want.
As I'm new to twisted I don't know, what the best way would be to access the Django DB (via its ORM) in a non-blocking way using deferreds.
deferredGenerator ?
twisted.enterprise.adbapi ? (circumvent the ORM?)
???
If the presence message is parsed I want to save in the Django DB that the user with jid_str is online/offline (using the Django model UserProfile). I do it with that function:
def django_useravailable(jid_str, user_available):
try:
userhost = jid.JID(jid_str).userhost()
user = UserProfile.objects.get(im_jabber_name=userhost)
user.im_jabber_online = user_available
user.save()
return jid_str, user_available
except Exception, e:
print e
raise jid_str, user_available,e
Currently, I invoke it with:
d = threads.deferToThread(django_useravailable, from_attr, user_available)
d.addCallback(self.success)
d.addErrback(self.failure)
"I have a normal Django site running."
Presumably under Apache using mod_wsgi or similar.
If you're using mod_wsgi embedded in Apache, note that Apache is multi-threaded and your Python threads are mashed into Apache's threading. Analysis of what's blocking could get icky.
If you're using mod_wsgi in daemon mode (which you should be) then your Django is a separate process.
Why not continue this design pattern and make your "jabber listener" a separate process.
If you'd like this process to be run any any of a number of servers, then have it be started from init.rc or cron.
Because it's a separate process it will not compete for attention. Your Django process runs quickly and your Jabber listener runs independently.
I have been successful using the method you described as your current method. You'll find by reading the docs that the twisted DB api uses threads under the hood because most SQL libraries have a blocking API.
I have a twisted server that saves data from power monitors in the field, and it does it by starting up a subthread every now and again and calling my Django save code. You can read more about my live data collection pipeline (that's a blog link).
Are you saying that you are starting up a sub thread and that is still blocking?
I have a running Twisted app where I use Django ORM. I'm not deferring it. I know it's wrong, but hadd no problems yet.