Why does Openstack Swift requests are blocked by eventlet.green.httplib?

Why does Openstack Swift requests are blocked by eventlet.green.httplib? - python

Openstack-Swift is using evenlet.green.httplib for BufferedHttpconnections.
When I do performance benchmark of it for write operations, I could observer that write throughput drops even only one replica node is overloaded.
As I know write quorum is 2 out of 3 replicas, therefore overloading only one replica cannot affect for the throughput.
When I dig deeper what I observed was, the consequent requests are blocked until the responses are reached for the previous requests. Its mainly because of the BufferedHttpConnection which stops issuing new request until the previous response is read.
Why Openstack-swift use such a method?
Is this the usual behaviour of evenlet.green.httplib.HttpConnection?
This does not make sense in write quorum point of view, because its like waiting for all the responses not a quorum.
Any ideas, any workaround to stop this behaviour using the same library?

Its not a problem of the library but a limitation due to the Openstack Swift configuration where the "Workers" configuration in all Account/Container/Object config of Openstack Swift was set to 1
Regarding the library
When new connections are made using evenlet.green.httplib.HttpConnection
it does not block.
But if requests are using the same connection, subsequent requests are blocked until the response is fully read.

Related

API Design Questions using Django for OS tasks (REST vs RPC)

Background:
I have an application that is supposed to automate some infrastructure & OS-heavy tasks that happen on a network file system (for example: mounting volumes, shutting down / bringing up servers, creating directories, moving data around, ssh-ing etc). Ultimately there are a lot of OS-level commands that need to be run in a sequence for each action. Our consumer/client likely does not know this sequence, but knows "I want to do X task".
Tech stack: Python/Django
I have been tasked with setting this application up but am perplexed on the best way to approach for modularity from the API standpoint & just overall design. Currently, we have a similar application that is a SOAP-style (rpc) but the way it is written is not very modular. Like for example, one function will have a ton of random hardcoded subprocess commands - not the approach I want to emulate here.
Initially I was leaning more towards REST API since Django has a nice django rest framework plug-in, but am having trouble modelling these very action-oriented tasks. The more I read other things online, the more I come to believe I really need to think of every little action as a resource with the client having to GET/POST/PUT to each of these to keep things very modular but when I boil that down further it looks like I may need to set up 15+ endpoints for each situation needed and the client likely isn't going to want to call all 15 endpoints to get their singular behavior they want. That being said - moving to rpc so users can have one endpoint that 'moves the moon on a single call' might not be the best approach either.
I think one of the issues I see is our application is doing a lot of work on a file system, not all contained within our application's database. I reckon that's kind of a central point of this application, but I have trouble modelling things that require file system actions outside our application's database.
Question 1:
One example action that our client might want to call would be responsible for ssh-ing to a remote server and running a command. How might you model this in REST?
Question 2:
How do you all model file system actions in your applications?
Question 3:
After reviewing the above, does RPC seem like the better option?
Other:
Any other help or feedback (even in generally is much appreciated).

REST is similar to SOAP in a sense that you call operations in SOAP and REST just maps those operations to web resources and HTTP methods.
For example
z DoSSHStuffOnARemoteServer(x=1,y=2)
vs
POST /RemoteServer/SSHStuff {x:1,y:2}
If it timeouts, because it takes a lot of time, then you can do
202 accepted
{type: "transaction": href: "/RemoteServer/SSHStuff/123", status: "pending"}
and poll it in every 5-10 mins or use websockets to update the status. After it is done:
200 ok
{type: "transaction": href: "/RemoteServer/SSHStuff/123", status: "done", result: {z:3}}
So there is no magic. Just keep in mind that REST is in the presentation layer of your application, it returns view models and the entire structure is connected to the application services if you do DDD. It should not reflect the database structure unless you have an anemic domain model, or others call it thick client. Normally I would not say anything about RemoteServer/SSHStuff, just tell the client what will be done and stay silent about how it will be done. They don't need to know anything about how you store data or how many servers you have with what protocols and applications. It should not be their concern. The only thing they need to know what will be done, how long it takes to respond and what will be the response. The other part is irrelevant to them and it is a security risk if you share too much of it. When we design an interface like an interface for a REST service or just an OOP interface we always do it to hide implementation details. I hope that helps, have a nice day!

how to add rate limiting on tornado python app

would it be possible to implement a rate limiting feature on my tornado app? like limit the number of HTTP request from a specific client if they are identified to send too many requests per second (which red flags them as bots).
I think I could it manually by storing the requests on a database and analyzing the requests per IP address but I was just checking if there is already an existing solution for this feature.
I tried checking the github page of tornado, I have the same questions as this post but no explicit answer was provided. checked tornado's wiki links as well but I think rate limiting is not handled yet.

Instead of storing them in the DB, would be better to have them in a dictionary stored in memory for easy usability.
Also can you share the details whether the api has a load-balancer and which web-server is used.

The enterprise grade solution to your problem is ambassador.
You can use ambassador's solutions like envoy proxy and edge stack and have it set up that can do the needful.
additional to tore the data, you can use any popular cached db, or d that store as key:value pairs, for example redis.
if you doing this for a very small project, can use some npm/pip packages.
Read the docs: https://www.getambassador.io/products/edge-stack/api-gateway/

You should probably do this before your requests reach Tornado.
But if it's an application level feature (limiting requests depending on level of subscription), then you can do it in Tornado in lots of ways, depending on how complex you want the rate limiting to be.
Probably the simplest way is to have a dict on your tornado.web.Application that uses the ip as the key and the timestamp of the last request as the value and check every request against it in prepare- if not enough time passed since last request, raise a tornado.web.HTTPError(429) (ideally with a Retry-After header). If you do this, you will still need to clean up this dict now and then to remove entries that have not made a request recently or it will grow (you could do it finish on every request).
If you have another fast/in-memory storage attached (memcache, redis, sqlite), you should use that, but you definitely should not use an RDBMS as all those writes will not be great for its performance.

Google AppEngine and Threaded Workers

I am currently trying to develop something using Google AppEngine, I am using Python as my runtime and require some advise on setting up the following.
I am running a webserver that provides JSON data to clients, The data comes from an external service in which I have to pull the data from.
What I need to be able to do is run a background system that will check the memcache to see if there are any required ID's, if there is an ID I need to fetch some data for that ID from the external source and place the data in the memecache.
If there are multiple id's, > 30 I need to be able to pull all 30 request as quickly and efficiently as possible.
I am new to Python Development and AppEngine so any advise you guys could give would be great.
Thanks.

You can use "backends" or "task queues" to run processes in the background. Tasks have a 10-minute run time limit, and backends have no run time limit. There's also a cronjob mechanism which can trigger requests at regular intervals.
You can fetch the data from external servers with the "URLFetch" service.

Note that using memcache as the communication mechanism between front-end and back-end is unreliable -- the contents of memcache may be partially or fully erased at any time (and it does happen from time to time).
Also note that you can't query memcache of you don't know the exact keys ahead of time. It's probably better to use the task queue to queue up requests instead of using memcache, or using the datastore as a storage mechanism.

HEAD request vs. GET request

I always had the idea that doing a HEAD request instead of a GET request was faster (no matter the size of the resource) and therefore had it advantages in certain solutions.
However, while making a HEAD request in Python (to a 5+ MB dynamic generated resource) I realized that it took the same time as making a GET request (almost 27 seconds instead of the 'less than 2 seconds' I was hoping for).
Used some urllib2 solutions to make a HEAD request found here and even used pycurl (setting headers and nobody to True). Both of them took the same time.
Am I missing something conceptually? is it possible, using Python, to do a 'quick' HEAD request?

The server is taking the bulk of the time, not your requester or the network. If it's a dynamic resource, it's likely that the server doesn't know all the header information - in particular, Content-Length - until it's built it. So it has to build the whole thing whether you're doing HEAD or GET.

The response time is dominated by the server, not by your request. The HEAD request returns less data (just the headers) so conceptually it should be faster, but in practice, many static resources are cached so there is almost no measureable difference (just the time for the additional packets to come down the wire).

Chances are, the bulk of that request time is actually whatever process generates the 5+MB response on the server rather than the time to transfer it to you.
In many cases, a web application will still execute the full script when responding to a HEAD request--it just won't send the full body back to the requester.
If you have access to the code that is processing that request, you may be able to add a condition in there to make it handle the request differently depending on the the method, which could speed it up dramatically.

Is it more efficient to parse external XML or to hit the database?

I was wondering when dealing with a web service API that returns XML, whether it's better (faster) to just call the external service each time and parse the XML (using ElementTree) for display on your site or to save the records into the database (after parsing it once or however many times you need to each day) and make database calls instead for that same information.

First off -- measure. Don't just assume that one is better or worse than the other.
Second, if you really don't want to measure, I'd guess the database is a bit faster (assuming the database is relatively local compared to the web service). Network latency usually is more than parse time unless we're talking a really complex database or really complex XML.

Everyone is being very polite in answering this question: "it depends"... "you should test"... and so forth.
True, the question does not go into great detail about the application and network topographies involved, but if the question is even being asked, then it's likely a) the DB is "local" to the application (on the same subnet, or the same machine, or in memory), and b) the webservice is not. After all, the OP uses the phrases "external service" and "display on your own site." The phrase "parsing it once or however many times you need to each day" also suggests a set of data that doesn't exactly change every second.
The classic SOA myth is that the network is always available; going a step further, I'd say it's a myth that the network is always available with low latency. Unless your own internal systems are crap, sending an HTTP query across the Internet will always be slower than a query to a local DB or DB cluster. There are any number of reasons for this: number of hops to the remote server, outage or degradation issues that you can't control on the remote end, and the internal processing time for the remote web service application to analyze your request, hit its own persistence backend (aka DB), and return a result.
Fire up your app. Do some latency and response times to your DB. Now do the same to a remote web service. Unless your DB is also across the Internet, you'll notice a huge difference.
It's not at all hard for a competent technologist to scale a DB, or for you to completely remove the DB from caching using memcached and other paradigms; the latency between servers sitting near each other in the datacentre is monumentally less than between machines over the Internet (and more secure, to boot). Even if achieving this scale requires some thought, it's under your control, unlike a remote web service whose scaling and latency are totally opaque to you. I, for one, would not be too happy with the idea that the availability and responsiveness of my site are based on someone else entirely.
Finally, what happens if the remote web service is unavailable? Imagine a world where every request to your site involves a request over the Internet to some other site. What happens if that other site is unavailable? Do your users watch a spinning cursor of death for several hours? Do they enjoy an Error 500 while your site borks on this unexpected external dependency?
If you find yourself adopting an architecture whose fundamental features depend on a remote Internet call for every request, think very carefully about your application before deciding if you can live with the consequences.

Consuming the webservices is more efficient because there are a lot more things you can do to scale your webservices and webserver (via caching, etc.). By consuming the middle layer, you also have the options to change the returned data format (e.g. you can decide to use JSON rather than XML). Scaling database is much harder (involving replication, etc.) so in general, reduce hits on DB if you can.

There is not enough information to be able to say for sure in the general case. Why don't you do some tests and find out? Since it sounds like you are using python you will probably want to use the timeit module.
Some things that could effect the result:
Performance of the web service you are using
Reliability of the web service you are using
Distance between servers
Amount of data being returned
I would guess that if it is cacheable, that a cached version of the data will be faster, but that does not necessarily mean using a local RDBMS, it might mean something like memcached or an in memory cache in your application.

It depends - who is calling the web service? Is the web service called every time the user hits the page? If that's the case I'd recommend introducing a caching layer of some sort - many web service API's throttle the amount of hits you can make per hour.
Whether you choose to parse the cached XML on the fly or call the data from a database probably won't matter (unless we are talking enterprise scaling here). Personally, I'd much rather make a simple SQL call than write a DOM Parser (which is much more prone to exceptional scenarios).

It depends from case to case, you'll have to measure (or at least make an educated guess).
You'll have to consider several things.
Web service
it might hit database itself
it can be cached
it will introduce network latency and might be unreliable
or it could be in local network and faster than accessing even local disk
DB
might be slow since it needs to access disk (although databases have internal caches, but those are usually not targeted)
should be reliable
Technology itself doesn't mean much in terms of speed - in one case database parses SQL, in other XML parser parses XML, and database is usually acessed via socket as well, so you have both parsing and network in either case.
Caching data in your application if applicable is probably a good idea.

As a few people have said, it depends, and you should test it.
Often external services are slow, and caching them locally (in a database in memory, e.g., with memcached) is faster. But perhaps not.
Fortunately, it's cheap and easy to test.

Test definitely. As a rule of thumb, XML is good for communicating between apps, but once you have the data inside of your app, everything should go into a database table. This may not apply in all cases, but 95% of the time it has for me. Anytime I ever tried to store data any other way (ex. XML in a content management system) I ended up wishing I would have just used good old sprocs and sql server.

It sounds like you essentially want to cache results, and are wondering if it's worth it. But if so, I would NOT use a database (I assume you are thinking of a relational DB): RDBMSs are not good for caching; even though many use them. You don't need persistence nor ACID.
If choice was between Oracle/MySQL and external web service, I would start with just using service.
Instead, consider real caching systems; local or not (memcache, simple in-memory caches etc).
Or if you must use a DB, use key/value store, BDB works well. Store response message in its serialized form (XML), try to fetch from cache, if not, from service, parse. Or if there's a convenient and more compact serialization, store and fetch that.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.