Azure, Amazon and other instance based cloud providers can be used to carry out website load tests (by spinning up numerous instances running programs that send requests to a set of URLs) and I was wondering if I would be able to do this with Google App Engine.
So far, however it seems this is not the case. The only implementation I can think of at the moment is setting up the maximum number of cron jobs each executing at the highest frequency, each task requesting a bunch of URLs and at the same time popping in further tasks in the task queue.
According to my calculations this is only enough to fire off a maximum of 25 concurrent requests (as an application can have maximum 20 cron tasks each executing no more frequent than once a minute and the default queue has a throughput rate of 5 task invocations per second.
Any ideas if there is a way I could have more concurrent requests fetching URLs in an automated way?
The taskqueue API allows 100 task invocations per second per queue with the following max active queues quota:
Free: 10 active queues (not including the default queue)
Billing: 100 active queues (not including the default queue)
With a single UrlFetch per task, multiplying [max number of active queues] * [max number of tasks invocation per second] * [60 seconds] you can reach these nominal Urlfetch calls rate:
Free:
11 * 100 * 60 = 66000 Urlfetch calls/minute
Billing:
101 * 100 * 60 = 606000 Urlfetch calls/minute
These rates are limited by the number of allowed UrlFetch per minute quota:
Free:
3,000 calls/minute
Billing: 32,000 calls/minute
As you can see, Taskqueue + Urlfetch APIs can be used effectively to suit your load testing need.
Load testing against a public url may not be as accurate as getting boxes attached directly to the same switch as your target server. There are so many uncontrollable network effects.
Depending on your exact circumstances I would recommend borrowing a few desktop boxes for the purpose and using them. Any half decent machine should be able to generate a 2-3 thousand calls a minute.
That said, it really depends on the target scale you wish to achieve.
Related
I have a Python script that will be used for automation. It runs system commands and stores the result in a database. It contains the following line:
ThreadPoolExecutor(max_workers = 3)
I don't know what value to set max_workers as. I've seen people say that the optimal value of max_workers? I've heard people say it depends on the machine but haven't elaborated further. I've also read that the default value in Python 3 is the number of processors * 5. If there is no universally optimal solution, then what's a good way at approaching a locally optimal solution to this problem?
Try
import os
max_workers = os.cpu_count()
Brian Goetz in his famous book "Java Concurrency in Practice" recommends the following formula:
Number of threads = Number of Available Cores * (1 + Wait time /
Service time)
Waiting time - is the time spent waiting for IO bound tasks to complete, say waiting for HTTP response from remote service.
(not only IO bound tasks, it could be time waiting to get monitor lock or time when thread is in WAITING/TIMED_WAITING state)
Service time - is the time spent being busy, say processing the HTTP response, marshaling/unmarshaling, any other transformations etc.
Wait time / Service time - this ratio is often called blocking coefficient.
A computation-intensive task has a blocking coefficient close to 0, in this case, the number of threads is equal to the number of available cores. If all tasks are computation intensive, then this is all we need. Having more threads will not help.
For example:
A worker thread makes a call to a microservice, serializes response into JSON and executes some set of rules. The microservice response time is 50ms, processing time is 5ms. We deploy our application to a server with a dual-core CPU:
2 * (1 + 50 / 5) = 22 // optimal thread pool size
But this example is oversimplified. Besides an HTTP connection pool, your application may have requests from JMS and probably a JDBC connection pool.
If you have different classes of tasks it is best practice to use multiple thread pools, so each can be tuned according to its workload.
Full article you can find here
We have a Luigi Task that request a piece of information from a 3rd party service. We are limited on the number of call requests we can perform per minute to that API call.
Is there a way to specify on a per-Task basis how many tasks of this kind must the scheduler run per unit of time?
We implemented our own rate limiting in the task. Our API limit was low enough that we could saturate it with a single thread. When we received a rate limit response, we just back off and retry.
One thing you can do is to declare the API call as a resource. You can set how many of the resource is available in the config, and then how many of the resource the task consumes as a property on the task. This will then limit you to running n of that task at a time.
in config:
[resources]
api=1
in code for Task:
resources = {"api": 1}
I have a Google App Engine HTTP resource that takes 20 seconds to respond. The resource does a calculation requiring very little bandwidth and no storage access. Billing is not enabled. If my desktop application spawns 100 threads to POST 500 times (each thread will on average POST 5 times). I believe that 500 POSTs use up just a little more than the freebie time for non-billing accounts, which is 6.5 CPU hours per 24 hour period. I might be about 10 POSTs over the limit because towards the end, about 10 of the 500 will fail even if I allow each request to retry twice.
In any event, the fact that I'm a little over the limit probably does not affect the problem which prompted my question. My question is: the dashboard measurement "CPU seconds used per second" is about 17. I would like this to be 100, because after all, I have 100 threads.
I'm not really good with Firebug or other monitoring tools so I have not proven that there is a peak of 100 outstanding requests on the wire-side of the Python standard library web methods, but I do print "hey" to the desktop console when there are 100 outstanding threads. It says "hey" fairly early so I think the number of CPU seconds per second should be a lot closer to 100 than 17. Is my problem on the desktop or is GAE throttling me and how can I get 100 CPU seconds per second? How can I get somebody at Google to help with this question? I think their "support" link just goes to "community-style" support.
Search the groups for 1000ms. Your app will not be given as many resources if your user-requests do not return in less than 1000ms. You might also face additional issues with requests that are taking 20 seconds, I believe if your requests sit in the pending queue it counts against the run-time increasing the likelihood you will get deadline / timeout errors.
You should look into breaking your code up and doing the processing in the task queue, or submitting more requests with less work per request.
I'm confused about Task execution using queues. I've read the documentation and I thought I understood bucket_size and rate, but when I send 20 Tasks to a queue set to 5/h, size 5, all 20 Tasks execute one after the other as quickly as possible, finishing in less than 1 minute.
deferred.defer(spam.cookEggs,
egg_keys,
_queue="tortoise")
- name: tortoise
rate: 5/h
bucket_size: 5
What I want is whether I create 10 or 100 Tasks, I only want 5 of them to run per hour. So it would take 20 Tasks approximately 4 hours to complete. I want their execution spread out.
UPDATE
The problem was I assumed that when running locally, that Task execution rate rules were followed, but that is not the case. You cannot test execution rates locally. When I deployed to production, the rate and bucket size I had set executed as I expected.
Execution rates are not honored by the app_devserver. This issue should not occur in production.
[Answer discovered by Nick Johnson and/or question author; posting here as community wiki so we have something that can get marked accepted]
You want to set bucket_size to 1, or else you'll have "bursts" of queued activity like you saw there.
From the documentation:
bucket_size
Limits the burstiness of the queue's
processing, i.e. a higher bucket size
allows bigger spikes in the queue's
execution rate. For example, consider
a queue with a rate of 5/s and a
bucket size of 10. If that queue has
been inactive for some time (allowing
its "token bucket" to fill up), and 20
tasks are suddenly enqueued, it will
be allowed to execute 10 tasks
immediately. But in the following
second, only 5 more tasks will be able
to be executed because the token
bucket has been depleted and is
refilling at the specified rate of
5/s.
Say I had over 10,000 feeds that I wanted to periodically fetch/parse.
If the period were say 1h that would be 24x10000 = 240,000 fetches.
The current 10k limit of the labs Task Queue API would preclude one from
setting up one task per fetch. How then would one do this?
Update: RE: Fetching nurls per task - Given the 30second timeout per request at some point this would hit a ceiling. Is
there anyway to parallelize it so each task queue initiates a bunch of async parallel fetches each of which would take less than 30sec to finish but the lot together may take more than that.
Here's the asynchronous urlfetch API:
http://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html
Set of a bunch of requests with a reasonable deadline (give yourself some headroom under your timeout, so that if one request times out you still have time to process the others). Then wait on each one in turn and process as they complete.
I haven't used this technique myself in GAE, so you're on your own finding any non-obvious gotchas. Sadly there doesn't seem to be a select() style call in the API to wait for the first of several requests to complete.
2 fetches per task? 3?
Group up the fetches, so instead of queuing 1 fetch you queue up, say, a work unit that does 10 fetches.