Delaying 1 second per request, not enough for 3600 per hour

Delaying 1 second per request, not enough for 3600 per hour - python

The Amazon API limit is apparently 1 req per second or 3600 per hour. So I implemented it like so:
while True:
#sql stuff
time.sleep(1)
result = api.item_lookup(row[0], ResponseGroup='Images,ItemAttributes,Offers,OfferSummary', IdType='EAN', SearchIndex='All')
#sql stuff
Error:
amazonproduct.errors.TooManyRequests: RequestThrottled: AWS Access Key ID: ACCESS_KEY_REDACTED. You are submitting requests too quickly. Please retry your requests at a slower rate.
Any ideas why?

This code looks correct, and it looks like 1 request/second limit is still actual:
http://docs.aws.amazon.com/AWSECommerceService/latest/DG/TroubleshootingApplications.html#efficiency-guidelines
You want to make sure that no other process is using the same associate account. Depending on where and how you run the code, there may be an old version of the VM, or another instance of your application running, or maybe there is a version on the cloud and other one on your laptop, or if you are using a threaded web server, there may be multiple threads all running the same code.
If you still hit the query limit, you just want to retry, possibly with the TCP-like "additive increase/multiplicative decrease" back-off. You start by setting extra_delay = 0. When request fails, you set extra_delay += 1 and sleep(1 + extra_delay), then retry. When it finally succeeds, set extra_delay = extra_delay * 0.9.

Computer time is funny
This post is correct in saying "it varies in a non-deterministic manner" (https://stackoverflow.com/a/1133888/5044893). Depending on a whole host of factors, the time measured by a processor can be quite unreliable.
This is compounded by the fact that Amazon's API has a different clock than your program does. They are certainly not in-sync, and there's likely some overlap between their "1 second" time measurement and your program's. It's likely that Amazon tries to average out this inconsistency, and they probably also allow a small bit of error, maybe +/- 5%. Even so, the discrepancy between your clock and theirs is probably triggering the ACCESS_KEY_REDACTED signal.
Give yourself some buffer
Here are some thoughts to consider.
Do you really need to hit the Amazon API every single second? Would your program work with a 5 second interval? Even a 2-second interval is 200% less likely to trigger a lockout. Also, Amazon may be charging you for every service call, so spacing them out could save you money.
This is really a question of "optimization" now. If you use a constant variable to control your API call rate (say, SLEEP = 2), then you can adjust that rate easily. Fiddle with it, increase and decrease it, and see how your program performs.
Push, not pull
Sometimes, hitting an API every second means that you're polling for new data. Polling is notoriously wasteful, which is why Amazon API has a rate-limit.
Instead, could you switch to a queue-based approach? Amazon SQS can fire off events to your programs. This is especially easy if you host them with Amazon Lambda.

Related

dynamodb simple query execution time

I have a python aws lambda function that queries a aws dynamodb.
As my api now takes about 1 second to respond to a very simple query/table setup I wanted to understand where i can optimize.
The table has only 3 items (users) at the moment and the following structure:
user_id (Primary Key, String),
details ("[{
"_nested_atrb1_str": "abc",
"_nested_atrb2_str": "def",
"_nested_map": [nested_item1,nested_item2]},
{..}]
The query is super simple:
response = table.query(
KeyConditionExpression=Key('userid').eq("xyz")
)
The query takes 0.8-0.9 seconds.
Is this a normal query time for a table with only 3 items where each
user only has max 5 attributes(incl nested)?
If yes, can i expect
similar times if the structure stays the same but the number of items
(users) increases hundred-fold ?

There are a few things to investigate. First off, is your timing of .8 - .9 seconds based on timing the query directly by wrapping the query in a time or timeit like timer? If it is the query truly taking that time then there is definitely something not quite right with the interaction to Dynamo from Lambda.
If the time you're seeing is actually from the invoke of your Lambda (I assume this is through API Gateway as a REST API since you mentioned "api") then the time you're seeing could be due to many factors. Can you profile the API call? I would check to see through Postman or even browser tools if you can profile to see the time for DNS lookup, SSL setup, etc. Additionally, CloudWatch will give you metrics specific to the call times for your Lambda once the request has reached Lambda. You could also look at enabling X-Ray which will give you more details in regards to the execution of your Lambda. If your Lambda is running in a VPC you could also be encountering cold starts that are leading to the latency you're seeing.
X-Ray:
https://aws.amazon.com/xray/
Cold Starts: just Google "AWS Lambda cold starts" and you'll find all kinds of info

For anyone with similar experiences, I received the below AWS developer support response with some useful references. It didn't solve my problem but I now understand that this is mainly related to the low (test)volume and lambda startup time.
1) Is this a normal query time for a table with only 3 items where each user only has max 5 attributes(incl nested)?
The time is slow but could be due to a number of factors based on your setup. Since you are using Lambda you need to keep in mind that every time you trigger your lambda function it sets up your environment and then executes the code. An AWS Lambda function runs within a container—an execution environment that is isolated from other functions. When you run a function for the first time, AWS Lambda creates a new container and begins executing the function's code. A Lambda function has a handler that is executed once per invocation. After the function executes, AWS Lambda may opt to reuse the container for subsequent invocations of the function. In this case, your function handler might be able to reuse the resources that you defined in your initialization code. (Note that you cannot control how long AWS Lambda will retain the container, or whether the container will be reused at all.) Your table is really small, I had a look at it. [1]
2) Can I expect similar times if the structure stays the same but the number of items (users) increases hundred-fold?
If the code takes longer to execute and you have more data in DynamoDB eventually it could slow down, again based on your set up.
Some of my recommendations on optimizing your set up.
1) Have Lambda and DynamoDB within the same VPC. You can query your DynamoDB via a VPC endpoint. This will cut out any network latencies. [2][3]
2) Increase memory on lambda for faster startup and execution times.
3) As your application scales. Make sure to enable auto-scaling on your DynamoDB table and also increase your RCU and WCU to improve DynamoDBs performance when handling requests. [4]
Additionally, have a look at DynamoDB best practices. [5]
Please feel free to contact me with any additional questions and for further guidance. Thank you. Enjoy your day. Have a great day.
References
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.BestPracticesWithDynamoDB.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/vpc-endpoints-dynamodb.html
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html

Profiling my small lambda code (outside of lambda) I got these results that you may find interesting.
Times in milliseconds
# Initially
3 calls to DB,
1350 ms 1st call (read)
1074 ms 2nd call (write)
1051 ms 3rd call (read)
# After doing this outside the DB calls and providing it to each one
dynamodb = boto3.resource('dynamodb',region_name=REGION_NAME)
12 ms executing the line above
1324 ms 1st call (read)
285 ms 2nd call (write)
270 ms 3rd call (read)
# seeing that reusing was producing savings I did the same with
tableusers = dynamodb.Table(TABLE_USERS)
12 create dynamodb handler
3 create table handler
1078 read reusing dynamodb and table
280 write reusing dynamodb and table
270 read reusing dynamodb (not table)
So initially it took 3.4 seconds, now ~1.6 seconds for just adding 2 lines of code.
I got these results using %lprun on jupyter / Colab
# The -u 0.001 sets the time unit at 1ms (default is 1 microsecond)
%lprun -u 0.001 -f lambdaquick lambdaquick()
If you only do 1 DB request and nothing else with the DB, try to put the 2 DB handlers outside the lambda handler as amittn recommends.
Disclaimer: I just learned all this, including deep profiling. So all this may be nonsense.
Note: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. -- Donald Knuth" from https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Python.03.html

If you are seeing this issue only on the first invocation then its definitely due to cold start of lambda's. Otherwise on the consequent requests there should be a improvement which might help you to diagnose the actual pain point. Also cloudwatch logs will help in tracking the request.

I am assuming that you are reusing your connections as it cuts several milliseconds off your execution time. If not this will help you achieve that.
Any variable outside the lambda_handler function will be frozen in between Lambda invocations and possibly reused. The documentation states to “not assume that AWS Lambda always reuses the container because AWS Lambda may choose not to reuse the container.” but it's observed that depending on the volume of executions, the container is almost always reused.

Is it possible to force a 2 second looping callback in Python?

I'm trying to get a looping call to run every 2 seconds. Sometimes, I get the desired functionality, but othertimes I have to wait up to ~30 seconds which is unacceptable for my applications purposes.
I reviewed this SO post and found that looping call might not be reliable for this by default. Is there a way to fix this?
My usage/reason for needing a consistent ~2 seconds:
The function I am calling scans an image (using CV2) for a dollar value and if it finds that amount it sends a websocket message to my point of sale client. I can't have customers waiting 30 seconds for the POS terminal to ask them to pay.
My source code is very long and not well commented as of yet, so here is a short example of what I'm doing:
#scan the image for sales every 2 seconds
def scanForSale():
print ("Now Scanning for sale requests")
#retrieve a new image every 2 seconds
def getImagePreview():
print ("Loading Image From Capture Card")
lc = LoopingCall(scanForSale)
lc.start(2)
lc2 = LoopingCall(getImagePreview)
lc2.start(2)
reactor.run()
I'm using a Raspberry Pi 3 for this application, which is why I suspect it hangs for so long. Can I utilize multithreading to fix this issue?

Raspberry Pi is not a real time computing platform. Python is not a real time computing language. Twisted is not a real time computing library.
Any one of these by itself is enough to eliminate the possibility of a guarantee that you can run anything once every two seconds. You can probably get close but just how close depends on many things.
The program you included in your question doesn't actually do much. If this program can't reliably print each of the two messages once every two seconds then presumably you've overloaded your Raspberry Pi - a Linux-based system with multitasking capabilities. You need to scale back your usage of its resources until there are enough available to satisfy the needs of this (or whatever) program.
It's not clear whether multithreading will help - however, I doubt it. It's not clear because you've only included an over-simplified version of your program. I would have to make a lot of wild guesses about what your real program does in order to think about making any suggestions of how to improve it.

Interact with long running python process

I have a long running python process running headless on a raspberrypi (controlling a garden) like so:
from time import sleep
def run_garden():
while 1:
/* do work */
sleep(60)
if __name__ == "__main__":
run_garden()
The 60 second sleep period is plenty of time for any changes happening in my garden (humidity, air temp, turn on pump, turn off fan etc), BUT what if i want to manually override these things?
Currently, in my /* do work */ loop, i first call out to another server where I keep config variables, and I can update those config variables via a web console, but it lacks any sort of real time feel, because it relies on the 60 second loop (e.g. you might update the web console, and then wait 45 seconds for the desired effect to take effect)
The raspberryPi running run_garden() is dedicated to the garden and it is basically the only thing taking up resources. So i know i have room to do something, I just dont know what.
Once the loop picks up the fact that a config var has been updated, the loop could then do exponential backoff to keep checking for interaction, rather than wait 60 seconds, but it just doesnt feel like that is a whole lot better.
Is there a better way to basically jump into this long running process?

Listen on a socket in your main loop. Use a timeout (e.g. of 60 seconds, the time until the next garden update should be performed) on your socket read calls so you get back to your normal functionality at least every minute when there are no commands coming in.
If you need garden-tending updates to happen no faster than every minute you need to check the time since the last update, since read calls will complete significantly faster when there are commands coming in.

Python's select module sounds like it might be helpful.
If you've ever used the unix analog (for example in socket programming maybe?), then it'll be familiar.
If not, here is the select section of a C sockets reference I often recommend. And here is what looks like a nice writeup of the module.
Warning: the first reference is specifically about C, not Python, but the concept of the select system call is the same, so the discussion might be helpful.
Basically, it allows you to tell it what events you're interested in (for example, socket data arrival, keyboard event), and it'll block either forever, or until a timeout you specify elapses.
If you're using sockets, then adding the socket and stdin to the list of events you're interested in is easy. If you're just looking for a way to "conditionally sleep" for 60 seconds unless/until a keypress is detected, this would work just as well.
EDIT:
Another way to solve this would be to have your raspberry-pi "register" with the server running the web console. This could involve a little bit extra work, but it would give you the realtime effect you're looking for.
Basically, the raspberry-pi "registers" itself, by alerting the server about itself, and the server stores the address of the device. If using TCP, you could keep a connection open (which might be important if you have firewalls to deal with). If using UDP you could bind the port on the device before registering, allowing the server to respond to the source address of the "announcement".
Once announced, when config. options change on the server, one of two things usually happen:
A) You send a tiny "ping" (in the general sense, not the ICMP host detection protocol) to the device alerting it that config options have changed. At this point the host would immediately request the full config. set, acquiring the update with it.
B) You send the updated config. option (or maybe the entire config. set) back to the device. This decreases the number of messages between the device and server, but would probably take more work as it seems like more a deviation from your current setup.

Why not use an event based loop instead of sleeping for a certain amount of time.
That way your loop will only run when a change is detected, and it will always run when a change is detected (which is the point of your question?).
You can do such a thing by using:
python event objects
Just wait for one or all of your event objects to be triggered and run the loop. You can also wait for X events to be done, etc, depending if you expect one variable to be updated a lot.
Or even a system like:
broadcasting events

GAE Backend fails to respond to start request

This is probably a truly basic thing that I'm simply having an odd time figuring out in a Python 2.5 app.
I have a process that will take roughly an hour to complete, so I made a backend. To that end, I have a backend.yaml that has something like the following:
-name: mybackend
options: dynamic
start: /path/to/script.py
(The script is just raw computation. There's no notion of an active web session anywhere.)
On toy data, this works just fine.
This used to be public, so I would navigate to the page, the script would start, and time out after about a minute (HTTP + 30s shutdown grace period I assume, ). I figured this was a browser issue. So I repeat the same thing with a cron job. No dice. Switch to a using a push queue and adding a targeted task, since on paper it looks like it would wait for 10 minutes. Same thing.
All 3 time out after that minute, which means I'm not decoupling the request from the backend like I believe I am.
I'm assuming that I need to write a proper Handler for the backend to do work, but I don't exactly know how to write the Handler/webapp2Route. Do I handle _ah/start/ or make a new endpoint for the backend? How do I handle the subdomain? It still seems like the wrong thing to do (I'm sticking a long-process directly into a request of sorts), but I'm at a loss otherwise.

So the root cause ended up being doing the following in the script itself:
models = MyModel.all()
for model in models:
# Magic happens
I was basically taking for granted that the query would automatically batch my Query.all() over many entities, but it was dying at the 1000th entry or so. I originally wrote it was computational only because I completely ignored the fact that the reads can fail.
The actual solution for solving the problem we wanted ended up being "Use the map-reduce library", since we were trying to look at each model for analysis.

How can i make this code run smoothly on google app engine?

i'm new to web apps so i'm not so used to worrying about CPU limits, but i looks i am going to have a problem with this code. I read in google's quotas page that i can use 6.5 CPU hours per day an 15 CPU , minutes per minute.
Google Said:
CPU time is reported in "seconds," which is equivalent to the number of CPU cycles that
can be performed by a 1.2 GHz Intel x86 processor in that amount of time. The actual
number of CPU cycles spent varies greatly depending on conditions internal to App Engine,
so this number is adjusted for reporting purposes using this processor as a reference
measurement.
And
Per Day Max Rate
CPU Time 6.5 CPU-hours 15 CPU-minutes/minute
What i want to Know:
Is this script going over the limit?
(if yes)How can i make it not go over the limit?
I use the urllib library, should i use Google's URL Fetch API? Why?
Absolutely any other helpful comment.
What it does:
It scrapes (crawls) project free TV. I will only completely run it once then replace it with a shorter faster script.
from urllib import urlopen
import re
alphaUrl = 'http://www.free-tv-video-online.me/movies/'
alphaPage = urlopen(alphaUrl).read()
patFinderAlpha = re.compile('<td width="97%" nowrap="true" class="mnlcategorylist"><a href="(.*)">')
findPatAlpha = re.findall(patFinderAlpha,alphaPage)
listIteratorAlpha = []
listIteratorAlpha[:] = range(len(findPatAlpha))
for ai in listIteratorAlpha:
betaUrl = 'http://www.free-tv-video-online.me/movies/' + findPatAlpha[ai] + '/'
betaPage = urlopen(betaUrl).read()
patFinderBeta = re.compile('<td width="97%" class="mnlcategorylist"><a href="(.*)">')
findPatBeta = re.findall(patFinderBeta,betaPage)
listIteratorBeta = []
listIteratorBeta[:] = range(len(findPatBeta))
for bi in listIteratorBeta:
gammaUrl = betaUrl + findPatBeta[bi]
gammaPage = urlopen(gammaUrl).read()
patFinderGamma = re.compile('<a href="(.*)" target="_blank" class="mnllinklist">')
findPatGamma = re.findall(patFinderGamma,gammaPage)
patFinderGamma2 = re.compile('<meta name="keywords"content="(.*)">')
findPatGamma2 = re.findall(patFinderGamma2,gammaPage)
listIteratorGamma = []
listIteratorGamma[:] = range(len(findPatGamma))
for gi in listIteratorGamma:
deltaUrl = findPatGamma[gi]
deltaPage = urlopen(deltaUrl).read()
patFinderDelta = re.compile("<iframe id='hmovie' .* src='(.*)' .*></iframe>")
findPatDelta = re.findall(patFinderDelta,deltaPage)
PutData( findPatGamma2[gi], findPatAlpha[ai], findPatDelt)
If I forgot anything please let me know.
Update:
This is about how many times it will run and why in case this is helpfull in answering the question.
per cycle total
Alpha: 1 1
Beta: 16 16
Gamma: ~250 ~4000
Delta: ~6 ~24000

I don't like to optimize until I need to. First, just try it. It might just work. If you go over quota, shrug, come back tomorrow.
To split jobs into smaller parts, look at the Task Queue API. Maybe you can divide the workload into two queues, one that scrapes pages and one that processes them. You can put limits on the queues to control how aggressively they are run.
P.S. On Regex for HTML: Do what works. The academics will call you out on semantic correctness, but if it works for you, don't let that stop you.

I use the urllib library, should i use Google's URL Fetch API? Why?
urlib on AppEngine production servers is The URLFetch API

It's unlikely that this will go over the free limit, but it's impossible to say without seeing how big the list of URLs it needs to fetch is, and how big the resulting pages are. The only way to know for sure is to run it - and there's really no harm in doing that.
You're more likely to run into the limitations on individual request execution - 30 seconds for frontend requests, 10 minutes for backend requests like cron jobs - than run out of quota. To alleviate those issues, use the Task Queue API to split your job into many parts. As an additional benefit, they can run in parallel! You might also want to look into Asynchronous URLFetch - though it's probably not worth it if this is just a one-off script.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.