I want to measure time in miliseconds, that this line took:
before=datetime.datetime.now()
response = urllib2.urlopen("https://www.google.com")
after=datetime.datetime.now()
It is supposed to be kind of workaround for server, which doesn't ping back, so I have to measure it from the server response.
I can get the string back string 0:00:00.034225 if I deduct two times and I am able to grab miliseconds as a substring, but I would like to get miliseconds in some cleaner way (whole difference in ms, including time converted from seconds, if the server responds with really big delay).
after - before is a datetime.timedelta object whose total_seconds method will give you what you are looking for. You can find additional information in the Python docs.
You will just have to multiply by 1000 to get milliseconds. Don't worry, although the method is called total_seconds, it includes milliseconds as decimal places. Sample output:
>>> d = t1 - t0
>>> d.total_seconds()
2.429001
This won't give you a timeout though, only a mesurement of the duration.
urlopen allows you to pass a timeout parameter, and will automatically abort after that much time has elapsed. From the docs:
urllib2.urlopen(url[, data][, timeout])
The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the
global default timeout setting will be used). This actually only works
for HTTP, HTTPS and FTP connections.
Python actually has a mechanism for timing small pieces of code -- timeit.Timer -- but that's for performance profiling and testing, not for implementing your own timeouts.
Related
I'm novice to programming and learning python3.
Recently I'm trying to make cryptocurrency trading system using binance's api.
Here's the api document.
The logic and explanation about timestamp in the document is as follows :
Timestamp, to be sent which should be the millisecond timestamp of when the request was created and sent.
if (timestamp < serverTime && (serverTime - timestamp) <= recvWindow)
{ // process request } else { // reject request }
According to this logic, the time I sent the request should be less than the time on the server. The problem is that I have not passed this logic.
When I call time.time() and server time using this code,
import requests
import simplejson as json
import time
base_url = "https://api.binance.com"
servertime_endpoint="/api/v1/time"
url = base_url + servertime_endpoint
t = time.time()*1000
r = requests.get(url)
result = json.loads(r.content)
print(int(t)-result["serverTime"])
time.time() is bigger than server time so that I get return from last sentence with positive value. What should I do?
This is most likely due to the operating system you are running using a clock with a lower resolution than the one the server is running. When running on a Linux or Mac OS, Python uses a system call for time.time() that returns time down to microsecond resolution (or better). When running on a Windows machine, it only returns time down to millisecond resolution.
You can check the resolution of the time.time() function by programming a busy loop and waiting until the time changes: use the code in this incredibly useful answer to see what your resolution is.
If you are running on an OS with a resolution of ~0.001 second (1 millisecond) while the server is reporting times at a resolution of ~0.000001 second (1 microsecond), then even if your clocks were exactly in sync and there is zero network latency, you would still expect your time to be ahead of the server time on 50% of the calls simply due to quantization noise. For instance, if the server reports a time of 12345.678501 (microsecond resolution), your machine would report a time of 12345.679 (millisecond resolution) and appear to be 499 microseconds ahead.
Some quick solutions are to:
check if the server time rounds to your machine time and call that acceptable even if it appears your time is ahead of the server time;
subtract 500 microseconds to your time to guarantee that quantization noise can't put you ahead of the server;
increase the timing threshold by 500 microseconds and check that the absolute value of the difference between your time and the server time are within the bounds;
run your code on a operating system with a higher resolution system clock.
The Amazon API limit is apparently 1 req per second or 3600 per hour. So I implemented it like so:
while True:
#sql stuff
time.sleep(1)
result = api.item_lookup(row[0], ResponseGroup='Images,ItemAttributes,Offers,OfferSummary', IdType='EAN', SearchIndex='All')
#sql stuff
Error:
amazonproduct.errors.TooManyRequests: RequestThrottled: AWS Access Key ID: ACCESS_KEY_REDACTED. You are submitting requests too quickly. Please retry your requests at a slower rate.
Any ideas why?
This code looks correct, and it looks like 1 request/second limit is still actual:
http://docs.aws.amazon.com/AWSECommerceService/latest/DG/TroubleshootingApplications.html#efficiency-guidelines
You want to make sure that no other process is using the same associate account. Depending on where and how you run the code, there may be an old version of the VM, or another instance of your application running, or maybe there is a version on the cloud and other one on your laptop, or if you are using a threaded web server, there may be multiple threads all running the same code.
If you still hit the query limit, you just want to retry, possibly with the TCP-like "additive increase/multiplicative decrease" back-off. You start by setting extra_delay = 0. When request fails, you set extra_delay += 1 and sleep(1 + extra_delay), then retry. When it finally succeeds, set extra_delay = extra_delay * 0.9.
Computer time is funny
This post is correct in saying "it varies in a non-deterministic manner" (https://stackoverflow.com/a/1133888/5044893). Depending on a whole host of factors, the time measured by a processor can be quite unreliable.
This is compounded by the fact that Amazon's API has a different clock than your program does. They are certainly not in-sync, and there's likely some overlap between their "1 second" time measurement and your program's. It's likely that Amazon tries to average out this inconsistency, and they probably also allow a small bit of error, maybe +/- 5%. Even so, the discrepancy between your clock and theirs is probably triggering the ACCESS_KEY_REDACTED signal.
Give yourself some buffer
Here are some thoughts to consider.
Do you really need to hit the Amazon API every single second? Would your program work with a 5 second interval? Even a 2-second interval is 200% less likely to trigger a lockout. Also, Amazon may be charging you for every service call, so spacing them out could save you money.
This is really a question of "optimization" now. If you use a constant variable to control your API call rate (say, SLEEP = 2), then you can adjust that rate easily. Fiddle with it, increase and decrease it, and see how your program performs.
Push, not pull
Sometimes, hitting an API every second means that you're polling for new data. Polling is notoriously wasteful, which is why Amazon API has a rate-limit.
Instead, could you switch to a queue-based approach? Amazon SQS can fire off events to your programs. This is especially easy if you host them with Amazon Lambda.
I have a http request in my code that takes ~5-10 s to run. Through searching this site, I've found the code to increase the limit before timeout:
from google.appengine.api import urlfetch
urlfetch.set_default_fetch_deadline(60)
My question: What is that number '60'? Seconds or tenths of a second? Most responses seem to imply it's seconds, but that can't be right. When I use 60, I get a time out in less than 10 s while testing on localhost. I have to set the number to at least 100 to avoid the issue - which I worry will invoke the ire of the Google gods.
It's seconds, you can passed it in the fetch function. Have you tried to fetch another website? Are you sure it's a timeout not another error?
https://developers.google.com/appengine/docs/python/urlfetch/fetchfunction
I am using nginx, web.py, fastcgi, and redis as my stack.
Upon a post request I have 120 ms to return a response so I need to always measure the response and if about to approach the threshold I need to abort and return False. I dont get punished if true or false, only if I exceed the 120 threshold where its an exception. I expect 10-50K qps.
I can simply do if conditions but I am concerned about a long running process and I will have to wait to end before I find out it took to long e.g. a redis call where I am using pipelining. For example
start = time.time()
r.get()
end = time.time - start
if end>115 then return False
if r.get() takes too e.g. 130ms long then I get punished.
What is python best practice to monitor time and send an abort signal without the monitoring process not taking up too much resources? Do I fire up a thread?
Do I use a timeout? if So then how in the MS range?
Thanks
This is a good use for a decorator in Python. Somebody has already written a #timeout decorator that you can use this way:
#timeout(timeout=2)
def return_later():
time.sleep(3)
return 'later'
Since sleep time > timeout, instead of returning 'later' it will generation an exception:
Traceback (most recent call last):
...
TimeoutException
However because the implementation uses signal.alarm() which only takes an int, you can't specify a fraction of a second, only whole seconds. Since it looks like you want to use milliseconds you could adapt this decorator to use signal.setitimer(). Or an even better solution if you're intrepid enough is to submit a patch the way someone did to implement setitimer functionality to the signal module to support ualarm() which provides microsecond resolution and is simpler to use than setitimer().
cache.set(key, value, 9999999)
But this is not infinite time...
def _get_memcache_timeout(self, timeout):
"""
Memcached deals with long (> 30 days) timeouts in a special
way. Call this function to obtain a safe value for your timeout.
"""
timeout = timeout or self.default_timeout
if timeout > 2592000: # 60*60*24*30, 30 days
# See http://code.google.com/p/memcached/wiki/FAQ
# "You can set expire times up to 30 days in the future. After that
# memcached interprets it as a date, and will expire the item after
# said date. This is a simple (but obscure) mechanic."
#
# This means that we have to switch to absolute timestamps.
timeout += int(time.time())
return timeout
And from the FAQ:
What are the limits on setting expire time? (why is there a 30 day limit?)
You can set expire times up to 30 days in the future. After that memcached interprets it as a date, and will expire the item after said date. This is a simple (but obscure) mechanic.
From the docs:
If the value of this settings is None, cache entries will not expire.
Notably, this is different from how the expiration time works in the Memcache standard protocol:
Expiration times can be set from 0, meaning "never expire", to 30
days. Any time higher than 30 days is interpreted as a unix timestamp
date
So, to set a key to never expire, set the timeout to None if you're using Django's cache abstraction, or 0 if you're using Memcache more directly.
Support for non-expiring cache has been added in Django 1.6 by setting timeout=None
Another simple technique is to write the generated HTML out to a file on the disk, and to use that as your cache. It's not hard to implement, and it works quite well as a file-based cache that NEVER expires, is quite transparent, etc.
It's not the django way, but it works well.