Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm working with an API and the documentation doesn't state the exact limits on the requests I make, this causes my app to suddenly stop working because of long waiting periods and eventually timeouts.
Is there a way to find out what the API limits are and build a workaround? such as "if API limits are 5 requests per minute then wait a minute before sending the 6th request" or so ...
The API I'm talking about here is the TD Ameritrade API, documentation:
https://developer.tdameritrade.com/home
I'm coding with Python.
Thanks for anybody who helps.
Edit: Problem was solved, the API can handle 120 calls per minute.
Yes, there is a limit every minute. So, it's says at the bottom of this page : https://developer.tdameritrade.com/content/authentication-faq
All non-order based requests by personal use non-commercial applications are throttled to 120 per minute. Exceeding this throttle limit will provide a response with a 429 error code to inform you that the throttle limit has been exceeded.
API calls, especially private accounts are restricted to be able to preserve processing power to people who pay for the service, like companies do.
For about 2 minutes of searching in the documentation, I managed to find this line:
All private, non-commercial use apps are currently limited to 120 requests per minute on all APIs except for Accounts & Trading
Please, read the docs carefully before posting here!
By the way, you can calculate that you have 120 calls / 60 seconds, which means 1 call / 0.5 second.
You can simply sleep for that amount of time, or delay the call of a new thread, if your app is designed that way.
Since you did not provided any code, I will show you a basic example using sleep.
import time
while True: #main loop
apicall() #apicall here
time.sleep(1) #sleep 1 second after each call
But I strongly suggest adding your code to the question, so people can provide you better solutions.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
Improve this question
I use an API that has ~30 endpoints and I have settings how often I need to send request to each endpoint. For some endpoints it's seconds and for some hours. I want to implement python app that will call each API endpoint (and execute some code) after every N seconds where N can be different for each endpoint. If one call is still in progress when second one kicks in, then that one should be added to queue (or something similar) and executed after the first one finishes.
What would be the correct way to implement this using python?
I have some experience with RabbitMQ but I think that might be overkill for this problem.
You said "executed after the first one finishes", so it's a single thread program.
Just use def() to create some functions and then execute them one by one.
For example
import time
def task1(n):
print("Task1 start")
time.sleep(n)
print("Task1 end ")
def task2(n):
print("Task2 start")
time.sleep(n)
print("Task2 end ")
task1(5) #After 5sec, task1 end and execute task2
task2(3) #task2 need 3sec to execute.
You could build your code in this way:
store somewhere the URL, method and parameters for each type of query. A dictionary would be nice: {"query1": {"url":"/a","method":"GET","parameters":None} , "query2": {"url":"/b", "method":"GET","parameters":"c"}} but you can do this any way you want, including a database if needed.
store somewhere a relationship between query type and interval. Again, you could do this with a case statement, or with a dict (maybe the same you previously used), or an interval column in a database.
Every N seconds, push the corresponding query entry to a queue (queue.put)
an HTTP client library such as requests runs continuously, removes an element from the queue, runs the HTTP request and when it gets a result it removes the following element.
Of course if your code is going to be distributed across multiple nodes for scalability or high availability, you will need a distributed queue such as RabbitMQ, Ray or similar.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Is there any way to improve performance of Python script by making all threads ready and sending all of them at once?
For example, "get ready" 100 different threads with HTTP requests, and when they are ready, they will be released at the same time with smallest delay possible.
Is there any possibility to make all threads ready (for example 500 threads) and send all of them without waiting?
Yes.
What you need is a synchronization object. Basically you start all threads, but they try to acquire access to a resource, which is not possible initially. When all 500 threads are waiting, you release that resource and all 500 threads will run.
Please note that
on usual computers, you can only run 8 threads really parallel, because the CPU only has 8 cores. So starting 500 threads and having 1 HTTP request each will likely result in the same as running 8 threads that do 62 HTTP requests in a loop.
specifically for Python, it has the GIL (global interpreter lock), so you don't need multithreading, you need multiprocessing.
this seems to be used for load testing. There's software available which was specifically built for such purposes, they are reliable and tested. Don't reinvent the wheel, that's error prone.
Thread scheduling on Windows is done in 17 ms intervals, AFAIK. That's because there is a hardware timer causing an interrupt. This interrupt gives the kernel control over the CPU. So your 10 ms requirements may not be possible.
I have been working with the Coinbase Websocket API recently for data analysis purposes. I am trying to track the order book in at least seconds-frequency.
As far as I am aware of, it is possible to use the REST API for that, but it does not include timestamp. The other options are the websocket level2 updates and the full channels.
The problem is that when I am processing the level2 updates I am constantly falling back in time (I did not focus on processing speed while I was programming since it was not my goal and I do not have the hardware neither the connection speed to do it), so for example after 30min I am able to process only 10 min of data.
The problem comes if, for whatever reason I am disconnected from the exchange, I have to reconnect again and I have a big empty window of data in the middle.
Is there any aggregated feed or way to do it (Receive all updates in one second or something like that) that I am not aware of? or should I just resign and improve my code and buy better equipment?
P.D: I am relatively new, so sorry if this type of question does not fit here!
Just in case anyone interested I just opened multiple websocket at different time windows and reconnect them periodically in order to miss as few price updates as possible.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm relatively new to Python and requests, so I'm not sure the best way to go about this.
I need to send a large amount of POST requests to a URL. Right now, I'm simply using a loop and sending the request, which yields roughly 100 posts every 10 - 30 seconds, depending on the internet. I'm looking for a way to do this faster and with more posts. Multiprocessing was recommended to me, but my knowledge here is very lacking (I've already frozen my computer trying to spawn too many processes).
How can I effectively implement multiprocessing to increase my results?
Here is a code sample taken from http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/ which may solve your problem. It uses the requests library to make the request and asyncio for the asynchronous calls. The only change you'd have to make is from a GET call to a POST call.
This was written in Python 3.5 (as expressed in the article)
# Example 2: asynchronous requests
import asyncio
import requests
async def main():
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
'http://example.org/'
)
for i in range(20)
]
for response in await asyncio.gather(*futures):
pass
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I would also recommend reading the entire article as it shows time comparisons when using lots of threads.
There's no reason to use multiprocessing here. Making requests of HTTP servers is almost entirely I/O-bound, not CPU-bound, so threads work just fine.
And the very first example of using ThreadPoolExecutor in the stdlib's concurrent.futures documentation does exactly what you're asking for, except with urllib instead of requests.
If you're doing anything complicated, look at requests-futures.
If you really do need to use multiprocessing for some reason (e.g., you're doing a whole lot of text processing on each result, and you want to parallelize that along with the requesting), you can just switch the ThreadPoolExecutor to a ProcessPoolExecutor and change nothing else in your code.
I have a Google App Engine HTTP resource that takes 20 seconds to respond. The resource does a calculation requiring very little bandwidth and no storage access. Billing is not enabled. If my desktop application spawns 100 threads to POST 500 times (each thread will on average POST 5 times). I believe that 500 POSTs use up just a little more than the freebie time for non-billing accounts, which is 6.5 CPU hours per 24 hour period. I might be about 10 POSTs over the limit because towards the end, about 10 of the 500 will fail even if I allow each request to retry twice.
In any event, the fact that I'm a little over the limit probably does not affect the problem which prompted my question. My question is: the dashboard measurement "CPU seconds used per second" is about 17. I would like this to be 100, because after all, I have 100 threads.
I'm not really good with Firebug or other monitoring tools so I have not proven that there is a peak of 100 outstanding requests on the wire-side of the Python standard library web methods, but I do print "hey" to the desktop console when there are 100 outstanding threads. It says "hey" fairly early so I think the number of CPU seconds per second should be a lot closer to 100 than 17. Is my problem on the desktop or is GAE throttling me and how can I get 100 CPU seconds per second? How can I get somebody at Google to help with this question? I think their "support" link just goes to "community-style" support.
Search the groups for 1000ms. Your app will not be given as many resources if your user-requests do not return in less than 1000ms. You might also face additional issues with requests that are taking 20 seconds, I believe if your requests sit in the pending queue it counts against the run-time increasing the likelihood you will get deadline / timeout errors.
You should look into breaking your code up and doing the processing in the task queue, or submitting more requests with less work per request.