Tweepy api.list_direct_messages() updates slowly - python

Hello and thanks for taking the time to try to answer my question. I'll be as blunt, and specific as possible.
Using tweepy I'm trying to get the ID from the last message in my DMs by using this method
auth = tweepy.OAuthHandler(token[0], token[1])
auth.set_access_token(token[2], token[3])
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify= True)
last_dm = api.list_direct_messages(1)
for messages in last_dm:
print(messages.message_create['sender_id'])
if not (messages.message_create['sender_id'] == my_id):
send_message()
This works as expected, however something weird happens right after. If I were to run this program once it'd work, but if I run it again within 3 or so minutes it won't register that the sender ID has changed. Any time after that it'll work however so I think theres some sort of lag coming from tweepy.
My question is is there a way around this? If not for tweepy what about with another library or language like Java script?

There is a rate max limit for all APIs calls to Twitter, with Tweepy you can let the library deal with it setting wait_on_rate_limit=True.
In this case Tweepy will slow down (hold the requests) to stay within the limits with the advantage that you will not get an error/exception in the application.
In a similar implementation I am calling the api.list_direct_messages once every 60 seconds and I don't hit the rate limit (obviously I cannot respond in real time but this is a decision which depends on the context and business logic).

Related

Delaying 1 second per request, not enough for 3600 per hour

The Amazon API limit is apparently 1 req per second or 3600 per hour. So I implemented it like so:
while True:
#sql stuff
time.sleep(1)
result = api.item_lookup(row[0], ResponseGroup='Images,ItemAttributes,Offers,OfferSummary', IdType='EAN', SearchIndex='All')
#sql stuff
Error:
amazonproduct.errors.TooManyRequests: RequestThrottled: AWS Access Key ID: ACCESS_KEY_REDACTED. You are submitting requests too quickly. Please retry your requests at a slower rate.
Any ideas why?
This code looks correct, and it looks like 1 request/second limit is still actual:
http://docs.aws.amazon.com/AWSECommerceService/latest/DG/TroubleshootingApplications.html#efficiency-guidelines
You want to make sure that no other process is using the same associate account. Depending on where and how you run the code, there may be an old version of the VM, or another instance of your application running, or maybe there is a version on the cloud and other one on your laptop, or if you are using a threaded web server, there may be multiple threads all running the same code.
If you still hit the query limit, you just want to retry, possibly with the TCP-like "additive increase/multiplicative decrease" back-off. You start by setting extra_delay = 0. When request fails, you set extra_delay += 1 and sleep(1 + extra_delay), then retry. When it finally succeeds, set extra_delay = extra_delay * 0.9.
Computer time is funny
This post is correct in saying "it varies in a non-deterministic manner" (https://stackoverflow.com/a/1133888/5044893). Depending on a whole host of factors, the time measured by a processor can be quite unreliable.
This is compounded by the fact that Amazon's API has a different clock than your program does. They are certainly not in-sync, and there's likely some overlap between their "1 second" time measurement and your program's. It's likely that Amazon tries to average out this inconsistency, and they probably also allow a small bit of error, maybe +/- 5%. Even so, the discrepancy between your clock and theirs is probably triggering the ACCESS_KEY_REDACTED signal.
Give yourself some buffer
Here are some thoughts to consider.
Do you really need to hit the Amazon API every single second? Would your program work with a 5 second interval? Even a 2-second interval is 200% less likely to trigger a lockout. Also, Amazon may be charging you for every service call, so spacing them out could save you money.
This is really a question of "optimization" now. If you use a constant variable to control your API call rate (say, SLEEP = 2), then you can adjust that rate easily. Fiddle with it, increase and decrease it, and see how your program performs.
Push, not pull
Sometimes, hitting an API every second means that you're polling for new data. Polling is notoriously wasteful, which is why Amazon API has a rate-limit.
Instead, could you switch to a queue-based approach? Amazon SQS can fire off events to your programs. This is especially easy if you host them with Amazon Lambda.

ec2 wait for instance to come up with timeout [Python]

I'm using AWS python API (boto3). My script starts a few instances and then waits for them to come up online, before proceeding doing stuff. I want the wait to timeout after a predefined period, but I can't find any API for that in Python. Any ideas? A snippet of my current code:
def waitForInstance(id):
runningWaiter = self.ec2c.get_waiter("instance_status_ok")
runningWaiter.wait(InstanceIds = [id])
instance = ec2resource.Instance(id)
return instance.state
I can certainly do something like running this piece of code in a separate thread and terminate it if needed, but I was wondering whether there is already a built in API in boto3 for that and I'm just missing it.
A waiter has a configuration associated with it which can be accessed (using your example above) as:
runningWaiter.config
One of the settings in this config is max_attempts which controls how many attempts will be tried before giving up. The default value is 40. You can change that value like this:
runningWaiter.config.max_attempts = 10
This isn't directly controlling a timeout as your question asked but will cause the waiter to give up earlier.
Why not check the instances status from time to time?
#code copy from boto3 doc
for status in ec2.meta.client.describe_instance_status()['InstanceStatuses']:
print(status)
refence : http://boto3.readthedocs.org/en/latest/guide/migrationec2.html
BTW, it is better to use tag naming for all the instances with a standard naming convention. Query any aws resources with its original ID is a maintenance nightmare.
You could put a sleep timer in your code. Sleep for x minutes, check it to see if it is finished and go back to sleep if not. After y number of attempts take some sort it action.

GAE Backend fails to respond to start request

This is probably a truly basic thing that I'm simply having an odd time figuring out in a Python 2.5 app.
I have a process that will take roughly an hour to complete, so I made a backend. To that end, I have a backend.yaml that has something like the following:
-name: mybackend
options: dynamic
start: /path/to/script.py
(The script is just raw computation. There's no notion of an active web session anywhere.)
On toy data, this works just fine.
This used to be public, so I would navigate to the page, the script would start, and time out after about a minute (HTTP + 30s shutdown grace period I assume, ). I figured this was a browser issue. So I repeat the same thing with a cron job. No dice. Switch to a using a push queue and adding a targeted task, since on paper it looks like it would wait for 10 minutes. Same thing.
All 3 time out after that minute, which means I'm not decoupling the request from the backend like I believe I am.
I'm assuming that I need to write a proper Handler for the backend to do work, but I don't exactly know how to write the Handler/webapp2Route. Do I handle _ah/start/ or make a new endpoint for the backend? How do I handle the subdomain? It still seems like the wrong thing to do (I'm sticking a long-process directly into a request of sorts), but I'm at a loss otherwise.
So the root cause ended up being doing the following in the script itself:
models = MyModel.all()
for model in models:
# Magic happens
I was basically taking for granted that the query would automatically batch my Query.all() over many entities, but it was dying at the 1000th entry or so. I originally wrote it was computational only because I completely ignored the fact that the reads can fail.
The actual solution for solving the problem we wanted ended up being "Use the map-reduce library", since we were trying to look at each model for analysis.

sending instant messages through python (msn)

ok I am well aware there are many other questions about this, but I have been searching and have yet to find a solid proper answer that doesnt revolve around jabber or something worse. (no offense to jabber users, just I don't want all the extras that come with it)
I currently have msnp and twisted.words, I simply want to send and receive messages, have read many examples that have failed to work, and msnp is poorly documented.
My preference is msnp as it requires much less code, I'm not looking for something complicated.
Using this code I can login, and view my friends that are online (can't send them messages though.):
import msnp
import time, threading
msn = msnp.Session()
msn.login('XXXXXXX#hotmail.com', 'XXXXXX')
msn.sync_friend_list()
class MSN_Thread(threading.Thread):
def run(self):
msn.start_chat("XXXXXXX#hotmail.com") #this does not work
while True:
msn.process()
time.sleep(1)
start_msn = MSN_Thread()
start_msn.start()
I hope I have been clear enough, its pretty late and my head is not in a clear state after all this msn frustration.
edit: since it seems msnp is extremely outdated could anyone recommend with simple examples on how I could achieve this?
Don't need anything fancy that requires other accounts.
There is also xmpp which is used for gmail.
You are using a library abandoned in 2004 so i'm not sure if msnp could still be used to talk on MSN.
Anyway i would try with:
while True:
msn.process(chats = True)
time.sleep(1)
using the contact id and not the email address.
contacts = msn.friend_list.get_friends()
contact_id = contacts.get_passport_id()
Your code just start the chat without sending anything; you need to add the code to send message.
Have a look to send_message method in this tutorial.
It looks like papyon is a maintained fork of the pymsn library, and is currently used by telepathy-butterfly and amsn2.
papyon is an MSN client library, that tries to abstract the MSN protocol gory details. It is a fork of the unmaintained pymsn MSN library. papyon uses the GLib main event loop to process the network events in an asynchronous manner.

chatbot using twisted and wokkel

I am writing a chatbot using Twisted and wokkel and everything seems to be working except that bot periodically logs off. To temporarily fix that I set presence to available on every connection initialized. Does anyone know how to prevent going offline? (I assume if i keep sending available presence every minute or so bot wont go offline but that just seems too wasteful.) Suggestions anyone? Here is the presence code:
class BotPresenceClientProtocol(PresenceClientProtocol):
def connectionInitialized(self):
PresenceClientProtocol.connectionInitialized(self)
self.available(statuses={None: 'Here'})
def subscribeReceived(self, entity):
self.subscribed(entity)
self.available(statuses={None: 'Here'})
def unsubscribeReceived(self, entity):
self.unsubscribed(entity)
Thanks in advance.
If you're using XMPP, as I assume is the case given your mention of wokkel, then, per RFC 3921, the applicable standard, you do need periodic exchanges of presence information (indeed, that's a substantial overhead of XMPP, and solutions to it are being researched, but that's the state of the art as of now). Essentially, given the high likelihood that total silence from a client may be due to that client just going away, periodic "reassurance" of the kind "I'm still here" appears to be a must (I'm not sure what direction those research efforts are taking to ameliorate this situation -- maybe the client could commit to "being there for at least the next 15 minutes", but given that most clients are about a fickle human user who can't be stopped from changing their mind at any time and going away, I'm not sure that would be solid enough to be useful).

Categories