I have included code below that I'm using with Tweepy, a Twitter API library for Python. While I'm trying most approaches that I've found online, they've failed to close the connection or stop the stream. Is there any way to do so?
Inside my function
setTerms = s.split(',')
streaming_api = tweepy.Stream(auth=auth, listener=StreamListener(), timeout=60 )
if (s == '0'):
streaming_api.disconnect()
raise web.seeother('/dc')
print "Failed to see this"
try:
twt = streaming_api.filter(track=setTerms)
except:
streaming_api.disconnect()
#also cannot see this
raise web.seeother('/stream')
Here is the stream listener class
class StreamListener(tweepy.StreamListener):
def on_status(self, status):
try:
printer(status.text, status.created_at)
except Exception, e:
pass
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True
The first time you call stream.disconnect() (inside if (s == '0'):), you haven't called filter yet, so the stream will never be connected. The rest of your code should be correct, assuming you're using the latest version of tweepy. Note that the except block will almost never be called, as any errors that occur while the stream is running are passed to the on_error callback.
Related
I am trying to make a twitter bot in python using tweepy, when running the below code I get error:
tweepy.errors.NotFound: 404 Not Found
50 - User not found.
My code:
import tweepy
import logging
from config import create_api
import json
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
class FavRetweetListener(tweepy.Stream):
def __init__(self, api):
self.api = api
self.me = api.get_user()
def on_status(self, tweet):
logger.info(f"Processing tweet id {tweet.id}")
if tweet.in_reply_to_status_id is not None or \
tweet.user.id == self.me.id:
# This tweet is a reply or I'm its author so, ignore it
return
if not tweet.favorited:
# Mark it as Liked, since we have not done it yet
try:
tweet.favorite()
except Exception as e:
logger.error("Error on fav", exc_info=True)
if not tweet.retweeted:
# Retweet, since we have not retweeted it yet
try:
tweet.retweet()
except Exception as e:
logger.error("Error on fav and retweet", exc_info=True)
def on_error(self, status):
logger.error(status)
def main(keywords):
api = create_api()
tweets_listener = FavRetweetListener(api)
stream = tweepy.Stream(api.auth, tweets_listener)
stream.filter(track=keywords, languages=["en"])
if __name__ == "__main__":
main(["Python", "Tweepy"])
Is this something with the `tweet.user.id == self.me.id:` ?
The user does not exist anymore in Twitter.
You must catch the error with an except statement in python.
You did not provide the exact line where the error happen but you should try to catch it with something like this:
try:
api.get_user()
except tweepy.errors.NotFound:
print("user not found")
I have a simple function (in python 3) to take a url and attempt to resolve it: printing an error code if there is one (e.g. 404) or resolve one of the shortened urls to its full url. My urls are in one column of a csv files and the output is saved in the next column. The problem arises where the program encounters a url where the server takes too long to respond- the program just crashes. Is there a simple way to force urllib to print an error code if the server is taking too long. I looked into Timeout on a function call but that looks a little too complicated as i am just starting out. Any suggestions?
i.e. (COL A) shorturl (COL B) http://deals.ebay.com/500276625
def urlparse(urlColumnElem):
try:
conn = urllib.request.urlopen(urlColumnElem)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
else:
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
EDIT: if anyone gets the http.client.disconnected error (like me), see this question/answer http.client.RemoteDisconnected error while reading/parsing a list of URL's
Have a look at the docs:
urllib.request.urlopen(url, data=None[, timeout])
The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used).
You can set a realistic timeout (in seconds) for your process:
conn = urllib.request.urlopen(urlColumnElem, timeout=realistic_timeout_in_seconds)
and in order for your code to stop crushing, move everything inside the try except block:
import socket
def urlparse(urlColumnElem):
try:
conn = urllib.request.urlopen(
urlColumnElem,
timeout=realistic_timeout_in_seconds
)
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout as e:
return ('Connection timeout')
Now if a timeout occurs, you will catch the exception and the program will not crush.
Good luck :)
First, there is a timeout parameter than can be used to control the time allowed for urlopen. Next an timeout in urlopen should just throw an exception, more precisely a socket.timeout. If you do not want it to abort the program, you just have to catch it:
def urlparse(urlColumnElem, timeout=5): # allow 5 seconds by default
try:
conn = urllib.request.urlopen(urlColumnElem, timeout = timeout)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout:
return ('Timeout')
else:
...
I currently am making use of the tweepy package in python for a DM listener. I wish to send a reply to the sender on reception of their message. I have the following:
class StdOutListener( StreamListener ):
def __init__( self ):
self.tweetCount = 0
def on_connect( self ):
print("Connection established!!")
def on_disconnect( self, notice ):
print("Connection lost!! : ", notice)
def on_data( self, status ):
status = str(status)
try:
json_acceptable_string = status.replace('\\','')
#string to dict
status=json.loads(json_acceptable_string)
if 'direct_message' in status.keys():
print '\n'
print status[u'direct_message'][u'sender_screen_name'] +' sent: '+ status[u'direct_message'][u'text']
message=str(status[u'direct_message'][u'text'])
api.send_direct_message(screen_name=str(status[u'direct_message'][u'sender_screen_name']),text='Out of office now - will respond to you asap')
print 'auto response submitted'
else:
#not direct message flow
pass
except:
#not important flows - couldn't convert to json/not correct flow in stream
pass
return True
def main():
global api
try:
auth = OAuthHandler(consumer_key, consumer_secret)
auth.secure = True
auth.set_access_token(access_token, access_token_secret)
api = API(auth)
print(api.me().name)
stream = Stream(auth, StdOutListener())
stream.userstream()
except BaseException as e:
print("Error in main()", e)
if __name__ == '__main__':
main()
For some reason, I can see the print statement of the user and what they sent but when it gets to the send_direct_message method it hangs.
Oddly enough, if I message myself, I receive a barrage of messages as it loops. Is this because it's on_data()? How can I make this work for other senders?
UPDATE: Resolved - regnerated tokens and add conditional to check for sender, essentially blacklisting myself.
UPDATE: Resolved - regenerated tokens and add conditional to check for sender, essentially blacklisting myself.
I am implementing a Twitter bot for fun purposes using Tweepy.
What I am trying to code is a bot that tracks a certain keyword and based in it the bot replies the user that tweeted with the given string.
I considered storing the Twitter's Stream on a .json file and looping the Tweet object for every user but it seems impractical as receiving the stream locks the program on a loop.
So, how could I track the tweets with the Twitter's Stream API based on a certain keyword and reply the users that tweeted it?
Current code:
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy.streaming import StreamListener
class MyListener(StreamListener):
def on_data(self, data):
try:
with open("caguei.json", 'a+') as f:
f.write(data)
data = f.readline()
tweet = json.loads(data)
text = str("#%s acabou de. %s " % (tweet['user']['screen_name'], random.choice(exp)))
tweepy.API.update_status(status=text, in_reply_to_status_id=tweet['user']['id'])
#time.sleep(300)
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
def on_error(self, status):
print(status)
return True
api = tweepy.API(auth)
twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(track=['dengue']) #Executing it the program locks on a loop
Tweepy StreamListener class allows you to override it's on_data method. That's where you should be doing your logic.
As per the code
class StreamListener(object):
...
def on_data(self, raw_data):
"""Called when raw data is received from connection.
Override this method if you wish to manually handle
the stream data. Return False to stop stream and close connection.
"""
...
So in your listener, you can override this method and do your custom logic.
class MyListener(StreamListener):
def on_data(self, data):
do_whatever_with_data(data)
You can also override several other methods (on_direct_message, etc) and I encourage you to take a look at the code of StreamListener.
Update
Okay, you can do what you intent to do with the following:
class MyListener(StreamListener):
def __init__(self, *args, **kwargs):
super(MyListener, self).__init__(*args, **kwargs)
self.file = open("whatever.json", "a+")
def _persist_to_file(self, data):
try:
self.file.write(data)
except BaseException:
pass
def on_data(self, data):
try:
tweet = json.loads(data)
text = str("#%s acabou de. %s " % (tweet['user']['screen_name'], random.choice(exp)))
tweepy.API.update_status(status=text, in_reply_to_status_id=tweet['user']['id'])
self._persist_to_file(data)
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
def on_error(self, status):
print(status)
return True
I am having a lot of trouble with encoding character '/' in order to use it in streaming twitter api with twython. When it is tried without the encoding, the 'EUR/USD' gives an error code 401. Note, with other search queries it works normally and does not produce this error.
I have tried doing this in a couple of ways.
First:
'EUR/USD'.replace('/','%2F')
but the search is not returning anything.
I also tried:
urllib.quote('EUR/USD', '')
and while the output with print is the same (EUR%2FUSD) the search is still not returning any results.
Finally I tried double encoding:
urllib.quote(urllib.quote('EUR/USD', ''),'')
where I get EUR%252FUSD but still no results.
Furthermore, when searching for just EURUSD, search does work properly but only when it is preceded by the symbol $ (e.g. $EURUSD) in the tweet itself.
In case the dollar sign is missing search also won't detect the tweet. (e.g. just EURUSD)
This is how it works:
querystring = 'EURUSD'
auth = tweepy.OAuthHandler('key','secret')
auth.set_access_token('key','secret')
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
pprint.pprint([status.user.name,removeNonAscii(status.text),status.lang])
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(track=querystring, languages=['en'])
Anyone has an idea of what might be going on here?