Tweepy CustomStreamListener Class - python

I have the following code where I have made some amendments to the class 'CustomStreamListener':
import sys
import tweepy
consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
for hashtag in status.entities['hashtags']:
if hashtag == 'turndownforwhat':
print(hashtag['text'])
print status.text
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(locations=[-122.75,36.8,-121.75,37.8])
The bit I have added is everything within the class from the 'for' statement onwards. What I am trying to do is filter by the text values of the hashtags within text messages and then use some of the standard tweepy filters further down to filter by geolocation.
This has been built in Python 2.7. With my amendments the code does not error however it just hangs with no tweets coming through. Have I put a logical error in somewhere that I have missed?
Thanks

The code has an error in the "if hashtag" condition.
It should be:
if hashtag['text'] == 'turndownforwhat'
You may need to wait a while to find a tweet that shows up, but if you use a bigger bounding box and a trending hashtag you will see results with this modification.

Related

Twitter Scanner w tweepy - Python

I was just wondering if it's possible to make a scanner with tweepy - for instance, a while loop that is constantly searching for certain words. I'm a trader and would find it very useful in case there is any breaking news.
Example:
I want to set my scanner to constantly return tweets that have '$DB' in them. Furthermore, I only want to return tweets of users that have > 5k followers.
Any advice or pointers would be helpful! Thanks.
Edit/Update: As discussed by asongtoruin and qorka, the question asks for new tweets, not existing tweets. Previous edit used api.search method which finds only existing messages. The StreamListener reads new messages.
import tweepy
from tweepy import OAuthHandler
access_token='your_api_token'
access_secret='your_api_access_secret'
consumer_key = 'your_api_key'
consumer_secret = 'your_consumer_key'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
class MyListener(StreamListener):
def on_status(self, status):
try:
if status.user.followers_count > 5000:
print '%s (%s at %s, followers: %d)' % (status.text, status.user.screen_name, status.created_at, status.user.followers_count)
return True
except BaseException as e:
print("Error on_status: %s" % str(e))
return True
def on_error(self, status):
print(status)
return True
twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(track=['$DB','$MS','$C'])

Retrieve tweets tweepy

How can I retrieve only my tweets with a stream? I test that but I don't see my tweets.
My first attempt:
streamingAPI = tweepy.streaming.Stream(auth, CustomStreamListener())
streamingAPI.userstream(_with='followings')
streamingAPI.filter()
My second attempt:
streamingAPI = tweepy.streaming.Stream(auth, CustomStreamListener())
streamingAPI.filter(follow= ['2466458114'])
Thanks a lot.
If you want stream only tweets on your user, you can use the following lines:
from tweepy import StreamListener
from tweepy import Stream
import tweepy
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
class CustomStreamListener(StreamListener):
def on_data(self, data):
print(data)
def on_error(self, status):
print(status)
if __name__ == '__main__':
listener = CustomStreamListener()
twitterStream = Stream(auth, listener)
twitterStream.filter(follow=['2466458114'])
In your question, you said that you can't see your tweets. I don't know if is clear or not but just to be sure, with streaming you can see only the "real time" tweets. So also with my code, if you don't tweet nothing, you don't see nothing.
UPDATE AFTER CHAT IN COMMENTS
Since Twitter Official API has the bother limitation of time constraints, you can't get older tweets than a week.
For this task I suggest you to use this great python library.
It allows to get how many tweets you want and wrote when you want.
As documentation says, you can simply use it in this way:
tweetCriteria = got.manager.TweetCriteria().setUsername('<user_without_#>').setSince("2015-05-01").setUntil("2015-09-30")
If you are using python2.X you can use got, instead if you are using python3.X you can use got3.
I prepare an example in Python3:
from getOldTweets import got3
tweetCriteria = got3.manager.TweetCriteria().setUsername('barackobama').setSince("2015-09-01").setUntil("2015-09-30")
tweets_list = got3.manager.TweetManager.getTweets(tweetCriteria)
for tweet in tweets_list:
print(tweet.text)
Let me know.

Tweepy. Make stream run forever

I am relatively new to tweepy python library.
I want to be sure that my stream python script always runs on a remote server. So it would be great if someone will share the best practices on how to make it happen.
Right now I am doing it this way:
if __name__ == '__main__':
while True:
try:
# create instance of the tweepy tweet stream listener
listener = TweetStreamListener()
# set twitter keys/tokens
auth = OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
# create instance of the tweepy stream
stream = Stream(auth, listener)
stream.userstream()
except Exception as e:
print "Error. Restarting Stream.... Error: "
print e.__doc__
print e.message
time.sleep(5)
And I return False on each of the methods: on_error(), on_disconnect(), on_timeout().
So, by returning False the stream stops and then reconnects in the infinite loop.
Here's how I do mine and it's been running for almost a year, on two computers to handle the errors that stop the stream here and there.
#They don't need to be in the loop.
auth = OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
while True:
listener = TweetStreamListener()
stream = Stream(auth, listener, timeout=60)
try:
stream.userstream()
except Exception, e:
print "Error. Restarting Stream.... Error: "
print e.__doc__
print e.message
To make sure that it runs forever, you should redefine the on_error method to handle the time between reconnection attempts. Your 5 seconds sleeping will hinder your chances to a successful reconnect because Twitter will see that you tried to do it too frequently. But that's another question.
Just my two cents.
I received lots of Error 420, which was weird because I didn't ask for too much keywords to the stream API.
So I figured out that the on_data() method of the stream listener class must always return True.
Mine returned False sometimes, so tweepy cut the connection, and recreate it directly as it was in a loop, twitter didn't like it much...
I've also resolved the problem by creating new stream recursively on exceptions.
Here is my complete code. just change mytrack variable, put your keys and run it using pm2 or python.
from tweepy import OAuthHandler, Stream, StreamListener
import json
mytrack = ['netmine', 'bitkhar', 'bitcoin']
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
class StdOutListener(StreamListener):
def __init__(self, listener, track_list, repeat_times):
self.repeat_times = repeat_times
self.track_list = track_list
print('************** initialized : #', self.repeat_times)
def on_data(self, data):
print(self.repeat_times, 'tweet id : ', json.loads(data)['id'])
def on_exception(self, exception):
print('exception', exception)
new_stream(auth, self.track_list, self.repeat_times+1)
def on_error(self, status):
print('err', status)
if status == 420:
# returning False in on_data disconnects the stream
return False
def new_stream(auth, track_list, repeat_times):
listener = StdOutListener(StreamListener, track_list, repeat_times)
stream = Stream(auth, listener).filter(track=track_list, is_async=True)
new_stream(auth, mytrack, repeat_times=0)

GetSearch or SreamListener? python

I'm very new to twitter api, please help me understand the difference between two things.
As far as I understand I can get real-time tweets by using tweepy for example :
hashtag = ['justinbieber']
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
try:
data = status.__getstate__()
print data
output.write("%s\n "% data)
except Exception, e:
print >> sys.stderr, 'Encountered Exception:', e
pass
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
class Twitter():
def __init__(self):
consumer_key=
consumer_secret=
access_key =
access_secret =
self.auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
self.auth.set_access_token(access_key, access_secret)
self.api = tweepy.API(self.auth)
def start(self):
l = CustomStreamListener()
stream = tweepy.streaming.Stream(self.auth,l, secure=True)
stream.filter(follow=None, track=hashtag)
if __name__ == "__main__":
Twitter().start()
But what exactly I'm getting if I use python-twitter's api.GetSearch()? For example:
def t_auth(self):
consumer_key=
consumer_secret=
access_key =
access_secret =
self.api = twitter.Api(consumer_key, consumer_secret ,access_key, access_secret)
self.api.VerifyCredentials()
return self.api
self.tweets = []
self.tweets.extend(self.api.GetSearch(self.hashtag, per_page=10))
Imagine that I put last line in an infinite while loop, will I get the same result as in the first example? What's the difference between those two?
Here's my insight.
The first example with tweepy stream is a use case of twitter streaming API.
The second example using python-twitter is a use case of twitter search API.
So, I understand this question as: Should I use twitter regular search API or Streaming API?
It depends, but, long story short, if you want to see the real real-time picture - you should use streaming.
I don't have enough experience to explain you props and cons of both approaches, so I'll just refer you:
Aggregating tweets: Search API vs. Streaming API
Search API vs Streaming API
Streaming API vs Rest API?
Hope that helps.

Twitter Streaming API with Tweepy rejects oauth

I'm trying to access the Twitter stream which I had working previously while improperly using Tweepy. Now that I understand how Tweepy is intended to be used I wrote the following Stream.py module. When I run it, I get error code 401 which tells me my auth has been rejected. But I had it working earlier with the same consumer token and secret. Any ideas?
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy import TweepError
from tweepy import error
#Removed. I have real keys and tokens
consumer_key = "***"
consumer_secret = "***"
access_token="***"
access_token_secret="***"
class CustomListener(StreamListener):
""" A listener handles tweets are the received from the stream.
This is a basic listener that just prints received tweets to stdout."""
def on_status(self, status):
# Do things with the post received. Post is the status object.
print status.text
return True
def on_error(self, status_code):
# If error thrown during streaming.
# Check here for meaning:
# https://dev.twitter.com/docs/error-codes-responses
print "ERROR: ",; print status_code
return True
def on_timeout(self):
# If no post received for too long
return True
def on_limit(self, track):
# If too many posts match our filter criteria and only a subset is
# sent to us
return True
def filter(self, track_list):
while True:
try:
self.stream.filter(track=track_list)
except error.TweepError as e:
raise TweepError(e)
def go(self):
listener = CustomListener()
auth = OAuthHandler(consumer_key, consumer_secret)
self.stream = Stream(auth,listener,timeout=3600)
listener.filter(['LOL'])
if __name__ == '__main__':
go(CustomListener)
For anyone who happens to have the same issue, I should have added this line after auth was initialized:
auth.set_access_token(access_token, access_token_secret)

Categories