I was just wondering if it's possible to make a scanner with tweepy - for instance, a while loop that is constantly searching for certain words. I'm a trader and would find it very useful in case there is any breaking news.
Example:
I want to set my scanner to constantly return tweets that have '$DB' in them. Furthermore, I only want to return tweets of users that have > 5k followers.
Any advice or pointers would be helpful! Thanks.
Edit/Update: As discussed by asongtoruin and qorka, the question asks for new tweets, not existing tweets. Previous edit used api.search method which finds only existing messages. The StreamListener reads new messages.
import tweepy
from tweepy import OAuthHandler
access_token='your_api_token'
access_secret='your_api_access_secret'
consumer_key = 'your_api_key'
consumer_secret = 'your_consumer_key'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
class MyListener(StreamListener):
def on_status(self, status):
try:
if status.user.followers_count > 5000:
print '%s (%s at %s, followers: %d)' % (status.text, status.user.screen_name, status.created_at, status.user.followers_count)
return True
except BaseException as e:
print("Error on_status: %s" % str(e))
return True
def on_error(self, status):
print(status)
return True
twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(track=['$DB','$MS','$C'])
Related
I am a beginner at python . I'm trying to get the follower counts of a given user handle from twitter. The issue is that tweepy is not connecting to twitter and is not even returning any error. The terminal just stays blank. Please help on this.
import tweepy
import pymysql
import time
#insert your Twitter keys here
consumer_key =''
consumer_secret=''
access_token=''
access_secret=''
auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
global conn
conn=pymysql.connect(db='twitter', user='root' , host= 'localhost' , port=3307)
global cursor
cursor=conn.cursor()
print("entering loop")
while True:
query=cursor.execute("select twitter_name from timj_users where found_followers is null and twitter_name is not null order by id asc limit 1")
if query>0:
results=cursor.fetchone()
timj_handle=results[0]
user = tweepy.Cursor(api.followers, screen_name=timj_handle).items()
try:
followers=user.follower_count
location=user.location
cursor.execute("update timj_users set followers=%s,location=%s,found_followers=1 where twitter_name=%s" , (followers, location ,handle))
conn.commit()
print("user followers received")
if followers>100:
user.follow()
cursor.execute("update users set followed=1 where twitter_name=%s" , (handle))
conn.commit()
print("User followed")
except:
time.sleep(15*60)
print 'We got a timeout ... Sleeping for 15 minutes'
else:
print("All users processed")
break
If you're not getting an error from python and the console is just "hanging" you did actually connect to the Twitter, but since you have nothing in the code to display any message you get from Twitter you won't receive anything.
You need to include this in the code:
def on_error(self, status_code):
print(status_code)
That code will give you provide you with the number that is related to Twitter's Error Codes & Responses.
To be more clear:
except:
time.sleep(15*60)
print 'We got a timeout ... Sleeping for 15 minutes'
That exception is not being used what you think it is. The exception is raised if there's an error in the code you're writing, not errors you obtain from twitter.
It looks like this line
auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
should be
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
However, you have written a really complicated for a beginner. Can I suggest that you run a more basic Tweepy program, and see what you get.
import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
public_tweets = api.home_timeline()
for tweet in public_tweets:
print tweet.text
(From the Tweepy Documentation)
I am relatively new to tweepy python library.
I want to be sure that my stream python script always runs on a remote server. So it would be great if someone will share the best practices on how to make it happen.
Right now I am doing it this way:
if __name__ == '__main__':
while True:
try:
# create instance of the tweepy tweet stream listener
listener = TweetStreamListener()
# set twitter keys/tokens
auth = OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
# create instance of the tweepy stream
stream = Stream(auth, listener)
stream.userstream()
except Exception as e:
print "Error. Restarting Stream.... Error: "
print e.__doc__
print e.message
time.sleep(5)
And I return False on each of the methods: on_error(), on_disconnect(), on_timeout().
So, by returning False the stream stops and then reconnects in the infinite loop.
Here's how I do mine and it's been running for almost a year, on two computers to handle the errors that stop the stream here and there.
#They don't need to be in the loop.
auth = OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
while True:
listener = TweetStreamListener()
stream = Stream(auth, listener, timeout=60)
try:
stream.userstream()
except Exception, e:
print "Error. Restarting Stream.... Error: "
print e.__doc__
print e.message
To make sure that it runs forever, you should redefine the on_error method to handle the time between reconnection attempts. Your 5 seconds sleeping will hinder your chances to a successful reconnect because Twitter will see that you tried to do it too frequently. But that's another question.
Just my two cents.
I received lots of Error 420, which was weird because I didn't ask for too much keywords to the stream API.
So I figured out that the on_data() method of the stream listener class must always return True.
Mine returned False sometimes, so tweepy cut the connection, and recreate it directly as it was in a loop, twitter didn't like it much...
I've also resolved the problem by creating new stream recursively on exceptions.
Here is my complete code. just change mytrack variable, put your keys and run it using pm2 or python.
from tweepy import OAuthHandler, Stream, StreamListener
import json
mytrack = ['netmine', 'bitkhar', 'bitcoin']
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
class StdOutListener(StreamListener):
def __init__(self, listener, track_list, repeat_times):
self.repeat_times = repeat_times
self.track_list = track_list
print('************** initialized : #', self.repeat_times)
def on_data(self, data):
print(self.repeat_times, 'tweet id : ', json.loads(data)['id'])
def on_exception(self, exception):
print('exception', exception)
new_stream(auth, self.track_list, self.repeat_times+1)
def on_error(self, status):
print('err', status)
if status == 420:
# returning False in on_data disconnects the stream
return False
def new_stream(auth, track_list, repeat_times):
listener = StdOutListener(StreamListener, track_list, repeat_times)
stream = Stream(auth, listener).filter(track=track_list, is_async=True)
new_stream(auth, mytrack, repeat_times=0)
I have the following code where I have made some amendments to the class 'CustomStreamListener':
import sys
import tweepy
consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
for hashtag in status.entities['hashtags']:
if hashtag == 'turndownforwhat':
print(hashtag['text'])
print status.text
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(locations=[-122.75,36.8,-121.75,37.8])
The bit I have added is everything within the class from the 'for' statement onwards. What I am trying to do is filter by the text values of the hashtags within text messages and then use some of the standard tweepy filters further down to filter by geolocation.
This has been built in Python 2.7. With my amendments the code does not error however it just hangs with no tweets coming through. Have I put a logical error in somewhere that I have missed?
Thanks
The code has an error in the "if hashtag" condition.
It should be:
if hashtag['text'] == 'turndownforwhat'
You may need to wait a while to find a tweet that shows up, but if you use a bigger bounding box and a trending hashtag you will see results with this modification.
I'm very new to twitter api, please help me understand the difference between two things.
As far as I understand I can get real-time tweets by using tweepy for example :
hashtag = ['justinbieber']
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
try:
data = status.__getstate__()
print data
output.write("%s\n "% data)
except Exception, e:
print >> sys.stderr, 'Encountered Exception:', e
pass
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
class Twitter():
def __init__(self):
consumer_key=
consumer_secret=
access_key =
access_secret =
self.auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
self.auth.set_access_token(access_key, access_secret)
self.api = tweepy.API(self.auth)
def start(self):
l = CustomStreamListener()
stream = tweepy.streaming.Stream(self.auth,l, secure=True)
stream.filter(follow=None, track=hashtag)
if __name__ == "__main__":
Twitter().start()
But what exactly I'm getting if I use python-twitter's api.GetSearch()? For example:
def t_auth(self):
consumer_key=
consumer_secret=
access_key =
access_secret =
self.api = twitter.Api(consumer_key, consumer_secret ,access_key, access_secret)
self.api.VerifyCredentials()
return self.api
self.tweets = []
self.tweets.extend(self.api.GetSearch(self.hashtag, per_page=10))
Imagine that I put last line in an infinite while loop, will I get the same result as in the first example? What's the difference between those two?
Here's my insight.
The first example with tweepy stream is a use case of twitter streaming API.
The second example using python-twitter is a use case of twitter search API.
So, I understand this question as: Should I use twitter regular search API or Streaming API?
It depends, but, long story short, if you want to see the real real-time picture - you should use streaming.
I don't have enough experience to explain you props and cons of both approaches, so I'll just refer you:
Aggregating tweets: Search API vs. Streaming API
Search API vs Streaming API
Streaming API vs Rest API?
Hope that helps.
I'm trying to access the Twitter stream which I had working previously while improperly using Tweepy. Now that I understand how Tweepy is intended to be used I wrote the following Stream.py module. When I run it, I get error code 401 which tells me my auth has been rejected. But I had it working earlier with the same consumer token and secret. Any ideas?
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy import TweepError
from tweepy import error
#Removed. I have real keys and tokens
consumer_key = "***"
consumer_secret = "***"
access_token="***"
access_token_secret="***"
class CustomListener(StreamListener):
""" A listener handles tweets are the received from the stream.
This is a basic listener that just prints received tweets to stdout."""
def on_status(self, status):
# Do things with the post received. Post is the status object.
print status.text
return True
def on_error(self, status_code):
# If error thrown during streaming.
# Check here for meaning:
# https://dev.twitter.com/docs/error-codes-responses
print "ERROR: ",; print status_code
return True
def on_timeout(self):
# If no post received for too long
return True
def on_limit(self, track):
# If too many posts match our filter criteria and only a subset is
# sent to us
return True
def filter(self, track_list):
while True:
try:
self.stream.filter(track=track_list)
except error.TweepError as e:
raise TweepError(e)
def go(self):
listener = CustomListener()
auth = OAuthHandler(consumer_key, consumer_secret)
self.stream = Stream(auth,listener,timeout=3600)
listener.filter(['LOL'])
if __name__ == '__main__':
go(CustomListener)
For anyone who happens to have the same issue, I should have added this line after auth was initialized:
auth.set_access_token(access_token, access_token_secret)