I am trying to search in twitter using Tython, but it seems that the library has a limitation on 140 characters. With the new feature of python, i.e. 280 characters length, what can one do?
This is not a limitation of Twython. The Twitter API by default returns the old 140-character limited tweet. In order to see the newer extended tweet you just need to supply this parameter to your search query:
tweet_mode=extended
Then, you will find the 280-character extended tweet in the full_text field of the returned tweet.
I use another library (TwitterAPI), but I think you would do something like this using Twython:
results = api.search(q='pizza',tweet_mode='extended')
for result in results['statuses']:
print(result['full_text'])
Unfortunately, I am unable to find anything related "Tython". However, if searching twitter data (in this case posts) and/or gathering metadata is your goal, I would recommend you having a look into the library TwitterSearch.
Here is a quick example from the provided link with searching for Twitter-posts containing the words Gutenberg and Doktorarbeit.
from TwitterSearch import *
try:
tso = TwitterSearchOrder() # create a TwitterSearchOrder object
tso.set_keywords(['Guttenberg', 'Doktorarbeit']) # let's define all words we would like to have a look for
tso.set_language('de') # we want to see German tweets only
tso.set_include_entities(False) # and don't give us all those entity information
# it's about time to create a TwitterSearch object with our secret tokens (API auth credentials)
ts = TwitterSearch(
consumer_key = 'aaabbb',
consumer_secret = 'cccddd',
access_token = '111222',
access_token_secret = '333444'
)
# this is where the fun actually starts :)
for tweet in ts.search_tweets_iterable(tso):
print( '#%s tweeted: %s' % ( tweet['user']['screen_name'], tweet['text'] ) )
except TwitterSearchException as e: # take care of all those ugly errors if there are some
print(e)
Related
import tweepy as tw
client = tw.Client(bearer_token='')
I want to filter the tweet data using a location search query that I found from twitters client API docs.
I tried using bounding_box as a filter location but it just outputs an error every time I use it There were errors processing your request: no viable alternative at input '119.19,4.94,127.31,19.38'
query = 'crypto -is:retweet bounding_box:[119.19,4.94,127.31,19.38]'
for tweet in tw.Paginator(client.search_recent_tweets, query=query,
tweet_fields=[ 'created_at'], max_results=100).flatten(limit=1000):
print(tweet.text, tweet.created_at)
is there a possible way of filtering the tweet data by location?
Replace commas with spaces:
query = 'crypto -is:retweet bounding_box:[119.19 4.94 127.31 19.38]'
..as is stated in Twitter API docs:
bounding_box:[west_long south_lat east_long north_lat]
Example: bounding_box:[-105.301758 39.964069 -105.178505 40.09455]
Rule arguments are contained within brackets, space delimited.
I'm trying to identify interesting people to follow on Twitter. To do this,
I want to find users who post a tweet containing various keywords and then
filter out users whose bios don't contain certain keywords.
I'm using the following code to identify the tweets, and then automatically
follow the users who tweeted them:
naughty_words = ["example"]
good_words = ["example", "example"]
filter = " OR ".join(good_words)
blacklist = " -".join(naughty_words)
keywords = filter + blacklist
twitter = Twython(consumer_key, consumer_secret, access_token,
access_token_secret)
search_results = twitter.search(q=keywords, count=10)
try:
for tweet in search_results["statuses"]:
try:
st=tweet["entities"]["user_mentions"]
if st != []:
twitter.create_friendship(screen_name=st[0]["screen_name"])
except TwythonError as e:
print(e)
except TwythonError as e:
print(e)
This code is working great, but I want to filter my results more, as this
method returns a lot of users that I don't want to follow! Does anyone know
how I could amend this to include a second filter that looks at users'
bios?
According to the Twitter Doc, you can search for users based on a query string. However, if I check the Twython API documentation, it seems that this call is not directly supported. Tweepy, on the other hand, provides a corresponding method API.search_users, see here.
I don't think that you can search for users and tweets in one request. So might might have to stick to your current tweet search, and check each tweet if you have already seen this users. If not, you have to get the user's profile and check if they satisfy your conditions (probably batches of users to limit the number of API calls).
Edit: You probably can use Twython to search for users as well. While it does not provide a dedicated method, it provides a generic method get where you can make calls to any endpoint. So it might look something like :
get('users/search', params={'q': 'soccer music -sex -porn'})
I haven't tried it myself, but that's what I can get from the Twython Docs.
I am using Twython to get a stream of tweets. I used this tutorial, expect that I am not using GPIO.
My code is the following:
import time
from twython import TwythonStreamer
TERMS='#stackoverflow'
APP_KEY='MY APP KEY'
APP_SECRET='MY APP SECRET'
OAUTH_TOKEN='MY OATH TOKEN'
OAUTH_TOKEN_SECRET='MY OATH TOKEN SECRET'
class BlinkyStreamer(TwythonStreamer):
def on_success(self, data):
if 'text' in data:
print data['text'].encode('utf-8')
try:
stream = BlinkyStreamer(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
stream.statuses.filter(track=TERMS)
except KeyboardInterrupt
That outputs a stream of all tweets containing #stackoverflow. But I want to output the tweet if it is from a certain user, e.g. #StackStatus.
I am running this on a Raspberry Pi.
How would I do that? Any help is appreciated!
Edit: if there is another, other or easier, way to execute some script when a new tweet is placed by some user, please let me know, this would solve my question as well!
The 'follow' parameter does not work as stated above by teknoboy. Correct usage is with the user's ID, not their screen name. You can get user IDs using http://gettwitterid.com/.
The third parameter available is Location - you can use 1, 2 or 3 of then as desired. They become linked with "OR", not 'AND'.
Example Usage:
SearchTerm = 'abracadabra' # If spaces are included, they are 'OR', ie finds tweets with any one of the words, not the whole string.
Tweeter = '25073877' # This is Donald Trump, finds tweets from him or mentioning him
Place = '"47.405,-177.296,1mi"' # Sent from within 1 mile of Lat, Long
stream.statuses.filter(track=SearchTerm, follow=Tweeter, location=Place)
you should supply the filter with the follow parameter to stream specific users' tweets.
if you wish to only follow one user, you can define
FOLLOW='StackStatus'
and change the appropriate line to
stream.statuses.filter(track=TERMS, follow=FOLLOW)
if you wish to see all the user's tweets, regardless of keyword, you can omit the track parameter:
stream.statuses.filter(follow=FOLLOW)
I'm new to the tweepy library. I am able to capture a twitter stream if I use a filter like the one shown below, looking for tweets containing the word snow in the text field.
import tweepy
ckey = ''
csecret = ''
atoken = ''
asecret = ''
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track=["snow"])
However, I don't know how to capture all tweets without doing any filtering. If I leave off the last line of the above code, the program runs, but I don't get any tweets. If I change the track parameter to track=[] or track=[""], I receive an error code of 406 from the Twitter API.
I am using Python 3.4.2.
You can use twitterStream.sample() as the last line for that. It will fetch all the tweets for you.
Dont do a search by location. It would limit out those tweets which dont have geolocation enabled. What I suggest is to use stream.sample() instead. IT doesnt have any necessary parameters. See tweepy docs for more info.
the streaming API can only be used with at least one predicate parameter, as specified in the twitter documentation.
luckily, there's also a locations parameter, and you can pass the value [-180,-90,180,90] to get tweets from every point on earth.
so, in your snippet above, the last line should be:
twitterStream.filter(locations=[-180,-90,180,90])
the only filtering you'll get is that the user must not have turned the geotagging off.
Twitter only returns 100 tweets per "page" when returning search results on the API. They provide the max_id and since_id in the returned search_metadata that can be used as parameters to get earlier/later tweets.
Twython 3.1.2 documentation suggests that this pattern is the "old way" to search:
results = twitter.search(q="xbox",count=423,max_id=421482533256044543)
for tweet in results['statuses']:
... do something
and that this is the "new way":
results = twitter.cursor(t.search,q='xbox',count=375)
for tweet in results:
... do something
When I do the latter, it appears to endlessly iterate over the same search results. I'm trying to push them to a CSV file, but it pushes a ton of duplicates.
What is the proper way to search for a large number of tweets, with Twython, and iterate through the set of unique results?
Edit: Another issue here is that when I try to iterate with the generator (for tweet in results:), it loops repeatedly, without stopping. Ah -- this is a bug... https://github.com/ryanmcgrath/twython/issues/300
I had the same problem, but it seems that you should just loop through a user's timeline in batches using the max_id parameter. The batches should be 100 as per Terence's answer (but actually, for user_timeline 200 is the max count), and just set the max_id to the last id in the previous set of returned tweets minus one (because max_id is inclusive). Here's the code:
'''
Get all tweets from a given user.
Batch size of 200 is the max for user_timeline.
'''
from twython import Twython, TwythonError
tweets = []
# Requires Authentication as of Twitter API v1.1
twitter = Twython(PUT YOUR TWITTER KEYS HERE!)
try:
user_timeline = twitter.get_user_timeline(screen_name='eugenebann',count=200)
except TwythonError as e:
print e
print len(user_timeline)
for tweet in user_timeline:
# Add whatever you want from the tweet, here we just add the text
tweets.append(tweet['text'])
# Count could be less than 200, see:
# https://dev.twitter.com/discussions/7513
while len(user_timeline) != 0:
try:
user_timeline = twitter.get_user_timeline(screen_name='eugenebann',count=200,max_id=user_timeline[len(user_timeline)-1]['id']-1)
except TwythonError as e:
print e
print len(user_timeline)
for tweet in user_timeline:
# Add whatever you want from the tweet, here we just add the text
tweets.append(tweet['text'])
# Number of tweets the user has made
print len(tweets)
As per the official Twitter API documentation.
Count optional
The number of tweets to return per page, up to a maximum of 100
You need to make repeated calls to the python method. However, there is no guarantee that these will be the next N, or if the tweets are really coming in it might miss some.
If you want all the tweets in a time frame you can use the streaming api: https://dev.twitter.com/docs/streaming-apis and combine this with the oauth2 module.
How can I consume tweets from Twitter's streaming api and store them in mongodb
python-twitter streaming api support/example
Disclaimer: i have not actually tried this
As a solution to the problem of returning 100 tweets for a search query using Twython, here is the link showing how it can be done using the "old way":
Twython search API with next_results