TwitterAPI: how to streaming multiple users by id - python

I'm streaming all tweets that mention one of the usernames (screen_name) that I have on a list( TRACK_TERM ).
from TwitterAPI import TwitterAPI
api = TwitterAPI(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_KEY, ACCESS_SECRET)
TRACK_TERM = ['#CNN', '#FoxNews', '#FOXTV', '#BBC'... + 500]
r = api.request('statuses/filter', {'track': TRACK_TERM})
My problem is that users might sometimes change their screen_name. So I was wondering if there's a way to track user' mentions by their user id instead of their screen_name. As this script will run continuously for a month.
I'm using the TwitterAPI I also try twython

Instead of the track parameter try using the follow parameter.
USER_IDS = '%d,%d,%d' % (ID1,ID2,ID3)
r = api.request('statuses/filter', {'follow': USER_IDS})
The docs are here.

Related

Tweepy with stream hangs for follow tag

I am trying to run the Tweepy StreamListener to follow users' Tweets on Twitter.
When I use the track keyword it works, but when I add follow='userid' it hangs. Am I doing anything wrong?
stream_listener = StreamListener()
auth = OAuthHandler("", "")
auth.set_access_token("", "")
stream = Stream(auth=auth, listener=stream_listener)
#stream.filter(follow="")
api = tweepy.API(auth)
screen_name = "ThetaWarrior"
user = api.get_user(screen_name)
api = tweepy.API(auth, wait_on_rate_limit=True)
print("User details:")
print(user.name)
print(user.description)
print(user.location)
print(user.id_str)
stream.filter(follow="98**************")
The follow parameter needs to be a list of user IDs, not a string, e.g.:
stream.filter(follow=["98**************"])
assuming "98**************" is an actual user ID.
See the Streaming with Tweepy section of the documentation for Tweepy v3.10 or the documentation for Stream.filter for the latest development version of Tweepy on the master branch, set to be released as v4.0. Also see the documentation for the POST statuses/filter endpoint.

Applying filter to extract tweets on some hashtags from a particular country using python

I'm using the below python script to get tweets on #lockdownindia, #lockdownextension and #covid19 to get the tweets from my country (INDIA) and perform sentiment analysis. I've used cursor() from tweepy library to do so. When I use #geocode: it helps me out, but due to inaccurate radius, I'm getting tweets from neighboring countries like Pakistan as well, which I don't want. While reading Twitter documentation, I came to know about place_country: but it is not working. It is returning empty dataframe. Any help on how to use place_country: will be appreciated.
Also, is it possible to get all the attributes of a tweet in a single go, as happens while using streaming API
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token = (access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
query = '#lockdownindia OR #lockdownextension OR #covid19 -filter:retweets AND place_country:IN' #['-filter:retweets place_country:IN'] #geocode:20.5937,78.9629,910mi
max_tweets = 100
tweets = tweepy.Cursor(api.search, q=query, since = "2020-06-05", until = "2020-06-21",lang="en").items(max_tweets)
Hey there I have checked the documentation and it seems like you are not supposed to put place-country in your query. It's one of the attributes you will be able to find in the JSON reply from Twitter. I suggest checking their JSON message under the place attribute for country and using a loop check if it's set to India.
Hey sorry I couldn't respond to this earlier. First off you'd want to import some libraries which are json, requests and time using the following line.
import json, requests, time
These should help you make http requests to the twitter api and manipulate the JSON response. Next you'd want to set your http request. I would suggest setting a variable for this. You can have it like this.
myrequest = 'https://api.twitter.com/1.1/search/tweets.json?q=from%3Atwitterdev&result_type=mixed&count=2'
Lastly you'd make the GET request to the twitter API. I am gonna save my request to the variable response.
response = requests.request(method = "GET", url = myrequest)
Now if you will have your response from the server. To access the JSON content of that response you need to use this command. I am gonna save my JSON content to the variable result.
result = json.loads(response.content)
Now you should have the JSON content and you can look inside of it just like you would've with a dictionary. I haven't used the twitter API before but I hope this helps. This is what I would use with other APIs.
Sidenote: Here is a link to how your http request should be from twitter. Best of Luck :)

Python run script for every entry in a dictionary

I'm trying to write a simple python programme that uses the tweepy API for twitter and wget to retrieve the image link from a twitter post ID (Example: twitter.com/ExampleUsername/12345678), then download the image from the link. The actual programme works fine, but there is a problem. While it runs FOR every ID in the dictionary (if there are 2 IDs, it runs 2 times), it doesn't use every ID, so the script ends up looking at the last ID on the dictionary, then downloading the image from that same id however many times there is an ID in the dictionary. Does anyone know how to make the script run again for every ID?
tl;dr I want the programme to look at the first ID, grab its image link, download it, then do the same thing with the next ID until its done all of the IDs.
#!/usr/bin/env python
# encoding: utf-8
import tweepy #https://github.com/tweepy/tweepy
import wget
#Twitter API credentials
consumer_key = "nice try :)"
consumer_secret = "nice try :)"
access_key = "nice try :)"
access_secret = "my, this joke is getting really redundant"
def get_all_tweets():
#authorize twitter, initialize tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
id_list = [1234567890, 0987654321]
# Hey StackOverflow, these are example ID's. They won't work as they're not real twitter ID's, so if you're gonna run this yourself, you'll want to find some twitter IDs on your own
# tweets = api.statuses_lookup(id_list)
for i in id_list:
tweets = []
tweets.extend(api.statuses_lookup(id_=id_list, include_entities=True))
for tweet in tweets:
spacefiller = (1+1)
# this is here so the loop runs, if it doesn't the app breaks
a = len(tweets)
print(tweet.entities['media'][0]['media_url'])
url = tweet.entities['media'][0]['media_url']
wget.download(url)
get_all_tweets()
Thanks,
~CS
I figured it out!
I knew that loop was being used for something...
I moved everything from a = len(tweets to wget.download(url) into the for tweet in tweets: loop, and removed the for i in id_list: loop.
Thanks to tdelany this programme works now! Thanks everyone!
Here's the new code if anyone wants it:
#!/usr/bin/env python
# encoding: utf-8
import tweepy #https://github.com/tweepy/tweepy
import wget
#Twitter API credentials
consumer_key = "nice try :)"
consumer_secret = "nice try :)"
access_key = "nice try :)"
access_secret = "my, this joke is getting really redundant"
def get_all_tweets():
#authorize twitter, initialize tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
id_list = [1234567890, 0987654321]
# Hey StackOverflow, these are example ID's. They won't work as they're not real twitter ID's, so if you're gonna run this yourself, you'll want to find some twitter IDs on your own
tweets = []
tweets.extend(api.statuses_lookup(id_=id_list, include_entities=True))
for tweet in tweets:
a = len(tweets)
print(tweet.entities['media'][0]['media_url'])
url = tweet.entities['media'][0]['media_url']
wget.download(url)
get_all_tweets()
One strange thing I see is that the variable i declared in the outer loop is never used after on. Shouldn't your code be
tweets.extend(api.statuses_lookup(id_=i, include_entities=True))
and not id_=id_list as you wrote?

how to get whole user timeline of a specific twitter user

so I came up with this script to get the all of a user tweet from one twitter user
import tweepy
from tweepy import OAuthHandler
import json
def load_api():
Consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
return tweepy.API(auth)
api = load_api()
user = 'allkpop'
tweets = api.user_timeline(id=user, count=2000)
print('Found %d tweets' % len(tweets))
tweets_text = [t.text for t in tweets]
filename = 'tweets-'+user+'.json'
json.dump(tweets_text, open(filename, 'w'))
print('Saved to file:', filename)
but when I run it I can only get 200 tweets per request. Is there a way to get 2000 tweets or at least more than 2000 tweets?
please help me, thank you
The Twitter API has request limits. The one you're using corresponds to the Twitter statuses/user_timeline endpoint. The max number that you can get for this endpoint is documented as 3,200. Also note that there's a max number of requests in a 15-minute window, which might explain why you're only getting 2000, instead of the max. Here are a couple observations that might be interesting for you:
Documentation says that the max count is 200.
There's an include_rts (include retweets) parameter that might return more values. While it's part of the Twitter API, I can't see where Tweepy documents that.
You might try Tweepy Cursors to see if that will bring back more items.
Because of the 15 minute limits, you might be able to pause until the next 15 minute window to continue. That said, I don't know enough about your use case to know if this is practical or not.

Query regarding pagination in tweepy (get_followers) of a particular twitter user

I am fairly new to tweepy and pagination using the cursor class. I have been trying to user the cursor class to get all the followers of a particular twitter user but I keep getting the error where it says "tweepy.error.TweepError: This method does not perform pagination"
Hence I would really appreciate any help if someone could please help me achieve this task of obtaining all the followers of a particular twitter user with pagination, using tweepy. The code I have so far is as follows:
import tweepy
consumer_key='xyz'
consumer_secret='xyz'
access_token='abc'
access_token_secret='def'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
user = api.get_user('somehandle')
print user.name
followers = tweepy.Cursor(user.followers)
temp=[]
for user in followers.items():
temp.append(user)
print temp
#the following part works fine but that is without pagination so I will be able to retrieve at #most 100 followers
aDict = user.followers()
for friend in aDict:
friendDict = friend.__getstate__()
print friendDict['screen_name']
There is a handy method called followers_ids. It returns up to 5000 followers (twitter api limit) ids for the given screen_name (or id, user_id or cursor).
Then, you can paginate these results manually in python and call lookup_users for every chunk. As long as lookup_users can handle only 100 user ids at a time (twitter api limit), it's pretty logical to set chunk size to 100.
Here's the code (pagination part was taken from here):
import itertools
import tweepy
def paginate(iterable, page_size):
while True:
i1, i2 = itertools.tee(iterable)
iterable, page = (itertools.islice(i1, page_size, None),
list(itertools.islice(i2, page_size)))
if len(page) == 0:
break
yield page
auth = tweepy.OAuthHandler(<consumer_key>, <consumer_secret>)
auth.set_access_token(<key>, <secret>)
api = tweepy.API(auth)
followers = api.followers_ids(screen_name='gvanrossum')
for page in paginate(followers, 100):
results = api.lookup_users(user_ids=page)
for result in results:
print result.screen_name
Hope that helps.

Categories