I'm trying to download my followers of twitter and the followers of my followers. T
The code seems to work but it doesn´t download all my followers. It downloads a portion and in this portion I think it works well. But why not all?
why is it?
-- coding: utf-8 --
"""
#author: Mik
"""
import csv
import time
import tweepy
# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app
api_key = ''
api_secret = ''
access_token = '-'
access_token_secret = ''
# You don't need to make any changes below here # This bit authorises you to ask for information from Twitter
auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)
# The api object gives you access to all of the http calls that Twitter accepts
api = tweepy.API(auth)
#User we want to use as initial node
user=''
#This creates a csv file and defines that each new entry will be in a new line
csvfile=open(user+'network2.csv', 'w')
spamwriter = csv.writer(csvfile, delimiter=' ',quotechar='|', quoting=csv.QUOTE_MINIMAL)
#This is the function that takes a node (user) and looks for all its followers #and print them into a CSV file... and look for the followers of each follower...
def fib(n,user,spamwriter):
if n>0:
#There is a limit to the traffic you can have with the API, so you need to wait
#a few seconds per call or after a few calls it will restrict your traffic
#for 15 minutes. This parameter can be tweeked
time.sleep(40)
#This is for private users that we wont be able to see their followers
try:
users=tweepy.Cursor(api.followers, screen_name = user, wait_on_rate_limit = True).items()
for follower in users:
spamwriter.writerow([user+';'+follower.screen_name])
fib(n-1,follower.screen_name,spamwriter)
#n defines the level of autorecurrence
except tweepy.TweepError:
print("Failed to run the command on that user, Skipping...")
n=2
fib(n,user,spamwriter)
If I understood correctly then you want to get ids of all followers of each of your followers.
Use logic like following, it will get you details of your 3000 followers per 15 minutes
import tweepy
#twitter credentials here---------------------------------------------------
auth = tweepy.OAuthHandler(your keys)
auth.set_access_token(your keys)
api = tweepy.API(auth)
iter1 = tweepy.Cursor(api.followers, screen_name = 'your_screen_name',count = 200).pages()
for request in range(15):
your_200_followers = next(iter1)
for each_follower in your_200_followers:
variable = each_follower.followers_ids
#store the <list> variable somewhere
Related
I am able to reply to a specific tweet by getting tweet IDs, but cannot get my configuration to do what I want it to do, which is to reply to every tweet from a specific user. I have that user's username and ID. Currently it appears to only be pulling one tweet, which I suspect has something to do with line 23's tweet.id. I guess what I'm looking for is a way to ensure that my bot replies every single time this user tweets. Here is my current code (sensitive info redacted)
from ast import For
import tweepy
api_key = "###############################################"
api_secret = "###############################################"
bearer_token = r"###############################################"
access_token = "###############################################"
access_token_secret = "###############################################"
client = tweepy.Client(bearer_token, api_key, api_secret, access_token, access_token_secret)
auth = tweepy.OAuth1UserHandler(api_key, api_secret, access_token, access_token_secret)
api = tweepy.API(auth)
toReply = "TwitterUsernameHere"
api = tweepy.API(auth)
tweets = api.user_timeline(screen_name = toReply, count=1)
for tweet in tweets:
api.update_status("#" + toReply + " Why? ", in_reply_to_status_id = tweet.id)
Assuming that you are following the Twitter automation rules (i.e. that you're only replying to Tweets that the user has opted-in for your app to reply to - otherwise your user account or app will be restricted)...
... your code currently checks the user's Timeline, and then replies to the most recent single Tweet (count=1 on the user_timeline call). You would need this to check for new Tweets in order to reply to different ones. You could store tweet.id somewhere and only reply to it when it changes.
Note that there are a few other things to tidy up:
from ast import For is not required
client = tweepy.Client targets the Twitter API v2 but the rest of the code uses Twitter API v1.1 (via tweepy.API)
bearer_token is unused in this code and will only work for a read operation in v1.1 of the API so you could remove it.
The code below was provided to another user who was scraping the "friends" (not followers) list of a specific Twitter user. For some reason, I get an error when using "api.lookup_users". The error states "Too many terms specified in query". Ideally, I would like to scrape the followers and output a csv with the screen names (not ids). I would like their descriptions as well, but can do this in a separate step unless there is a suggestion for pulling both pieces of information. Below is the code that I am using:
import time
import tweepy
import csv
#Twitter API credentials
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
ids = []
for page in tweepy.Cursor(api.friends, screen_name="").pages():
ids.extend(page)
time.sleep(60)
print(len(ids))
users = api.lookup_users(user_ids=ids) #iterates through the list of users and prints them
for u in users:
print(u.screen_name)
From the error you get, it seems that you are putting too many ids at once in the api.lookup_users request. Try splitting you list of ids in smaller parts and make a request for each part.
import time
import tweepy
import csv
#Twitter API credentials
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
ids = []
for page in tweepy.Cursor(api.friends, screen_name="").pages():
ids.extend(page)
time.sleep(60)
print(len(ids))
idChunks = [ids[i:i + 300] for i in range(0, len(ids), 300)]
users = []
for idChunk in idChunks:
try:
users.append(api.lookup_users(user_ids=idChunk))
except tweepy.error.RateLimitError:
print("RATE EXCEDED. SLEEPING FOR 16 MINUTES")
time.sleep(16*60)
users.append(api.lookup_users(user_ids=idChunk))
for u in users:
print(u.screen_name)
print(u.description)
This code has not been tested, and does not writes the CSV, but it should help you getting past that error you were having. The size of 300 for the chunks is completely arbitrary, adjust it if it is too small (or too big).
I have a list of tweet ids for which I would like to download their text content. Is there any easy solution to do this, preferably through a Python script? I had a look at other libraries like Tweepy and things don't appear to work so simple, and downloading them manually is out of the question since my list is very long.
You can access specific tweets by their id with the statuses/show/:id API route. Most Python Twitter libraries follow the exact same patterns, or offer 'friendly' names for the methods.
For example, Twython offers several show_* methods, including Twython.show_status() that lets you load specific tweets:
CONSUMER_KEY = "<consumer key>"
CONSUMER_SECRET = "<consumer secret>"
OAUTH_TOKEN = "<application key>"
OAUTH_TOKEN_SECRET = "<application secret"
twitter = Twython(
CONSUMER_KEY, CONSUMER_SECRET,
OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
tweet = twitter.show_status(id=id_of_tweet)
print(tweet['text'])
and the returned dictionary follows the Tweet object definition given by the API.
The tweepy library uses tweepy.get_status():
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
api = tweepy.API(auth)
tweet = api.get_status(id_of_tweet)
print(tweet.text)
where it returns a slightly richer object, but the attributes on it again reflect the published API.
Sharing my work that was vastly accelerated by the previous answers (thank you). This Python 2.7 script fetches the text for tweet IDs stored in a file. Adjust get_tweet_id() for your input data format;
original configured for data at https://github.com/mdredze/twitter_sandy
Update April 2018: responding late to #someone bug report (thank you). This script no longer discards every 100th tweet ID (that was my bug). Please note that if a tweet is unavailable for whatever reason, the bulk fetch silently skips it. The script now warns if the response size is different from the request size.
'''
Gets text content for tweet IDs
'''
# standard
from __future__ import print_function
import getopt
import logging
import os
import sys
# import traceback
# third-party: `pip install tweepy`
import tweepy
# global logger level is configured in main()
Logger = None
# Generate your own at https://apps.twitter.com/app
CONSUMER_KEY = 'Consumer Key (API key)'
CONSUMER_SECRET = 'Consumer Secret (API Secret)'
OAUTH_TOKEN = 'Access Token'
OAUTH_TOKEN_SECRET = 'Access Token Secret'
# batch size depends on Twitter limit, 100 at this time
batch_size=100
def get_tweet_id(line):
'''
Extracts and returns tweet ID from a line in the input.
'''
(tagid,_timestamp,_sandyflag) = line.split('\t')
(_tag, _search, tweet_id) = tagid.split(':')
return tweet_id
def get_tweets_single(twapi, idfilepath):
'''
Fetches content for tweet IDs in a file one at a time,
which means a ton of HTTPS requests, so NOT recommended.
`twapi`: Initialized, authorized API object from Tweepy
`idfilepath`: Path to file containing IDs
'''
# process IDs from the file
with open(idfilepath, 'rb') as idfile:
for line in idfile:
tweet_id = get_tweet_id(line)
Logger.debug('get_tweets_single: fetching tweet for ID %s', tweet_id)
try:
tweet = twapi.get_status(tweet_id)
print('%s,%s' % (tweet_id, tweet.text.encode('UTF-8')))
except tweepy.TweepError as te:
Logger.warn('get_tweets_single: failed to get tweet ID %s: %s', tweet_id, te.message)
# traceback.print_exc(file=sys.stderr)
# for
# with
def get_tweet_list(twapi, idlist):
'''
Invokes bulk lookup method.
Raises an exception if rate limit is exceeded.
'''
# fetch as little metadata as possible
tweets = twapi.statuses_lookup(id_=idlist, include_entities=False, trim_user=True)
if len(idlist) != len(tweets):
Logger.warn('get_tweet_list: unexpected response size %d, expected %d', len(tweets), len(idlist))
for tweet in tweets:
print('%s,%s' % (tweet.id, tweet.text.encode('UTF-8')))
def get_tweets_bulk(twapi, idfilepath):
'''
Fetches content for tweet IDs in a file using bulk request method,
which vastly reduces number of HTTPS requests compared to above;
however, it does not warn about IDs that yield no tweet.
`twapi`: Initialized, authorized API object from Tweepy
`idfilepath`: Path to file containing IDs
'''
# process IDs from the file
tweet_ids = list()
with open(idfilepath, 'rb') as idfile:
for line in idfile:
tweet_id = get_tweet_id(line)
Logger.debug('Enqueing tweet ID %s', tweet_id)
tweet_ids.append(tweet_id)
# API limits batch size
if len(tweet_ids) == batch_size:
Logger.debug('get_tweets_bulk: fetching batch of size %d', batch_size)
get_tweet_list(twapi, tweet_ids)
tweet_ids = list()
# process remainder
if len(tweet_ids) > 0:
Logger.debug('get_tweets_bulk: fetching last batch of size %d', len(tweet_ids))
get_tweet_list(twapi, tweet_ids)
def usage():
print('Usage: get_tweets_by_id.py [options] file')
print(' -s (single) makes one HTTPS request per tweet ID')
print(' -v (verbose) enables detailed logging')
sys.exit()
def main(args):
logging.basicConfig(level=logging.WARN)
global Logger
Logger = logging.getLogger('get_tweets_by_id')
bulk = True
try:
opts, args = getopt.getopt(args, 'sv')
except getopt.GetoptError:
usage()
for opt, _optarg in opts:
if opt in ('-s'):
bulk = False
elif opt in ('-v'):
Logger.setLevel(logging.DEBUG)
Logger.debug("main: verbose mode on")
else:
usage()
if len(args) != 1:
usage()
idfile = args[0]
if not os.path.isfile(idfile):
print('Not found or not a file: %s' % idfile, file=sys.stderr)
usage()
# connect to twitter
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
api = tweepy.API(auth)
# hydrate tweet IDs
if bulk:
get_tweets_bulk(api, idfile)
else:
get_tweets_single(api, idfile)
if __name__ == '__main__':
main(sys.argv[1:])
You can access tweets in bulk (up to 100 at a time) with the status/lookup endpoint: https://dev.twitter.com/rest/reference/get/statuses/lookup
I don't have enough reputation to add an actual comment so sadly this is the way to go:
I found a bug and a strange thing in chrisinmtown answer:
Every 100th tweet will be skipped due to the bug. Here is a simple solution:
if len(tweet_ids) < 100:
tweet_ids.append(tweet_id)
else:
tweet_ids.append(tweet_id)
get_tweet_list(twapi, tweet_ids)
tweet_ids = list()
Using is better since it works even past the rate limit.
api = tweepy.API(auth_handler=auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
Using tweepy I am able to return all of my friends using a cursor. Is it possible to specify another user and get all of their friends?
user = api.get_user('myTwitter')
print "Retreiving friends for", user.screen_name
for friend in tweepy.Cursor(api.friends).items():
print "\n", friend.screen_name
Which prints a list of all my friends, however if I change the first line
to another twitter user it still returns my friends. How can I get friends for any given user using tweepy?
#first line is changed to
user = api.get_user('otherUsername') #still returns my friends
Additionally user.screen_name when printed WILL return otherUsername
The question Get All Follower IDs in Twitter by Tweepy does essentially what I am looking for however it returns only a count of ID's. If I remove the len() function I will I can iterate through a list of user IDs, but is it possible to get screen names #twitter,#stackoverflow, #etc.....?
You can use the ids variable from the answer you referenced in the other answer to get the the id of the followers of a given person, and extend it to get the screen names of all of the followers using Tweepy's api.lookup_users method:
import time
import tweepy
auth = tweepy.OAuthHandler(..., ...)
auth.set_access_token(..., ...)
api = tweepy.API(auth)
ids = []
for page in tweepy.Cursor(api.followers_ids, screen_name="McDonalds").pages():
ids.extend(page)
time.sleep(60)
screen_names = [user.screen_name for user in api.lookup_users(user_ids=ids)]
You can use this:
# import the module
import tweepy
# assign the values accordingly
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
# authorization of consumer key and consumer secret
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
# set access to user's access key and access secret
auth.set_access_token(access_token, access_token_secret)
# calling the api
api = tweepy.API(auth)
# the screen_name of the targeted user
screen_name = "TwitterIndia"
# printing the latest 20 friends of the user
for friend in api.friends(screen_name):
print(friend.screen_name)
for more details see https://www.geeksforgeeks.org/python-api-friends-in-tweepy/
I am fairly new to tweepy and pagination using the cursor class. I have been trying to user the cursor class to get all the followers of a particular twitter user but I keep getting the error where it says "tweepy.error.TweepError: This method does not perform pagination"
Hence I would really appreciate any help if someone could please help me achieve this task of obtaining all the followers of a particular twitter user with pagination, using tweepy. The code I have so far is as follows:
import tweepy
consumer_key='xyz'
consumer_secret='xyz'
access_token='abc'
access_token_secret='def'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
user = api.get_user('somehandle')
print user.name
followers = tweepy.Cursor(user.followers)
temp=[]
for user in followers.items():
temp.append(user)
print temp
#the following part works fine but that is without pagination so I will be able to retrieve at #most 100 followers
aDict = user.followers()
for friend in aDict:
friendDict = friend.__getstate__()
print friendDict['screen_name']
There is a handy method called followers_ids. It returns up to 5000 followers (twitter api limit) ids for the given screen_name (or id, user_id or cursor).
Then, you can paginate these results manually in python and call lookup_users for every chunk. As long as lookup_users can handle only 100 user ids at a time (twitter api limit), it's pretty logical to set chunk size to 100.
Here's the code (pagination part was taken from here):
import itertools
import tweepy
def paginate(iterable, page_size):
while True:
i1, i2 = itertools.tee(iterable)
iterable, page = (itertools.islice(i1, page_size, None),
list(itertools.islice(i2, page_size)))
if len(page) == 0:
break
yield page
auth = tweepy.OAuthHandler(<consumer_key>, <consumer_secret>)
auth.set_access_token(<key>, <secret>)
api = tweepy.API(auth)
followers = api.followers_ids(screen_name='gvanrossum')
for page in paginate(followers, 100):
results = api.lookup_users(user_ids=page)
for result in results:
print result.screen_name
Hope that helps.