So I have written code to pull tweets on certain key words and send it to an excel document. I am trying to get it to work with the premium sandbox but cannot figure out how. Any insight?
I have:
-a developer account
-a registered application
-a developer environment set up
what else do I need to do to get this to work? Core code is as follows:
##import library
import os
import tweepy as tw
###variables###
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
##set values for keys
auth = tw.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tw.API(auth, wait_on_rate_limit=True)
search_list = ["apples oranges -filter:retweets"]
search_words = search_list[sc]
date_since = "2020-07-01"
##set search words and search date limit
def gettweet():
tweets = tw.Cursor(api.search,
q=search_words,
lang="en", since=date_since,until="2020-07-16",tweet_mode="extended").items(50)
#finds tweets. can filter our retweets if wanted
#,until="2020-07-08"
all_tweets = [[tweet.user.screen_name, tweet.user.location, tweet.created_at, tweet.full_text] for tweet in tweets]
print(all_tweets)
#generates list containin username and location
gettweet()
From this I hope to return a dataframe containing tweets containing the keywords 'apples' or 'oranges'. I want these tweets to be from 30 days ago (hence my using the premium sandbox for this)
Related
Hello I am searching how to create a Twitter bot that replies to all the tweets of a specific user in Python.
I already created a developer's account and I am a beginner in Python.
First, visit https://dev.twitter.com, and create a new application.
head your venv or anaconda and execute
pip install tweepy
Now, in your development directory, create a file, keys.py, and add the following code:
#!/usr/bin/env python
#keys.py
#visit https://dev.twitter.com to create an application and get your keys
keys = dict(
consumer_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
consumer_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
)
Replace the ‘x’ fields with your keys and tokens from your newly created Twitter application
create a file, replybot.py, in the same directory as keys.py, and add the following code:
#!/usr/bin/env python
import tweepy
#from our keys module (keys.py), import the keys dictionary
from keys import keys
CONSUMER_KEY = keys['consumer_key']
CONSUMER_SECRET = keys['consumer_secret']
ACCESS_TOKEN = keys['access_token']
ACCESS_TOKEN_SECRET = keys['access_token_secret']
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
twts = api.search(q="Hello World!")
#list of specific strings we want to check for in Tweets
t = ['Hello world!',
'Hello World!',
'Hello World!!!',
'Hello world!!!',
'Hello, world!',
'Hello, World!']
for s in twt:
for i in t:
if i == s.text:
sn = s.user.screen_name
m = "#%s Hello!" % (sn)
s = api.update_status(m, s.id)
Check if your API is working . sleep is you ensure you are not asked to validate you are a bot python <program.py> .txt
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import tweepy, time, sys
argfile = str(sys.argv[1])
#enter the corresponding information from your Twitter application:
CONSUMER_KEY = '1234abcd...'#keep the quotes, replace this with your consumer key
CONSUMER_SECRET = '1234abcd...'#keep the quotes, replace this with your consumer secret key
ACCESS_KEY = '1234abcd...'#keep the quotes, replace this with your access token
ACCESS_SECRET = '1234abcd...'#keep the quotes, replace this with your access token secret
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
filename=open(argfile,'r')
f=filename.readlines()
filename.close()
for line in f:
api.update_status(line)
time.sleep(900)#Tweet every 15 minutes
To reply to specific twitter user
toReply = "someonesTwitterName" #user to get most recent tweet
api = tweepy.API(auth)
#get the most recent tweet from the user
tweets = api.user_timeline(screen_name = toReply, count=1)
for tweet in tweets:
api.update_status("#" + toReply + " This is what I'm replying with", in_reply_to_status_id = tweet.id)
you can code whatever logic you want
I have a list of tweets Id more than 100 and I want to get all retweets Id for each tweet Id the code that I used is for one tweet Id how can I give the list of tweets Id and check if there is retweets for this tweet print the user ids
# import the module
import tweepy
# assign the values accordingly
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
# authorization of consumer key and consumer secret
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
# set access to user's access key and access secret
auth.set_access_token(access_token, access_token_secret)
# calling the api
api = tweepy.API(auth)
# the ID of the tweet
ID = 1265889240300257280
# getting the retweeters
retweets_list = api.retweets(ID)
# printing the screen names of the retweeters
for retweet in retweets_list:
print(retweet.user.screen_name)
can anyone help me ?
For getting Retweets from a list of Tweets, you'll need to iterate over your list of Tweet IDs and call the api.retweets function for each one in turn.
If your Tweets themselves have more than 100 Retweets, you'll hit a limitation in the API.
Per the Tweepy documentation:
API.retweets(id[, count])
Returns up to 100 of the first retweets of the given tweet.
The Twitter API itself only supports retrieving up to 100 Retweets, see the API documentation (this is the same API that Tweepy is calling):
GET statuses/retweets/:id
Returns a collection of the 100 most recent retweets of the Tweet specified by the id parameter.
This works for me:
for retweet in retweets_list:
print (retweets_list.retweet_count)
I'm trying to retrive Tweets that particular accounts has posted. I do use
user_timeline parameter from the tweepy library, but it includes also replies from the concrete Twitter user. Does anyone has a clue how to omit them?
Code:
import tweepy
consumer_key = key
consumer_secret = key
access_key = key
access_secret = key
def get_tweets(username):
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
#set count to however many tweets you want; twitter only allows 200 at once
number_of_tweets = 20
#get tweets
tweets = api.user_timeline(screen_name = username,count = number_of_tweets)
#create array of tweet information: username, tweet id, date/time, text
tweets_for_csv = [[username,tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")] for tweet in tweets]
print(str(tweets_for_csv))
Pass exclude_replies as a kwarg.
tweets = api.user_timeline(screen_name=username, count=number_of_tweets, exclude_replies=True)
See Twitters API documentation for a full list of kwargs you can pass.
I am trying to gather the tweets of a user navalny, from 01.11.2017 to 31.01.2018 using tweepy. I have ids of the first and last tweets that I need, so I tried the following code:
import tweepy
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
t = api.user_timeline(screen_name='navalny', since_id = 933000445307518976, max_id = 936533580481814529)
However, the returned value is an empty list.
What is the problem here?
Are there any restrictions on the history of tweets that I can get?
What are possible solutions?
Quick answer:
Using Tweepy you can only retrieve the last 3200 tweets from the Twitter REST API for a given user.
Unfortunately the tweets you are trying to access are older than this.
Detailed answer:
I did a check using the code below:
import tweepy
from tweepy import OAuthHandler
def tweet_check(user):
"""
Scrapes a users most recent tweets
"""
# API keys and initial configuration
consumer_key = ""
consumer_secret = ""
access_token = ""
access_secret = ""
# Configure authentication
authorisation = OAuthHandler(consumer_key, consumer_secret)
authorisation.set_access_token(access_token, access_secret)
api = tweepy.API(authorisation)
# Requests most recent tweets from a users timeline
tweets = api.user_timeline(screen_name=user, count=2,
max_id=936533580481814529)
for tweet in tweets:
tid = tweet.id
print(tid)
twitter_users = ["#navalny"]
for twitter_user in twitter_users:
tweet_check(twitter_user)
This test returns nothing before 936533580481814529
Using a seperate script I scraped all 3200 tweets, the max Twitter will let you scrape and the youngest tweet id I can find is 943856915536326662
Seems like you have run into Twitter's tweet scraping limit for user timelines here.
The code below was provided to another user who was scraping the "friends" (not followers) list of a specific Twitter user. For some reason, I get an error when using "api.lookup_users". The error states "Too many terms specified in query". Ideally, I would like to scrape the followers and output a csv with the screen names (not ids). I would like their descriptions as well, but can do this in a separate step unless there is a suggestion for pulling both pieces of information. Below is the code that I am using:
import time
import tweepy
import csv
#Twitter API credentials
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
ids = []
for page in tweepy.Cursor(api.friends, screen_name="").pages():
ids.extend(page)
time.sleep(60)
print(len(ids))
users = api.lookup_users(user_ids=ids) #iterates through the list of users and prints them
for u in users:
print(u.screen_name)
From the error you get, it seems that you are putting too many ids at once in the api.lookup_users request. Try splitting you list of ids in smaller parts and make a request for each part.
import time
import tweepy
import csv
#Twitter API credentials
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
ids = []
for page in tweepy.Cursor(api.friends, screen_name="").pages():
ids.extend(page)
time.sleep(60)
print(len(ids))
idChunks = [ids[i:i + 300] for i in range(0, len(ids), 300)]
users = []
for idChunk in idChunks:
try:
users.append(api.lookup_users(user_ids=idChunk))
except tweepy.error.RateLimitError:
print("RATE EXCEDED. SLEEPING FOR 16 MINUTES")
time.sleep(16*60)
users.append(api.lookup_users(user_ids=idChunk))
for u in users:
print(u.screen_name)
print(u.description)
This code has not been tested, and does not writes the CSV, but it should help you getting past that error you were having. The size of 300 for the chunks is completely arbitrary, adjust it if it is too small (or too big).