I am trying to get top 20 trending topic through twitter api based on the Tweepy library.
Here is my python code:
import tweepy
import json
import time
today = time.strftime("%Y-%m-%d")
CONSUMER_KEY = ""
CONSUMER_SECRET = ""
ACCESS_KEY = ""
ACCESS_SECRET = ""
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
trends = api.trends_daily(today)
print trends
I am using trends_daily function to get the top 20 trending topics for each day.
The variable "today" is in date format: today = time.strftime("%Y-%m-%d"). And I tried string format as well. However, it keeps report error message:
File "/Users/Ivy/PycharmProjects/TwitterTrend/trends.py", line 17, in <module>
trends = api.trends_daily("2014-06-03")
File "build/bdist.macosx-10.9-intel/egg/tweepy/binder.py", line 230, in _call
File "build/bdist.macosx-10.9-intel/egg/tweepy/binder.py", line 203, in execute
tweepy.error.TweepError: [{u'message': u'Sorry, that page does not exist', u'code': 34}]
I believe that you're using tweepy version 1 which is no longer supported: https://api.twitter.com/1/trends/daily.json
Try to re-install (version 1.1), for example:
https://api.twitter.com/1.1/trends/available.json
Related
I am trying to gather the tweets of a user navalny, from 01.11.2017 to 31.01.2018 using tweepy. I have ids of the first and last tweets that I need, so I tried the following code:
import tweepy
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
t = api.user_timeline(screen_name='navalny', since_id = 933000445307518976, max_id = 936533580481814529)
However, the returned value is an empty list.
What is the problem here?
Are there any restrictions on the history of tweets that I can get?
What are possible solutions?
Quick answer:
Using Tweepy you can only retrieve the last 3200 tweets from the Twitter REST API for a given user.
Unfortunately the tweets you are trying to access are older than this.
Detailed answer:
I did a check using the code below:
import tweepy
from tweepy import OAuthHandler
def tweet_check(user):
"""
Scrapes a users most recent tweets
"""
# API keys and initial configuration
consumer_key = ""
consumer_secret = ""
access_token = ""
access_secret = ""
# Configure authentication
authorisation = OAuthHandler(consumer_key, consumer_secret)
authorisation.set_access_token(access_token, access_secret)
api = tweepy.API(authorisation)
# Requests most recent tweets from a users timeline
tweets = api.user_timeline(screen_name=user, count=2,
max_id=936533580481814529)
for tweet in tweets:
tid = tweet.id
print(tid)
twitter_users = ["#navalny"]
for twitter_user in twitter_users:
tweet_check(twitter_user)
This test returns nothing before 936533580481814529
Using a seperate script I scraped all 3200 tweets, the max Twitter will let you scrape and the youngest tweet id I can find is 943856915536326662
Seems like you have run into Twitter's tweet scraping limit for user timelines here.
I try to query a specified user's tweets with a specified key word included in the tweet text. Here is my code:
# Import Tweepy, sleep, credentials.py
import tweepy
from time import sleep
from credentials import *
# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"
def twtr2():
raw_tweets = tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(50)
for tweet in raw_tweets:
if tweet['user']['screen_name'] == SCREEN_NAME:
print tweet
twtr2()
I get the error message as below:
Traceback (most recent call last):
File "test2.py", line 19, in <module>
twtr2()
File "test2.py", line 17, in twtr2
if tweet['user']['screen_name'] == SCREEN_NAME:
TypeError: 'Status' object has no attribute '__getitem__'
I googled a lot and thought that maybe I needed to save Twitter's JSON in python first, so I tried the following:
import tweepy, json
from time import sleep
from credentials import *
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"
raw_tweets = tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(50)
for tweet in raw_tweets:
load_tweet = json.loads(tweet)
if load_tweet['user']['screen_name'] == SCREEN_NAME:
print tweet
However, the result is sad:
Traceback (most recent call last):
File "test2.py", line 35, in <module>
load_tweet = json.loads(tweet)
File "C:\Python27\lib\json\__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
Does anyone know what's wrong with my code? And can you help me to fix it?
Thanks in advance!
I figured out. Here is the solution:
# Import Tweepy, sleep, credentials.py
import tweepy
from time import sleep
from credentials import *
# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"
for tweet in tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(200):
if tweet.user.screen_name == SCREEN_NAME:
print tweet.text
print tweet.user.screen_name
Please do note that this is not an efficient way to locate the tweets with both specified conditions (screen_name and keyword) satisfied. This is because we query by keyword first, and then query by screen_name. If the keyword is very popular, like what I use here "TheBachelor", with a limited number of tweets (200), we may find none of the 200 tweets are sent by the specified screen_name. I think if we can query by screen_name first, and then by keyword, maybe it will provide a better result. But that's out of the discussion.
I will leave you here.
The issue is with the
load_tweet = json.loads(tweet)
The "tweet" object is not a JSON object. If you want to use JSON objects, follow this stackoverflow post on how to use JSON objects with tweepy.
To achieve what you are trying to do (print each tweet of a feed of 50), I would follow what was stated in the getting started docs:
import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
public_tweets = api.home_timeline()
for tweet in public_tweets:
print(tweet.text)
I am trying to connect to the Twitter REST API. I downloaded the twitter packages with pi in the command line. In another program I did I could connect to the stream Twitter API. This is my code:
Import the necessary package to process data in JSON format
try:
import json
except ImportError:
import simplejson as json
# Import the necessary methods from "twitter" library
from twitter import Twitter, OAuth, TwitterHTTPError, TwitterStream
# Variables that contains the user credentials to access Twitter API
ACCESS_TOKEN = ''
ACCESS_SECRET = ''
CONSUMER_KEY = ''
CONSUMER_SECRET = ''
oauth = OAuth(ACCESS_TOKEN, ACCESS_SECRET, CONSUMER_KEY, CONSUMER_SECRET)
twitter = Twitter(auth=OAuth)
twitter.statuses.home_timeline()
I am getting this error:
line 263, in __call__
headers.update(self.auth.generate_headers())
TypeError: generate_headers() missing 1 required positional argument: 'self'
How can I fix it?
I believe what you are trying to do is
twitter = Twitter(auth=oauth)
I am very new to Python having taught it to myself just a few weeks ago. I have tried to cobble together some simple script using Tweepy to do various things with the Twitter API. I have been trying to get the Search API working but to no avail. I have the following code just to simply Search the last 7 days of Twitter for keywords.
# 1.Import required libs and used objects/libs
import tweepy
from tweepy import OAuthHandler
from tweepy import API
from tweepy import Cursor
#2. GET or input App keys and tokens. Here keys/tokens are pasted from Twitter.
ckey = 'key'
csecret = 'secret'
atoken = 'token'
asecret = 'secret'
# 3. Set authorization tokens.
auth = tweepy.OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
#4. Define API.
api = tweepy.API(auth)
#5. Define list or library.
for tweets in tweepy.Cursor(api.search, q = '#IS', count = 100,
result_type ='recent').items():
print tweet.text
Every time I get the following error:
Traceback (most recent call last):
File "C:/Users/jbutk_001/Desktop/Tweety Test/tweepy streaming.py", line 25, in <module>
result_type ='recent')).items():
File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 22, in __init__
raise TweepError('This method does not perform pagination')
TweepError: This method does not perform pagination
I also tried
for tweets in tweepy.Cursor(api.search(q = '#IS', count = 100,
result_type ='recent')).items():
print tweet.text
But then I get:
Traceback (most recent call last):
File "C:/Users/jbutk_001/Desktop/Tweety Test/tweepy streaming.py", line 25, in <module>
result_type ='recent').items():
File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 181, in next
self.current_page = self.page_iterator.next()
File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 101, in next
old_parser = self.method.__self__.parser
AttributeError: 'function' object has no attribute '__self__'
Can anyone please point me in the right direction, this has been driving me nuts for the past few days.
Thanks.
First of all, you are importing some things wrong, you might want to read more about how import works:
What are good rules of thumb for Python imports?
A working example of how to make this kind of search work:
import tweepy
CONSUMER_KEY = 'key'
CONSUMER_SECRET = 'secret'
ACCESS_KEY = 'accesskey'
ACCESS_SECRET = 'accesssecret'
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
for tweet in tweepy.Cursor(api.search,
q="#IS",
count=100,
result_type="recent",
include_entities=True,
lang="en").items():
print tweet.tweet
Also, I would recommend to avoid spaces in filenames, instead of "tweepy streaming.py" go for something like "tweepy_streaming.py".
I am trying to output the number of followers one user has on twitter using tweepy, I have searched high and low to find some answers and I managed to get some code:
import oauth, tweepy, sys, locale, threading
from time import localtime, strftime, sleep
def init():
global api
consumer_key = "..."
consumer_secret = "..."
access_key = "..."
access_secret = "..."
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
user = tweepy.api.get_user('...')
print user.screen_name
print user.followers_count
when I run this in python, i get errors of bad authentication,
could someone please explain why this is?
Thanks
You create the api object with the authentication, but then you don't use it and call tweepy directly.
This line:
user = tweepy.api.get_user('...')
Should be:
user = api.get_user('...')