How do I scrape historical tweet from twitter with their location inclusive
I was able to get tweets with their location using the tweeter API, but it doesn't go beyond a certain period. How do I get historical tweets with their location inclusive using a hashtag.
I saw this code from github which I could get historical tweets but I've been unable to modify it to get the location of the tweets.Also how do I get the parameters passed into the xpath and css methods. How do I modify the code to get the location of my tweets historical.
the tweet crawler I'm using is from : https://github.com/jonbakerfish/TweetScraper/blob/master/TweetScraper/spiders/TweetCrawler.py
Advice on any other way I have use to get it will be apprecaited
Related
noofitem = 1000
tweets = tweepy.Cursor(api.search,q=['#iphone11, -filter:retweets'],since='2019-11-14',lang='en',tweet_mode='extended',retweeted=False).items(noofitem)
i = [tweet.full_text for tweet in tweets] #Tweet text
I am trying to get about 1000 tweets using tweepy. But the max tweets I get are around 600. Changing the date does not work. Any modification or other workarounds will be helpful. Thanks.
Please note that Twitter’s search service and, by extension, the
Search API is not meant to be an exhaustive source of Tweets. Not all
Tweets will be indexed or made available via the search interface.
Please refer to this link for more information: http://docs.tweepy.org/en/latest/api.html#help-methods
Probably you will need to set up a Stream to get the amount of data you need.
# Initiate the connection to Twitter
twitter = Twitter(auth=oauth)
# Search for latest tweets about "pakistan"
results = twitter.search.tweets(q='pakistan',until=2008 - 08 - 19, )
print results
I am trying to retrieve tweets that are earlier than this date by one week. It does not return anything. However, I have searched manually on twitter and found that tweets exist.
When you use the Twitter API to download tweets you will have access to tweets back to roughly one week old. This is despite the fact that you can see tweets older than one week on Twitter's website. This is a built-in limitation of the API.
To have access to a bigger time span you can do the following ways:
download everyday data and add up gradually.
you can search on the web to find a dataset
The best way is to ask Twitter to give you the data for a specific time span while you have an API developer account. You have asked for a quote using this address:
https://www.trackmyhashtag.com/twitter-dataset#request-data-form
Im trying to get all of a user's tweets for a 2 month time period. In search I see results but this code returns an empty array. Why?
results = api.GetSearch(raw_query="q=&from=yikyakapp&since=2014-09-24&until=2014-11-24")
print(results)
This is because Twitter search API has a limit of 7 days. Check the API documentation
The Twitter Search API searches against a sampling of recent Tweets published in the past 7 days.
There is a detailed explanation here https://dev.twitter.com/rest/reference/get/search/tweets
Keep in mind that the search index has a 7-day limit. In other words,
no tweets will be found for a date older than one week.
In summary, you can't use Twitter API to search for tweets beyond 7 days. Of course, in the website they can show you whatever they want. They hold all the data.
I was wondering if there was a way to retrieve tweets in a specific trend? For example, a trend right now is #AirAsia and therefore, I would like to get, for example, 50 tweets from that trend. I tried looking at tweepy but couldn't find anything except for trends in a certain location.
EDIT
I would like to clarify that I am not looking for a specific hashtag but a trend. For example, as mentioned in the comments, "Greece" is a trend but it is not a hashtag. Therefore, a trend is not necessarily a hashtag and vice versa, a hashtag is not necessarily a trend.
Use Twitter Search:
https://twitter.com/search-home
Or:
Google "twitter hashtag airAsia" or go to the address https://twitter.com/hashtag/airasia
You can retrieve as many tweets as you want in any given trend.
Note that method #1 of using Twitter's search engine allows you to search any keyword or trend, which isn't necessarily a hashtag.
Similar to the Twitter Search function, the following Search API can be used to find tweets 'containing an exact phrase': Link to Search API Documentation
I have the following questions about tweepy python module
1.I am trying to retrieve all tweets for a specific location. I am able to do this by using tweepy python module (streaming API), but I get only those tweets whose geo locations are enabled, which means I would loose rest of the tweeter’s tweet who have not enabled their geo location. Is there a better way to retrieve all the tweets, given a location?
2.I use Stream.Sample method to retrieve all the tweets, Can someone tell me about the parameters used in sample method? I see count, and async as parameters. Now what should we specify here?
3.What does firehose method in tweepy.Stream do?
Any help is much appreciated
If tweepy doesn't have a feature you need, you can always access Twitter directly with an HTTP request. The full Twitter REST API is described here: https://dev.twitter.com/docs/api
The ones that seem relevant to your interest are:
GET trends/:woeid which looks up tweets by woeid, a Yahoo Identifier for collecting information about a given place/landmark/etc.
GET geo/id/:place_id which only mines geotagged tweets.
There is documentation of all the information available for a GET request but the IP address is not among the available fields: https://dev.twitter.com/docs/api/1/get/search .
Lastly, Twitter has a location search FAQ that may be of interest.