Find location from tweet ID - python

I have a list of tweets. For each tweet I have different attributes (user, date, text and tweet IDs).
To scrape that data, I’m using the project of Jefferson Henrique (https://github.com/Jefferson-Henrique/GetOldTweets-python).
In addition to that, I would like to know two geographical elements for each tweet:
where tweets were generated (location or long, lat)?
where the user resides?
Do you have any idea to get those two information either from tweet IDs or something else?

You might want to post the json file of the tweet to let me locate where the information is.
From my experience, the location of the tweet may not always be available, depending on whether the user allows location sharing when tweeting.
For the user location, it is usually not in the tweet. Scrape the user profile and you should easily find it.

You could query the Twitter API directly using the tweet id. This would allow you to retrieve more data about the tweet, including the location if available.
According to the Twitter API documentation:
If the Tweet is geo-tagged, there will a "place" object included.

Related

Stream and save tweepy data

I am looking to stream and save twitter data based on a hashtag during an event. I don't pay twitter, so I may have restriction limits on my account. Assuming I have a twitter_credentials.py with acc_secret, acc_token, con_key, and con_secret, and the hashtag #hashtag, could someone please help me build this? I'd like it to end up as a json object that I can then convert to pandas dataframes.
The search method allow you to get all tweets refer to a query. I think you will need this function to retrieve all tweets refering to a specific hashtag
You can look on https://docs.tweepy.org/en/v3.10.0/api.html?highlight=search#API.search to have more details

Get live tweets from a specific user list... getting duplicates tweepy python

I have seen here, here and here.
I have a list of twitter users I want to stream live tweets for. But I am getting duplicate tweets. And the tweets are not live per se.
Here is the code:
users_to_follow = ['twitterid_1', 'twitterid_2', 'twitterid_3']
mystream = tweepy.Stream(self.auth, self.listener)
try:
mystream.filter(follow=users_to_follow)
except:
print("error!")
mystream.disconnect()
It is bringing back the tweets but the same tweets are being duplicated. What am I doing wrong?
Cheers
Per the Twitter documentation on the follow parameter:
follow
A comma-separated list of user IDs, indicating the users whose Tweets
should be delivered on the stream. Following protected users is not
supported. For each user specified, the stream will contain:
Tweets created by the user.
Tweets which are retweeted by the user.
Replies to any Tweet created by the user.
Retweets of any Tweet created by the user.
Manual replies, created without pressing a reply
button (e.g. “#twitterapi I agree”).
The stream will not contain:
Tweets mentioning the user (e.g. “Hello #twitterapi!”).
Manual Retweets created without pressing a Retweet button (e.g. “RT
#twitterapi The API is great”).
Tweets by protected users.
When you say that "the same Tweets are being duplicated", do you mean that you're seeing the same Tweet IDs multiple times?
You also mentioned that the "Tweets are not live" but it is not clear what you mean by this.

How to get tweets in real time from a user's timeline using Tweepy

I'm trying to pull tweets from a user's timeline in real-time. I then want to do some analysis on those tweets. Having read the docs it looks like I will need to use tweepy.Stream for this use case. I've done the following:
stream.filter(follow='25073877')
But Twitter's filter API states the following:
Tweets created by the user.
Tweets which are retweeted by the user.
Replies to any Tweet created by the user.
Retweets of any Tweet created by the user.
Manual replies, created without pressing a reply
button (e.g. “#twitterapi I agree”).
It seems that this will return a huge volume of tweets that aren't relevant to my use case. Do I have to use this approach and then filter by screen name to get only tweets by the real user? This doesn't seem right at all.
The alternative seems to be the api.user_timeline class but that isn't a streaming API. Do I therefore use this API and hit it every second? I can't seem to find suitable examples of how best to accomplish my use case.
Yes, you'll need to filter either by screen_name or maybe you can check if it's a retweet or not.
I wouldn't recommend the second approach since you'll be getting an even bigger amount of tweets since you'll have to filter out the tweets you already received in previous requests plus you may hit the API querying limits if you don't time ti properly.
That's the signature of the filter function:
def filter(self, follow=None, track=None, is_async=False, locations=None,
stall_warnings=False, languages=None, encoding='utf8', filter_level=None)
Which maps to this Twitter API request.
And here the explanation of the parameters.

Twitter API - Retrieve all replies to a certain tweet

I am trying to use Twitter API with the Python wrapper Twython and I want to retrieve all replies (the comments below a tweet) to a certain tweet find using some patterns.
At the moment to achieve this, I perform the search of a string, I retrieve the screen_name field of user field in the response, related to the original tweets and then I use again the API in order to search the latest tweets directed to the user, using in the query the substring to:screen_name.
Is there a better solution? The only questions related to this topic that I found were written in '14 and I hope that, in the mean time, there were some improvements.

How can I get only responses tweets?

I need to retrieve specific data from twitter.
I'd like to get all the responses tweets received by a specific user (which is not the authenticating user of the program). Is there a way to achieve this? Right now I'm thinking about using the search function and see if the 'in_reply_to_user_id_str' matches the id of the user I want.
But this means that I need to filter a lot of data to find the one I want
Edit: I'm using the Python-Twitter-Tools
If it is the authenticating user, you can directly get the response tweets using the 'mentions timeline'. As the user is not the authenticating user, you have two options here.
Streaming API
Use the filter endpoint along with the 'follow' parameter. Pass the required 'user_id' to the follow paramenter and it will return the followings. You will have to check the 'in_reply_to_user_id_str' in order to isolate the replies(responses).
Tweets created by the user.
Tweets which are retweeted by the user.
Replies to any Tweet created by the user.
Retweets of any Tweet created by the user.
Manual replies, created without pressing a reply button.
Python-Twitter-Tools supports Streaming API. Streaming API is realtime and better than Search API considering the completeness.
Search API
Every response tweet contains the "#username" tag. You can searching using "#username" tag and then filter the tweets using 'in_reply_to_user_id_str' as you have mentioned.
Considering the two options, Streaming API will help you to get what you need easily and reliably.

Categories