Tweepy (API V2) - Convert Response into dictionary - python

I want to get the information about the people followed by the Twitter account "POTUS" in a dictionary. My code:
import tweepy, json
client = tweepy.Client(bearer_token=x)
id = client.get_user(username="POTUS").data.id
users = client.get_users_following(id=id, user_fields=['created_at','description','entities','id', 'location', 'name', 'pinned_tweet_id', 'profile_image_url','protected','public_metrics','url','username','verified','withheld'], expansions=['pinned_tweet_id'], max_results=13)
This query returns the type "Response", which in turn stores the type "User":
Response(data=[<User id=7563792 name=U.S. Men's National Soccer Team username=USMNT>, <User id=1352064843432472578 name=White House COVID-19 Response Team username=WHCOVIDResponse>, <User id=1351302423273472012 name=Kate Bedingfield username=WHCommsDir>, <User id=1351293685493878786 name=Susan Rice username=AmbRice46>, ..., <User id=1323730225067339784 name=The White House username=WhiteHouse>], includes={}, errors=[], meta={'result_count': 13})
I've tried ._json and .json() but both didn't work.
Does anyone have any idea how I can convert this response into a dictionary object to work with?
Thanks in advance

Found the soloution! Adding return_type=dict to the client will return everything as a dictionary!
client = tweepy.Client(bearer_token=x, return_type=dict)
However, you then have to change the line to get the User ID a bit:
id = client.get_user(username="POTUS")['data']['id']

You can do
previous_cursor, next_cursor = None, 0
while previous_cursor != next_cursor:
followed_data = api.get_friend_ids(username = "POTUS", cursor = next_cursor)
previous_cursor, next_cursor = next_cursor, followed_data["next_cursor"]
followed_ids = followed_data["id"] #this is a list
# do something with followed_ids like writing them to a file
to get the user ids of the followed accounts.
If you want the usernames and not the ids, you can do something very similar with api.get_friends() but this returns fewer items at a time so if you plan to follow those accounts, using the ids will probably be quicker.

Related

Only get Tweets that mention a country

Is it possible to exclusively gather Tweets which mention countries by name? I am only gathering Tweets from the US.
I know that Twitter allows us to access context_annotations from the payload, and that context_annotations identifies if a tweet mentions a country. Here, https://developer.twitter.com/en/docs/twitter-api/annotations/overview ,they mention that countries is domain number 160 in context annotations.
I'm wondering if it is possible to exclusively gather Tweets that mention country names. I am not familiar with Tweepy, so I've finally managed to obtain Tweets from the US, but am still unable to specify the code to obtain only tweets which mention countries.
This is my current code:
client = tweepy.Client(bearer_token=bearer_token)
# Specify Query
query = ' "favorite country" place_country:US'
start_time = '2022-03-05T00:00:00Z'
end_time = '2022-03-11T00:00:00Z'
tweets = client.search_all_tweets(query=query, tweet_fields=['context_annotations', 'created_at', 'geo'],
place_fields = ['place_type','geo'], expansions='geo.place_id',
start_time=start_time,
end_time=end_time, max_results=10000)
# Prepare to write to csv file
f = open('tweetSheet.csv','w')
writer = csv.writer(f)
# Write to csv file
for tweet in tweets.data:
print(tweet.text)
print(tweet.created_at)
writer.writerow(['0', tweet.id, tweet.created_at, tweet.text])
# Close csv file
f.close()
has:geo:
One way of doing this would be by filtering in tweets that have country attributes.
You can use the has:geo: operator in your query instead of the place_country: operator seen in the Twitter Docs. This way you get all the tweets that are geo tagged, every geo tagged tweet has a country attribute.
includes
Another way would be checking if the tweet has an includes attribute, empty if it has no geo attributes: response.includes != {}. To get the country code if needed then response.includes['places'][0].country works just fine. It is not very well documented in the Tweepy Docs so here are all the geo attributes found in the Twitter Docs for a tweet:
twt_geo = 1602695447298162689
twt_no_geo = 1602719044645408768
response = client.get_tweet(
twt_geo, place_fields=['country', 'country_code', 'place_type', 'name'], expansions=['geo.place_id'])
if(response.includes != {}):
print(response.includes)
print(response.includes['places'][0].country)
print(response.includes['places'][0].country_code)
print(response.includes['places'][0].place_type)
print(response.includes['places'][0].name)
print(response.includes['places'][0].full_name)
print(response.includes['places'][0])
print(response.data.geo)
print(response.data.geo['place_id'])
else:
print(response.data.id)
Hashtags
If you are implying filtering in tweets that have country names as hashtags as country mentions, you can extract the tweet text with response.data.text and compare the country names you would like to filter in.

TypeError in Python Program

Hello so I am writing a program that will prompt for a location, contact a web service and retrieve JSON for the web service and parse that data, and retrieve the first place_id from the JSON.
I am trying to find the place_id for: Shanghai Jiao Tong University
I have my code written, but I just can't get it to work. It has to be a small error because when I run it, I get a message that says
place_id = process_json['results'][0]['place_id']
TypeError: list indices must be integers or slices, not str
Here is my code
import urllib.request, urllib.parse, urllib.error
import json
serviceurl = 'http://py4e-data.dr-chuck.net/geojson??'
while True:
location = input('Enter location: ')
if len(location) < 1: break
url = serviceurl + urllib.parse.urlencode(
{'address': location})
print ('Retrieving', url)
data = urllib.request.urlopen(url)
read_data = data.read().decode()
print ('Retrieved',len(read_data),'characters')
try:
process_json = json.loads(read_data)
except:
process_json = None
place_id = process_json['results'][0]['place_id']
print ('Place id:', place_id)
The problem here is that you're treating a list like a dictionary. A list is, as the name implies, a list of items with an incrementing index, 0, 1, 2... A dictionary is a lot like a list, except its index is named.
The reason your code isn't working is because, the JSON returned from the URL is a list. It looks like this:
[
"AGH University of Science and Technology",
"Academy of Fine Arts Warsaw Poland",
"American University in Cairo",
"Arizona State University",
"Athens Information Technology",
"BITS Pilani",
]
It seems you're trying to find the place_id of a university, however there is no place_id in the data you're searching in. But if there were, your approach was correct, however it did not account for the user not typing in the exact name of the university.

Tweepy: get all friends of a sample of twitter accounts: how to handle protected users

I want to look up all the friends (meaning the twitter users one is following) of a sample of friends of one twitter account, to see what other friends they have in common. The problem is that I don't know how to handle protected accounts, and I keep running into this error:
tweepy.error.TweepError: Not authorized.
This is the code I have:
...
screen_name = ----
file_name = "followers_data/follower_ids-" + screen_name + ".txt"
with open(file_name) as file:
ids = file.readlines()
num_samples = 30
ids = [x.strip() for x in ids]
friends = [[] for i in range(num_samples)]
for i in range(0, num_samples):
id = random.choice(ids)
for friend in tweepy.Cursor(api.friends_ids, id).items():
print(friend)
friends[i].append(friend)
I have a list of all friends from one account screen_name, from which I load the friend ids. I then want to sample a few of those and look up their friends.
I have also tried something like this:
def limit_handled(cursor, name):
try:
yield cursor.next()
except tweepy.TweepError:
print("Something went wrong... ", name)
pass
for i in range(0, num_samples):
id = random.choice(ids)
items = tweepy.Cursor(api.friends_ids, id).items()
for friend in limit_handled(items, id):
print(friend)
friends[i].append(friend)
But then it seems like only one friend per sample friend is stored before moving on to the next sample. I'm pretty new to Python and Tweepy so if anything looks weird, please let me know.
First of all, a couple of comments on naming. The names file and id are protected, so you should avoid using them to name variables - I have changes these.
Secondly, when you initialise your tweepy API, it's clever enough to deal with rate limits if you use wait_on_rate_limit=True and will inform you when it's delayed due to rate limits if you use wait_on_rate_limit_notify=True.
You also lose some information when you set friends = [[] for i in range(num_samples)], as you then won't be able to associate the friends you find with the account they relate to. You can instead use a dictionary, which will associate each ID used with the friends found, allowing for better processing.
My corrected code is as follows:
import tweepy
import random
consumer_key = '...'
consumer_secret = '...'
access_token = '...'
access_token_secret = '...'
# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Creation of the actual interface, using authentication. Use rate limits.
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
screen_name = '----'
file_name = "followers_data/follower_ids-" + screen_name + ".txt"
with open(file_name) as f:
ids = [x.strip() for x in f.readlines()]
num_samples = 30
friends = dict()
# Initialise i
i = 0
# We want to check that i is less than our number of samples, but we also need to make
# sure there are IDs left to choose from.
while i <= num_samples and ids:
current_id = random.choice(ids)
# remove the ID we're testing from the list, so we don't pick it again.
ids.remove(current_id)
try:
# try to get friends, and add them to our dictionary value if we can
# use .get() to cope with the first loop.
for page in tweepy.Cursor(api.friends_ids, current_id).pages():
friends[current_id] = friends.get(current_id, []) + page
i += 1
except tweepy.TweepError:
# we get a tweep error when we can't view a user - skip them and move onto the next.
# don't increment i as we want to replace this user with someone else.
print 'Could not view user {}, skipping...'.format(current_id)
The output is a dictionary, friends, with keys of user IDs and items of the friends for each user.

Tweepy Look up ID with username

I am trying to get a list of IDs from a list of usernames I have. Is there any method that tweepy provides that lets me do lookup user IDs using their username?
Twitter API has the resource https://dev.twitter.com/rest/reference/get/users/lookup for such requirements. It can return user objects for at most 100 users at a time.
You can use this in Tweepy like:
user_objects = api.lookup_users(screen_names=list_of_at_most_100_screen_names)
user_ids = [user.id_str for user in user_objects]
screen_name = unames['username']
enter code here
#my username df
#0 briankrebs
#1 Dejan_Kosutic
#2 msftsecresponse
#3 PrivacyProf
#4 runasand
data = []
def return_twitterid(screen_name):
print("The screen name is: " + screen_name)
twitterid = client.get_user(username=screen_name)
id = twitterid.data.id
return id
for s in range(len(screen_name)):
u_id = return_twitterid(screen_name[s])
data.append(u_id)
print(data)
For anyone who landed here from Google, this is a code snippet that is basically a username to id converter. It supports twitter api V2.
# the usernames variable is a list containing all the usernames you want to convert
usernames = ["POTUS", "VP"]
users = client.get_users(usernames=usernames)
for user in users.data:
print(user.id)

Get the highest value of a specific field from an API reponse in Python

I make a GET to a API
I got this back
{"status":200,"message":"Success","data":[{"email_address":"admin#nyunets.com","password":"admin","account_id":1000,"account_type":"admin","name_prefix":null,"first_name":null,"middle_names":null,"last_name":"Admin","name_suffix":null,"non_person_name":false,"dba":"","display_name":"Admin","address1":"111 Park Ave","address2":"Floor 4","address3":"Suite 4011","city":"New York","state":"NY","postal_code":"10022","nation_code":"USA","phone1":"212-555-1212","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":2,"last_updated_utc_in_secs":1446127072},{"email_address":"mhn#nyu.com","password":"nyu123","account_id":1002,"account_type":"customer","name_prefix":"","first_name":"MHN","middle_names":"","last_name":"User","name_suffix":"","non_person_name":false,"dba":"","display_name":"MHNUser","address1":"3101 Knox St","address2":"","address3":"","city":"Dallas","state":"TX","postal_code":"75205","nation_code":"USA","phone1":"8623875097","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":2,"last_updated_utc_in_secs":1461166172},{"email_address":"mhn1#nyu.com","password":"nyu123","account_id":1004,"account_type":"customer","name_prefix":"","first_name":"MHN1","middle_names":"","last_name":"User","name_suffix":"","non_person_name":false,"dba":"","display_name":"MHN1User","address1":"1010 Rosedale Shopping Center","address2":"","address3":"","city":"Roseville","state":"MN","postal_code":"55113","nation_code":"USA","phone1":"8279856982","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":2,"last_updated_utc_in_secs":1461166417},{"email_address":"location#nyu.com","password":"nyu123","account_id":1005,"account_type":"customer","name_prefix":"","first_name":"BB","middle_names":"","last_name":"HH","name_suffix":"","non_person_name":false,"dba":"","display_name":"BBHH","address1":"9906 Beverly Dr","address2":"9906 Beverly Dr","address3":"","city":"Beverly Hills","state":"CA","postal_code":"90210","nation_code":"90210","phone1":"3105559906","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":1,"last_updated_utc_in_secs":1461167224},{"email_address":"mbn1#nyu.com","password":"nyu123","account_id":1003,"account_type":"customer","name_prefix":"","first_name":"MBN1","middle_names":"","last_name":"User","name_suffix":"","non_person_name":false,"dba":"","display_name":"MBN1User","address1":"3200 S Las Vegas Blvd","address2":"","address3":"","city":"Las Vegas","state":"NV","postal_code":"89109","nation_code":"USA","phone1":"9273597497","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":1,"last_updated_utc_in_secs":1461593233},{"email_address":"mbn#nyu.com","password":"nyu123","account_id":1001,"account_type":"customer","name_prefix":"","first_name":"MBN","middle_names":"","last_name":"User","name_suffix":"","non_person_name":false,"dba":"","display_name":"MBNUser","address1":"300 Concord Road","address2":"","address3":"","city":"Billerica","state":"MA","postal_code":"01821","nation_code":"USA","phone1":"8127085695","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":1,"last_updated_utc_in_secs":1461784499},{"email_address":"usermbn#nyu.com","password":"nyu123","account_id":1006,"account_type":"customer","name_prefix":"","first_name":"User","middle_names":"","last_name":"MBN","name_suffix":"","non_person_name":false,"dba":"","display_name":"UserMBN","address1":"75 Saint Alphonsus Street","address2":"","address3":"","city":"Boston","state":"MA","postal_code":"01821","nation_code":"USA","phone1":"8127085695","phone2":"","phone3":"","time_zone_offset_from_utc":-5,"customer_type":1,"last_updated_utc_in_secs":1462285561},{"email_address":"emile.barnaby#example.com","password":"nyu123","account_id":2000,"account_type":"customer","name_prefix":"","first_name":"emile","middle_names":"","last_name":"barnaby","name_suffix":"","non_person_name":false,"dba":"","display_name":"emilebarnaby","address1":"300 Concord Rd","address2":"","address3":"","city":"8239grandmaraisave","state":"manitoba","postal_code":"56798","nation_code":"USA","phone1":"414-140-1435","phone2":"414-140-1435","phone3":"414-140-1435","time_zone_offset_from_utc":-5,"customer_type":1,"last_updated_utc_in_secs":1462211572}]}
I have
import requests
import json
url = "http://api/users"
accounts = requests.get(url).json()
data = json.loads(accounts)
object_with_max_account_id = max(accounts['data'], key=lambda x: x['account_id'])
print(object_with_max_account_id['account_id'])
Goal
is to get the highest account id out of it.
Usually we like to see what OPs try themselves, this is pretty much straightforward.
import requests
url = "http://api/users"
accounts = requests.get(url).json()
object_with_max_account_id = max(accounts['data'], key=lambda x: x['account_id'])
print(object_with_max_account_id['account_id'])
>> 2000
Edit: Apparently, you first need to parse your input as JSON.
Check out simplejson.
import simplejson as json
data_obj = json.loads(data)
The s in loads means load from string.
Then, if you want to be looping through, how about something like:
maxID= -1
for account in data_obj:
if(account[account_id])>maxID:
maxID= account[account_id]
print "Max ID is %d" % maxID

Categories