Accessing LinkedIn data via API using python (and authorisation in general) - python

I'm trying to access LinkedIn data via API (I don't have an app, I just want to access company data - or see what can be accessed). There are other questions here on this topic, but most are out of date (using packagaes which precede LinkedIn's current authorisation process).
I followed the LinkedIn documentation on authorisation: https://developer.linkedin.com/docs/oauth2
I created an application (using a nonsense website url as I do not have a website). This gave me a Client ID and Client Secret.
Using (out of date) stuff from LinkedIn (https://github.com/linkedin/api-get-started/blob/master/python/tutorial.py) I wrote:
import oauth2 as oauth
import urllib.parse as urlparse
consumer_key = 'my client id e.g. sjd6ffdf6262d'
consumer_secret = 'my customer secret e.g. d77373hhfh'
request_token_url = 'https://api.linkedin.com/uas/oauth/requestToken'
access_token_url = 'https://api.linkedin.com/uas/oauth/accessToken'
authorize_url = 'https://api.linkedin.com/uas/oauth/authorize'
consumer = oauth.Consumer(consumer_key, consumer_secret)
client = oauth.Client(consumer)
resp,content = client.request(request_token_url, "POST")
request_token = dict(urlparse.parse_qsl(content))
clean_request_token = {}
for key in request_token.keys():
clean_request_token[key.decode('ascii')] = request_token[key].decode('ascii')
request_token = clean_request_token
print ("Go to the following link in your browser:")
print ("%s?oauth_token=%s" % (authorize_url, request_token['oauth_token']
This link takes me to a website where I 'give permission', and am then shown a pin code. Using this pin (called oauth_verifier here):
oauth_verifier = 12345
token = oauth.Token(request_token['oauth_token'],
request_token['oauth_token_secret'])
token.set_verifier(oauth_verifier)
client = oauth.Client(consumer, token)
content = client.request(access_token_url,"POST")
access_token = dict(urlparse.parse_qsl(content[1]))
clean_access_token = {}
for key in access_token.keys():
clean_access_token[key.decode('ascii')] = access_token[key].decode('ascii')
access_token = clean_request_token
token = oauth.Token(key=access_token['oauth_token'],secret=access_token['oauth_token_secret'])
client = oauth.Client(consumer, token)
response = client.request("http://api.linkedin.com/v1/companies/barclays")
This response has a 401 code, due to "The token used in the OAuth request has been revoked."
The underlying problems are:
I don't really get how APIs work, how they work with python, how authorisation works or how to know the api url I need.
In case relevant, I have experience web scraping (using requests plus beautiful soup to parse) but not with APIs.

I eventually worked it out, posting here in case anyone comes this way. Before you invest time, I also found out that the freely available API now only allows you to access your own profile or company page. So you can write an app that allows a user to post to their own page, but you can't write something to grab data. See here:
LinkedIn API unable to view _any_ company profile
Anyway, to get the limited API working, you need to:
Create a LinkedIn account, create an application and add a redirect URL to your application page (I used http://localhost:8000). This doc says how to set up the app: https://developer.linkedin.com/docs/oauth2
Following the steps in the above link, but in python, you make a request to gain an "access code".
html = requests.get("https://www.linkedin.com/oauth/v2/authorization",
params = {'response_type':'code','client_id':client_id,
'redirect_uri':'http://localhost:8000',
'state':'somestring'})
print html.url to get a huge link - click on it. You'll be asked to login and allow access, and then you'll be redirected to your redirect url. There'll be nothing there, but the url will have a long "access code" on the end of it. Pull this out and send it to LinkedIn with a Post request:
token = requests.post('https://www.linkedin.com/oauth/v2/accessToken',
data = {'grant_type':'authorization_code','code':access_code,
'redirect_uri':'http://localhost:8000',
'client_id':client_id,'client_secret':client_secret})
token.content will contain an "access_token". This is what is needed to access the API. e.g. to access your own profile:
headers = {'x-li-format': 'json', 'Content-Type': 'application/json'}
params = {'oauth2_access_token': access_token}
html = requests.get("https://api.linkedin.com/v1/people/~",headers=headers,params = params)
Hopefully that's useful to someone starting from scratch, the info is mostly out there but there are lots of assumed steps (like how to use the access token with requests).

Related

Access LinkedIn Profile with Python

I am trying to computationally access my own LinkedIn profile via API to download my own posts. There are three recent Python wrappers to access my profile, e.g. linkedin-sdk, pawl, LinkedIn V2. However, I have been unable to make them work. The problem is the authentication. I have seen the famous LinkedIn-API wrapper, but its authentication process is complex and difficult probably due to LinkedIn changing its authentication process and access scope.
Based on this tutorial from last year I have been able to access my own profile to view my name, country, language and id.
import requests
#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
URL = "https://www.linkedin.com/oauth/v2/authorization"
client_id= 'XXXX'
redirect_uri = 'http://localhost:8080/login'
scope='r_liteprofile'
PARAMS = {'response_type':'code', 'client_id':client_id, 'redirect_uri':redirect_uri, 'scope':scope}
r = requests.get(url = URL, params = PARAMS)
return_url = r.url
print('Please copy the URL and paste it in browser for getting authentication code')
print('')
print(return_url)
get_auth_link()
# Make a POST request to exchange the Authorization Code for an Access Token
import json
def get_access_token():
headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
AUTH_CODE = 'XXXX'
ACCESS_TOKEN_URL = 'https://www.linkedin.com/oauth/v2/accessToken'
client_id= 'XXXX'
client_secret= 'XXXX'
redirect_uri = 'http://localhost:8080/login'
PARAM = {'grant_type': 'authorization_code',
'code': AUTH_CODE,
'redirect_uri': redirect_uri,
'client_id': client_id,
'client_secret': client_secret}
response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
data = response.json()
print(data)
access_token = data['access_token']
return access_token
get_access_token()
access_token = 'XXXX'
def get_profile(access_token):
URL = "https://api.linkedin.com/v2/me"
headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
response = requests.get(url=URL, headers=headers)
print(response.json())
get_profile(access_token)
As soon as I change the scope from r_liteprofile to r_basicprofile I get the an unauthorized_scope_error: r_basicprofile is not authorised for your application. In my developpers page I have the scopesr_emailaddress, r_liteprofile and w_member_social authorised. But only r_liteprofile works. From what I understand from the LinkedIn documentation, comments cannot be downloaded?
So the big question really is, can comments be downloaded via API?
Bots or scrapers are not an option as they require explicit permission from LinkedIn, which I do not have.
Up-date: so no illegal solutions please. I knew before I have written this post that they exist.
Thanks for your help!
I found that login with the linkedin-api by tomquirk was really easy. However, a KeyError was raised when a post does not have any comment. I fixed it in a fork and just submitted a pull request. If you install the fork with python setup.py install, following code will get all your posts with comments:
from linkedin_api import Linkedin
import getpass
print("Please enter your LinkedIn credentials first (2FA must be disabled)")
username = input("user: ")
password = getpass.getpass('password: ')
api = Linkedin(username, password)
my_public_id = api.get_user_profile()['miniProfile']['publicIdentifier']
my_posts = api.get_profile_posts(public_id=my_public_id)
for post in my_posts:
post_urn = post['socialDetail']['urn'].rsplit(':', 1)[1]
print('POST:' + post_urn + '\n')
comments = api.get_post_comments(post_urn, comment_count=100)
for comment in comments:
commenter = comment['commenter']['com.linkedin.voyager.feed.MemberActor']['miniProfile']
print(f"\t{commenter['firstName']} {commenter['lastName']}: {comment['comment']['values'][0]['value']}\n")
Note: this does not use the official API, and according to the README.md:
This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account.
However, as long as you scrape comments only from your own account you should be fine.
There are two legal options to download comments that do not breach LinkedIn's terms and conditions. Both require LinkedIn's permission.
Option A: Comment API
The Comment API is part of the Page Management APIs which in turn is part of the Marketing Developer Program (MDP). LinkedIn describes the application process for its marketing developer program here. It requires filling out a form specifying the use case. Then LinkedIn decides whether to grant access. These use cases will be restricted or not approved.
Option B: Web crawling and scraping LinkedIn with an exemption (whitelist) Exemption process is described here.
I am going for option A. Let's see if they give me access. I'll up-date the post accordingly.
Up-date 19/05/2022
LinkedIn has granted the permissions for the MDP. It took about 2 weeks.
Up-date 27/05/2022
Here is a great tutorial to get individual posts. Getting company page posts is another story - entirely - so opened a new query

Can't register webhook for Twitter Account Activity API [python3]

I'm trying to set up a Twitter app using the Account Activity API, to replace my old set up which used the user streaming endpoint. I want to be able to get DM messages to one user sent to a particular URL in real time.
Following these migration instructions, I've set up a webhook endpoint on my site, as described here. I've checked that process is working, by making sure that when I open https://example.com/webhook_endpoint?crc_token=foo in my browser, I get a token in response.
Now I'm trying and failing to register my webhook. I'm using the following code, and getting a 403 response.
from requests_oauthlib import OAuth1Session
import urllib
CONSUMER_KEY = 'my consumer key'
CONSUMER_SECRET = 'my consumer secret'
ACCESS_TOKEN = 'my access token'
ACCESS_SECRET = 'my access secret'
twitter = OAuth1Session(CONSUMER_KEY,
client_secret=CONSUMER_SECRET,
resource_owner_key=ACCESS_TOKEN,
resource_owner_secret=ACCESS_SECRET)
webhook_endpoint = urllib.parse.quote_plus('https://example.com/webhook/')
url = 'https://api.twitter.com/1.1/account_activity/all/env-beta/'
'webhooks.json?url={}'.format(webhook_endpoint)
r = twitter.post(url)
403 response content: {"errors":[{"code":200,"message":"Forbidden."}]}
I can successfully post a status using the same session object and
r = twitter.post('https://api.twitter.com/1.1/statuses/update.json?status=Test')
What am I doing wrong here?
This turned out to be due to a combination of:
Not having created an environment here: https://developer.twitter.com/en/account/environments as described here: https://developer.twitter.com/en/docs/accounts-and-users/subscribe-account-activity/guides/getting-started-with-webhooks
using the wrong consumer secret in the function that created the token returned at the /webhook endpoint

How To Get LinkedIn Return URL after Authentication

I am trying to use https://github.com/ozgur/python-linkedin
if __name__ == '__main__':
API_KEY = 'XXXXXXXXXXX'
API_SECRET = 'XXXXXXXXXX'
RETURN_URL = 'http://localhost:8080'
authentication = LinkedInAuthentication(API_KEY, API_SECRET, RETURN_URL, PERMISSIONS.enums.values())
print authentication.authorization_url
This code prints:
https://www.linkedin.com/uas/oauth2/authorization?scope=r_basicprofile%20r_emailaddress%20w_share&state=1e7f21566bdd75cee5d71d0615f338ad&redirect_uri=http%3A//localhost%3A8080&response_type=code&client_id=XXXXXXXXXX
When I click this link, LinkedIn authentication page is opening. After authentication, it returns that URL:
http://localhost:8080/?code=CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC&state=f40c8d9ec67df057e72eeb235f4d8e7c
If I copy that CCCCCCCC... code quickly and use it in
application = LinkedInApplication(authentication)
authentication.authorization_code = "CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC"
application = LinkedInApplication(token=authentication.get_access_token())
print application.get_profile()
it prints the profile information that I need. But I don't want to click to URL and copy that CCCCC... code into my python code. I want to do these programmatically. I need to get that authorization_code. How can I do that?
I run into an invalid redirect_uri error when I follow the link that I get, so I can only help you get started.
If the code was on the page you are linked to, you could simply HTTP GET the page and extract the code. However, since you say you must load the page in your browser, authenticate yourself, then get the code from the resulting page, the process is a bit more complicated. Can you try a GET on the page with your credentials already included?
import requests
if __name__ == '__main__':
API_KEY = 'XXXXXXXXXXX'
API_SECRET = 'XXXXXXXXXX'
RETURN_URL = 'http://localhost:8080'
authentication = LinkedInAuthentication(API_KEY, API_SECRET, RETURN_URL, PERMISSIONS.enums.values())
auth_url = authentication.authorization_url
#Create an HTTP request with your credentials included
response = requests.get(auth_url, auth=requests.auth.HTTPBasicAuth('username', 'password'))
print response
print response.text
This is just to get you started. I believe that authenticating yourself programmatically is possible, but it may be slightly more complicated than this basic approach. Let me know what this returns for you.

How to retrieve linkedin information of other people using python and oauth 2?

1) Followed the steps for keys creation on developer site of linkedin.
2) Works well to get my information using Python and oauth 2.0:
import oauth2 as oauth
import time
url = "http://api.linkedin.com/v1/people/~"
consumer_key = 'my_app_key'
consumer_secret = 'my_app_secret_key'
oath_key = 'oath_key'
oath_secret = 'oath_secret_key'
consumer = oauth.Consumer(
key=consumer_key,
secret=consumer_secret)
token = oauth.Token(
key=oath_key,
secret=oath_secret)
client = oauth.Client(consumer, token)
resp, content = client.request(url)
print resp
print content
But, I want to know the information of other people, e.g. to get the info based on first_name, last_name, and company.
There seems to be good information at https://developer.linkedin.com/documents/profile-api
but, cannot get through it.
What exactly is "id" value?
You can't do that. You can get public profile data of somebody if you know their public profile url or their unique member id, but you can't query on anything else.

Oauth client initialization in python for tumblr API using Python-oauth2

I'm new to Oauth. In the past for twitter applications written in Python i used python-oauth2 library to initialize client like this:
consumer = oauth.Consumer(key = CONSUMER_KEY, secret = CONSUMER_SECRET)
token = oauth.Token(key = ACCESS_KEY, secret = ACCESS_SECRET)
client = oauth.Client(consumer, token)
That was easy because twitter provides both CONSUMER and ACCESS keys and secrets. But now i need to do the same for tumblr. The problem is that tumblr provides only CONSUMER_KEY, CONSUMER_SECRET and these urls:
Request-token URL http://www.tumblr.com/oauth/request_token
Authorize URL http://www.tumblr.com/oauth/authorize
Access-token URL http://www.tumblr.com/oauth/access_token
Using this data how can i initialize client to access tumblr API?
UPD
jterrace suggested a code i tried to use before. The problem with it is oauth_callback. If i don't specify any, api returns error "No oauth_callback specified", but if i do specify some url like "http://example.com/oauthcb/" and follow the link http://www.tumblr.com/oauth/authorize?oauth_token=9ygTF..., then press Allow button, tumblr doesn't show any PIN code page, it immediately redirects to that callback url, which is useless since it's desktop application. Why PIN code isn't shown?
UPD 2
Tumblr API doesn't support PIN code authorization. Use xAuth instead - https://groups.google.com/group/tumblr-api/browse_thread/thread/857285e6a2b4268/15060607dc306c1d?lnk=gst&q=pin#15060607dc306c1d
First, import the oauth2 module and set up the service's URL and consumer information:
import oauth2
REQUEST_TOKEN_URL = 'http://www.tumblr.com/oauth/request_token'
AUTHORIZATION_URL = 'http://www.tumblr.com/oauth/authorize'
ACCESS_TOKEN_URL = 'http://www.tumblr.com/oauth/access_token'
CONSUMER_KEY = 'your_consumer_key'
CONSUMER_SECRET = 'your_consumer_secret'
consumer = oauth2.Consumer(CONSUMER_KEY, CONSUMER_SECRET)
client = oauth2.Client(consumer)
Step 1: Get a request token. This is a temporary token that is used for
having the user authorize an access token and to sign the request to obtain
said access token.
resp, content = client.request(REQUEST_TOKEN_URL, "GET")
request_token = dict(urlparse.parse_qsl(content))
print "Request Token:"
print " - oauth_token = %s" % request_token['oauth_token']
print " - oauth_token_secret = %s" % request_token['oauth_token_secret']
Step 2: Redirect to the provider. Since this is a CLI script we do not
redirect. In a web application you would redirect the user to the URL
below.
print "Go to the following link in your browser:"
print "%s?oauth_token=%s" % (AUTHORIZATION_URL, request_token['oauth_token'])
# After the user has granted access to you, the consumer, the provider will
# redirect you to whatever URL you have told them to redirect to. You can
# usually define this in the oauth_callback argument as well.
oauth_verifier = raw_input('What is the PIN? ')
Step 3: Once the consumer has redirected the user back to the oauth_callback
URL you can request the access token the user has approved. You use the
request token to sign this request. After this is done you throw away the
request token and use the access token returned. You should store this
access token somewhere safe, like a database, for future use.
token = oauth2.Token(request_token['oauth_token'], request_token['oauth_token_secret'])
token.set_verifier(oauth_verifier)
client = oauth2.Client(consumer, token)
resp, content = client.request(ACCESS_TOKEN_URL, "POST")
access_token = dict(urlparse.parse_qsl(content))
print "Access Token:"
print " - oauth_token = %s" % access_token['oauth_token']
print " - oauth_token_secret = %s" % access_token['oauth_token_secret']
print
Now that you have an access token, you can call protected methods with it.
EDIT: Turns out that tumblr does not support the PIN authorization method. Relevant post here.
If you just want to gain an access-token/secret to sign, you could just setup your callback URL as: http://localhost/blah
Fireup the CLI-app (after modifying the callback-url, secret and token ofcourse)
Follow the link in your browser
Allow app
View addressbar of the page you've been redirected to in the browser after allowing your app. It should look something like:
http://localhost/blah?oauth_token=xxxxxxxxxxxxxxxxxxxxxxxxxx0123456789ABCDEFGHIJKLMN&oauth_verifier=XXXXXXXXXXXXXXXXXXXXXXXXX0123456789abcdefghijklmn
Use the value of the query-parameter 'oauth_verifier' as your PIN:
XXXXXXXXXXXXXXXXXXXXXXXXX0123456789abcdefghijklmn
The CLI should print out your oauth-token and oauth-token-secret.
HTH! Got this working for tumblr in this way :)
Have a look at https://github.com/ToQoz/Pyblr
It uses oauth2 and urllib to provide a nice wrapper for exactly what you're trying to do.
It seems that what you're trying to do is access an OAuth 1 API with an OAuth 2 client.
See https://github.com/simplegeo/python-oauth2 and look for “three-legged OAuth example”.
had this problem with oauth2 and facebook.
#deepvanbinnen's answer lead me into the right direction.
facebook actually redirected to a page similar to this
'http://localhost/blah?code=AQAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX#_=_'
using then the ' AQAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX#_=_ as the PIN actually got me the access to the requested facebook account.
#jterrance's answer is good. However, realize it is a one _time_ manual procedure to get the access token. The access token is the key that you use for all subsequent API calls. (That's why he recommends saving the access token in a database.) The string referred to as 'PIN' (aka the verification key) is not necessarily a number. It can be a printable string in any form. That verification key is displayed on the authorization page at the URL printed in step 2 then pasted into the prompt for a the 'PIN'.

Categories