Facebook workplace API in Python - python

This is the function that I use for making a call to Workplace API. But I don't see the page_id in the workplace (as opposed to in other Facebook pages).
Has anyone come across the same issue? Any suggestions would be appreciated !!
def testFacebookPageData(page_id, access_token):
base = "https://graph.facebook.com/v2.4"
node = "/" + page_id
parameters = "/?access_token=%s" % access_token
url = base + node + parameters
req = urllib2.Request(url)
response = urllib2.urlopen(req)
data = json.loads(response.read())
print json.dumps(data, indent=4, sort_keys=True)
testFacebookPageData(page_id, access_token) //Function call

There are several ways to get a Workplace custom integration page ID (all badly documented by Facebook). The easiest one is to access your bot page: just search your custom integration name in Workplace search bar (you would be able to find it as if it were a regular user).
Your bot page URL should look like:
https://<your_community>.facebook.com/<custom_integration_name>-<page_id>/
The page ID is the last number of the URL (it is a 15-digit number).
Alternatively, you can also access this Facebook page:
https://developers.facebook.com/tools/debug/accesstoken/
And paste your custom integration access token to see some relevant information of your access token, like permissions, but also the Page ID. It is designed for Facebook, but it works for Workplace as well, since they both share basically the same API.

Related

Getting few results with trying to authenticate and search repositories with github API using python

I'm trying to search for repositories using javascript using python and the github API, and put the links to the repositories in a file.
import requests
from pprint import pprint
username = #my username here!
token = #my token here!
user_data = requests.get(f"https://api.github.com/search/repositories?q=language:js&sort=stars&order=desc", auth=(username,token)).json()
headers = {'Authorization': 'token ' + token}
login = requests.get('https://api.github.com/user', headers=headers)
print(login.json())
f = open("snapshotJS.txt", "w")
for userKeys in user_data.keys():
if userKeys == "items":
for item in user_data[userKeys]:
for lines in item:
if lines == "html_url":
print(item.get(lines))
f.write(item.get(lines) + "\n")
f.close()
When I run the code, I only get 30 links in my textfile every time (granted, they're different links every time I run it). How would I be able to get more than 30 at a time? Since I have a personal token, shouldn't I be able to get up to 5000 requests?
Sorry if it's something small I'm missing, I'm new to API!
The Github API returns 30 entries per page if the size of the page is not specified.
Requests that return multiple items will be paginated to 30 items by default. You can specify further pages with the page parameter. For some resources, you can also set a custom page size up to 100 with the per_page parameter.
To get more records in a page, use the per_page query param.
To get all the records, use a while loop to keep fetching the pages till no page is left.

Query Firebase dynamic link information

When trying to query Google Firebase dynamic link stats I am getting an empty object.
I've got 5 dynamic links in the firebase console which were created via the console. Using the following code I am able to get a token. I used the GCP->IAM->Service Accounts to create a new account and pull down the JSON file. I've ensured the project_id matches the one in firebase.
link = "my_dynamic_link_short_name"
scopes = ["https://www.googleapis.com/auth/firebase"]
credentials = service_account.Credentials.from_service_account_file("key.json", scopes=scopes)
url_base = "https://firebasedynamiclinks.googleapis.com/v1/SHORT_DYNAMIC_LINK/linkStats?durationDays=1"
encoded_link = urllib.parse.quote(link, safe='')
url = url_base.replace('SHORT_DYNAMIC_LINK', encoded_link)
request = Request()
credentials.refresh(request)
access_token = credentials.token
HEADER = {"Authorization": "Bearer " + access_token}
response = requests.get(url, headers=HEADER)
print(response.json())
Both of the above requests return a 200 but no data returned.
The GCP service account I am using has the following roles:
Firebase Admin
Firebase Admin SDK Administrator Service Agent
Service Account Token Creator
I've given it full owner to test and it didn't resolve issue.
FDL Analytics REST API returns an empty object {} if the short link doesn't have analytics data on the specified date range. If you have existing short links in the FDL dashboard that has click data, you can use it to validate if the response from the API matches the data displayed on the dashboard.
If you're still having issues, I suggest filing a ticket https://firebase.google.com/support
Edit: To add, Firebase Dynamic Links click data are aggregated daily and should be updated the next day. For newly created links, give it a day or two for the click data to be updated. This applies on both click data from the API and the one displayed on the dashboard.

Python code to authenticate to website, navigate through links and download files

I'm looking something which could be interesting to you as well.
I'm developing a feature using Python, which should be able to authenticate (using userid/password and/or with other preferred authentication methods) and connect to specify website, navigate through the website and download the file under a specific option.
Later I have to write the schedules on developed code and automate it.
Did anyone come across such scenario and developed the code in python?
Please suggest if any python libraries are there.
What I have achieved right now is:
I can download file with specific URL.
I know how to authenticate and download the file.
I'm able to pull the links from the specific website.
This is something we could achieve using selenium, but I want to write in Python.
After 5 days of research, I found what I wanted. Your urlLogin and urlAuth could be same, its totally depends on what action taken on Login button or form action. I used crome inspect option to findout the actual GET or POST request used on the portal.
Here is the answer of my own question-->
import requests
urlLogin = 'https://example.com/jsp/login.jsp'
urlAuth = 'https://example.com/CheckLoginServlet'
urlBd = 'https://example.com/jsp/batchdownload.jsp'
payload = {
"username": "username",
"password": "password"
}
# Session will be closed at the end of with block
with requests.Session() as s:
s.get(urlLogin)
headers = s.cookies.get_dict()
print(f"Session cookies {headers}")
r1 = s.post(urlAuth, data=payload, headers=headers)
print(f'MainFrame text:::: {r1.status_code}') #200
r2 = s.post(urlBd, data=payload)
print(f'MainFrame text:::: {r2.status_code}') #200
print(f'MainFrame text:::: {r2.text}') #page source
# 3. Again cookies will be used through session to access batch download page
r2 = s.post(config['access-url'])
print(f'Batch Download status:::: {r2.status_code}') #200
source_code = r2.text
# print(f'Batch Download source:::: {source_code}')

how to obtain full HTML content from a google search result page

I am new to web crawling, thanks for helping out. The task I need to perform is to obtain the full returned HTTP response from google search. When searching on Google with a search keyword in browser, in the returned page, there is a section:
Searches related to XXXX (where XXXX is the searched words)
I need to extract this section of the web page. From my research, most of the current package on google crawling are not able to extract this section of information. I tried to use urllib2, with the following code:
import urllib2
url = "https://www.google.com.sg/search? q=test&ie=&oe=#q=international+business+machine&spf=187"
req = urllib2.Request(url, headers={'User-Agent' : 'Mozilla/5.0'})
con = urllib2.urlopen( req )
strs = con.read()
print strs
I am getting a large chunk of text which looks like legit HTTP response, but within the text, there isn't any content related to my searched key "international business machine". I know Google probably detect this is not request from an actual browser hence hide this info. May I know if there is any way to bypass this and obtained the "related search" section of google result? Thanks.
as pointed out by #anonyXmous. the useful post to refer to is here:
Google Search Web Scraping with Python
with
from requests import get
keyword = "internation business machine"
url = "https://google.com/search?q="+keyword
raw = get(url).text
print raw
I am able to get the needed text in "raw"

Python 3 how to retrieve Transifex dashboard page

I'm a Transifex user, I need to retrieve my dashboard page with the list of all the projects of my organization.
that is, the page I see when I login: https://www.transifex.com/organization/(my_organization_name)/dashboard
I can access Transifex API with this code:
import urllib.request as url
usr = 'myusername'
pwd = 'mypassword'
def GetUrl(Tx_url):
auth_handler = url.HTTPBasicAuthHandler()
auth_handler.add_password(realm='Transifex API',
uri=Tx_url,
user=usr,
passwd=pwd)
opener = url.build_opener(auth_handler)
url.install_opener(opener)
f = url.urlopen(Tx_url)
return f.read().decode("utf-8")
everything is ok, but there's no API call to get all the projects of my organization.
the only way is to get that page html, and parse it, but if I use this code, I get the login page.
This works ok with google.com, but I get an error with www.transifex.com or www.transifex.com/organization/(my_organization_name)/dashboard
Python, HTTPS GET with basic authentication
I'm new at Python, I need some code with Python 3 and only standard library.
Thanks for any help.
The call to
/projects/
returns your projects along with all the public projects that you can have access (like what you said). You can search for the ones that you need by modifying the call to something like:
https://www.transifex.com/api/2/projects/?start=1&end=6
Doing so the number of projects returned will be restricted.
For now maybe it would be more convenient to you, if you don't have many projects, to use this call:
/project/project_slug
and fetch each one separately.
Transifex comes with an API, and you can use it to fetch all the projects you have.
I think that what you need this GET request on projects. It returns a list of (slug, name, description, source_language_code) for all projects that you have access to in JSON format.
Since you are familiar with python, you could use the requests library to perform the same actions in a much easier and more readable way.
You will just need to do something like that:
import requests
import json
AUTH = ('yourusername', 'yourpassword')
url = 'www.transifex.com/api/2/projects'
headers = {'Content-type': 'application/json'}
response = requests.get(url, headers=headers, auth=AUTH)
I hope I've helped.

Categories