I have a post on my fb page which I need to update several times a day with data elaborated in a python script. I tried using Selenium, but it gets often stuck when saving the post hence the script gets stuck too, so I'm trying to find a way to do the job within python itself without using a web browser.
I wonder is there a way to edit a FB post using a python library such as Facepy or similar?
I'm reading the graph API reference but there are no examples to learn from, but I guess first thing is to set up the login. On the facepy github page is written that
note that Facepy does not do authentication with Facebook; it only consumes its API. To get an access token to consume the API on behalf of a user, use a suitable OAuth library for your platform
I tried logging in with BeautifulSoup
from bs4 import BeautifulSoup
import requests
import re
def facebook_login(mail, pwd):
session = requests.Session()
r = session.get('https://www.facebook.com/', allow_redirects=False)
soup = BeautifulSoup(r.text)
action_url = soup.find('form', id='login_form')['action']
inputs = soup.find('form', id='login_form').findAll('input', {'type': ['hidden', 'submit']})
post_data = {input.get('name'): input.get('value') for input in inputs}
post_data['email'] = mail
post_data['pass'] = pwd.upper()
scripts = soup.findAll('script')
scripts_string = '/n/'.join([script.text for script in scripts])
datr_search = re.search('\["_js_datr","([^"]*)"', scripts_string, re.DOTALL)
if datr_search:
datr = datr_search.group(1)
cookies = {'_js_datr' : datr}
else:
return False
return session.post(action_url, data=post_data, cookies=cookies, allow_redirects=False)
facebook_login('email', 'psw')
but it gives this error
action_url = soup.find('form', id='login_form')['action']
TypeError: 'NoneType' object is not subscriptable
I also tried with Mechanize
import mechanize
username = 'email'
password = 'psw'
url = 'http://facebook.com/login'
print("opening browser")
br = mechanize.Browser()
print("opening url...please wait")
br.open(url)
print(br.title())
print("selecting form")
br.select_form(name='Login')
br['UserID'] = username
br['PassPhrase'] = password
print("submitting form"
br.submit()
response = br.submit()
pageSource = response.read()
but it gives an error too
mechanize._response.httperror_seek_wrapper: HTTP Error 403: b'request disallowed by robots.txt'
Install the facebook package
pip install facebook-sdk
then to update/edit a post on your page just run
import facebook
page_token = '...'
page_id = '...'
post_id = '...'
fb = facebook.GraphAPI(access_token = page_token, version="2.12")
fb.put_object(parent_object=page_id+'_'+post_id, connection_name='', message='new text')
Related
I am trying to use Beautifulsoup to scrape the post data by using the below code,
but I found that the beautifulsoup fail to login, that cause the scraper return text of all the post and include the header message (text that ask you to login).
Might I know how to modify the code in order to return info for the specific post with that id not all the posts info. Thanks!
import requests
from bs4 import BeautifulSoup
class faceBookBot():
login_basic_url = "https://mbasic.facebook.com/login"
login_mobile_url = 'https://m.facebook.com/login'
payload = {
'email': 'XXXX#gmail.com',
'pass': "XXXX"
}
post_ID = ""
# login to facebook and redirect to the link with specific post
# I guess something wrong happen in below function
def parse_html(self, request_url):
with requests.Session() as session:
post = session.post(self.login_basic_url, data=self.payload)
parsed_html = session.get(request_url)
return parsed_html
# scrape the post all <p> which is the paragraph/content part
def post_content(self):
REQUEST_URL = f'https://m.facebook.com/story.php?story_fbid={self.post_ID}&id=7724542745'
soup = BeautifulSoup(self.parse_html(REQUEST_URL).content, "html.parser")
content = soup.find_all('p')
post_content = []
for lines in content:
post_content.append(lines.text)
post_content = ' '.join(post_content)
return post_content
bot = faceBookBot()
bot.post_ID = "10158200911252746"
You can't, facebook encrypts password and you don't have encryption they use, server will never accept it, save your time and find another way
#AnsonChan yes, you could open the page with selenium, login and then copy it's cookies to requests:
from selenium import webdriver
import requests
driver = webdriver.Chrome()
driver.get('http://facebook.com')
# login manually, or automate it.
# when logged in:
session = requests.session()
[session.cookies.update({cookie['name']: cookie['value']}) for cookie in driver.get_cookies()]
driver.quit()
# get the page you want with requests
response = session.get('https://m.facebook.com/story.php?story_fbid=123456789')
I'm trying to scrape the site data, but facing issue while logging in to the site. when I log in to the site with username and password it does not do so.
I think there is an issue with the token, every time I try to login to the system a token is generated(check in the console headers)
import requests
from bs4 import BeautifulSoup
s = requests.session()
url = "http://indiatechnoborate.tymra.com"
with requests.Session() as s:
first = s.get(url)
start_soup = BeautifulSoup(first.content, 'lxml')
print(start_soup)
retVal=start_soup.find("input",{"name":"return"}).get('value')
print(retVal)
formdata=start_soup.find("form",{"id":"form-login"})
dynval=formdata.find_all('input',{"type":"hidden"})[1].get('name')
print(dynval)
dictdata={"username":"username", "password":"password","return":retVal,dynval:"1"
}
print(dictdata)
pr = {"task":"user.login"}
print(pr)
sec = s.post("http://indiatechnoborate.tymra.com/component/users/",data=dictdata,params=pr)
print("------------------------------------------")
print(sec.status_code,sec.url)
print(sec.text)
I want to log in to the site and want to get the data after login is done
try replacing this line:
dictdata={"username":"username", "password":"password","return":retVal,dynval:"1"}
with this one:
dictdata={"username":"username", "password":"password","return":retVal + "==",dynval:"1"}
hope this helps
Try to use authentication methods instead of passing in payload
import requests
from requests.auth import HTTPBasicAuth
USERNAME = "<USERNAME>"
PASSWORD = "<PASSWORD>"
BASIC_AUTH = HTTPBasicAuth(USERNAME, PASSWORD)
LOGIN_URL = "http://indiatechnoborate.tymra.com"
response = requests.get(LOGIN_URL,headers={},auth=BASIC_AUTH)
I'm very new to python, and I'm trying to scrape a webpage using BeautifulSoup, which requires a log in.
So far I have
import mechanize
import cookielib
import requests
from bs4 import BeautifulSoup
# Browser
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
br.open('URL')
#login form
br.select_form(nr=2)
br['email'] = 'EMAIL'
br['pass'] = 'PASS'
br.submit()
soup = BeautifulSoup(br.response().read(), "lxml")
with open("output1.html", "w") as file:
file.write(str(soup))
(With "URL" "EMAIL" and "PASS" being the website, my email and password.)
Still the page I get in output1.html is the logged out page, rather than what you would see after logging in?
How can I make it so it logs in with the details and returns what's on the page after log in?
Cheers for any help!
Let me suggest another way to obtain desired page.
It may be a little bit easy to troubleshoot.
First, you should login manually with open any browser Developer tools's page Network. After sending login credentials, you will get a line with POST request. Open the request and right side you will get the "form data" information.
Use this code to send login data and get response:
`
from bs4 import BeautifulSoup
import requests
session = requests.Session()
url = "your url"
req = session.get(url)
soup = BeautifulSoup(req.text, "lxml")
# You can collect some useful data here (like csrf code or some token)
#fill in form data here
params = {'login': 'your login',
'password': 'your password'}
req = session.post(url)
I hope this code will be helpful.
I'm trying to configure an Access Point(AP) in my office through HTTP Post method via Python.
I was able to login to the AP through Python HTTP Authentication code but when I click on wireless page of the AP to give inputs such as AP SSID, Channel and Passphrase, I'm getting stuck at this point. There is a apply button at the end of the wireless page.
When I'm trying to do that using the below mentioned code, I don't see any changes getting reflected at the AP side. May be my code is wrong or I'm not following the correct procedure to post the data in the AP. How can I resolve this issue?
import urllib2
import requests
def login():
link = "http://192.168.1.11/start_apply2.htm"
username = 'admin'
password = 'admin'
p = urllib2.HTTPPasswordMgrWithDefaultRealm()
p.add_password(None, link, username, password)
handler = urllib2.HTTPBasicAuthHandler(p)
opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)
age = urllib2.urlopen(link).read()
payload = {'wl_ssid_org': 'nick','wl_wpa_psk_org':12345678}
r = requests.get(link)
r = requests.get(link, params=payload)
r = requests.post(link, params=payload)
login()
Note: When I'm running this code, it was throwing error as: 401 unauthorized. When I'm able to login to the AP using same auth code but why I'm unable to clear the authentication here, I'm not getting it.
In your case, you should change
r = requests.post(link, params=payload)
to
r = requests.post(link, data=payload)
Then you can do the POST request successfully.
You should refer to the requests document to find more tutorials.
Even, you can replace the code using urllib2 with code using requests.
This question has been addresses in various shapes and flavors but I have not been able to apply any of the solutions I read online.
I would like to use Python to log into the site: https://app.ninchanese.com/login
and then reach the page: https://app.ninchanese.com/leaderboard/global/1
I have tried various stuff but without success...
Using POST method:
import urllib
import requests
oURL = 'https://app.ninchanese.com/login'
oCredentials = dict(email='myemail#hotmail.com', password='mypassword')
oSession = requests.session()
oResponse = oSession.post(oURL, data=oCredentials)
oResponse2 = oSession.get('https://app.ninchanese.com/leaderboard/global/1')
Using the authentication function from requests package
import requests
oSession = requests.session()
oResponse = oSession.get('https://app.ninchanese.com/login', auth=('myemail#hotmail.com', 'mypassword'))
oResponse2 = oSession.get('https://app.ninchanese.com/leaderboard/global/1')
Whenever I print oResponse2, I can see that I'm always on the login page so I am guessing the authentication did not work.
Could you please advise how to achieve this?
You have to send the csrf_token along with your login request:
import urllib
import requests
import bs4
URL = 'https://app.ninchanese.com/login'
credentials = dict(email='myemail#hotmail.com', password='mypassword')
session = requests.session()
response = session.get(URL)
html = bs4.BeautifulSoup(response.text)
credentials['csrf_token'] = html.find('input', {'name':'csrf_token'})['value']
response = session.post(URL, data=credentials)
response2 = session.get('https://app.ninchanese.com/leaderboard/global/1')