How do you create blogger posts using google blogger api with python? - python

I have successfully used the google api sandbox and been able to create posts on my blogger website, that was by using it through HTTP post requests.
POST https://blogger.googleapis.com/v3/blogs/206150456/posts?key=[YOUR_API_KEY] HTTP/1.1
Authorization: Bearer [YOUR_ACCESS_TOKEN]
Accept: application/json
Content-Type: application/json
{
"title": "Test",
"content": "Hello World Test"
}
Basically I want to convert the above code into Python code.
Attempts :
my code so far is
payload = '{"title": "A new post", "content": "With <b>exciting</b> content..."}'
r = requests.post("https://www.googleapis.com/blogger/v3/blogs/" + blogid + "/posts/" + auth + "/application/json/" + payload)
And I get the response
b'<!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\n **<title>Error 404** (Not Found)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}#media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}#media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}#media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>404.</b> <ins>That\xe2\x80\x99s an error.</ins>\n <p>The requested URL <code>/blogger/v3/blogs/2061504564313173903/posts/48738461363-jcoqroi84fk9q81nfsdcsvup6ve7hj75.apps.googleusercontent.com/application/json/%7B%22title%22:%20%22A%20new%20post%22,%20%22content%22:%20%22With%20%3Cb%3Eexciting%3C/b%3E%20content...%22%7D</code> was not found on this server. <ins>That\xe2\x80\x99s all we know.</ins>\n'
Its not a authorisation error but more like a formatting error ? = Error 404
I was expecting to see a new post generated in on my blog

Related

Flask Iframe Cookie SameSite=Lax Issue

I have a flask application with a few custom built tools. I'm trying to bring in some other tools into that flask application to have a single place for everything. One of those tools is MicroStrategy. I'm rendering a template and the MicroStrategy login page is working, but when I log in, it just kicks me back to the login page. When I look at the request, there are two Set-Cookie's in the header with errors.
Is it possible to do what I'm trying to do? A way to read the headers from the MicroStrategy page in the iframe and modify SameSite=None?
Here is my flask app:
#dash_app.server.route("/mstr")
def mstr():
resp = make_response(render_template("mstr.html"))
return resp
mstr.html:
<div style="position:fixed; width:100%; top:50px; left:0px; right:0px; bottom:0px; z-index:1;">
<iframe src="https://webserver.com/MicroStrategy/asp/Main.aspx" title="MicroStrategy" style="width:100%; height:100%; border:none; margin:0; padding:0; overflow:hidden;"></iframe>
</div>

What is the correct 3-legged OAuth workflow in Python (Example: ImmoScout API) ? (How to get request_token)

I am trying to access the ImmoScout24 web api for a data science project in python and I kind of stuck in the 3 legged authentication process. I googled the problem but its kind of special, so maybe someone can help me.
I want to implement the workflow described on: https://api.immobilienscout24.de/api-docs/authentication/three-legged/#callback-url
To obtain the request_token (first step within the authentication process) I tried the following approach:
API Credentials are stored in those two variables:
client_key
client_secret
The Python code looks like follows
immoscout_api = OAuth1Session(client_key,
client_secret=client_secret)
request_token_url='http://rest.immobilienscout24.de/restapi/security/oauth/request_token'
fetch_response = immoscout_api.fetch_request_token(request_token_url)
I am getting an Error in my Jupyter Notebook that looks like the following:
TokenRequestDenied: Token request failed with code 403, response was '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>403 ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Bad request.
We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.
<BR clear="all">
If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: M_HHRf9VaNN9xFRqWlHWt2txfuIsBE5fe6siJACFUFjVWw20p91jLg==
</PRE>
<ADDRESS>
</ADDRESS>
</BODY></HTML>'.
Can somebody help me to obtain the request toke?

Unable to connect using selenium driver.get()

I am new to Python (Selenium, Scrapy, etc.) & Web-Scraping in general, but I am pretty familiar with other languages such as Java , so please forgive me if I am missing something very simple!
My end goal is to visit a page, sit there for around 10 seconds and then close the browser and repeat. However, I am trying to practice rotating my IP address via proxy with each request. I have been able to accomplish visiting the page but when I try to throw the rotating Proxy in the mix, I get a long connection error that I can't seem to figure out that seems to include a bunch of CSS.
Complete Code Snippet
The issue seems to be caused by the second line in the try-block where the driver is trying to access the website
import scrapy
import requests
from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType
from scrapy.http import Request
from lxml.html import fromstring
from itertools import cycle
class VisitPageSpider(scrapy.Spider):
name = 'visitpage'
allowed_domains = ['books.toscrape.com']
def start_requests(self):
test_url = 'http://books.toscrape.com'
proxies = self.get_proxies()
proxy_pool = cycle(proxies)
prox = Proxy()
prox.proxy_type = ProxyType.MANUAL
view_count = 0
url = 'https://httpbin.org/ip'
for i in range(1, 11):
proxy = next(proxy_pool)
prox.http_proxy = proxy
prox.socks_proxy = proxy
prox.ssl_proxy = proxy
capabilities = webdriver.DesiredCapabilities.INTERNETEXPLORER
prox.add_to_capabilities(capabilities)
print("Request #%d" % i)
try:
self.driver = webdriver.Ie(desired_capabilities=capabilities)
self.driver.get(test_url)
view_count += 1
time.sleep(10)
self.driver.quit()
except:
print("Skipping. Connection error")
print('Total New Views ' + view_count)
yield Request(test_url, callback=self.visit_page)
def visit_page(self, response):
pass
def get_proxies(self):
url = 'https://free-proxy-list.net/'
response = requests.get(url)
parser = fromstring(response.text)
proxies = set()
for i in parser.xpath('//tbody/tr')[:10]:
if i.xpath('.//td[7][contains(text(),"yes")]'):
proxy = ":".join([i.xpath('.//td[1]/text()')[0], i.xpath('.//td[2]/text()')[0]])
proxies.add(proxy)
print(proxies)
return proxies
CMD Output
For the first 2 lines in the try block respectively
2018-07-26 18:19:21 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://127.0.0.1:52898/session {"capabilities": {"firstMatch": [{}], "alwaysMatch": {"browserName": "internet explorer", "platformName": "windows", "proxy": {"proxyType": "manual", "httpProxy": "46.227.162.167:8080", "sslProxy": "46.227.162.167:8080", "socksProxy": "46.227.162.167:8080"}}}, "desiredCapabilities": {"browserName": "internet explorer", "version": "", "platform": "WINDOWS", "proxy": {"proxyType": "MANUAL", "httpProxy": "46.227.162.167:8080", "sslProxy": "46.227.162.167:8080", "socksProxy": "46.227.162.167:8080"}}}
2018-07-26 18:19:21 [selenium.webdriver.remote.remote_connection] DEBUG: b'<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">\n<html><head>\n<meta type="copyright" content="Copyright (C) 1996-2015 The Squid Software Foundation and contributors">\n<meta http-equiv="Content-Type" CONTENT="text/html; charset=utf-8">\n<title>ERROR: The requested URL could not be retrieved</title>\n<style type="text/css"><!-- \n /*\n * Copyright (C) 1996-2016 The Squid Software Foundation and contributors\n *\n * Squid software is distributed under GPLv2+ license and includes\n * contributions from numerous individuals and organizations.\n * Please see the COPYING and CONTRIBUTORS files for details.\n */\n\n/*\n Stylesheet for Squid Error pages\n Adapted from design by Free CSS Templates\n http://www.freecsstemplates.org\n Released for free under a Creative Commons Attribution 2.5 License\n*/\n\n/* Page basics */\n* {\n\tfont-family: verdana, sans-serif;\n}\n\nhtml body {\n\tmargin: 0;\n\tpadding: 0;\n\tbackground: #efefef;\n\tfont-size: 12px;\n\tcolor: #1e1e1e;\n}\n\n/* Page displayed title area */\n#titles {\n\tmargin-left: 15px;\n\tpadding: 10px;\n\tpadding-left: 100px;\n\tbackground: url(\'/squid-internal-static/icons/SN.png\') no-repeat left;\n}\n\n/* initial title */\n#titles h1 {\n\tcolor: #000000;\n}\n#titles h2 {\n\tcolor: #000000;\n}\n\n/* special event: FTP success page titles */\n#titles ftpsuccess {\n\tbackground-color:#00ff00;\n\twidth:100%;\n}\n\n/* Page displayed body content area */\n#content {\n\tpadding: 10px;\n\tbackground: #ffffff;\n}\n\n/* General text */\np {\n}\n\n/* error brief description */\n#error p {\n}\n\n/* some data which may have caused the problem */\n#data {\n}\n\n/* the error message received from the system or other software */\n#sysmsg {\n}\n\npre {\n font-family:sans-serif;\n}\n\n/* special event: FTP / Gopher directory listing */\n#dirmsg {\n font-family: courier;\n color: black;\n font-size: 10pt;\n}\n#dirlisting {\n margin-left: 2%;\n margin-right: 2%;\n}\n#dirlisting tr.entry td.icon,td.filename,td.size,td.date {\n border-bottom: groove;\n}\n#dirlisting td.size {\n width: 50px;\n text-align: right;\n padding-right: 5px;\n}\n\n/* horizontal lines */\nhr {\n\tmargin: 0;\n}\n\n/* page displayed footer area */\n#footer {\n\tfont-size: 9px;\n\tpadding-left: 10px;\n}\n\n\nbody\n:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }\n:lang(he) { direction: rtl; }\n --></style>\n</head><body id=ERR_CONNECT_FAIL>\n<div id="titles">\n<h1>ERROR</h1>\n<h2>The requested URL could not be retrieved</h2>\n</div>\n<hr>\n\n<div id="content">\n<p>The following error was encountered while trying to retrieve the URL: http://127.0.0.1:52898/session</p>\n\n<blockquote id="error">\n<p><b>Connection to 127.0.0.1 failed.</b></p>\n</blockquote>\n\n<p id="sysmsg">The system returned: <i>(111) Connection refused</i></p>\n\n<p>The remote host or network may be down. Please try the request again.</p>\n\n<p>Your cache administrator is webmaster.</p>\n\n<br>\n</div>\n\n<hr>\n<div id="footer">\n<p>Generated Fri, 27 Jul 2018 04:19:20 GMT by vps188962 (squid/3.5.23)</p>\n<!-- ERR_CONNECT_FAIL -->\n</div>\n</body></html>\n'
My guess is that this is a problem with these proxies. Free proxies are often unreliable (in my experience - very often) and you must be prepared for them to yield anything realistically possible - errors, timeouts or even mangled responses. The second line of your log seems like a generic response from squid proxy software indicating a proxy error in this case.

AWS API Gateway and Python Lambda returning HTML

I'm trying to return a webpage from my python lambda ftn, using API GW. Instead, I'm getting my page embeded in a tag within the body, instead of the return value being the full page ( header, body, etc... without the pre>
Any suggestions what I might be doing wrong
Thanks
The <pre> tag you are seeing is the browser trying to show you the text returned from server. It is not part of the returned from the Lambda function.
To get it working you need to get the lambda function set the response HTTP header with Content-Type: 'text/html'
for example:
response = {
"statusCode": 200,
"body": content,
"headers": {
'Content-Type': 'text/html',
}
}
You have to configure the API Gateway to return the proper Content-Type.
From the API Gateway click on the API you created
Click on "Method Response"
Expand the row for Method response status 200. Click "Add Header" and add a "Content-Type" entry.
Go back to the API you created by clicking "<- Method Execution"
Click on "Integration Response"
Expand the row for Method response status 200
Click "Add mapping template"
Type "text/html" without quotes for the Content-Type and click the checkbox button
In the template area type the JsonPath that maps the part of the json returned from you lambda function to what is returned to the client. For example type $input.path('body') if your json is:
.
{
"statusCode": 200,
"body": "<html><body><h1>Test</h1></body></html>"
}
Be sure to deploy the API before testing.
Here's a more detailed article on how to return html from AWS Lambda
try:
response_body = "<HTML><Title>Title</Title></HTML>"
finally:
return {
"statusCode": 200,
"body": response_body,
"headers": {
'Content-Type': 'text/html',
}
}
This just code illustration of David Lin answer

scrapy has response status 400 , but browser response is ok?

I have this strange situation,
I have a link that works on all borwsers that I currently have (chrome,IE,firefox),
I tried to crawl the page using scrapy in python. however I get response.status == 400,
I am using tor + polipo to crawl anonymously
response.body is :
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html><head>
<title>Proxy error: 400 Couldn't parse URL.</title>
</head><body>
<h1>400 Couldn't parse URL</h1>
<p>The following error occurred while trying to access <strong>https://exmpale.com/blah</strong>:<br><br>
<strong>400 Couldn't parse URL</strong></p>
<hr>Generated Thu, 11 Dec 2014 13:55:38 UTC by Polipo on <em>localhost:8123</em>.
</body></html>
I'm just wondering why that should be, could it be that browser can get results but not scrapy?

Categories