In Python Requests module, using proxy connection to browse some url
I added proxy ip and port to https_proxy and http_proxy environment variables to route the traffic to proxy
import os
import requests
os.environ["HTTPS_PROXY"] = "proxy1.xx.local:8081"
os.environ["HTTP_PROXY"] = "proxy1xx.local:8081"
H = {"X-Authenticated-User": "bharani#dummy.com"}
url = r"https://google.co.in"
r = requests.get(url,headers=H,verify=False)
print r.status_code
print r.text
Response I'm getting from above code
Traceback (most recent call last):
File "pilot.py", line 356, in <module>
r = requests.get(url,headers=H,verify=False)
File "C:\Python27\lib\site-packages\requests\api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "C:\Python27\lib\site-packages\requests\api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 609, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 497, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
pyopenssl, ndg-httpsclient, pyasn1 are also installed.
Not sure What I'm missing.
Related
ua = UserAgent()
headers={
'user-agent':str(ua.random),
'Connection':'close'
}
r = requests.get(url,headers=headers,timeout=5)
I want to scrape some information from a website ,but the function request.get() raise exception occasionally (sometimes successful but sometime not). I've tried many methods, random u-a, timeout, time.sleep, max tries, but of no use.
Is there something wrong with my code, or is it a fault or some anti-scraper system of the website?
Here is the full exception:
Traceback (most recent call last):
File "d:\AAA临时文档\抢课app\爬虫\run2.py", line 7, in <module>
r=requests.get(url=url,headers=headers,timeout=20)
File "C:\Users\86153\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\86153\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\86153\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\86153\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\86153\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py", line 504, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='www.dy2018.com', port=443): Max retries exceeded with url: /i/103887.html (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x04046A18>, 'Connection to www.dy2018.com timed out. (connect timeout=20)'))
When I run my Python script to get data from RESTAPI of an application, I get the following error. I installed PIP and I installed requests package for the python. Here is my below query:
./simpleRunQuery.py <args> <args>
Traceback (most recent call last):
File "./simpleRunQuery.py", line 25, in <module>
res = requests.post(url, auth=(args.username, args.password), data=jsonRequest, headers=headers)
File "/Library/Python/2.7/site-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/Library/Python/2.7/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', error(54, 'Connection reset by peer'))
I can attach the script, but my script doens't have the lines 112 or 58 or any of these, it's a simple RESTAPI Script that queries and posts results here. Any pointers?
I'm trying to get access to the BambooHR API (documentation here), but I receive the following error
params = {
'user': username,
'password': password,
'api_token': api_key}
url = 'https://api.bamboohr.com/api/gateway.php/company/v1/login'
r = requests.get(url, params=params)
Error:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1580, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 964, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Users/chriscruz/Dropbox/PycharmProjects/082716_r2/Shippy/API/bamboo_api2.py", line 31, in <module>
BambooFunctions().login()
File "/Users/chriscruz/Dropbox/PycharmProjects/082716_r2/Shippy/API/bamboo_api2.py", line 26, in login
r = requests.get(url, params=params, auth=HTTPBasicAuth(api_key, 'api_token'))
File "/Library/Python/2.7/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "/Library/Python/2.7/site-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 475, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 596, in send
r = adapter.send(request, **kwargs)
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 497, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: ("bad handshake: Error([('SSL routines', 'SSL23_GET_SERVER_HELLO', 'unknown protocol')],)",)
I'm unsure what this is caused by as I've re-installed OpenSSL, Requests, and not sure how to fix this issue.
You try by setting verify=False, use this option if you are using self-signed certificates.
r = requests.get(url, params=params, verify=False)
More info http://docs.python-requests.org/en/master/user/advanced/#ssl-cert-verification
I'm writing a script to a server and verify it's certificate. If I write the script using sockets it works fine:
import socket, ssl, pprint
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(s,
ca_certs="mule.crt",
cert_reqs=ssl.CERT_REQUIRED)
ssl_sock.connect(('localhost', 8081))
message = "GET /api HTTP/1.1\r\nHost:localhost\r\n\r\n"
ssl_sock.write(bytes(message, 'UTF-8'))
data = ssl_sock.read().decode("utf-8")
print(data)
ssl_sock.close()
If I try to do a similar thing using requests:
import requests
r = requests.get('https://localhost:8081/api', cert=r'C:\path\to\cert\mule.crt')
print(r.status_code)
print(r.text)
This throws:
Traceback (most recent call last):
File "C:\path\to\py\ssltest.py", line 4, in <module>
r = requests.get('https://localhost:8081/api', cert=r'C:\path\to\cert\mule.crt')
File "C:\Python34\lib\site-packages\requests\api.py", line 69, in get
return request('get', url, params=params, **kwargs)
File "C:\Python34\lib\site-packages\requests\api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "C:\Python34\lib\site-packages\requests\sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python34\lib\site-packages\requests\sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "C:\Python34\lib\site-packages\requests\adapters.py", line 431, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [SSL] PEM lib (_ssl.c:2536)
Any ideas what I might be doing wrong?
I want to sign in to twitter with the code below and scrape twitter's data then:
import requests
import urllib2
with requests.Session() as c:
url = "https://twitter.com/login"
USER = "hadishamgholi74#gmail.com"
PASS = "52518685251868"
c.get(url)
login_data = {"session[username_or_email]": USER, "session[password]": PASS, "authenticity_token": "4d1c2137136cb297b3e83e382b0026d9213fe731", "scribe_log": "", "redirect_after_login": "", "authenticity_token": "4d1c2137136cb297b3e83e382b0026d9213fe731"}
c.post(url,data = login_data,headers={"Referer":"https://twitter.com"})
page = c.get("https://twitter.com")
print page.content
But it rise this error:
Traceback (most recent call last):
File "C:/Users/Mehdi/PycharmProjects/scrap/login1.py", line 9, in <module>
c.get(url)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 473, in get
return self.request('GET', url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 461, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 431, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:581)
What should i do?
According to this post, there is not support for the https protocol yet requests.exceptions.SSLError: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol