using python-requests with cntlm corporate proxy

using python-requests with cntlm corporate proxy - python

I got some python applications I'd like to run behind a corporate proxy. I'm using cntlm and I think it's configured properly so far because wget, curl and pip and so on are working pretty well.
i.e.:
sudo wget http://apple.com
Console-Output:
andre#VirtualBox:~$ wget apple.com
--2018-04-04 10:38:55-- http://apple.com/
Connecting to 127.0.0.1:3128... connected.
Proxy request sent, awaiting response... 301 Moved Permanently
Location: https://www.apple.com/ [following]
--2018-04-04 10:38:56-- https://www.apple.com/
Connecting to 127.0.0.1:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 45704 (45K) [text/html]
Saving to: ‘index.html.3’
index.html.3 100%[===================>] 44,63K --.-KB/s in 0s
2018-04-04 10:38:56 (312 MB/s) - ‘index.html.3’ saved [45704/45704]
is working - where this:
import requests
url = 'https://apple.com'
session = requests.session()
r = session.get(url)
print(r.text)
Console-Output:
Traceback (most recent call last):
File "/home/andre/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 595, in urlopen
self._prepare_proxy(conn)
File "/home/andre/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 816, in _prepare_proxy
conn.connect()
File "/home/andre/.local/lib/python3.5/site-packages/urllib3/connection.py", line 294, in connect
self._tunnel()
File "/usr/lib/python3.5/http/client.py", line 832, in _tunnel
message.strip()))
OSError: Tunnel connection failed: 407 Proxy Authentication Required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/andre/.local/lib/python3.5/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/home/andre/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/andre/.local/lib/python3.5/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='apple.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',)))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/andre/PycharmProjects/Notifications/de/seandre/rest/requesttest.py", line 5, in <module>
r = session.get(url)
File "/home/andre/.local/lib/python3.5/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/home/andre/.local/lib/python3.5/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/home/andre/.local/lib/python3.5/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/home/andre/.local/lib/python3.5/site-packages/requests/adapters.py", line 502, in send
raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='apple.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',)))
is not working.
Many thanks in advance!

You should set environment variable HTTP_PROXY
export HTTP_PROXY='http://mydomain\username:12345#servername.org:3128'
And try this example code.
url = 'https://example.com/example'
req = urllib.request.Request(
url,
data=None,
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'})
proxy = urllib.request.ProxyHandler({'https': os.environ['HTTP_PROXY']})
auth = urllib.request.HTTPBasicAuthHandler()
opener = urllib.request.build_opener(proxy, auth, urllib.request.HTTPHandler)
urllib.request.install_opener(opener)
data = urllib.request.urlopen(req)
But for me very usefull cntlm proxy

Related

Python requests module timeouts with every https proxy and uses my real ip with http proxies

Basically i get this error with every single https proxy I've tried on every website.
Code:
import requests
endpoint = 'https://ipinfo.io/json'
proxies = {'http':'http://45.167.95.184:8085','https':'https://45.167.95.184:8085'}
r = requests.get(endpoint,proxies=proxies,timeout=10)
Error:
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
r = requests.get(endpoint,proxies=proxies,timeout=10)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 529, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 645, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='ipinfo.io', port=443): Max retries exceeded with url: /json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001FDCFB9E6A0>, 'Connection to 45.167.95.184 timed out. (connect timeout=10)'))
And when I only use http
import requests
endpoint = 'https://ipinfo.io/json'
proxies = {'http':'http://45.167.95.184:8085'}
r = requests.get(endpoint,proxies=proxies,timeout=10)
The request is sent but websites that return public ips show my real ip. Both requests (2.27.1) and urllib3 (1.26.8) are update to their lastest versions, what could the issue be?

popups data scraping on given page range

I have a code to parse a popup info(emails), but I am unable to go on every page. It's only scraping one page. So how could I add range of pages on this code ?
import re
import requests
from bs4 import BeautifulSoup
base_url = 'https://tenders.procurement.gov.ge/public/?lang=en'
url = 'https://tenders.procurement.gov.ge/public/library/controller.php?action=org_list'
profile_url = 'https://tenders.procurement.gov.ge/public/library/controller.php?action=profile&org_id='
num = re.compile(r'(\d+)')
with requests.session() as s:
# load cookies:
s.get(base_url)
soup = BeautifulSoup(s.get(url).content, 'html.parser')
for tr in soup.select('tr[onclick]'):
n = num.search(tr['onclick']).group(1)
soup2 = BeautifulSoup(s.get(profile_url + n).content, 'html.parser')
email = soup2.select_one('td:contains("E-Mail") + td')
print(email.text)
This code was offered by Andrej Keseley and many thanks to him.
So what will be the addition to this code to scrape pages in given range?

You can try this code to load next pages:
import re
import requests
from bs4 import BeautifulSoup
base_url = 'https://tenders.procurement.gov.ge/public/?lang=en'
url = 'https://tenders.procurement.gov.ge/public/library/controller.php?action=org_list'
profile_url = 'https://tenders.procurement.gov.ge/public/library/controller.php?action=profile&org_id='
next_url = 'https://tenders.procurement.gov.ge/public/library/controller.php?action=org_list&search_org_type=0&page=next&blacklisted=0'
num = re.compile(r'(\d+)')
with requests.session() as s:
# load cookies:
s.get(base_url)
soup = BeautifulSoup(s.get(url).content, 'html.parser')
while True:
if not soup.select('tr[onclick]'):
break
for tr in soup.select('tr[onclick]'):
n = num.search(tr['onclick']).group(1)
soup2 = BeautifulSoup(s.get(profile_url + n).content, 'html.parser')
email = soup2.select_one('td:contains("E-Mail") + td')
print(email.text)
soup = BeautifulSoup(s.get(next_url).content, 'html.parser')

Thanks Andrej, your code is working as always! ;) but after about 9000 emails i am getting an error, so how could I add any variable of ranges of pages ? for example I need from 9000 emails to 12000 emails (or same thing with pages ranges)
Traceback (most recent call last):
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\util\connection.py", line 83, in create_connection
raise err
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
chunked=chunked)
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connectionpool.py", line 345, in _make_request
self._validate_conn(conn)
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connectionpool.py", line 844, in _validate_conn
conn.connect()
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connection.py", line 284, in connect
conn = self._new_conn()
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x00000074893451D0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\requests\adapters.py", line 440, in send
timeout=timeout
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\connectionpool.py", line 649, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\urllib3\util\retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='tenders.procurement.gov.ge', port=443): Max retries exceeded with url: /public/library/controller.php?action=profile&org_id=67358 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x00000074893451D0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Dell/PycharmProjects/Scrap/main.py", line 25, in
soup2 = BeautifulSoup(s.get(profile_url + n).content, 'html.parser')
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\requests\sessions.py", line 536, in get
return self.request('GET', url, **kwargs)
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\requests\sessions.py", line 523, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\requests\sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Dell\PycharmProjects\Scrap\venv\lib\site-packages\requests\adapters.py", line 504, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='tenders.procurement.gov.ge', port=443): Max retries exceeded with url: /public/library/controller.php?action=profile&org_id=67358 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x00000074893451D0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))

SSL handshake failure on Google Translate

I have successfully been using the gTTS module in order to get audio from Google Translate for a while. I use it quite sparsely (I must have made 25 requests in total), and don't believe I could have hit any kind of limit that would cause my address to be blocked from using the service.
However, today, after trying to use it (I haven't used it in 1-2 months), I got the following program:
from gtts import gTTS
tts = gTTS('hallo', 'de')
tts.save('hallo.mp3')
To cause an error. I tracked down the problem, and I managed to see that even this simple program:
import requests
response = requests.get("https://translate.google.com/")
Causes the following error:
Traceback (most recent call last):
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen
chunked=chunked)
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 346, in _make_request
self._validate_conn(conn)
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 850, in _validate_conn
conn.connect()
File "C:\...\lib\site-packages\urllib3\connection.py", line 326, in connect
ssl_context=context)
File "C:\...\lib\site-packages\urllib3\util\ssl_.py", line 329, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\...\lib\ssl.py", line 407, in wrap_socket
_context=self, _session=session)
File "C:\...\lib\ssl.py", line 814, in __init__
self.do_handshake()
File "C:\...\lib\ssl.py", line 1068, in do_handshake
self._sslobj.do_handshake()
File "C:\...\lib\ssl.py", line 689, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:777)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\...\lib\site-packages\requests\adapters.py", line 440, in send
timeout=timeout
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\...\lib\site-packages\urllib3\util\retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='translate.google.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:777)'),))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main2.py", line 2, in <module>
response = requests.get("https://translate.google.com/")
File "C:\...\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\...\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\...\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "C:\...\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\...\lib\site-packages\requests\adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='translate.google.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:777)'),))
I would like to know if anyone has an idea what the issue could be. I can get on the Google Translate website without any problems from my browser, and have no issues using the audio either.

Accepted answer did not work for me since the code has changed, the way i got it to work was to add verify=False in gtts_token.py instead
response = requests.get("https://translate.google.com/", verify=False)

This looks like an error related to your proxy setting, especially if you are using your work PC. I have got the same issue, but different error message, for example:
gTTSError: Connection error during token calculation:
HTTPSConnectionPool(host='translate.google.com', port=443): Max
retries exceeded with url: / (Caused by SSLError(SSLError("bad
handshake: Error([('SSL routines', 'ssl3_get_server_certificate',
'certificate verify failed')],)",),))
To further investigate the issue, you can debug it in the command line.
(base) c:\gtts-cli "sample text to debug" --debug --output test.mp3
you should see results as below;
ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',)))
Solution:
I have checked the gTTs documentation, there is no way to pass your proxy setting to the api. so the work around is ignore the ssl verification, which in not available also in gTTs. so the only way to do it is to change the following gtts files:
tts.py, in line 208 chage the request function to add verifiy=false
r = requests.get(self.GOOGLE_TTS_URL,
params=payload,
headers=self.GOOGLE_TTS_HEADERS,
proxies=urllib.request.getproxies(),
verify=False)
file lang.py, line 56
page = requests.get(URL_BASE, verify=False)
Then, try again the debug command line. you should be able to get the file recorded now
(base) c:\gtts-cli "sample text to debug" --debug --output test.mp3
gtts.tts - DEBUG - status-0: 200
gtts.tts - DEBUG - part-0 written to <_io.BufferedWriter name=test.mp3'>

python requests requests.exceptions.ConnectionError: Failed to establish a new connection: [Errno 10060]

I'm trying to crawl a website.
url = "http://www.hellotrade.com/business/"
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36",
'Connection':'close'
}
res = requests.get(url, headers = headers, timeout = 30)
It runs perfectly at the beginning, but after running a while later, it comes out the error message.
Traceback (most recent call last):
File "C:\Users\millshih\Desktop\hellotrade.py", line 32, in <module> res = s.get(url, headers = headers, timeout = 30)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 508, in request resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 508, in send
raise ConnectionError(e, request=request)requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.hellotrade.com', port=80): Max retries exceeded with url: /business/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0294F3D0>: Failed to establish a new connection: [Errno 10060] \xb3s\xbdu\xb9\xc1\xb8\xd5\xa5\xa2\xb1\xd1\xa1A\xa6]\xac\xb0\xb3s\xbdu\xb9\xef\xb6H\xa6\xb3\xa4#\xacq\xae\xc9\xb6\xa1\xa8\xc3\xa5\xbc\xa5\xbf\xbdT\xa6^\xc0\xb3\xa1A\xa9\xce\xacO\xb3s\xbdu\xab\xd8\xa5\xdf\xa5\xa2\xb1\xd1\xa1A\xa6]\xac\xb0\xb3s\xbdu\xaa\xba\xa5D\xbe\xf7\xb5L\xaak\xa6^\xc0\xb3\xa1C',))
After this error, I have to wait until next day then it can run again, but it will have same problem after running couple minutes later.
But my browser is doing well to surf this website. So, it means my IP is not banned by this website.
I've tried this and some other ways from the internet, but it can't work at all.
I'm wondering maybe it is because of my internet connection problem?

Python requests: ignore exceptions and errors while connecting to proxy server

I wrote my first program in Python.
#This program casts votes in online poll using different proxy servers for each request.
#It works, but some proxy servers cause errors crashing the whole thing.
#To avoid that, I would like it to skip those servers and ignore the errors.
import requests
import time
#Votes to be cast
votes = 5
#Makes proxy list
f=open('proxy2.txt')
lines=f.read().splitlines()
f.close()
#Vote counter
i = 1
#Proxy list counter
j = 0
while (i<=votes):
#Tests and moves to next proxy if there was a problem.
try:
r = requests.get('http://www.google.com')
except requests.exceptions.RequestException:
j = j + 1
#Headers copied from my browser. Some of them cause errors. Could you tell me why?
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
#'Accept-Encoding': 'gzip, deflate',
#'Accept-Language': 'pl-PL,pl;q=0.8,en-US;q=0.6,en;q=0.4',
#'Cache-Control': 'max-age=0',
#'Connection': 'keep-alive',
#'Content-Length': '101',
'Content-Type': 'application/x-www-form-urlencoded',
#'Host': 'www.mylomza.pl',
#'Origin': 'http://www.mylomza.pl',
#'Referer': 'http://www.mylomza.pl/home/lomza/item/11780-wybierz-miss-%C5%82ks-i-portalu-mylomzapl-video-i-foto.html',
#'Upgrade-Insecure-Requests': '1',
#'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
proxies = {
'http': 'http://'+lines[j] #31.207.0.99:3128
}
r = requests.get('http://www.mylomza.pl/home/lomza/item/11780-wybierz-miss-%C5%82ks-i-portalu-mylomzapl-video-i-foto.html', headers=headers, proxies=proxies, timeout=10)
#The funny part - form, that I have to post, requires some kind of ID and this is my way of getting it :P Feel free to suggest an alternative way.
userid = r.text[(22222-32):22222]
print('Voter', userid, 'registered.')
data = {
'voteid': '141',
'task_button': 'Głosuj',
'option': 'com_poll',
'task': 'vote',
'id': '25',
userid: '1'
}
r = requests.post('http://www.mylomza.pl/home/lomza/item/index.php', headers=headers, cookies=r.cookies, data=data, proxies=proxies, timeout=10)
print('Vote nr', i, 'cast from', lines[i])
i = i + 1
j = j + 1
time.sleep(1)
What I need is to make it handle exceptions and errors.
#Tests and moves to next proxy if there was a problem.
try:
r = requests.get('http://www.google.com')
except requests.exceptions.RequestException:
j = j + 1
Beside that I could use an alternative way of achieving this:
#The funny part - form, that I have to post, requires some kind of ID and this is my way of getting it :P Feel free to suggest an alternative way.
userid = r.text[(22222-32):22222]
Sometimes my method doesn't work (example below). First vote went through, second didn't and then all crashed.
Voter 53bf55490ebd07d9c190787c5c6ca44c registered.
Vote nr 1 cast from 111.23.6.161:80
Voter registered.
Vote nr 2 cast from 94.141.102.203:8080
Traceback (most recent call last):
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connection.py", line 142, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\util\connection.py", line 91, in create_connection
raise err
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\util\connection.py", line 81, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 578, in urlopen
chunked=chunked)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 362, in _make_request
conn.request(method, url, **httplib_request_kw)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 1083, in request
self._send_request(method, url, body, headers)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 1128, in _send_request
self.endheaders(body)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 1079, in endheaders
self._send_output(message_body)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 911, in _send_output
self.send(msg)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 854, in send
self.connect()
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connection.py", line 167, in connect
conn = self._new_conn()
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connection.py", line 147, in _new_conn
(self.host, self.timeout))
requests.packages.urllib3.exceptions.ConnectTimeoutError: (<requests.packages.urllib3.connection.HTTPConnection object at 0x03612730>, 'Connection to 94.141.102.203 timed out. (connect timeout=10)')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\adapters.py", line 403, in send
timeout=timeout
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 623, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\util\retry.py", line 281, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='94.141.102.203', port=8080): Max retries exceeded with url: http://www.mylomza.pl/home/lomza/item/11780-wybierz-miss-%C5%82ks-i-portalu-mylomzapl-video-i-foto.html (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x03612730>, 'Connection to 94.141.102.203 timed out. (connect timeout=10)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\PollVoter.py", line 50, in <module>
r = requests.get('http://www.mylomza.pl/home/lomza/item/11780-wybierz-miss-%C5%82ks-i-portalu-mylomzapl-video-i-foto.html', headers=headers, proxies=proxies, timeout=10)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\api.py", line 71, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\api.py", line 57, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\sessions.py", line 475, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\sessions.py", line 585, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Adrian\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\adapters.py", line 459, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='94.141.102.203', port=8080): Max retries exceeded with url: http://www.mylomza.pl/home/lomza/item/11780-wybierz-miss-%C5%82ks-i-portalu-mylomzapl-video-i-foto.html (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x03612730>, 'Connection to 94.141.102.203 timed out. (connect timeout=10)'))

It looks like you're flooding the server with too many requests, that's why you're getting the other errors like requests.packages.urllib3.exceptions.MaxRetryError, since likely the server throttles the number of connections you can make in a given amount of time. You can try handling all the exceptions listed in your output, and you can also try making fewer attempts at the url you're requesting from.
[Edit] Or if you want to brute force and handle all errors and exceptions, try the following instead
except:
j = j + 1
[Edit:] You could try https: as well as http:
[Edit] Found this:
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)

PROBLEM SOLVED
Turns out that I shouldn't open more than 1 connection per proxy server.
But I have to make 2 requests. The solution was to send first one from my ip then switch to proxy for second one.
r = requests.get(url, headers=headers, timeout=timeout)
try:
r = requests.post(url, headers=headers, cookies=r.cookies, data=data, timeout=timeout, proxies=proxies)
except:
j = j + 1
Works perfectly so far. :)

I had the similar thing and used
except:
continue
which ment to continue the loop over again in case of exceptions and continue 'trying'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

using python-requests with cntlm corporate proxy - python

Related

Python requests module timeouts with every https proxy and uses my real ip with http proxies

popups data scraping on given page range

SSL handshake failure on Google Translate

python requests requests.exceptions.ConnectionError: Failed to establish a new connection: [Errno 10060]

Python requests: ignore exceptions and errors while connecting to proxy server

Categories

Resources