Requests library not properly directing HTTP requests through proxies - python

I know how to use requests very well, yet for some reason I am not succeeding in getting the proxies working. I am making the following request:
r = requests.get('http://whatismyip.com', proxies={'http': 'http://148.236.5.92:8080'})
I get the following:
requests.exceptions.ConnectionError: [Errno 10060] A connection attempt failed b
ecause the connected party did not properly respond after a period of time, or e
stablished connection failed because connected host has failed to respond
Yet, I know the proxy works, because using node:
request.get({uri: 'http://www.whatismyip.com', proxy: 'http://148.236.5.92:8080'},
function (err, response, body) {var $ = cheerio.load(body); console.log($('#greenip').text());});
I get the following (correct) response:
148.236.5.92
Furthermore, when I try the requests request at all differently (say, without writing http:// in front of the proxy), it just allows the request to go through normally without going through a proxy or returning an error.
What am I doing wrong in Python?

It's a known issue: https://github.com/kennethreitz/requests/issues/1074
I'm not sure exactly why it's taking so long to fix though. To answer your question though, you're doing nothing wrong.

As sigmavirus24 says, this is a known issue, which has been fixed, but hasn't yet been packaged up into a new version and pushed to PyPI.
So, if you need this in a hurry, you can upgrade from the git repo's master.
If you're using pip, this is simple. Instead of this:
pip install -U requests
Do this:
pip install -U git+https://github.com/kennethreitz/requests
If you're not using pip, you'll probably have to explicitly git clone the repo, then easy_install . or python setup.py or whatever from your local copy.

Related

Pytrends: The request failed: Google returned a response with code 429

I'm using Pytrends to extract Google trends data, like:
from pytrends.request import TrendReq
pytrend = TrendReq()
pytrend.build_payload(kw_list=['bitcoin'], cat=0, timeframe=from_date+' '+today_date)
And it returns an error:
ResponseError: The request failed: Google returned a response with code 429.
I made it yesterday and for some reason it doesn't work now! The source code from github failed too:
pytrends = TrendReq(hl='en-US', tz=360, proxies = {'https': 'https://34.203.233.13:80'})
How can I fix this? Thanks a lot!
TLDR; I solved the problem with a custom patch
Explanation
The problem comes from the Google bot recognition system. As other similar systems do, it stops serving too frequent requests coming from suspicious clients. Some of the features used to recognize trustworthy clients are the presence of specific headers generated by the javascript code present on the web pages. Unfortunately, the python requests library does not provide such a level of camouflage against those bot recognition systems since javascript code is not even executed.
So the idea behind my patch is to leverage the headers generated by my browser interacting with google trends. Those headers are generated by the browser meanwhile I am logged in using my Google account, in other words, those headers are linked with my google account, so for them, I am trustworthy.
Solution
I solved in the following way:
First of all you must use google trends from your web browser while you are logged in with your Google Account;
In order to track the actual HTTP GET made: (I am using Chromium) Go into "More Tools" -> "Developers Tools" -> "Network" tab.
Visit the Google Trend page and perform a search for a trend; it will trigger a lot of HTTP requests on the left sidebar of the "Network" tab;
Identify the GET request (in my case it was /trends/explore?q=topic&geo=US) and right-click on it and select Copy -> Copy as cURL;
Then go to this page and paste the cURL script on the left side and copy the "headers" dictionary you can find inside the python script generated on the right side of the page;
Then go to your code and subclass the TrendReq class, so you can pass the custom header just copied:
from pytrends.request import TrendReq as UTrendReq
GET_METHOD='get'
import requests
headers = {
...
}
class TrendReq(UTrendReq):
def _get_data(self, url, method=GET_METHOD, trim_chars=0, **kwargs):
return super()._get_data(url, method=GET_METHOD, trim_chars=trim_chars, headers=headers, **kwargs)
Remove any "import TrendReq" from your code since now it will use this you just created;
Retry again;
If in any future the error message comes back: repeat the procedure. You need to update the header dictionary with fresh values and it may trigger the captcha mechanism.
This one took a while but it turned out the library just needed an update. You can check out a few of the approaches I posted here, both of which resulted in Status 429 Responses:
https://github.com/GeneralMills/pytrends/issues/243
Ultimately, I was able to get it working again by running the following command from my bash prompt:
Run:
pip install --upgrade --user git+https://github.com/GeneralMills/pytrends
For the latest version.
Hope that works for you too.
EDIT:
If you can't upgrade from source you may have some luck with:
pip install pytrends --upgrade
Also, make sure you're running git as an administrator if on Windows.
I had the same problem even after updating the module with pip install --upgrade --user git+https://github.com/GeneralMills/pytrends and restart python.
But, the issue was solved via the below method:
Instead of
pytrends = TrendReq(hl='en-US', tz=360, timeout=(10,25), proxies=['https://34.203.233.13:80',], retries=2, backoff_factor=0.1, requests_args={'verify':False})
Just ran:
pytrend = TrendReq()
Hope this can be helpful!
After running the upgrade command via pip install, you should restart the python kernel and reload the pytrend library.

Can't route requests in a python script through tor without sudo

I'm trying to route requests in a python script through tor.
Here's the code:
#!/usr/bin/env python3
import socket
import socks
from urllib import request
socks.set_default_proxy(socks.SOCKS5, '127.0.0.1', 9050)
socket.socket = socks.socksocket
ip = request.urlopen('https://api.ipify.org/').read()
print(ip)
When I try to run it as a user (just "./script.py"), it crashes with the following error:
urllib.error.URLError: <urlopen error Socket error: 0x01: General SOCKS server failure>
But if I run the script with sudo ("sudo ./script.py"), it works as expected and prints a tor IP. How can I get it to work without sudo?
Edit 1: I think tor installation is ok, because it works fine with other languages (for example, I can perform requests from a Go script. Also, I can get my python script to work with tor when passing proxies dict to requests.get() (as suggested in a comment below). This solution is acceptable, but I am still wondering, what's wrong with my script.
Edit 2: I'm running Linux Mint 18.3 64-bit. Python and Python3 are pre-installed. Tor was installed via repository (sudo apt-get install tor). I tried installing PySocks globally (sudo pip3 install PySocks) and only for current user (pip3 install --user PySocks).
Did you try looking at : https://tor.stackexchange.com/questions/7101/general-socks-server-failure-while-using-tor-proxy? This seems to be what you are looking for, and they indicate this can be resolved using an intermediate connection on 127.0.0.2. Take a look!
I have tried your code on my machine (after installing the right packages) and it works as it should. Something should be wrong with the way Tor is installed/run in your system. Feel free to add details about your installation to your question and I'll try to take a look at it.

How can I solve CondaHTTPError in anaconda?

I'm using Windows 10 pro(x64) and I have just installed Anaonda 4.3.1
But whenever I try to install a package or update conda, it shows an error like the below.
(d:\Miniconda3) C:\Windows\system32>conda update conda
Fetching package metadata .....
CondaHTTPError: HTTP None None for url <None>
Elapsed: None
An HTTP error occurred when trying to retrieve this URL.
SSLError(SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'sslv3 alert bad record mac')],)",),),)
conda config --set ssl_verify False doesn't make any difference either.
I have no problem installing packages using pip.
These errors coming from such abstracting (i.e. with a high level of abstraction) tools are usually really hard to debug from the tool themselves (it requires quite a lot of digging around in the tool's code to pinpoint and finally find the issue); to the extent that in the vast majority of cases, once you've debugged it, you know enough about the tool in question to actually be able to write a patch to solve that problem.
What I would recommend is to first trace how does conda get the metadata it is getting at first (first line of your output). On UNIX I would recommend tcpdump, but on windows I would use wireshark (albeit according to the wikipedia page for tcpdump, it works on windows too).
Once you know what host the package should be obtained from, you can try to understand why it occurs. Namely, the bad record mac error should not occur under normal conditions; i.e. either you have a network problem (try with another network) or there is a server (more likely if conda used to work) or client problem.
To try to debug the SSL issue once you know the host, run:
openssl s_client -connect $host:443 -msg -debug
Where $host is the host you found using tcpdump/wireshark.
Best of luck!
Note: I have not linked wireshark.org in this answer, but instead the wikipedia page for wireshark to prevent supporting bogus security practices 1,2. Please do not edit that link.

Problems With Installing Python Nmap on Ubuntu

I am completely new to Ubunut and Linux so I really have no clue what I am doing wrong here. I'm copying this code out of a book I am using to learn penetration testing with python for a college course, but the code is not working. Below I have the code as I am entering it into terminal, followed by the response that it elicits. What is wrong with the code that I am entering?
programmer:~# wget http://zael.org/norman/python/python-nmap/python-nmap-0.3.2.tar.gz-On map.tar.gz
Resolving xael.org (xael.org)... 213.152.29.60
Connecting to xael.org (xael.org)|213.152.29.60|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-01-21 16:10:51 ERROR 404: Not Found.
--2014-01-21 16:10:51-- http://nmap.tar.gz/
Resolving nmap.tar.gz (nmap.tar.gz)... failed: Name or service not known.
wget: unable to resolve host address `nmap.tar.gz'
It because you missed a white space between link and the -O option , try:
wget http://xael.org/norman/python/python-nmap/python-nmap-0.3.2.tar.gz
you could also use pip to install package
pip install python-nmap

Python liburl2 timeout; can ping server fine, and wget works fine;

I'm trying to use Python's liburl2 to access the Dreamhost API documented here: http://wiki.dreamhost.com/API
Here is my code:
request = urllib2.Request('https://api.dreamhost.com/?key=<key>')
response = urllib2.urlopen(request)
page = response.read()
print(page)
This invariably fails with the error:
urllib2.URLError: <urlopen error [Errno 104] Connection reset by peer>
I'm absolutely stumped, because I can ping api.dreamhost.com just fine, and wget https://api.dreamhost.com/?key= works fine, too.
Any ideas?
I know it's an old question, but I faced the same problem, and found the solution through two other questions.
This, that shows me the problem is with the handshake using SSLv3:
OpenSSL issues in Debian Wheezy
And this, that gives some possible solutions:
Python HTTPS requests (urllib2) to some sites fail on Ubuntu 12.04 without proxy

Categories