I am using python urllib2 library and can see a strange and nasty problem.
Windows 7.
My code:
import urllib2 as url_request
opener = url_request.build_opener(url_request.ProxyHandler({'http': 'http://login:password#server:8080'}))
request = url_request.Request("http://localhost");
response = opener.open(request)
print response.read()
It works perfectly well, but when I change localhost to 127.0.0.1 this error happens:
HTTPError: HTTP Error 502: Proxy Error ( Forefront TMG denied the specified Uniform Resource Locator (URL). )
Another addresses like google.com can be opened sucessfully.
The only problem is 127.0.0.1
Any ideas?
Set a no_proxy or NO_PROXY environment key with 127.0.0.1, optionally with localhost too:
import os
os.environ['no_proxy'] = '127.0.0.1,localhost'
On Windows the ProxyOverride key in the HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings registry is consulted as well, you probably have localhost registered as exception. Check your proxy settings to verify this.
Related
I am using Python 3.7 socket to get the FQDN, fully qualified domain name.
It works for some, e.g.
socket.getfqdn('indiana.edu')
'www.indiana.edu'
and doesn't work for others, e.g.
socket.getfqdn('google.com')
'lga34s18-in-f14.1e100.net'
Using lga34s18-in-f14.1e100.net in the browser gives the 404 error, url not found.
Ok, google.com is just one example. Here is another one:
socket.getfqdn('www.finastra.com')
'ec2-52-51-237-24.eu-west-1.compute.amazonaws.com'
And using url 'ec2-52-51-237-24.eu-west-1.compute.amazonaws.com' doesn't work, obviously. So they host their website on AWS, but why does socket return it as the FQDM, isn't 'finastra.com' the FQDM?
socket.getfqdn() calls socket.gethostbyaddr() (as stated in docs) to resolve the address. socket.gethostbyaddr() will issue a type A DNS request (if the domain is not on your hosts file) which will resolve to whatever the DNS of google.com or www.finastra.com is configured.
Using lga34s18-in-f14.1e100.net in the browser gives the 404 error, url not found.
This is because your browser sends a Host header which is filled with the hostname from the URL. A single webserver can serve content from different hosts. For example, the following request won't return a 404 Not Found:
curl ec2-52-51-237-24.eu-west-1.compute.amazonaws.com -H 'Host: www.finastra.com'
Removing -H 'Host: www.finastra.com' will cause the request to issue Host: ec2-52-51-237-24.eu-west-1.compute.amazonaws.com header and return 404 Not Found.
I'm using pycURL to make a few requests to a https site through a http proxy.
Here's my code:
import pycurl
buf = cStringIO.StringIO()
c = pycurl.Curl()
c.setopt(c.URL, url) # 'url' is the base url of the form https://www.target.com
c.setopt(c.PROXY, proxy) # 'proxy' has the form 1.2.3.4:8080
c.setopt(c.WRITEFUNCTION, buf.write)
c.perform()
I've tried this code with different proxies. I get either Proxy CONNECT aborted or Received HTTP code 400 from proxy after CONNECT.
Is there something I'm missing? Should I be using https proxies instead? I've looked around and can't seem to find any help or documentation on pycURL's usage.
Any help appreciated. Thanks!
I have a problem similar to yours, and my error log is:
fatal: unable to access 'https://github.com/nhn/raphael.git/': Received HTTP code 400 from proxy after CONNECT
so i use these commond to resolve my problem,
first view your git profile
git config --global --edit
then to delete
config [remote "origin"]
proxy = https://github.com/facette/facette.git
I'm trying to scrape my own site from my local server. But when I use python requests on it, it gives me a response 503. Other ordinary sites on the web work. Any reason/solution for this?
import requests
url = 'http://127.0.0.1:8080/full_report/a1uE0000002vu2jIAA/'
r = requests.get(url)
print r
prints out
<Response [503]>
After further investigation, I've found a similar problem to mine.
Python requests 503 erros when trying to access localhost:8000
However, I don't think he's solved it yet. I can access the local website via the web browser but can't access using the requests.get function. I'm also using Django to host the server.
python manage.py runserver 8080
When I use:
curl -vvv http://127.0.0.1:8080
* Rebuilt URL to: http://127.0.0.1:8080/
* Trying 10.37.135.39...
* Connected to proxy.kdc.[company-name].com (10.37.135.39) port 8099 (#0)
* Proxy auth using Basic with user '[company-id]'
> GET http://127.0.0.1:8080/ HTTP/1.1
> Host: 127.0.0.1:8080
> Proxy-Authorization: Basic Y2FhNTc2OnJ2YTkxQ29kZQ==
> User-Agent: curl/7.49.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: BlueCoat-Security-Appliance
< Location:http://10.118.216.201
< Connection: Close
<
<HTML>
<HEAD><TITLE>Redirection</TITLE></HEAD>
<BODY><H1>Redirect</H1></BODY>
* Closing connection 0
I cannot request a local url using python requests because the company's network software won't allow it. This is a dead end and other avenues must be pursued.
EDIT: Working Solution
>>> import requests
>>> session = requests.Session()
>>> session.trust_env = False
>>> r = session.get("http://127.0.0.1:8080")
>>> r
<Response [200]>
Maybe you should disable your proxies in your requests.
import requests
proxies = {
"http": None,
"https": None,
}
requests.get("http://127.0.0.1:8080/myfunction", proxies=proxies)
ref:
https://stackoverflow.com/a/35470245/8011839
https://2.python-requests.org//en/master/user/advanced/#proxies
HTTP Error 503 means:
The Web server (running the Web site) is currently unable to handle the HTTP request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. Some servers in this state may also simply refuse the socket connection, in which case a different error may be generated because the socket creation timed out.
You may do following things:
Check you are able to open URL in the browser
If URL is opening, then check the domain in your code, it might be incorrect.
If in browser also it is not opening, your site may be overloaded or server resources are full to perform request
The most common cause of a 503 error is that a proxy host of some form is unable to communicate with the back end. For example, if you have Varnish trying to handle a request but Apache is down.
In your case, you have Django running on port 8080. (That's what the 8080 means). When you try to get content from 127.0.0.1, though, you're going to the default HTTP port (80). This means that your default server (Apache maybe? NginX?) is trying to find a host to serve 127.0.0.1 and can't find one.
You have two choices. Either you can update your server's configuration, or you can include the port in the URL.
url = 'http://127.0.0.1:8080/full_report/a1uE0000002vu2jIAA/'
I'm trying use SSH tunnels inside of Python's urllib2.
Creating the tunnel:
ssh -N user#machine.place.edu -L 1337:localhost:80
The above line should use port 80 on the remote machine and port 1337 on the local machine.
I used -N, so the bash prompt (intentionally) hangs so long as the this tunnel is running.
Using the tunnel in urllib2:
import urllib2
url = "http://ifconfig.me/ip"
headers={'User-agent' : 'Mozilla/5.0'}
proxy_support = urllib2.ProxyHandler({'http': 'http://127.0.0.1:1337'})
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)
req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html
When I run the above code, html = urllib2.urlopen(req).read() throws the error urllib2.HTTPError: HTTP Error 404: Not Found.
What might be going wrong, and how can we fix it?
Troubleshooting:
If I turn off the SSH tunnel, the error changes to urllib2.URLError: <urlopen error [Errno 61] Connection refused>. So, Python is clearly "seeing" the SSH tunnel.
If I comment out the proxy stuff by replacing opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1)) with opener = urllib2.build_opener(), then the ifconfig.me page downloads properly. (Of course, the project that I'm working on requires me to access documents from a few different networks, so I still need proxies to work.)
Some StackOverflow posts suggest using Requests instead of urllib2. I wouldn't mind using Requests instead -- I just used urllib2 here because I wasn't sure how to do custom headers (e.g. user-agent, referer) in Requests.
Unfortunately, since you're the only one with access to machine.place.edu, it's going to be impossible for anyone else to reproduce the problem.
First of all, try something like...
$ telnet localhost 1337
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET http://ifconfig.me/ip HTTP/1.0
...and hit enter a couple of times after the 'GET' line, and see what you get back.
If you get a 404, there's probably something wrong with the proxy.
If you get a 200, then you should be able to recreate that fairly easily with httplib.
I am trying to do a simple get request through a proxy server:
import requests
test=requests.get("http://google.com", proxies={"http": "112.5.254.30:80"})
print test.text
The address of the proxy server in the code is just from some freely available proxy lists on the internet. The point is that this same proxy server works when I use it from browser, but it doesn't work from this program. And i tried many different proxy servers and none of them works through above code.
Here is what I get for this proxy server:
The requested URL could not be retrieved While trying to retrieve the URL: http:/// The following error was encountered:
Unable to determine IP address from host name for
The dnsserver returned: Invalid hostname
This means that: The cache was not able to resolve the
hostname presented in the URL. Check if the address is correct.
I know its an old question, but it should be
import requests
test=requests.get("http://google.com", proxies={"http":"http://112.5.254.30:80","https": "http://112.5.254.30:80"})
print (test.text)