I'm trying to scrape my own site from my local server. But when I use python requests on it, it gives me a response 503. Other ordinary sites on the web work. Any reason/solution for this?
import requests
url = 'http://127.0.0.1:8080/full_report/a1uE0000002vu2jIAA/'
r = requests.get(url)
print r
prints out
<Response [503]>
After further investigation, I've found a similar problem to mine.
Python requests 503 erros when trying to access localhost:8000
However, I don't think he's solved it yet. I can access the local website via the web browser but can't access using the requests.get function. I'm also using Django to host the server.
python manage.py runserver 8080
When I use:
curl -vvv http://127.0.0.1:8080
* Rebuilt URL to: http://127.0.0.1:8080/
* Trying 10.37.135.39...
* Connected to proxy.kdc.[company-name].com (10.37.135.39) port 8099 (#0)
* Proxy auth using Basic with user '[company-id]'
> GET http://127.0.0.1:8080/ HTTP/1.1
> Host: 127.0.0.1:8080
> Proxy-Authorization: Basic Y2FhNTc2OnJ2YTkxQ29kZQ==
> User-Agent: curl/7.49.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: BlueCoat-Security-Appliance
< Location:http://10.118.216.201
< Connection: Close
<
<HTML>
<HEAD><TITLE>Redirection</TITLE></HEAD>
<BODY><H1>Redirect</H1></BODY>
* Closing connection 0
I cannot request a local url using python requests because the company's network software won't allow it. This is a dead end and other avenues must be pursued.
EDIT: Working Solution
>>> import requests
>>> session = requests.Session()
>>> session.trust_env = False
>>> r = session.get("http://127.0.0.1:8080")
>>> r
<Response [200]>
Maybe you should disable your proxies in your requests.
import requests
proxies = {
"http": None,
"https": None,
}
requests.get("http://127.0.0.1:8080/myfunction", proxies=proxies)
ref:
https://stackoverflow.com/a/35470245/8011839
https://2.python-requests.org//en/master/user/advanced/#proxies
HTTP Error 503 means:
The Web server (running the Web site) is currently unable to handle the HTTP request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. Some servers in this state may also simply refuse the socket connection, in which case a different error may be generated because the socket creation timed out.
You may do following things:
Check you are able to open URL in the browser
If URL is opening, then check the domain in your code, it might be incorrect.
If in browser also it is not opening, your site may be overloaded or server resources are full to perform request
The most common cause of a 503 error is that a proxy host of some form is unable to communicate with the back end. For example, if you have Varnish trying to handle a request but Apache is down.
In your case, you have Django running on port 8080. (That's what the 8080 means). When you try to get content from 127.0.0.1, though, you're going to the default HTTP port (80). This means that your default server (Apache maybe? NginX?) is trying to find a host to serve 127.0.0.1 and can't find one.
You have two choices. Either you can update your server's configuration, or you can include the port in the URL.
url = 'http://127.0.0.1:8080/full_report/a1uE0000002vu2jIAA/'
Related
When I execute the line from inside a request:
page = requests.get("http://localhost:5000/some/page/")
with DEBUG logging turned on, the output is:
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:5000
send: 'GET /some/page/ HTTP/1.1\r\nHost: localhost:5000\r\nConnection: keep-alive\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nUser-Agent: python-requests/2.22.0\r\n\r\n'
and execution will not progress past this point. Am I missing a step somewhere?
Update:
This is happening for requests that themselves contain a requests.get(). So, I think flask is trying to do two different things at once and the dev server is unable to handle that. That also explains why this seems to work on my staging server.
I've tried running Flask with threaded=True, but that didn't make a difference. Any ideas on a fix for the purposes of local dev and testing?
Update2:
Wrapping in with app.test_client() as c: and using c.get() works for localhost, but fails in staging
I'm using Tor, Privoxy, and Python to anonymously crawl sources on the web. Tor is configured with ControlPort 9051, while Privoxy is configured with forward-socks5 / localhost:9050 .
My scripts are working flawlessly, except when I request an API resource that I have running on 8000 on the same machine. If I hit the API via urllib2 setup with the proxy, I get an empty string response. If I hit the API using a new, non-proxy instance of urllib2, I get a HTTP Error 503: Forwarding failure.
I'm sure that if I open 8000 to the world I'll be able to access the port through the proxy. However, there must be a better way to access the resource on localhost. Curious how people deal with this.
I was able to switch off proxy and hit internal API by using the following to opener:
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
opener = urllib2.build_opener(urllib2.HTTPSHandler(context=ctx))
urllib2.install_opener(opener)
I'm not sure if there is a better way, but it worked.
I have my web app API running.
If I go to http://127.0.0.1:5000/ via any browser I get the right response.
If I use the Advanced REST Client Chrome app and send a GET request to my app at that address I get the right response.
However this gives me a 503:
import requests
response = requests.get('http://127.0.0.1:5000/')
I read to try this for some reason:
s = requests.Session()
response = s.get('http://127.0.0.1:5000/')
But I still get a 503 response.
Other things I've tried: Not prefixing with http://, not using a port in the URL, running on a different port, trying a different API call like Post, etc.
Thanks.
Is http://127.0.0.1:5000/ your localhost? If so, try 'http://localhost:5000' instead
Just in case someone is struggling with this as well, what finally worked was running the application on my local network ip.
I.e., I just opened up the web app and changed the app.run(debug=True) line to app.run(host="my.ip.address", debug = True).
I'm guessing the requests library perhaps was trying to protect me from a localhost attack? Or our corporate proxy or firewall was preventing communication from unknown apps to the 127 address. I had set NO_PROXY to include the 127.0.0.1 address, so I don't think that was the problem. In the end I'm not really sure why it is working now, but I'm glad that it is.
I'm trying use SSH tunnels inside of Python's urllib2.
Creating the tunnel:
ssh -N user#machine.place.edu -L 1337:localhost:80
The above line should use port 80 on the remote machine and port 1337 on the local machine.
I used -N, so the bash prompt (intentionally) hangs so long as the this tunnel is running.
Using the tunnel in urllib2:
import urllib2
url = "http://ifconfig.me/ip"
headers={'User-agent' : 'Mozilla/5.0'}
proxy_support = urllib2.ProxyHandler({'http': 'http://127.0.0.1:1337'})
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)
req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html
When I run the above code, html = urllib2.urlopen(req).read() throws the error urllib2.HTTPError: HTTP Error 404: Not Found.
What might be going wrong, and how can we fix it?
Troubleshooting:
If I turn off the SSH tunnel, the error changes to urllib2.URLError: <urlopen error [Errno 61] Connection refused>. So, Python is clearly "seeing" the SSH tunnel.
If I comment out the proxy stuff by replacing opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1)) with opener = urllib2.build_opener(), then the ifconfig.me page downloads properly. (Of course, the project that I'm working on requires me to access documents from a few different networks, so I still need proxies to work.)
Some StackOverflow posts suggest using Requests instead of urllib2. I wouldn't mind using Requests instead -- I just used urllib2 here because I wasn't sure how to do custom headers (e.g. user-agent, referer) in Requests.
Unfortunately, since you're the only one with access to machine.place.edu, it's going to be impossible for anyone else to reproduce the problem.
First of all, try something like...
$ telnet localhost 1337
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET http://ifconfig.me/ip HTTP/1.0
...and hit enter a couple of times after the 'GET' line, and see what you get back.
If you get a 404, there's probably something wrong with the proxy.
If you get a 200, then you should be able to recreate that fairly easily with httplib.
I am trying to do a simple get request through a proxy server:
import requests
test=requests.get("http://google.com", proxies={"http": "112.5.254.30:80"})
print test.text
The address of the proxy server in the code is just from some freely available proxy lists on the internet. The point is that this same proxy server works when I use it from browser, but it doesn't work from this program. And i tried many different proxy servers and none of them works through above code.
Here is what I get for this proxy server:
The requested URL could not be retrieved While trying to retrieve the URL: http:/// The following error was encountered:
Unable to determine IP address from host name for
The dnsserver returned: Invalid hostname
This means that: The cache was not able to resolve the
hostname presented in the URL. Check if the address is correct.
I know its an old question, but it should be
import requests
test=requests.get("http://google.com", proxies={"http":"http://112.5.254.30:80","https": "http://112.5.254.30:80"})
print (test.text)