I have a question about python mechanize's proxy support. I'm making some web client script, and I would like to insert proxy support function into my script.
For example, if I have:
params = urllib.urlencode({'id':id, 'passwd':pw})
rq = mechanize.Request('http://www.example.com', params)
rs = mechanize.urlopen(rq)
How can I add proxy support into my mechanize script?
Whenever I open this www.example.com website, i would like it to go through the proxy.
I'm not sure whether that help or not but you can set proxy settings on mechanize proxy browser.
br = Browser()
# Explicitly configure proxies (Browser will attempt to set good defaults).
# Note the userinfo ("joe:password#") and port number (":3128") are optional.
br.set_proxies({"http": "joe:password#myproxy.example.com:3128",
"ftp": "proxy.example.com",
})
# Add HTTP Basic/Digest auth username and password for HTTP proxy access.
# (equivalent to using "joe:password#..." form above)
br.add_proxy_password("joe", "password")
You use mechanize.Request.set_proxy(host, type) (at least as of 0.1.11)
assuming an http proxy running at localhost:8888
req = mechanize.Request("http://www.google.com")
req.set_proxy("localhost:8888","http")
mechanize.urlopen(req)
Should work.
Related
I am trying to access a server over my internal network under https://prodserver.de/info.
I have the code structure as below:
import requests
from requests.auth import *
username = 'User'
password = 'Hello#123'
resp = requests.get('https://prodserver.de/info/', auth=HTTPBasicAuth(username,password))
print(resp.status_code)
While trying to access this server via browser, it works perfectly fine.
What am I doing wrong?
By default, requests library verifies the SSL certificate for HTTPS requests. If the certificate is not verified, it will raise a SSLError. You check this by disabling the certificate verification by passing verify=False as an argument to the get method, if this is the issue.
import requests
from requests.auth import *
username = 'User'
password = 'Hello#123'
resp = requests.get('https://prodserver.de/info/', auth=HTTPBasicAuth(username,password), verify=False)
print(resp.status_code)
try using requests' generic auth, like this:
resp = requests.get('https://prodserver.de/info/', auth=(username,password)
What am I doing wrong?
I can not be sure without investigating your server, but I suggest checking if assumption (you have made) that server is using Basic authorization, there exist various Authentication schemes, it is also possible that your server use cookie-based solution, rather than headers-based one.
While trying to access this server via browser, it works perfectly
fine.
You might then use developer tools to see what is actually send inside and with request which does result in success.
I'm using Tor, Privoxy, and Python to anonymously crawl sources on the web. Tor is configured with ControlPort 9051, while Privoxy is configured with forward-socks5 / localhost:9050 .
My scripts are working flawlessly, except when I request an API resource that I have running on 8000 on the same machine. If I hit the API via urllib2 setup with the proxy, I get an empty string response. If I hit the API using a new, non-proxy instance of urllib2, I get a HTTP Error 503: Forwarding failure.
I'm sure that if I open 8000 to the world I'll be able to access the port through the proxy. However, there must be a better way to access the resource on localhost. Curious how people deal with this.
I was able to switch off proxy and hit internal API by using the following to opener:
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
opener = urllib2.build_opener(urllib2.HTTPSHandler(context=ctx))
urllib2.install_opener(opener)
I'm not sure if there is a better way, but it worked.
I am learning django and trying to complete my first webapp.
I am using shopify api & boilder plate (starter code) and am having an issue with the final step of auth.
Specifically, the redirect URL -- it's using HTTP:// when it should NOT and I don't know how to change it..
#in my view
def authenticate(request):
shop = request.GET.get('shop')
print('shop:', shop)
if shop:
scope = settings.SHOPIFY_API_SCOPE
redirect_uri = request.build_absolute_uri(reverse('shopify_app_finalize')) #try this with new store url?
print('redirect url', redirect_uri) # this equals http://myherokuapp.com/login/finalize/
permission_url = shopify.Session(shop.strip()).create_permission_url(scope, redirect_uri)
return redirect(permission_url)
return redirect(_return_address(request))
Which is a problem because my app uses the Embedded Shopify SDK which causes this error to occur at the point of this request
Refused to frame 'http://my.herokuapp.com/' because it violates the following Content Security Policy directive: "child-src 'self' https://* shopify-pos://*". Note that 'frame-src' was not explicitly set, so 'child-src' is used as a fallback.
How do i change the URL to use HTTPS?
Thank you so much in advance. Please let me know if I can share any other details but my code is practically identical to that starter code
This is what the Django doc says about build_absolute_uri:
Mixing HTTP and HTTPS on the same site is discouraged, therefore
build_absolute_uri() will always generate an absolute URI with the
same scheme the current request has. If you need to redirect users to
HTTPS, it’s best to let your Web server redirect all HTTP traffic to
HTTPS.
So you can do two things:
Make sure your site runs entirely on HTTPS (preferred option): Setup your web server to use HTTPS, see the Heroku documentation on how to do this. Django will automatically use HTTPS for request.build_absolute_uri if the incoming request is on HTTPS.
I'm not sure what gets passed in the shop parameter but if it contains personal data I'd suggest to use HTTPS anyway.
Create the URL yourself:
url = "https://{host}{path}".format(
host = request.get_host(),
path = reverse('shopify_app_finalize'))
But you will still need to configure your server to accept incoming HTTPS requests.
Few days back I was wanting to build a proxy that could allow me to securely and anonymously connect to websites and servers. At first it seemed like a pretty easy idea, I would create an HTTP proxy that uses SSL between the client and the proxy, It would then create a SSL connection with what ever website/server the client requested and then forward that information to and from the client and server. I spent about a day researching and writing code that would do just that. But I then realized that someone could compromise the proxy and use the session key that the proxy had to decrypt and read the data being sent to and from the server.
After a little more research it seem that a socks proxy is what I needed. However there is not much documentation on a python version of a socks proxy(Mostly just how to connect to one). I was able to find The PySocks Module and read the Socks.py file. It looks great for creating a client but I don't see how I could use it to make a proxy.
I was wondering if anyone had a simple example of a socks5 proxy or if someone could point me to some material that could help me begin learning and building one?
You create a python server to listen on a port and listen on IP 127.0.0.1. When you connect to your server you send: "www.facebook.com:80". No URL path nor http scheme. If the connect fails you send a failure message which may look something like "number Unable to connect to host." where number is an specific code that signifies a failed connection attempt. Upon success of a connection you send "200 Connection established". Then data is sent and received as normal. You do not want to use an http proxy because it accepts only website traffic.
You may want to use a framework for the proxy server because it should handle multiple connections.
I've read an ebook on asyncio named O'Reilly "Using Asyncio In Python 2020" multiple times and re-read it every now and again to try to grasp multiple connections. I have also just started to search for solutions using Flask because I want my proxy server to run along side a webserver.
I recommend using requesocks along with stem (assumes Tor). The official stem library is provided by Tor. Here's a simplified example based on a scraper that I wrote which also uses fake_useragent so you look like a browser:
import requesocks
from fake_useragent import UserAgent
from stem import Signal
from stem.control import Controller
class Proxy(object):
def __init__(self,
socks_port=9050,
tor_control_port=9051,
tor_connection_password='password')
self._socks_port = int(socks_port)
self._tor_control_port = int(tor_control_port)
self._tor_connection_password = tor_connection_password
self._user_agent = UserAgent()
self._session = None
self._update_session()
def _update_session(self):
self._session = requesocks.session()
# port 9050 is the default SOCKS port
self._session.proxies = {
'http': 'socks5://127.0.0.1:{}'.format(self._socks_port),
'https': 'socks5://127.0.0.1:{}'.format(self._socks_port),
}
def _renew_tor_connection(self):
with Controller.from_port(port=self._tor_control_port) as controller:
controller.authenticate(password=self._tor_connection_password)
controller.signal(Signal.NEWNYM)
def _sample_get_response(self, url):
if not self._session:
self._update_session()
# generate random user agent string for every request
headers = {
'User-Agent': self._user_agent.random,
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-us,en;q=0.5',
} # adjust as desired
response = self._session.get(url, verify=False, headers=headers)
return response
You must have the Tor service running before executing this script and you must also modify your torrc file to enable the control port (9051).
Tor puts the torrc file in /usr/local/etc/tor/torrc if you compiled Tor from source, and /etc/tor/torrc or /etc/torrc if you installed a pre-built package. If you installed Tor Browser, look for
Browser/TorBrowser/Data/Tor/torrc inside your Tor Browser directory (On Mac OS X, you must right-click or command-click on the Tor Browser icon and select "Show Package Contents" before the Tor Browser directories become
visible).
Once you've found your torrc file, you need to uncomment the corresponding lines:
ControlPort 9051
## If you enable the controlport, be sure to enable one of these
## authentication methods, to prevent attackers from accessing it.
HashedControlPassword 16:05834BCEDD478D1060F1D7E2CE98E9C13075E8D3061D702F63BCD674DE
Please note that the HashedControlPassword above is for the password "password". If you want to set a different password (recommended), replace the HashedControlPassword in the torrc file by noting the output from tor --hash-password "<new_password>" where <new_password> is the password that you want to set.
Once you've changed your torrc file, you will need to restart tor for the changes to take effect (note that you actually only need to send Tor a HUP signal, not actually restart it). To restart it:
sudo service tor restart
I hope this helps and at least gets you started for what you were looking for.
Does urllib2 in Python 2.6.1 support proxy via https?
I've found the following at http://www.voidspace.org.uk/python/articles/urllib2.shtml:
NOTE
Currently urllib2 does not support
fetching of https locations through a
proxy. This can be a problem.
I'm trying automate login in to web site and downloading document, I have valid username/password.
proxy_info = {
'host':"axxx", # commented out the real data
'port':"1234" # commented out the real data
}
proxy_handler = urllib2.ProxyHandler(
{"http" : "http://%(host)s:%(port)s" % proxy_info})
opener = urllib2.build_opener(proxy_handler,
urllib2.HTTPHandler(debuglevel=1),urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)
fullurl = 'https://correct.url.to.login.page.com/user=a&pswd=b' # example
req1 = urllib2.Request(url=fullurl, headers=headers)
response = urllib2.urlopen(req1)
I've had it working for similar pages but not using HTTPS and I suspect it does not get through proxy - it just gets stuck in the same way as when I did not specify proxy. I need to go out through proxy.
I need to authenticate but not using basic authentication, will urllib2 figure out authentication when going via https site (I supply username/password to site via url)?
EDIT:
Nope, I tested with
proxies = {
"http" : "http://%(host)s:%(port)s" % proxy_info,
"https" : "https://%(host)s:%(port)s" % proxy_info
}
proxy_handler = urllib2.ProxyHandler(proxies)
And I get error:
urllib2.URLError: urlopen error
[Errno 8] _ssl.c:480: EOF occurred in
violation of protocol
Fixed in Python 2.6.3 and several other branches:
_bugs.python.org/issue1424152 (replace _ with http...)
http://www.python.org/download/releases/2.6.3/NEWS.txt
Issue #1424152: Fix for httplib, urllib2 to support SSL while working through
proxy. Original patch by Christopher Li, changes made by Senthil Kumaran.
I'm not sure Michael Foord's article, that you quote, is updated to Python 2.6.1 -- why not give it a try? Instead of telling ProxyHandler that the proxy is only good for http, as you're doing now, register it for https, too (of course you should format it into a variable just once before you call ProxyHandler and just repeatedly use that variable in the dict): that may or may not work, but, you're not even trying, and that's sure not to work!-)
Incase anyone else have this issue in the future I'd like to point out that it does support https proxying now, make sure the proxy supports it too or you risk running into a bug that puts the python library into an infinite loop (this happened to me).
See the unittest in the python source that is testing https proxying support for further information:
http://svn.python.org/view/python/branches/release26-maint/Lib/test/test_urllib2.py?r1=74203&r2=74202&pathrev=74203