I have a Python script used to connect to Parse.com (remote server) and upload a file. The script runs off a server that sits behind a corporate firewall.
import env
import json
import requests
from requests.auth import HTTPProxyAuth
def uploadFile(fileFullPath):
print "Attempting to upload file: " + fileFullPath
proxies = {
"http": "http://10.128.198.14",
"https": "http://10.128.198.14"
}
auth = HTTPProxyAuth('MyDomain\MyUsername', 'MyPassord')
headers = {
"X-Parse-Application-Id": env.X_Parse_APP_ID,
"X-Parse-REST-API-Key": env.X_Parse_REST_API_Key,
"Content-Type": "application/pdf"
}
f = open(fileFullPath, 'r')
files = {'file': f}
r = requests.post(env.PARSE_HOSTNAME + env.PARSE_FILES_ENDPOINT + "/" + env.PARSE_FILE_NAME, files=files, headers=headers, timeout=10, verify=False, proxies=proxies)
print r.text
When I used this module from the command prompt, I got the following message:
ConnectionError thrown. Details: Cannot connect to proxy. Socket error: Tunnel connection failed: 407 Proxy Authentication Required.
I am pretty sure the username and password are both correct.
Any solution? Thanks!
The reason for the 407 error is that the proxy itself needs to be authenticated. So for your proxies dict, do the following:
proxies = {
"http": "http://user:pass#10.128.198.14",
"https": "http://user:pass#10.128.198.14"
}
Fill in the user and pass variables in the proxies urls. Here is a link to the relevant requests documentation on how to build proxy objects and have them authenticated.
Related
Is it possible to send a HTTP request using two (or more) proxies at the same time in Python? The order of proxy servers matters! (Additional info: 1st proxy is Socks5 and requires authentication. 2nd is HTTP, no auth).
client -> Socks5 Proxy Server -> HTTP Proxy Server -> resource
The requests library allows only one proxy at a time:
import requests
from requests.auth import HTTPProxyAuth
url = 'http://example.com'
proxy_1 = {
'http': 'socks5://host:port',
'https': 'socks5://host:port'
}
auth = HTTPProxyAuth('user', 'password')
# second proxy is not accepted by requests api
# proxy_2 = {
# 'http': 'http://host:port',
# 'https': 'http://host:port'
# }
requests.get(url, proxies=proxy_1, auth=auth)
I need all this to check if proxy_2 is working while being behind proxy_1. May be there is a better way to do it?
Two basic ways to do proxy chaining in python:
1 modified this answer to be used when FIRST proxy requires auth:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
# creating connection to 1st proxy
sock.connect((proxy_host, proxy_port))
# connecting to 2nd proxy
# auth creds are for the FIRST proxy while here you connect to the 2nd one
request = b"CONNECT second_proxy_host:second_proxy_port HTTP/1.0\r\n" \
b"Proxy-Authorization: Basic b64encoded_auth\r\n" \
b"Connection: Keep-Alive\r\n" \
b"Proxy-Connection: Keep-Alive\r\n\r\n"
sock.send(request)
print('Response 1:\n' + sock.recv(40).decode())
# this request will be sent through chain of two proxies
# auth creds are still for the FIRST proxy
request2 = b"GET http://www.example.com/ HTTP/1.0\r\n" \
b"Proxy-Authorization: Basic b64encoded_auth=\r\n" \
b"Connection: Keep-Alive\r\n" \
b"Proxy-Connection: Keep-Alive\r\n\r\n"
sock.send(request2)
print('Response 2:\n' + sock.recv(4096).decode())
2 Using PySocks:
# pip install pysocks
import socks
with sock.socksocket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.setproxy(proxytype=socks.PROXY_TYPE_SOCKS5,
addr="proxy1_host",
port=8080,
username='user',
password='password')
s.connect(("proxy2_host", 8080))
message = b'GET http://www.example.com/ HTTP/1.0\r\n\r\n'
s.sendall(message)
response = s.recv(4069)
print(response.decode())
You could try using proxychains
Something like this
proxychains4 python test.py
#test.py
import requests
r = requests.get("https://ipinfo.io/ip")
print(r.content)
Or check out this and this questions
Also, you could try using selenium instead of requests and play with web driver settings
My code for reference:
header = {"Content-Type": "application/json"}
proxyDict = {
"all_proxy": "http://proxy.com:8080",
"http_proxy": "http://proxy.com:8080",
"https_proxy": "http://proxy.com:8080",
"ftp_proxy": "http://proxy.com:8080",
"ALL_PROXY": "http://proxy.com:8080",
"HTTP_PROXY": "http://proxy.com:8080",
"HTTPS_PROXY": "http://proxy.com:8080",
"FTP_PROXY": "http://proxy.com:8080"
}
try:
res = requests.post('slack_url', json={"text": "text"}, headers=header, proxies=proxyDict, verify=False)
print('Success!')
except:
print("unable to send a slack message")
When I run this code it simply runs as if the proxies were never read and times out. However, when I manually set my environment variables it works perfectly fine.
The issue is that I need this part to run as an airflow service and therefore need the proxy to be set when it is run.
The only thing I can think of is that the requests library requires an actual IP address and can't use proxy.com (just replacing the company URL, not the actual url I am using). In which case I would need a work around without using the IP.
Any ideas?
You're setting up your proxyDict in the wrong way. You only need to set one proxy in that dict.
proxy = "https://proxy.com:8080"
proxies = {'https': proxy}
Then set this proxy in the request:
res = requests.post('slack_url', json={"text": "text"}, headers=header, proxies=proxies, verify=False) #you should never use verify=False as it's not secure
If you want to select a random proxy from a proxy list, just put all of your proxies in an array and randomly select one.
import random
myProxies = ["http://proxy.com:8080","http://proxy.com:8080","http://proxy.com:8080"]
proxies = {'https': random.choice(myProxies)}
I am trying to connect to Splunk via API using python. I can connect, and get a 200 status code but when I read the content, it doesn't read the content of the page. View below:
Here is my code:
import json
import requests
import re
baseurl = 'https://my_splunk_url:8888'
username = 'my_username'
password = 'my_password'
headers={"Content-Type": "application/json"}
s = requests.Session()
s.proxies = {"http": "my_proxy"}
r = s.get(baseurl, auth=(username, password), verify=False, headers=None, data=None)
print(r.status_code)
print(r.text)
I am new to Splunk and python so any ideas or suggestions as to why this is happening would help.
You need to authenticate first to get a token, then you'll be able to hit the rest of REST endpoints. The auth endpoint it at /servicesNS/admin/search/auth/login, which will give you the session_key, which you then provide to subsequent requests.
Here is some code that uses requests to authenticate to a Splunk instance, then start a search. It then checks to see if the search is complete, if not, wait a second and then check again. Keep checking and sleeping until the search is done, then print out the results.
import time # need for sleep
from xml.dom import minidom
import json, pprint
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
base_url = 'https://localhost:8089'
username = 'admin'
password = 'changeme'
search_query = "search=search index=*"
r = requests.get(base_url+"/servicesNS/admin/search/auth/login",
data={'username':username,'password':password}, verify=False)
session_key = minidom.parseString(r.text).getElementsByTagName('sessionKey')[0].firstChild.nodeValue
print ("Session Key:", session_key)
r = requests.post(base_url + '/services/search/jobs/', data=search_query,
headers = { 'Authorization': ('Splunk %s' %session_key)},
verify = False)
sid = minidom.parseString(r.text).getElementsByTagName('sid')[0].firstChild.nodeValue
print ("Search ID", sid)
done = False
while not done:
r = requests.get(base_url + '/services/search/jobs/' + sid,
headers = { 'Authorization': ('Splunk %s' %session_key)},
verify = False)
response = minidom.parseString(r.text)
for node in response.getElementsByTagName("s:key"):
if node.hasAttribute("name") and node.getAttribute("name") == "dispatchState":
dispatchState = node.firstChild.nodeValue
print ("Search Status: ", dispatchState)
if dispatchState == "DONE":
done = True
else:
time.sleep(1)
r = requests.get(base_url + '/services/search/jobs/' + sid + '/results/',
headers = { 'Authorization': ('Splunk %s' %session_key)},
data={'output_mode': 'json'},
verify = False)
pprint.pprint(json.loads(r.text))
Many of the request calls thare used include the flag, verify = False to avoid issues with the default self-signed SSL certs, but you can drop that if you have legit certificates.
Published a while ago at https://gist.github.com/sduff/aca550a8df636fdc07326225de380a91
Nice piece of coding. One of the wonderful aspects of Python is the ability to use other people's well written packages. In this case, why not use Splunk's Python packages to do all of that work, with a lot less coding around it.
pip install splunklib.
Then add the following to your import block
import splunklib.client as client
import splunklib.results as results
pypi.org has documentation on some of the usage, Splunk has an excellent set of how-to documents. Remember, be lazy, use someone else's work to make your work look better.
I am able to create a simple API interface using the requests module that authenticates correctly and receives a response from an API. However, when I attempt to use bravado, to create the client from a swagger file, and manually add an authorization token to the head, it fails with :
bravado.exception.HTTPUnauthorized: 401 Unauthorized: Error(code=u'invalid_credentials', message=u'Missing authorization header',
I believe I am adding the authorization headers correctly.
The code I'm using to create the client is below. As shown, I've tried to add an Authorization token two ways:
in the http_client setup via set_api_key
in the Swagger.from_url(...) step by adding request_headers.
However both options fail.
from bravado.requests_client import RequestsClient
from bravado.client import SwaggerClient
http_client = RequestsClient()
http_client.set_api_key(
'https://api.optimizely.com/v2', 'Bearer <TOKEN>',
param_name='Authorization', param_in='header'
)
headers = {
'Authorization': 'Bearer <TOKEN>',
}
client = SwaggerClient.from_url(
'https://api.optimizely.com/v2/swagger.json',
http_client=http_client,
request_headers=headers
)
My question is, how do I properly add authorization headers to a bravado SwaggerClient?
For reference, a possible solution is to add the _request_options with each request:
from bravado.client import SwaggerClient
headers = {
'Authorization': 'Bearer <YOUR_TOKEN>'
}
requestOptions = {
# === bravado config ===
'headers': headers,
}
client = SwaggerClient.from_url("<SWAGGER_JSON_URL>")
result = client.<ENTITY>.<ACTION>(_request_options=requestOptions).response().result
print(result)
However, a better solution, which I still am unable to get to work, is to have it automatically authenticate with each request.
Try again, fixing the host of the set_api_key line.
from bravado.requests_client import RequestsClient
from bravado.client import SwaggerClient
http_client = RequestsClient()
http_client.set_api_key(
'api.optimizely.com', 'Bearer <TOKEN>',
param_name='api_key', param_in='header'
)
client = SwaggerClient.from_url(
'https://api.optimizely.com/v2/swagger.json',
http_client=http_client,
)
Here you will find documentation about the method : https://github.com/Yelp/bravado/blob/master/README.rst#example-with-header-authentication
I have this program that check a website, and I want to know how can I check it via proxy in Python...
this is the code, just for example
while True:
try:
h = urllib.urlopen(website)
break
except:
print '['+time.strftime('%Y/%m/%d %H:%M:%S')+'] '+'ERROR. Trying again in a few seconds...'
time.sleep(5)
By default, urlopen uses the environment variable http_proxy to determine which HTTP proxy to use:
$ export http_proxy='http://myproxy.example.com:1234'
$ python myscript.py # Using http://myproxy.example.com:1234 as a proxy
If you instead want to specify a proxy inside your application, you can give a proxies argument to urlopen:
proxies = {'http': 'http://myproxy.example.com:1234'}
print("Using HTTP proxy %s" % proxies['http'])
urllib.urlopen("http://www.google.com", proxies=proxies)
Edit: If I understand your comments correctly, you want to try several proxies and print each proxy as you try it. How about something like this?
candidate_proxies = ['http://proxy1.example.com:1234',
'http://proxy2.example.com:1234',
'http://proxy3.example.com:1234']
for proxy in candidate_proxies:
print("Trying HTTP proxy %s" % proxy)
try:
result = urllib.urlopen("http://www.google.com", proxies={'http': proxy})
print("Got URL using proxy %s" % proxy)
break
except:
print("Trying next proxy in 5 seconds")
time.sleep(5)
Python 3 is slightly different here. It will try to auto detect proxy settings but if you need specific or manual proxy settings, think about this kind of code:
#!/usr/bin/env python3
import urllib.request
proxy_support = urllib.request.ProxyHandler({'http' : 'http://user:pass#server:port',
'https': 'https://...'})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)
with urllib.request.urlopen(url) as response:
# ... implement things such as 'html = response.read()'
Refer also to the relevant section in the Python 3 docs
Here example code guide how to use urllib to connect via proxy:
authinfo = urllib.request.HTTPBasicAuthHandler()
proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})
# build a new opener that adds authentication and caching FTP handlers
opener = urllib.request.build_opener(proxy_support, authinfo,
urllib.request.CacheFTPHandler)
# install it
urllib.request.install_opener(opener)
f = urllib.request.urlopen('http://www.google.com/')
"""
For http and https use:
proxies = {'http':'http://proxy-source-ip:proxy-port',
'https':'https://proxy-source-ip:proxy-port'}
more proxies can be added similarly
proxies = {'http':'http://proxy1-source-ip:proxy-port',
'http':'http://proxy2-source-ip:proxy-port'
...
}
usage
filehandle = urllib.urlopen( external_url , proxies=proxies)
Don't use any proxies (in case of links within network)
filehandle = urllib.urlopen(external_url, proxies={})
Use proxies authentication via username and password
proxies = {'http':'http://username:password#proxy-source-ip:proxy-port',
'https':'https://username:password#proxy-source-ip:proxy-port'}
Note: avoid using special characters such as :,# in username and passwords