I am trying to make request from zillow api. However, I want to use proxy.
import zillow
api = zillow.ValuationApi()
data = api.GetSearchResults(key, address, postal_code)
Is there any method to have my requests use my predefined proxy?
Thanks
Hi you can set the proxy using environment variables as below:
HTTP_PROXY="http://<proxy server>:<port>"
HTTPS_PROXY="http://<proxy server>:<port>"
You can read more about proxy in python requests proxy settings page
Related
I would like to download the state of industry CSV from OpenTable with a python bot,
but the URL of the CSV is of the form “Blob: https” and I could not use the request library.
What can I do to get the correct URL? or to download it using python and the blob URL? I can with Selenium, but I rather use the request library.
To Get The Correct URL Try Making A proxy like mitmproxy https://mitmproxy.org
Or Some Software Like Burp Suit To Intercept The Traffic Between You as A Client And Website Server Then You Will Got The ALL HTTP Request With HTTP method And URL
And Body Data
May That Will Help You To Build Your BOT
Intercepting HTTP Traffic with mitmproxy
Intercepting HTTP Traffic With BurpSuit
I am trying to test the latency of a proxy by pinging a site while using a proxy with a login. I know requests easily supports proxies and was wondering if there was a way to ping/test latency to a site through this. I am open to other methods as well, as long as they support a proxy with a login. Here is an example of my proxy integration with requests
import requests
proxy = {'https' : 'https://USER:PASS#IP:PORT'}
requests.get('https://www.google.com/', proxy=proxy)
How can I make a program to test the latency of a proxy with a login to a site?
I have to scrape an internal web page of my organization. If I use Beautiful soap I get
"Unauthorized access"
I don't want to put my username/password in the source code because it will be shared across collegues.
If I open the same web url using Firefox It doesn't not ask me to login, the only problem is when I make the same request using python script.
Is there a way to share the same session used by firefox with a python script?
I think my authentication is with my PC because if I log off deleting all cookies When i re-enter I because logged in automatically. Do you know why with my python script this doesn’t not happen?
When you use the browser to login to your organization, you provide your credentials and the server returns a cookie tied to your organization's domain. This cookie has an expiration and allows to use navigate your organization's site without having to login as long as the cookie is valid.
You can read about cookies here:
https://en.wikipedia.org/wiki/HTTP_cookie
Your website scraper does not need to store your credentials. First delete the cookies then, using your browser's developer tools, you can (look at the network tab):
Figure out if your organization uses a separate auth end point
If it's not evident, then you might ask the IT department
Use the auth endpoint to get a cookie using credentials passed in
See how this cookie is used by the system (look at the HTTP request/response headers)
Use this cookie to scrape the website
Share your code freely - if someone needs to scrape the website then they can either pass in their credentials, or use a curl command to get/set a valid cookie header
1) After authenticating in your Firefox browser, make sure to get the cookie key/value.
2) Use that data in the code below :
from bs4 import BeautifulSoup
import requests
browser_cookies = {'your_cookie_key':'your_cookie_value'}
s = requests.Session()
r = s.get(your_url, cookies=browser_cookies)
bsoup = BeautifulSoup(r.text, 'lxml')
The requests.Session() is for persistence.
One more tips, you could also call your script like that :
python3 /path/to/script/script.py cookies_key cookies_value
Then, get the two values with sys module. The code will be :
import sys
browser_cookies = {sys.argv[1]:sys.argv[2]}
I am using python request library to access the soap requests. And it was working fine. As there is change in our domain structure. I could not access the url, it always prompting me to enter the credentials.
I am using below code to access the url earlier using requests.
program_list_response = requests.get(program_list_path,
data=self.body, headers=self.headers)
How to pass the authentication in background using requests?
You can use the Authentication feature for that in order to provide the credentials for the link that you want to access.
For an eg:
You can pass the username and password by using the below format:
requests.get('https://website.com/user', auth=('user', 'pass'))
For more details I would recommend the official docs.
For handling the Windows authentication then I would recommend the Requests-NTLM.
For eg:
import requests
from requests_ntlm import HttpNtlmAuth
requests.get("http://ntlm_protected_site.com",auth=HttpNtlmAuth('domain\\username','password'))
I am working on scraping databases that I have access to using the duke library web proxy. I encountered the issue that since the data base is accessed through a proxy server, I can't directly scrape this database as I would if the database was did not require proxy authentication.
I tried several thing:
I wrote one script that logs into the duke network (https://shib.oit.duke.edu/idp/AuthnEngine').
I then hardcode in my login data:
login_data = urllib.urlencode({'j_username' : 'userxx',
'j_password' : 'passwordxx',
'Submit' : 'Enter'
})
I then login:
resp = opener.open('https://shib.oit.duke.edu/idp/AuthnEngine', login_data)
and then I create a cookie jar object to hold the cookies from proxy website.
then i try to access the database with my script and it is still telling me authentication is required. I wanted to know how I can get around the authentication required for the proxy server.
If you have any suggestions please let me know.
Thank you,
Jan
A proxy login does not store cookies but instead uses the Proxy-Authorization header. This header will need to be sent with every request similar to Cookies. The header is of the same format as regular Basic Authentication, although there are different formats possible (Digest, NTLM.) I suggest you check the headers of a normal login and copy and paste the Proxy-Authorization header that was sent.