I was trying to extract data from a website with the python requests library
When I launched the request from my browser, I got the data
When I launched the request from requests library, I got a captcha
So I guess that website doesn't want me to extract its data and it's fine, but it made me curious.
How can I get different results from a same request ? Headers were the same, URLs were the same too
Related
I have an issue related to technical support. I am trying to send a get requests using Python which code is given below
import requests
res=requests.get('https://nclt.gov.in/')
but this request got stuck for long time where it is working fine in local system and I get response within a second in my local system but not able to send get request from my droplet server. I don't know what going on this site.
I had test with different website and I am getting response from all the website instead of this website and I don't have any idea why.
I had also tried in such way:
I had set the user-agents in header
used cookies
but I am not getting response. I had tried this for last 24 hours and not able to get the exact reason behind this.
Is there any issue in droplet and should I have to configure anything. I think there is not any validation in 'http://nclt.gov.in' because I am sending just get request and it is working fine in my local machine without any problem.
I am logging in into a website with python request by sending a post with required data.
I am trying to get other http requests after sending the previous http post.
Is there a way to do it?
If I log in manually in browser I can see all other requests that are being sent after logging in (which is the first POST in screenshot), I want to grab them all (the ones marked with green marker):
I assume that when you login a new html side is responded to your web browser.
During the rendering of this site some files like images or javascript are requested from the server side. With selenium you can automate user interactions with a web browser and log the traffic like described in this example.
I'm not understanding why the python requests library isn't pulling in all cookies. For examples, I am running this code
import requests
a_session = requests.Session()
a_session.get('https://google.com/')
session_cookies = a_session.cookies
cookies_dictionary = session_cookies.get_dict()
print(cookies_dictionary)
But I only get the cookie "1P_JAR" even though there should be several cookies.
list of cookies shown up on inspector pannel
Ultimately I'm trying to figure out why its choosing only that 1 cookie and not the others because I'm trying to build my own application that generates a cookie but when I run this script on my application I get back and empty list even though the inspector shows that I have generated a cookie.
A cookie is set by a server response to a specific request.
Your basic google.com request only sets that cookie, which you can observe by the set-cookie header.
The other cookies are probably set by other requests or even the js code. Requests doesn't evaluate or run js and thus doesn't make any other requests.
If you don't want to completely reverse engeneer every single cookie, the way to go would be to simulate a browser by using Selenium + Chrome Driver or a similar solution.
I just started experimenting with Requests with python to interact with different sites. However sometimes I want to see if the POST Requests I'm sending is actually working. Is there anyway to open a browser to see what is actually happening in the browser when I send POST requests?
I am scraping data from peoplefinders.com a website which is not accesible from my home country so I am basically using a vpn client.
I login to this website with a session post and through the same session I get items from different pages of the same website. The problem is that I do scraping in a for loop with get requests but for some reason I receive response 400 error after a several iterations. The error occurs after scraping 4-5 pages on average.
Is it due to fact that I am using a vpn connection ?
Doesn't all requests from the same session contains same cookies and hence allow me to keep logged in while scraping different pages of the same website ?
Thank You
HTTP 400 is returned, if the request is malformed.
You should inspect the request being made, when you get the error. Perhaps, it is not properly encoded.
VPN should not cause an HTTP 400.