Python request returning 403

Python request returning 403 - python

I am trying to use requests to make an api call that this page is making https://www.betonline.ag/sportsbook/martial-arts/mma.
requests.post(
url='https://api.betonline.ag/offering/api/offering/sports/offering-by-league',
headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15'}
json={"Sport":"martial-arts","League":"mma","ScheduleText":None,"Period":0}
)
I have also tried including all the headers I see in the image but am still unable to get a 200 and get the response.
What am I missing?

Related

Python web scraping, requests object hangs

I am trying to scrape the website in python, https://www.nseindia.com/
However when I try to load the website using Requests in python the call simply hangs below is the code I am using.
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
r = requests.get('https://www.nseindia.com/',headers=headers)
The requests.get call simply hangs, not sure what I am doing wrong here? The same URL works perfectly in Chrome or any other browser.
Appreciate any help.

Unable to read website intermittently with requests

I tried to read a website using Python requests.
However, it sometimes succeeds but sometimes fails.
Here is my code:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
url = 'https://hn.house.ifeng.com/homedetail/25762.shtml'
res = requests.get(url, headers = headers, timeout = 10, verify=False)
What are the reasons and how to solve it?
Thank you very much.

403 forbidden error. can't access to this site

I want to scrape https://health.usnews.com/doctors/specialists-index while sending a request to this site through scrapy spider it shows status code as 403. In my request, I added user_agent but also it's not working.
I referred these two answer Python Doesn't Have Permission To Access On This Server / Return City/State from ZIP and 403:You don't have permission to access /index.php on this server but it's not working for me.
my user_agent is Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 Safari/537.36. Some one help me to scrape the above mentioned site.

Try to add 'authority' in the headers as well. The below works for me in scrapy shell:
from scrapy import Request
headers = {
'authority': 'health.usnews.com',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36',
}
url = "https://health.usnews.com/doctors/specialists-index"
req = Request(url, headers=headers)
fetch(req)

Requests posts with Python

This is the first time I am trying requests.post() because I have always used requests.get(). So I'm trying to navigate to a website and search. I am using yellowpages.com, and before I get negative feedback about using the site to scrape or about an API, I just want to try it out. The problem I am running into is that it spits out some html that isn't remotely what I am looking for. I'll post my code below to show you what I am talking about.
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
url = "https://www.yellowpages.com"
search_terms = "Car Dealership"
location = "Jackson, MS"
q = {'search_terms': search_terms, 'geo_locations_terms': location}
page = requests.post(url, headers=headers, params=q)
print(page.text)

Your request boils down to
$ curl -X POST \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36' \
'https://www.yellowpages.com/?search_terms=Car Dealership&geo_locations_terms=Jackson, MS'
For this the server returns a 502 Bad Gateway status code.
The reason is that you use POST together wihy query parameters params. The two don't go well together. Use data instead:
requests.post(url, headers=headers, data=q)

Python Requests Get XML

If I go to http://boxinsider.cratejoy.com/feed/ I can see the XML just fine. But when I try to access it using python requests, I get a 403 error.
blog_url = 'http://boxinsider.cratejoy.com/feed/'
headers = {'Accepts': 'text/html,application/xml'}
blog_request = requests.get(blog_url, timeout=10, headers=headers)
Any ideas on why?

Because it's hosted by WPEngine and they filter user agents.
Try this:
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.152 Safari/537.36"
requests.get('http://boxinsider.cratejoy.com/feed/', headers={'User-agent': USER_AGENT})

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python request returning 403 - python

Related

Python web scraping, requests object hangs

Unable to read website intermittently with requests

403 forbidden error. can't access to this site

Requests posts with Python

Python Requests Get XML

Categories

Resources