KeyError: 'data' error while parsing Subreddit JSON

KeyError: 'data' error while parsing Subreddit JSON - python

I am trying to read title of latest posts from r/chrome subreddit using Python.
But when I execute the python file, I get the KeyError: 'data' error
Here's my code:
import json, requests
def getReddit():
redditLatest = requests.get('https://www.reddit.com/r/chrome/new/.json').json()
print(redditLatest['data']['children'][0]['data']['title'])
getReddit()
Terminal:
Please help with a solution.

As Mikhail Beliansky already mentioned, debug your response.
import requests
redditLatest = requests.get('https://www.reddit.com/r/chrome/new/.json').json()
print(redditLatest)
# {'message': 'Too Many Requests', 'error': 429}
You can see that reddit recognizes that you are not a "normal" client/browser. Especially because requests adds a user-agent like "python-requests/2.25.1".
You can add a common browser user-agent to your request. If you don't make too many requests, this may work for you.
redditLatest = requests.get(
'https://www.reddit.com/r/chrome/new/.json',
headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.77 Safari/537.36'}
).json()
print(redditLatest)
# {'kind': 'Listing', 'data': {...}}

Related

Python trying to send request with requests library but nothing happened?

Like the title said, im trying to send request a url using requests with headers, but when I try to print the status code it doesn't print anything in the terminal, I checked my internet connection and changed to test it but nothing changes.
Here's my code ;
import requests
from bs4 import BeautifulSoup
from requests.exceptions import ReadTimeout
link="https://www.exampleurl.com"
header={
"accept-language": "tr,en;q=0.9,en-GB;q=0.8,en-US;q=0.7",
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36 Edg/99.0.1150.36'
}
r=requests.get(link)
print(r.status_code)
When I execute this command, nothing appears, don't know why. If someone can help me I will be so glad.

you can use request.head(link) like below:
r=requests.head(link)
print(r.status_code)

I get the same problem. The get() never returns.
Since you have created a header variable I thought about using that:
r = requests.get(link, headers=header)
Now I get status 200 returned.

Page 404 through Python Requests but loads fine through browser

Getting page 404 with Python Requests but I can access the page no problem through my browser. I can access other pages that are formatted exactly the same as this page and they load no problem.
Have already tried changing headers with no luck.
My Code:
string_page = str(page)
with requests.Session() as s:
resp = s.get('https://bscscan.com/token/generic-tokentxns2?m=normal&contractAddress=0x470862af0cf8d27ebfe0ff77b0649779c29186db&a=&sid=f58c1cdefacc680b799412c7645ed7f7&p='+string_page)
page_info = str(resp.text)
print(page_info)
I have also tried with urllib and the same thing happens

I'm not sure if this will fix it, but try adding this in the header maybe it might work
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'

Is there a faster way to check the availability of numerous websites

Good day, everyone.
This code is checking availability if the website, but it's loading the whole page, so if I have a list of 100 websites, it will be slow.
My Question is: Is there any way to do it faster?
import requests
user_agent = {'accept': '*/*', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'}
session = requests.Session()
response = session.get("http://google.com", headers=user_agent, timeout=5)
if response.status_code == 200:
print("Checked & avaliable")
else:
print("Not avaliable")
Thanks!
Every Help Will Be Appreciated

You Can Use That:
import urllib.request
print(urllib.request.urlopen("http://www.google.com").getcode())
#output
>>> 200

This code is checking availability if the website, but it's loading the whole page
To not load the whole page, you can issue HEAD requests, instead of GET, so you will just check the status. See Getting HEAD content with Python Requests
Another way to make it faster, is to issue multiple requests using multiple threads or asyncio ( https://pawelmhm.github.io/asyncio/python/aiohttp/2016/04/22/asyncio-aiohttp.html ).

Read a pdf file from a url and POST it on another server using python requests

I have tried the code below. Doesn't work.
The PDF is readable in browser.
I want to GET the pdf file from the GET-url and POST it to another server.
import requests
response = requests.get(url='some url')
requests.post(url='my_url', files={'file':response.content})
Link: (Expired)

It is caused by a missing header, specific the Uses-Agent. Looks like the site checks it.
The call returns a HTTP 406 (response.status_code). With the header a HTTP 200 is returned.
Try this:
import requests
header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"}
response = requests.get(url='some url', headers=header)
requests.post(url='my_url', files={'file':response.content})

Unable to login to twitter using python requests library

I have tried to login to the twitter account using requests library. But I am getting url response as "400". It is not working. I used all the required payload parameters and headers. But still, I am unable to figure out how to login.
import requests
from bs4 import BeautifulSoup
payload={
"session[username_or_email]":"***************",
"session[password]":"*************",
"authenticity_token":"*************",
"ui_metrics":'{"rf":{"a4f2d7a8e3d9736f0815ae7b34692191bca9f114a7d5602c7758a3e6087b6b30":0,"ad92fc8b83fb5dec3f720f89a7f0fb415a26130516362f230b02251edd96a54a":0,"a011babb5c5df598f93bcc4a38dfad0276f69df36faff48eea95bac67cefeffe":0,"a75214752b7e90fd50725fce21cc26761ef3613173b0f8764d52c8b53f136bbf":0},"s":"mTArUSdNtTOm6WaGwNeRjMAU3EhNA3VGbFeCIZeEkjjLTAbccFDTJjcTEB2tQ9iuNJUzniFKyvhZNOGdH1LIwmi1YSMcFTOHu2Wi49yKvONv0obfg1dW27znR_C2n-ev2zMvN5166j1ccsxWKIheiWw-eHM7oXA54U40cWHvdCrunJJKj2INkTrcVph-y2fccu1m3hp31vngqBiL-XmeLWYiyZ-NYOmV8f5iXW9WWMvISTcSwzz9vd_n9-tLSKociT-1ap5ZVFWNUWIycSflj8WcOmmRFzq4kwa-NsS0FRp-DQ2FOkozhhhQi9HDvSODUlGsdQWBPkGKKtDWbtnj9gAAAWEty4Xv"}',
"scribe_log":"",
"redirect_after_login":"",
"authenticity_token":"******************",
"return_to_ssl":"",
"remember_me":"1",
"lang":""
}
headers={
"accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"accept-encoding":"gzip, deflate, br",
"accept-language":"en-US,en;q=0.9",
"cache-control":"max-age=0",
"cookie":'moments_profile_moments_nav_tooltip_self=true; syndication_guest_id=v1%3A150345116906281638; eu_cn=1; kdt=QErLcBT9OjM5gjEznmsRcHlMTK6biDyAw4gfI5ro; remember_checked_on=1; _ga=GA1.2.1923324433.1496571570; tfw_exp=0; _gid=GA1.2.106381927.1516638134; __utma=43838368.1923324433.1496571570.1516764481.1516764481.1; __utmz=43838368.1516764481.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); lang=en; ct0=7ceea26f7fd3d186152512d26365cddf; _twitter_sess=BAh7CiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNo%250ASGFzaHsABjoKQHVzZWR7ADoPY3JlYXRlZF9hdGwrCL8wyy1hAToMY3NyZl9p%250AZCIlNjJjODQ1MjZiZWQzOGUyODZlOWUxNmNkMWJhZTZjYjc6B2lkIiU4MmZm%250AYWQ3Mzc1OGFhNmJjOTIxZjlmOGEyMzk3MjE1NToJdXNlcmwrCQAAVbhKiEIN--32d967262e1de8852d20ace15ec93d87b9a902a8; personalization_id="v1_snKt6bqCONQsnFuE8EOZDA=="; guest_id=v1%3A151689245583269291; _gat=1; ads_prefs="HBERAAA="; twid="u=955475925457502208"; auth_token=50decb38f16f3c264f480b0cd1cc30a9bcce9f08',
"referer":"https://twitter.com/login",
"upgrade-insecure-requests":"1",
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"
}
res = requests.get("https://twitter.com/login",data=payload,headers=headers)
soup = BeautifulSoup(res.text,"html.parser")
print(res.status_code)
print(res.url)
for item in soup.find_all(class_="title"):
print(item.text)
How to login to twitter? what all parameters did i miss? Please help me out with this.
Note: I am not using APIs or selenium driver. I want to do it using requests library. Please help me. Thanks in Advance.

You're using the GET method to access an auth endpoint, usually the POST method is used for such purposes, try using requests.post instead of requests.get.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

KeyError: 'data' error while parsing Subreddit JSON - python

Related

Python trying to send request with requests library but nothing happened?

Page 404 through Python Requests but loads fine through browser

Is there a faster way to check the availability of numerous websites

Read a pdf file from a url and POST it on another server using python requests

Unable to login to twitter using python requests library

Categories

Resources