Open and Receive JSON response from url

Open and Receive JSON response from url - python

I have a document in JSON, with information that I intend for my addon, I found a code in this forum and tried to modify without success. What I intend is that through the function that I will leave, call this link (https://tugarepo.000webhostapp.com/lib/lib.json) so that I can see the content.
CODE:
return json.loads(openfile('lib.json',path.join('https://tugarepo.000webhostapp.com/lib/lib.json')))

Python Answer
You can use
import urllib2
urllib2.openurl('https://tugarepo.000webhostapp.com/lib/lib.json').read()
in Python 2.7 to perform a simple GET request on your file. I think you're confusing openfile, which is for local files only and a HTTP get request which is for hosted content. The result of the read() you can put into any JSON library available for your project.
Original Answer for Javascript tag
In plain Javascript, you can use a function like explained in the following: HTTP GET request in JavaScript?
If you're using Bootstrap or Jquery, you can use the following: http://api.jquery.com/jquery.getjson/
If you wanna see the content on the html page (associated with your Javascript), you'll simply have to grab an element from the page (document.getElementById or document.getElementByClass and such). Once you have a DOM element you can add html into it yourself, that contains your JSON data.
Example code: https://codepen.io/MrKickkiller/pen/prgVLe
The above code is based on having JQuery linked in your html Element. There is however an error since your link doesn't have Acces Control headers. Therefor currently only requests coming from the tugarepo.000webhostapp.com domain have access to the JSON file. Consider adding CORS Headers. https://enable-cors.org/

Simply do:
fetch('https://tugarepo.000webhostapp.com/lib/lib.json')
.then(function (response) { return response.json() })
.then(function (body) { console.log(body)});
But this throws an error as your JSON is invalid.

Related

Python get full response from a get request

I'm needing to write a script to confirm a part of website is vulnerable to reflected XSS but the request response doesn't contain complete HTML so I can't check it for the payload. For example in Burb the response contains the whole page HTML where I can see the 'alert('xss')' but in Python it does not. I've tried response.text/content etc. but they're all the same. Is there a seperate module for this stuff or am I just doing something wrong with the request?
for p in payloads:
response = requests.get(url+p)
if p in response.content:
print(f'Vulnerable: payload - {p}')
Burp response does contain the following
<pre>Hello <script>alert("XSS")</script></pre>
I need to have the same thing in the Python response

One possibility is that that part of the script only will load after a few seconds after GET. When using the request module, it will return the first thing it sees (i.e. the unloaded script).
To go around this you may want to use a web driver module like selenium that allows waiting before getting the HTML

I need to set up an HTML code to recieve a requests.post HTTP request from python

This is my first question I've posted here so let me know if I need to add more information. I have set up a python code which utilizes requests.post to send an HTTP request to the website (the code shown below). I am trying to post the data that is sent from python to the weebly website I have created. I believe the easiest option for this would be to embed HTML code into the website, however I have never used HTML before and cannot find a good source to learn it.
Python code:
import requests
DataSent = {"somekey":"somevalue"}
url = "http://www.greeniethegenie123.weebly.com"
r = requests.post(url, data = DataSent)
print(r.text)
Edit: The question is how can I set up an HTML code to receive the request and post it on the website. Or if there is any other way to send the data that would work too. I just have a sensor recording numbers that I would like to post to the weebly website.
Edit: It looks like HTML is not possible to do this, does anyone have other advice for how to send data from a raspberry pi to a website? The main problem is the website needs to update the data every minute to be useful in what I am trying to do.

You would have to use Javascript instead of HTML to accomplish this.
HTML is used for the structure of a webpage, while javascript can be used for requests, updating content, and lots of other stuff.
Here are some links to help you out on HTML and Javascript:
HTML Intro
Javascript Intro
For requests with Javascript, I would recommend using Axios:
Axios NPM
Here's a link explaining how to use Axios as well:
Axios Tutorial

How do I make my script, receiving only the webpage's URL, parse its POST request's response?

When I access a specific webpage, it sends a specific POST request, the response to which I want to parse. How do I make my script, receiving only the webpage's URL, parse that specific request's response?
(Ideally, in Python.)

So, I've found out that the 'seleniumwire' library for Python is one way to access requests made by a browser when loading a page.

Scraping Biography.com using urllib2

So I've scraped websites before, but this time I am stumped. I am attempting to search for a person on Biography.com and retrieve his/her biography. But whenever I search the site using urllib2 and query the URL: http://www.biography.com/search/ I get a blank page with no data in it.
When I look into the source generated in the browser by clicking View Source, I still do not see any data. When I use Chrome's developer tools, I find some data but still no links leading to the biography.
I have tried changing the User Agent, adding referrers, using cookies in Python but to no avail. If someone could help me out with this task it would be really helpful.
I am planning to use this text for my NLP project and worst case, I'll have to manually copy-paste the text. But I hope it doesn't come to that.

Chrome/Chromium's Developer Tools (or Firebug) is definitely your friend here. I can see that the initial search on Biography's site is made via a call to a Google API, e.g.
https://www.googleapis.com/customsearch/v1?q=Barack%20Obama&key=AIzaSyCMGfdDaSfjqv5zYoS0mTJnOT3e9MURWkU&cx=011223861749738482324%3Aijiqp2ioyxw&num=8&callback=angular.callbacks._0
The search term I used is in the q= part of the query string: q=Barack%20Obama.
This returns JSON inside of which there is a key link with the value of the article of interest's URL.
"link": "http://www.biography.com/people/barack-obama-12782369"
Visiting that page shows me that this is generated by a request to:
http://api.saymedia-content.com/:apiproxy-anon/content-sites/cs01a33b78d5c5860e/content-customs/#published/#by-custom-type/ContentPerson/#by-slug/barack-obama-12782369
which returns JSON containing HTML.
So, replacing the last part of the link barack-obama-12782369 with the relevant info for the person of interest in the saymedia-content link may well pull out what you want.
To implement:
You'll need to use urllib2 (or requests) to do the search via their Google API call, using urllib2.urlopen(url) or requests.get(url). Replace the Barack%20Obama with a URL escaped search string, e.g. Bill%20Clinton.
Parse the JSON using Python's json module to extract the string that gives you the http://www.biography.com/people link. From this, extract the part of this link of interest (as barack-obama-12782369 above).
Use urllib2 or requests to do a saymedia-content API request replacing barack-obama-12782369 after #by-slug/ with whatever you extract from 2; i.e. do another urllib2.urlopen on this URL.
Parse the JSON from the response of this second request to extract the content you want.
(Caveat: This is provided that there are no session-based strings in those two API calls that might expire.)
Alternatively, you can use Selenium to visit the website, do the search and then extract the content.

You will most likely need to manually copy and paste, as biography.com is a completely javascript-based site, so it can't be scraped with traditional methods.

You can discover an api url with httpfox (firefox addon). f.e. http://www.biography.com/.api/item/search?config=published&query=marx
brings you a json you can process searching for /people/ to retrive biography links.
Or you can use an screen crawler like selenium

Working with html generated from javascript

I have some html-page. There is a javascript which generates some content. I have to parse this content from python-script. I have saved copy of file on the computer. Are there any ways to work with 'already generated' html? Like I can see in the browser after opening page-file. As I understand, I have to work with DOM (maybe, xml2dom lib).

Have you saved "the file" (web page, I imagine) before or after Javascript has altered it?
If "after", then it doesn't matter any more that some of the HTML was done via Javascript -- you can just use popular parsers like lxml or BeautifulSoup to handle the HTML you have.
If "before", then first you need to let Javascript do its work by automating a real browser; for that task, I would recommend SeleniumRC -- which brings you back to the "after" case;-).

I think you may have a fundamental misunderstanding in regards to what runs where: At the time JavaScript generates the content (on client side), the server side processing of the document has already taken place. There is no direct way for a server side Python script to access HTML created by JavaScript. Basically, that HTML lives only "virtually" in the browser's DOM.
You would have to find a way to transmit that HTML to your Python script. Most likely using Ajax. You would take the HTML, and add it as a parameter to your Ajax call (Remember to use POST as the request method so you don't get size limitation problems.)
An example using jQuery's AJAX functions:
$.ajax({
url: "myscript.py",
type: "POST",
data: { html: your_html_content_here },
success: function(){
alert("sent HTML to python script!");
}});

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.