What I've done:
I've created a small HTTP server python, using no modules but socket. It manages to start, load HTML and then send it out,
so you can see the HTML file in your browser.
What I'm using:
I'm using Python 3.5.1, I'm also using Socket.
The problem:
Images do not show on the web browser.
Here is the code I have put together:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
HOST = ''
PORT = 8888
asked_url = []
s.bind((HOST, PORT))
s.listen(1)
while True:
conn, addr = s.accept()
request = conn.recv(1024)
html_file = open("page.html", 'rb')
try:
http_response = html_file.read()
conn.sendall(http_response)
conn.sendall(b'<p><small>PyWeb</small></p>')
except:
print("Couldn't send any data! Passing...")
pass
conn.close()
Here is also the HTML code for the page, However it doesn't seem to be the problem:
<script src="https://mcapi.us/scripts/minecraft.js"></script>
<div class="server-status">
My awesome server is currently <span class="server-online"></span>!
</div>
<script>
MinecraftAPI.getServerStatus('SERVER HERE', {
port: 25565 // optional, only if you need a custom port
}, function (err, status) {
if (err) {
return document.querySelector('.server-status').innerHTML = 'Error loading status';
}
// you can change these to your own message!
document.querySelector('.server-online').innerHTML = status.online ? 'up' : 'down';
});
</script>
<img src="test.png" alt="Test Image" style="width:304px;height:228px;">
What the HTML code does:
Shows text and an image.
If you need more detail just ask.
Where is test.png located? Looks like your HTML is having an issue finding the file. I ran your code locally and was able to see an image when I changed to image tag to source an image from a url.
I checked chrome dev tools first and noticed the image tag rendered fine.
I changed the image tag to (random image from google image search):
`<img src="http://emojipedia-us.s3.amazonaws.com/cache/f8/69/f869f6512b0d7187f4e475fc9aa7f250.png" alt="Test Image">`
And it showed up fine.
The issue is that no matter what the request you get to your HTTP server you are sending the content of page.html as response. This is because you hard coded this line in your server.
html_file = open("page.html", 'rb')
In your page.html you have an image that means browser (or any client that you use) will send a call to your server asking for image's source (content of test.png) but instead of reading the image's source you are again serving page.html's content. That is why the image appears broken.
To fix this you need to look at the resource that is requested and open that file and send the content as response from server. Take a look at what you have in request variable in your server code to get an idea of how you can get the name of the resource requested.
You might also want to send out the proper content-type header to let the client know how it has to handle the response.
Related
I have a .html file downloaded and want to send a request to this file to grab it's content.
However, if I do the following:
import requests
html_file = "/user/some_html.html"
r = requests.get(html_file)
Gives the following error:
Invalid URL 'some_html.html': No schema supplied.
If I add a schema I get the following error:
HTTPConnectionPool(host='some_html.html', port=80): Max retries exceeded with url:
I want to know how to specifically send a request to a html file when it's downloaded.
You are accessing html file from local directory. get() method uses HTTPConnection and port 80 to access data from website not a local directory. To access file from local directory using get() method use Xampp or Wampp.
for accessing file from local directory you can use open() while requests.get() is for accessing file from Port 80 using http Connection in simple word from internet not local directory
import requests
html_file = "/user/some_html.html"
t=open(html_file, "r")
for v in t.readlines():
print(v)
Output:
You don't "send a request to a html file". Instead, you can send a request to a HTTP server on the internet which will return a response with the contents of a html file.
The file itself knows nothing about "requests". If you have the file stored locally and want to do something with it, then you can open it just like any other file.
If you are interested in learning more about the request and response model, I suggest you try a something like
response = requests.get("http://stackoverflow.com")
You should also read about HTTP and requests and responses to better understand how this works.
You can do it by setting up a local server to your html file.
If you use Visual Studio Code, you can install Live Server by Ritwick Dey.
Then you do as follows:
1 - Make the first request and save the html content into a .html file:
my_req.py
import requests
file_path = './'
file_name = 'my_file'
url = "https://www.qwant.com/"
response = requests.request("GET", url)
w = open(file_path + file_name + '.html', 'w')
w.write(response.text)
2 - With Live Server installed on Visual Studio Code, click on my_file.html and then click on Go Live.
and
3 - Now you can make a request to your local http schema:
second request
import requests
url = "http://127.0.0.1:5500/my_file.html"
response = requests.request("GET", url)
print(response.text)
And, tcharan!! do what you need to do.
On a crawler work, I had one situation where there was a difference between the content displayed on the website and the content retrieved with the response.text so the xpaths did not were the same as on the website, so I needed to download the content, making a local html file, and get the new ones xpaths to get the info that I needed.
You can try this:
from requests_html import HTML
with open("htmlfile.html") as htmlfile:
sourcecode = htmlfile.read()
parsedHtml = HTML(html=sourcecode)
print(parsedHtml)
I'm trying to intercept and modify https content using Mitm Proxy.
It works really well using the GUI but I'd like to use a python script.
I tried this script:
from mitmproxy import http
def request(flow: http.HTTPFlow) -> None:
if flow.request.pretty_url == "https://www.facebook.com/?l":
flow.response = http.HTTPResponse.make(
200, # (optional) status code
b"Hello World, this is a test", # (optional) content
{"Content-Type": "text/html"} # (optional) headers
)
def response(flow: http.HTTPFlow):
if flow.response.raw_content and flow.request.pretty_url == "https://www.facebook.com/?lol":
file = open(b"C:\Users\myuser\PycharmProjects\mitmProx\data.txt", "a")
file.write(str(flow.response.) + "\n")
However, the content I intercept is encrypted and not in clear despite the fact that the content is clear on the web GUI !
Does anyone have an idea why my script intercepts encrypted content and the web GUI prints the clear data?
It looks like you want to use response.text or response.content, not response.raw_content. raw_content contains the raw compressed HTTP message body, whereas .content contains the uncompressed version.
I'm building a little server that sends a page with a script in javascript, it works when I try to open it in my browser, but if I request the page from the server, the page is recived, but not the script, so I get these errors:
Script.js is missing between the files:
Looks strange because from the network session i can see a request for the script with the status 200:
The js file i'm tryng to add is Chart.js, so I can't add it internally, it would become impossible to work with it, but for now the server is just a few lines of code in python that use the SimpleHTTPRequestHandler, and I'm probably going to replace it, may it be because the SimpleHTTPRequestHandler can't handle multiple file requests?
Here's the code, tried to make a snippet but it does't work there too (probably that's just me never wrote a snippet before):
HTML:
<!doctype html>
<html>
<head>
<script src="script.js"></script>
</head>
<body>
<p id = "paragraph"></p>
<script>
document.getElementById('paragraph').innerHTML = sayHello();
</script>
</body>
</html>
JS:
function sayHello(){
return "HelloWorld!"
}
Here is the python server script:
from http.server import HTTPServer, BaseHTTPRequestHandler
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.end_headers()
with open("index.html", "r") as page:
self.wfile.write(page.read())
httpd = HTTPServer(("192.168.1.100", 8000), SimpleHTTPRequestHandler)
httpd.serve_forever()
I think you get an element with id, tag name or class Name and add file
var head = document.getElementsByTagName('head')[0];
var script = document.createElement('script');
script.type = 'text/javascript';
script.onload = function() {
callFunctionFromScript();
}
script.src = 'path/to/your-script.js';
head.appendChild(script);
also check this link
What i am trying to ask is, i have a website which takes the user profile and generates some kind of user specific reports in pdf. Pdf is generated at back end. So when user clicks on "generate" button, it should show blocking message and once the file is generated blocking message is removed and file is displayed with tag.
I am able to do this with help of socket.io and to be very Frank I copied this solution from one blog without understanding much. So my stack is Python and socket.io is on node. While it worked perfectly on my dev machine but on server its keep on throwing connection error message in JavaScript and post message fails on back end. So I am currently having issues with front and back end. And thanks to my blindly copy paste thing I am kind of clueless about the fix.
Is there any alternative to this approach? I can think of using Python based socket io (this time I will read and implement ) but something similar can be achieved by any other approach? I am using this to only send the generated pad back to client there is no other communication b/w client and server.
Adding code
It works like, client opens the download/generate page and gets the ID assigned, once generate button is clicked ID is sent to back-end, now celery generates the PDF (converts to base64) and then posts that on nodejs/notify with ID. Now the socket.io is already listening on /notify url, so once the data is posted its displayed in browser.
Browser -> sock (register) -> AJAX -> Celery (task) -> on nodejs/notify -> browser sockio client displays the data
Here is the code,
Client / Browser code
// func called on generate btn press
$client.on('notify', function(result) {
$.unblockUI();
// alert('In Receiving function...');
var $obj = $('#dynamic-pdf');
$obj.hide();
$obj.attr('data', result);
$obj.show();
$('button#view-again').removeClass('disabled');
$('#previe').modal('show');
});
blockWithMsg("Generating PDF, Please Wait...");
$.ajax({
url: $url,
data: {clientId: $clientId},
method: 'POST',
cache: false,
headers: {'Content-Type': 'application/x-www-form-urlencoded; charset=utf-8'},
});
// func finished
<!-- register client on page load -->
<script src="http://localhost:3000/socket.io/socket.io.js"></script>
<script>
var $client = io('http://localhost:3000'),
$clientId = null;
$client.on('register', function(id) {
$clientId = id;
});
</script>
Node js code
var app = require('express')(),
server = require('http').Server(app),
io = require('socket.io')(server),
bodyParser = require('body-parser');
// Accept URL-encoded body in POST request.
app.use(bodyParser.json({limit: '50mb'}));
app.use(bodyParser.urlencoded({ limit: '50mb', extended: true, parameterLimit: 50000}));
app.use(require('morgan')('dev'));
// Echo the client's ID back to them when they connect.
io.on('connection', function(client) {
client.emit('register', client.id);
});
// Forward task results to the clients who initiated them.
app.post('/notify', function(request, response) {
var client = io.sockets.connected[request.body.clientId];
client.emit('notify', request.body.result);
response.type('text/plain');
response.send('Result broadcast to client.');
});
server.listen(3000);
I am getting emit on undefined value error
(Posted on behalf of the OP).
The error was I started site on different domain than one mentioned in the socket.io connection (which was still localhost) in browser JavaScript.
I use NASA GSFC server to retrieve data from their archives.
I send request and receives response as a simple text.
I discovered that they amended their page so that login is required.
However, even after logging I'm receiving an error.
I read information provided in thread how do python capture 302 redirect url
as well as tried to use urllib2 and request libraries, but still receiving an error.
Currently part of my code responsible for downloading data looks as follows:
def getSampleData():
import urllib
# I approved application according to:
# http://disc.sci.gsfc.nasa.gov/registration/authorizing-gesdisc-data-access-in-earthdata_login
# Query: http://hydro1.sci.gsfc.nasa.gov/dods/_expr_{GLDAS_NOAH025SUBP_3H}{ave(rainf,time=00Z23Oct2016,time=00Z24Oct2016)}{17.00:25.25,48.75:54.50,1:1,00Z23Oct2016:00Z23Oct2016}.ascii?result
sample_query = 'http://hydro1.sci.gsfc.nasa.gov/dods/_expr_%7BGLDAS_NOAH025SUBP_3H%7D%7Bave(rainf,time=00Z23Oct2016,time=00Z24Oct2016)%7D%7B17.00:25.25,48.75:54.50,1:1,00Z23Oct2016:00Z23Oct2016%7D.ascii?result'
# I've tried also:
# sock=urllib.urlopen(sample_query, urllib.urlencode({'username':'MyUserName','password':'MyPassword'}))
# but I was still asked to provide credentials, so I simplified mentioned line to just:
sock=urllib.urlopen(sample_query)
print('\n\nCurrent url:\n')
print(sock.geturl())
print('\nIs it the same as sample query?')
print(sock.geturl() == sample_query)
returnedData=sock.read()
# returnedData always stores simple page with 302. Why? StackOverflow suggests that urllib and urllib2 handle redirection automatically
sock.close()
with open("Output.html", "w") as text_file:
text_file.write(returnedData)
Output.html content is as follows:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved here.</p>
</body></html>
If I copy-paste sample_query (sample_query from function defined above) to browser, I have no problem with receiving data.
Thus, if there's no hope for solution, I'm thinking about rewriting my code to use Selenium.
It seems that i figured out how to download data:
How to authenticate on NASA gsfc server
However, I don't know how to process dataset.
I would like to display (or write to textfile) output as a raw data (in exactly the same way as I see them in browser)