How to print '<!DOCTYPE html>'? - python

My client requests a page from a server written in python 3.
The server return an html page that is presented by client.
Therefore, I did a dummy.html page and when client asks for it, my python reads it and returns it to the client:
filename = "dummy.html"
fh = open(filename, 'rt')
line = fh.readline()
while line:
print(line)
line = fh.readline()
fh.close()
However, this code does not read the <!DOCTYPE html> that is placed in the top of my dummy.html file (and thus, things like bootstrap don't work for me...).
I also tried printing it manually print('<!DOCTYPE html>') but that also does not work.
print('<!DOCTYPE html>') <---- IT IS PRINTED TO SDOUT BUT WHEN PRINTED TO CLIENT, THE PAGE DOES NOT HAVE THIS LINE ....
filename = CURRENTPATH+"\\..\\su.html"
fh = open(filename, 'rt')
line = fh.readline()
print('hello')
print('<'+'!'+'DOCTYPE html>')
while line:
print(line)
line = fh.readline()
fh.close()
How can I fix it?

It looks like you're trying to reimplement a web server in Python. Please consider using an existing web framework, such as Django (https://www.djangoproject.com/), Flask (http://flask.pocoo.org/) or Pyramid (http://www.pylonsproject.org/), which will do most of the work for you (including built-in support for a wide variety of HTML templating libraries, and actual performance).
As for your actual answer, a bare print statement prints to stdout, as expected. You need, instead, to write to the file-like object whose contents will be sent to the client (is it a socket? a file? who knows? stop reinventing the wheel).

Related

Python doesn't release file after it is closed

What I need to do is to write some messages on a .txt file, close it and send it to a server. This happens in a infinite loop, so the code should look more or less like this:
from requests_toolbelt.multipart.encoder import MultipartEncoder
num = 0
while True:
num += 1
filename = f"example{num}.txt"
with open(filename, "w") as f:
f.write("Hello")
f.close()
mp_encoder = MultipartEncoder(
fields={
'file': ("file", open(filename, 'rb'), 'text/plain')
}
)
r = requests.post("my_url/save_file", data=mp_encoder, headers=my_headers)
time.sleep(10)
The post works if the file is created manually inside my working directory, but if I try to create it and write on it through code, I receive this response message:
500 - Internal Server Error
System.IO.IOException: Unexpected end of Stream, the content may have already been read by another component.
I don't see the file appearing in the project window of PyCharm...I even used time.sleep(10) because at first, I thought it could be a time-related problem, but I didn't solve the problem. In fact, the file appears in my working directory only when I stop the code, so it seems the file is held by the program even after I explicitly called f.close(): I know the with function should take care of closing files, but it didn't look like that so I tried to add a close() to understand if that was the problem (spoiler: it was not)
I solved the problem by using another file
with open(filename, "r") as firstfile, open("new.txt", "a+") as secondfile:
secondfile.write(firstfile.read())
with open(filename, 'w'):
pass
r = requests.post("my_url/save_file", data=mp_encoder, headers=my_headers)
if r.status_code == requests.codes.ok:
os.remove("new.txt")
else:
print("File not saved")
I make a copy of the file, empty the original file to save space and send the copy to the server (and then delete the copy). Looks like the problem was that the original file was held open by the Python logging module
Firstly, can you change open(f, 'rb') to open("example.txt", 'rb'). In open, you should be passing file name not a closed file pointer.
Also, you can use os.path.abspath to show the location to know where file is written.
import os
os.path.abspath('.')
Third point, when you are using with context manager to open a file, you don't close the file. The context manger supposed to do it.
with open("example.txt", "w") as f:
f.write("Hello")

Passing Binary file over HTTP POST

I have a local python file that decodes binary files. This python file first reads from the file, opens it as binary and then saves it in a buffer and interprets it. Reading it is simply:
with open(filepath, 'rb') as f:
buff = f.read()
read_all(buff)
This works fine locally. Now I'd like to setup a Azure Python job where I can send the file, approx. 100kb, over a HTTP POST and then read the interpreted meta data which my original python script does well.
I've first removed the read function so that I'll now work with the buffer only.
In my Azure Python Job I have the following, triggered by a HttpRequest
my_data = reader.read_file(req.get_body())
To test my sending I've tried the following in python
import requests
url = 'http://localhost:7071/api/HttpTrigger'
files = {'file': open('test', 'rb')}
with open('test', 'rb') as f:
buff = f.read()
r = requests.post(url, files=files) #Try using files
r = requests.post(url, data=buff) #Try using data
I've also tried in Postman adding the file to the body as a binary and setting the headers to application/octet-stream
All this doesn't send the binary file the same way as the original f.read() did. So I'm getting a wrong interpretation of the binary file.
What is file.read doing differently to how I'm sending it over as a HTTP Body message?
Printing out the first line from the local python read file gives.
b'\n\n\xfe\xfe\x00\x00\x00\x00\\\x18,A\x18\x00\x00\x00(\x00\x00\x00\x1f\x00\x00\
Whereas printing it out at the req.get_body() shows me
b'\n\n\xef\xbf\xbd\xef\xbf\xbd\x00\x00\x00\x00\\\x18,A\x18\x00\x00\x00(\x00\x00\x00\x1f\x00\
So something is clearly wrong. Any help why this could be different?
Thanks
EDIT:
I've implemented a similar function in Flask and it works well.
The code in flask is simply grabbing the file from a POST. No encoding/decoding.
if request.method == 'POST':
f = request.files['file']
#f.save(secure_filename(f.filename))
my_data = reader.read_file(f.read())
Why is the Azure Function different?
You can try UTF-16 to decode and do the further action in your code.
Here is the code for that:
with open(path_to_file,'rb') as f:
contents = f.read()
contents = contents.rstrip("\n").decode("utf-16")
Basically after doing re.get_body, perform the below operation:
contents = contents.rstrip("\n").decode("utf-16")
See if it gives you the same output as your receive in local python file.
Hope it helps.

Read file using urllib and write adding extra characters

I have a script that regularly reads a text file on a server and over writes a copy of the text to a local copy of the text file. I have an issue of the process adding extra carriage returns and an extra invisible character after the last character. How do I make an identical copy of the server file?
I use the following to read the file
for link in links:
try:
f = urllib.urlopen(link)
myfile = f.read()
except IOError:
pass
and to write it to the local file
f = open("C:\\localfile.txt", "w")
try:
f.write(myfile)
except NameError:
pass
finally:
f.close()
This is how the file looks on the server
!http://i.imgur.com/rAnUqmJ.jpg
and this is how the file looks locally. Besides, an additional invisible character after the last 75
!http://i.imgur.com/xfs3E8D.jpg
I have seen quite a few similar questions, but not sure how to handle the urllib to read in binary
Any solution please?
If you want to copy a remote file denoted by a URL to a local file i would use urllib.urlretrieve:
import urllib
urllib.urlretrieve("http://anysite.co/foo.gz", "foo.gz")
I think urllib is reading binary.
Try changing
f = open("C:\\localfile.txt", "w")
to
f = open("C:\\localfile.txt", "wb")

Dynamically Export URL Document To Server Using Python

I'm writing a script in python and I'm trying to wrap my head around a problem. I've a URL that when opened, downloads a document. I'm trying to write a python script that opens the https URL that downloads this document, and automatically send that document to a server I have opened using python's pysftp module.
I can't wrap my head around how to do this... Do you think I'd be able to just do:
server.put(urllib.open('https://......./document'))
EDIT:
This is the code I've tried before the above doesn't work...
download_file = urllib2.urlopen('https://somewebsite.com/file.csv')
file_contents = download_file.read().replace('"', '')
columns = [x.strip() for x in file_contents.split(',')]
# Write Downloaded File Contents To New CSV File
with open('file.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerow(columns)
# Upload New File To Server
srv.put('./file.csv', './SERVERFOLDER/file.csv')
ALSO:
How would I go about getting a FILE that is ONE DAY old from the server? (Examining age of each file)... using paramiko

Serving binary file from web server to client

Usually, when I want to transfer a web server text file to client, here is what I did
import cgi
print "Content-Type: text/plain"
print "Content-Disposition: attachment; filename=TEST.txt"
print
filename = "C:\\TEST.TXT"
f = open(filename, 'r')
for line in f:
print line
Works very fine for ANSI file. However, say, I have a binary file a.exe (This file is in web server secret path, and user shall not have direct access to that directory path). I wish to use the similar method to transfer. How I can do so?
What content-type I should use?
Using print seems to have corrupted content received at client side. What is the correct method?
I use the following code.
#!c:/Python27/python.exe -u
import cgi
print "Content-Type: application/octet-stream"
print "Content-Disposition: attachment; filename=jstock.exe"
print
filename = "C:\\jstock.exe"
f = open(filename, 'rb')
for line in f:
print line
However, when I compare the downloaded file with original file, it seems there is an extra whitespace (or more) for after every single line.
Agree with the above posters about 'rb' and Content-Type headers.
Additionally:
for line in f:
print line
This might be a problem when encountering \n or \r\n bytes in the binary file. It might be better to do something like this:
import sys
while True:
data = f.read(4096)
sys.stdout.write(data)
if not data:
break
Assuming this is running on windows in a CGI environment, you will want to start the python process with the -u argument, this will ensure stdout isn't in text-mode
When opening a file, you can use open(filename, 'rb') - the 'b' flag marks it as binary. For a general handler, you could use some form of mime magic (I'm not familiar with using it from Python, I've only ever used it from PHP a couple of years ago). For the specific case, .exe is application/octet-stream.
Content-type of .exe is tipically application/octet-stream.
You might want to read your file using open(filename, 'rb') where b means binary.
To avoid the whitespace problem, you could try with:
sys.stdout.write(open(filename,"rb").read())
sys.stdout.flush()
or even better, depending on the size of your file, use the Knio approach:
fo = open(filename, "rb")
while True:
buffer = fo.read(4096)
if buffer:
sys.stdout.write(buffer)
else:
break
fo.close()
For anyone using Windows Server 2008 or 2012 and Python 3, here's an update...
After many hours of experimentation I have found the following to work reliably:
import io
with io.open(sys.stdout.fileno(),"wb") as fout:
with open(filename,"rb") as fin:
while True:
data = fin.read(4096)
fout.write(data)
if not data:
break

Categories