Sending back a file stream from GRPC Python Server - python

I have a service that needs to return a filestream to the calling client so I have created this proto file.
service Sample {
rpc getSomething(Request) returns (stream Response){}
}
message Request {
}
message Response {
bytes data = 1;
}
When the server receives this, it needs to read some source.txt file and then write it back to the client
as a byte stream. Just would like to ask is this the proper way to do this in a Python GRPC server?
fileName = "source.txt"
with open(file_name, 'r') as content_file:
content = content_file.read()
response.data = content.encode()
yield response
I cannot find any examples related to this.

That looks mostly correct, though it's hard to be sure since you haven't shared with us all of your service-side code. A few tweaks I'd suggest would be (1) reading the file as binary content in the first place, (2) exiting the with statement as early as possible, (3) constructing the response message only after you've constructed the value of its data field, and (4) making a module-scope module-private constant out of the file name. Something like:
with open(_CONTENT_FILE_NAME, 'rb') as content_file:
content = content_file.read()
yield my_generated_module_pb2.Response(data=content)
. What do you think?

One option would be to lazily read in the binary and yield each chunk. Note, this is untested code:
def read_bytes(file_, num_bytes):
while True:
bin = file_.read(num_bytes)
if len(bin) != num_bytes:
break
yield bin
class ResponseStreamer(Sample_pb2_grpc.SampleServicer):
def getSomething(request, context):
with open('test.bin', 'rb') as f:
for rec in read_bytes(f, 4):
yield Sample_pb2.Response(data=rec)
Downside is that you'll have the file opened while the stream is open.

Related

How to save image which sent via flask send_file

I have this code for server
#app.route('/get', methods=['GET'])
def get():
return send_file("token.jpg", attachment_filename=("token.jpg"), mimetype='image/jpg')
and this code for getting response
r = requests.get(url + '/get')
And i need to save file from response to hard drive. But i cant use r.files. What i need to do in these situation?
Assuming the get request is valid. You can use use Python's built in function open, to open a file in binary mode and write the returned content to disk. Example below.
file_content = requests.get('http://yoururl/get')
save_file = open("sample_image.png", "wb")
save_file.write(file_content.content)
save_file.close()
As you can see, to write the image to disk, we use open, and write the returned content to 'sample_image.png'. Since your server-side code seems to be returning only one file, the example above should work for you.
You can set the stream parameter and extract the filename from the HTTP headers. Then the raw data from the undecoded body can be read and saved chunk by chunk.
import os
import re
import requests
resp = requests.get('http://127.0.0.1:5000/get', stream=True)
name = re.findall('filename=(.+)', resp.headers['Content-Disposition'])[0]
dest = os.path.join(os.path.expanduser('~'), name)
with open(dest, 'wb') as fp:
while True:
chunk = resp.raw.read(1024)
if not chunk: break
fp.write(chunk)

python sockets send file function explanation

I am trying to learn python socket programming (Networks) and I have a function that sends a file but I am not fully understanding each line of this function (can't get my head around it). Could someone please explain line by line what this is doing. I am also unsure why it needs the length of the data to send the file. Also if you can see any improvements to this function as well, thanks.
def send_file(socket, filename):
with open(filename, "rb") as x:
data = x.read()
length_data = len(data)
try:
socket.sendall(length_data.to_bytes(20, 'big'))
socket.sendall(data)
except Exception as error:
print("Error", error)
This protocol is written to first send the size of the file and then its data. The size is 20 octets (bytes) serialized as "big endian", meaning the most significant byte is sent first. Suppose you had a 1M file,
>>> val = 1000000
>>> val.to_bytes(20, 'big')
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0fB#'
The program sends those 20 bytes and then the file payload itself. The receiver would read the 20 octets, convert that back into a file size and would know exactly how many more octets to receive for the file. TCP is a streaming protocol. It has no idea of message boundaries. Protocols need some way of telling the other side how much data makes up a message and this is a common way to do it.
As an aside, this code has the serious problem that it reads the entire file in one go. Suppose it was huge, this code would crash.
A receiver would look something like the following. This is a rudimentary.
import io
def recv_exact(skt, count):
buf = io.BytesIO()
while count:
data = skt.recv(count)
if not data:
return b"" # socket closed
buf.write(data)
count -= len(data)
return buf.getvalue()
def recv_file(skt):
data = recv_exact(skt, 20)
if not data:
return b""
file_size = int.from_bytes(data, "big")
file_bytes = recv_exact(skt, file_size)
return file_bytes

Passing Binary file over HTTP POST

I have a local python file that decodes binary files. This python file first reads from the file, opens it as binary and then saves it in a buffer and interprets it. Reading it is simply:
with open(filepath, 'rb') as f:
buff = f.read()
read_all(buff)
This works fine locally. Now I'd like to setup a Azure Python job where I can send the file, approx. 100kb, over a HTTP POST and then read the interpreted meta data which my original python script does well.
I've first removed the read function so that I'll now work with the buffer only.
In my Azure Python Job I have the following, triggered by a HttpRequest
my_data = reader.read_file(req.get_body())
To test my sending I've tried the following in python
import requests
url = 'http://localhost:7071/api/HttpTrigger'
files = {'file': open('test', 'rb')}
with open('test', 'rb') as f:
buff = f.read()
r = requests.post(url, files=files) #Try using files
r = requests.post(url, data=buff) #Try using data
I've also tried in Postman adding the file to the body as a binary and setting the headers to application/octet-stream
All this doesn't send the binary file the same way as the original f.read() did. So I'm getting a wrong interpretation of the binary file.
What is file.read doing differently to how I'm sending it over as a HTTP Body message?
Printing out the first line from the local python read file gives.
b'\n\n\xfe\xfe\x00\x00\x00\x00\\\x18,A\x18\x00\x00\x00(\x00\x00\x00\x1f\x00\x00\
Whereas printing it out at the req.get_body() shows me
b'\n\n\xef\xbf\xbd\xef\xbf\xbd\x00\x00\x00\x00\\\x18,A\x18\x00\x00\x00(\x00\x00\x00\x1f\x00\
So something is clearly wrong. Any help why this could be different?
Thanks
EDIT:
I've implemented a similar function in Flask and it works well.
The code in flask is simply grabbing the file from a POST. No encoding/decoding.
if request.method == 'POST':
f = request.files['file']
#f.save(secure_filename(f.filename))
my_data = reader.read_file(f.read())
Why is the Azure Function different?
You can try UTF-16 to decode and do the further action in your code.
Here is the code for that:
with open(path_to_file,'rb') as f:
contents = f.read()
contents = contents.rstrip("\n").decode("utf-16")
Basically after doing re.get_body, perform the below operation:
contents = contents.rstrip("\n").decode("utf-16")
See if it gives you the same output as your receive in local python file.
Hope it helps.

downloading a large file in chunks with gzip encoding (Python 3.4)

If I make a request for a file and specify encoding of gzip, how do I handle that?
Normally when I have a large file I do the following:
while True:
chunk = resp.read(CHUNK)
if not chunk: break
writer.write(chunk)
writer.flush()
where the CHUNK is some size in bytes, writer is an open() object and resp is the request response generated from a urllib request.
So it's pretty simple most of the time when the response header contains 'gzip' as the returned encoding, I would do the following:
decomp = zlib.decompressobj(16+zlib.MAX_WBITS)
data = decomp.decompress(resp.read())
writer.write(data)
writer.flush()
or this:
f = gzip.GzipFile(fileobj=buf)
writer.write(f.read())
where the buf is a BytesIO().
If I try to decompress the gzip response though, I am getting issues:
while True:
chunk = resp.read(CHUNK)
if not chunk: break
decomp = zlib.decompressobj(16+zlib.MAX_WBITS)
data = decomp.decompress(chunk)
writer.write(data)
writer.flush()
Is there a way I can decompress the gzip data as it comes down in little chunks? or do I need to write the whole file to disk, decompress it then move it to the final file name? Part of the issue I have, using 32-bit Python, is that I can get out of memory errors.
Thank you
I think I found a solution that I wish to share.
def _chunk(response, size=4096):
""" downloads a web response in pieces """
method = response.headers.get("content-encoding")
if method == "gzip":
d = zlib.decompressobj(16+zlib.MAX_WBITS)
b = response.read(size)
while b:
data = d.decompress(b)
yield data
b = response.read(size)
del data
else:
while True:
chunk = response.read(size)
if not chunk: break
yield chunk
If anyone has a better solution, please add to it. Basically my error was the creation of the zlib.decompressobj(). I was creating it in the wrong place.
This seems to work in both python 2 and 3 as well, so there is a plus.

Stream a file to the HTTP response in Pylons

I have a Pylons controller action that needs to return a file to the client. (The file is outside the web root, so I can't just link directly to it.) The simplest way is, of course, this:
with open(filepath, 'rb') as f:
response.write(f.read())
That works, but it's obviously inefficient for large files. What's the best way to do this? I haven't been able to find any convenient methods in Pylons to stream the contents of the file. Do I really have to write the code to read a chunk at a time myself from scratch?
The correct tool to use is shutil.copyfileobj, which copies from one to the other a chunk at a time.
Example usage:
import shutil
with open(filepath, 'r') as f:
shutil.copyfileobj(f, response)
This will not result in very large memory usage, and does not require implementing the code yourself.
The usual care with exceptions should be taken - if you handle signals (such as SIGCHLD) you have to handle EINTR because the writes to response could be interrupted, and IOError/OSError can occur for various reasons when doing I/O.
I finally got it to work using the FileApp class, thanks to Chris AtLee and THC4k (from this answer). This method also allowed me to set the Content-Length header, something Pylons has a lot of trouble with, which enables the browser to show an estimate of the time remaining.
Here's the complete code:
def _send_file_response(self, filepath):
user_filename = '_'.join(filepath.split('/')[-2:])
file_size = os.path.getsize(filepath)
headers = [('Content-Disposition', 'attachment; filename=\"' + user_filename + '\"'),
('Content-Type', 'text/plain'),
('Content-Length', str(file_size))]
from paste.fileapp import FileApp
fapp = FileApp(filepath, headers=headers)
return fapp(request.environ, self.start_response)
The key here is that WSGI, and pylons by extension, work with iterable responses. So you should be able to write some code like (warning, untested code below!):
def file_streamer():
with open(filepath, 'rb') as f:
while True:
block = f.read(4096)
if not block:
break
yield block
response.app_iter = file_streamer()
Also, paste.fileapp.FileApp is designed to be able to return file data for you, so you can also try:
return FileApp(filepath)
in your controller method.

Categories