How do I serve a binary data from Tornado?

How do I serve a binary data from Tornado? - python

I have a numpy array, that I want to serve using Tornado, but when I try to write it using self.write(my_np_array) I just get an AssertionErrror.
What am I doing wrong?
File "server.py", line 28, in get
self.write(values)
File "/usr/lib/python2.7/site-packages/tornado/web.py", line 468, in write
chunk = utf8(chunk)
File "/usr/lib/python2.7/site-packages/tornado/escape.py", line 160, in utf8
assert isinstance(value, unicode)

Not exactly sure what your goal is, but if you want to get a string representation of the object you can do
self.write(str(your_object))
If you want to serve the numpy array as a python object in order to use it on a different client you need to pickle the object first
import pickle
self.write(pickle.dumps(your_object))
the object can then be retrieved with
your_object = pickle.loads(sent_object)
Keep in mind that it is dangerous to unpickle objects from an untrusted source as it can lead to malicious code execution.
Edit:
If you want to transfer a numpy array and use it within javascript you don't need a binary representation.
Just convert the numpy array to a list
your_numpy_list = your_numpy_object.tolist()
and convert it to json
import json
self.write(json.dumps(your_numpy_list))
at the javascript side you just parse the result string
var result = JSON.parse(resultString)
and create the typed array from it
var typedResult = new Float32Array(result)
voila!

Related

Changes in h5 file arent refelcted in xdmf file

Hello I was given an h5 file with an xdmf file associated with it. The visualization looks like this. Here the color is just the feature ID. I wanted to add some data to the h5 file to be able to visualize it in paraview. The newly added data does not appear in the paraview although clearly being there when using hdfview. The data I'm trying to add are the ones titled engineering stress and true stress. The only difference I noticed is that the number of attributes for these is zero while it's 5 for the rest but I dont know what to do with that information.
Here's the code I currently have set up:
nf_product = h5py.File(filename,"a")
e_princ = np.empty((280,150,280,3))
t_princ = e_princ
for i in tqdm(range(grain_count)):
a = np.where(feature_ID == i+1)
e_princ[a,0] = eng_stress[i,0]
e_princ[a,1] = eng_stress[i,1]
e_princ[a,2] = eng_stress[i,2]
t_princ[a,0] = true_stress[i,0]
t_princ[a,1] = true_stress[i,1]
t_princ[a,2] = true_stress[i,2]
EngineeringStress = nf_product.create_dataset('DataContainers/nfHEDM/CellData/EngineeringStressPrinciple',data=np.float32(e_princ))
TrueStress = nf_product.create_dataset('DataContainers/nfHEDM/CellData/TrueStressPrinciple',data=np.float32(t_princ))
I am new to using h5 and xdmf files so I may be going about this entirely wrong but the way I understand it is an xdmf file acts as a pointer to the data in the h5 file so I can't understand why the new data doesnt appear in paraview.

First, did you close the file withnf_product.close()? If not, new datasets may not have been flushed from memory. You may also need to flush the buffers withnf_product.flush() Better, use the Python with / as: file context manager and it is done automatically.
Next, you can simply use data=e_princ (and t_princ), there is no need to cast a numpy array to a numpy array.
Finally, verify the values in e_princ and t_princ. I think will be the same because they reference the same numpy object. You need to create t_princ as an empty array, the same as e_princ. Also they have 4 indices, and you only have 2 when you populate them with [a,0]. Be sure that works as expected.

Pickling of files result in file type as Binary (application/octet-stream)

I am doing some experiments in tensor flow and I collect some of the internediate measurements in numpy arrays and python dicts which are pickled at the end. However, when I try to load them, it takes a very long amount of time even when the pickled file is only about 20 MB. When i click on the Properties and see its type, it shows as Binary (application/octet-stream).
Even a simple command like type(filename) takes a very long time, and running the command shows the correct file type as dict. But checking the file properties, it still shows up as Binary (application/octet-stream).
How do I make them get saved as original numpy arrays instead of getting converted to this format, or how do I get back the original numpy arrays from this format?
Edit 1: all data is numeric, i.e int or float.

How to convert a numpy array to leveldb or lmdb format

I'm trying to convert a numpy array that was created using pylearn2 into leveldb/lmdb so that I can use in Caffe.
This is the script that I used to created the dataset.
After running this script, couple of files are generated, among which there are test.pkl, test.npy, train.pkl, train.npy
I dont know if there is a direct way for converting to leveldb/lmdb, so assuming there is no way, I need to be able to read each image and its corresponding label, so that I can then save it into a leveldb/lmdb database.
I was told I need to use the pickle file for reading since it provides a dictionary like structure. However, trying to do
import cPickle as pickle
pickle.load( open( "N:/pylearn2-master/datasets/cifar10/pylearn2_gcn_whitened/test.pkl", "rb" ) )
outputs
<pylearn2.datasets.cifar10.CIFAR10 at 0xde605f8>
and I dont know what the correct way of accessing the items in a pickle file is and or whether I need to read from the numpy array directly.

Working with array and Dictionaries after loading them from Disk using np.load and np.save

I have several huge arrays, and I am using np.save and np.load to save each array or dictionary in a single file and then I reload them, in order not to compute them another time as follows.
save(join(dir, "ListTitles.npy"), self.ListTitles)
self.ListTitles = load(join(dir,"ListTitles.npy"))
The problem is that when I try to use them afterwards, I have errors like (field name not found) or (len() of unsized object).
For example:
len(self.ListTitles) or when accessing a field of a dictionary return an error.
I don't know how to resolve this. Because when I simply use this code, it works perfectly:
M = array([[1,2,0], [3,4,0], [3,0,1]])
vector = zeros(3529)
save("M.npy", M)
save("vector.npy", vector)
vector = load("vector.npy")
B = load("M.npy")
print len(B)
print len(vector)

numpy's save and load functions are for numpy arrays, not for general Python data like dicts. Use the pickle module to save to file, and reload from file, most kinds of Python data structures (there are alternatives like dill which are however not in the standard library -- I'd recommend sticking with standard pickle unless it gives you specific problems).

Writing Fortran unformatted files with Python

I have some single-precision little-endian unformatted data files written by Fortran77. I am reading these files using Python using the following commands:
import numpy as np
original_data = np.dtype('float32')
f = open(file_name,'rb')
original_data = np.fromfile(f,dtype='float32',count=-1)
f.close()
After some data manipulation in Python, I (am trying to) write them back in the original format using Python using the following commands:
out_file = open(output_file,"wb")
s = struct.pack('f'*len(manipulated_data), *manipulated_data)
out_file.write(s)
out_file.close()
But it doesn't seem to be working. Any ideas what is the right way of writing the data using Python back in the original fortran unformatted format?
Details of the problem:
I am able to read the final file with manipulated data from Fortran. However, I want to visualize these data using a software (Paraview). For this I convert the unformatted data files in the *h5 format. I am able to convert both the original and manipulated data in h5 format using h5 utilities. But while Paraview is able to read the *h5 files created from original data, Paraview is not able to read the *h5 files created from the manipulated data. I am guessing something is being lost in translation.
This is how I am opening the file written by Python in Fortran (single precision data):
open (in_file_id,FILE=in_file,form='unformatted',access='direct',recl=4*n*n*n)
And this is I am writing the original unformatted data by Fortran:
open(out_file_id,FILE=out_file,form="unformatted")
Is this information sufficient?

Have you tried using the .tofile method of the manipulated data array? It will write the array in C order but is capable of writing plain binary.
The documentation for .tofile also suggests this is the same as:
with open(outfile, 'wb') as fout:
fout.write(manipulated_data.tostring())

this is creating an unformatted sequential access file:
open(out_file_id,FILE=out_file,form="unformatted")
Assuming you are writing a single array real a(n,n,n) using simply write(out_file_id)a you should see a file size 4*n^3+8 bytes. The extra 8 bytes being a 4 byte integer (=4n^3) repeated at the start and end of the record.
the second form:
open (in_file_id,FILE=in_file,form='unformatted',access='direct',recl=4*n*n*n)
opens direct acess, which does not have those headers. For writing now you'd have write(unit,rec=1)a. If you read your sequential access file using direct acess it will read without error but you'll get that integer header read as a float (garbage) as the (1,1,1) array value, then everything else is shifted. You say you can read with fortran ,but are you looking to see that you are really reading what you expect?
The best fix to this is to fix your original fortran code to use unformatted,direct access for both reading and writing. This gives you an 'ordinary' raw binary file, no headers.
Alternately in your python you need to first read that 4 byte integer, then your data. On output you could put the integer headers back or not depending on what your paraview filter is expecting.
---------- here is python to read/modify/write an unformatted sequential fortran file containing a single record:
import struct
import numpy as np
f=open('infile','rb')
recl=struct.unpack('i',f.read(4))[0]
numval=recl/np.dtype('float32').itemsize
data=np.fromfile(f,dtype='float32',count=numval)
endrec=struct.unpack('i',f.read(4))[0]
if endrec is not recl: print "error unexpected end rec"
f.close()
f=open('outfile')
f.write(struct.pack('i',recl))
for i in range(0,len(data)):data[i] = data[i]**2 #example data modification
data.tofile(f)
f.write(struct.pack('i',recl)
just loop for multiple records.. note that the data here is read as a vector and assumed to be all floats. Of course you need to know the actuall data type to make use if it..
Also be aware you may need to deal with byte order issues depending on platform.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I serve a binary data from Tornado? - python

Related

Changes in h5 file arent refelcted in xdmf file

Pickling of files result in file type as Binary (application/octet-stream)

How to convert a numpy array to leveldb or lmdb format

Working with array and Dictionaries after loading them from Disk using np.load and np.save

Writing Fortran unformatted files with Python

Categories

Resources