numpy.fromfile seems to be unable to read large files

numpy.fromfile seems to be unable to read large files - python

I wanted to write some very simple python helper tool for my project which is reading binary data from an ECG record. I have found somewhere that numpy.fromfile is the most appropriate tool to approach it, so I wrote:
#!/usr/bin/env python3
import sys
import numpy as np
arrayOfNums = np.fromfile(sys.argv[1], 'short')
print("Converting " + sys.argv[1] + "...")
conversionOutput = open("output", "x")
conversionOutput.write(np.array2string(arrayOfNums, separator=' '))
conversionOutput.close()
print("Conversion done.")
I did that to write the data which is 2 byte records unseparated. The input file is somewhat large for a simple text file (over 7MB), however not large enough I think to cause numpy troubles.
The output I got in the file: [-32243 -32141 -32666 ... -32580 -32635 -32690]
Why the dots between? It seems to convert it okay, but omits almost everything it is supposed to save. Any help would be appreciated.

Numpy reads correctly your file. To avoid a long display, numpy uses the dots:
import numpy as np
a = np.random.random(10000)
Output:
>>> a
array([0.20902653, 0.80097215, 0.06909818, ..., 0.5963183 , 0.94024005,
0.31870234])
>>> a.shape
(10000,)
a contains 10000 values and not only the 6 displayed values.
Update
To display the full output:
import sys
np.set_printoptions(threshold=sys.maxsize)
print(a)

Related

How to create a .png file using python?

I am trying to create a .png file from uint8array but am not getting the expected result. The file is 908 bytes but it is supposed to be of 905 bytes. When I try to open the image in the MS paint, it says This is not a valid bitmap file. The same array works for me when I use node.js. Here is the code :
import io
import numpy as np
arr =[137,80,78,71,13,10,26,10,0,0,0,13,73,72,68,82,0,0,0,200,0,0,0,200,8,6,0,0,0,173,88,174,158,0,0,0,4,115,66,73,84,8,8,8,8,124,8,100,136,0,0,0,9,112,72,89,115,0,0,11,19,0,0,11,19,1,0,154,156,24,0,0,3,43,73,68,65,84,120,156,237,221,193,110,163,48,20,64,209,206,168,255,255,203,51,251,44,174,21,192,177,33,231,236,41,52,237,149,37,30,14,63,63,0,0,0,0,0,0,0,0,0,0,19,253,57,120,220,191,75,175,98,236,232,117,114,15,219,254,63,253,157,121,21,112,119,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,191,147,126,174,185,5,239,184,250,255,229,178,185,138,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,172,103,177,222,245,233,61,201,239,58,251,172,208,236,223,111,247,235,59,107,217,179,125,86,16,8,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,71,231,32,159,190,47,61,251,124,103,231,0,163,227,103,207,41,102,159,127,247,207,255,213,101,215,107,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,126,144,179,102,207,1,70,86,239,199,88,253,93,200,171,63,255,105,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,158,50,7,89,125,159,253,238,251,49,206,218,253,250,14,179,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,184,203,28,228,238,239,175,56,59,39,89,189,223,98,247,207,127,26,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,199,62,199,255,97,171,223,15,226,239,56,137,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,183,236,7,121,250,126,139,213,251,77,86,127,126,211,88,65,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,236,50,7,217,125,63,197,234,227,207,90,125,254,145,109,175,207,10,2,65,32,16,4,2,65,32,16,4,2,65,32,16,4,2,97,245,253,239,171,124,251,156,98,245,241,35,179,247,243,76,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,246,125,252,145,217,63,223,249,207,157,127,25,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,109,239,63,191,184,251,126,135,167,255,252,213,199,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,151,57,200,200,234,253,32,171,143,31,121,202,223,249,227,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,238,242,158,244,145,213,239,175,152,109,245,251,73,86,179,31,4,118,36,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,228,233,251,37,118,127,255,198,215,206,97,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,118,153,131,220,253,62,253,232,252,187,127,111,214,89,219,190,223,227,44,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,93,230,32,171,239,227,143,172,126,15,249,200,234,57,195,234,243,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,116,14,242,233,231,255,87,239,167,120,250,123,216,71,118,159,83,189,186,236,243,176,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,56,122,127,252,177,223,131,196,35,152,131,192,39,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,204,250,110,222,171,247,48,123,182,235,217,182,221,243,110,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,1,0,0,0,0,0,0,0,0,56,239,63,169,44,80,125,75,214,2,231,0,0,0,0,73,69,78,68,174,66,96,130]
arr = np.array(arr, dtype=np.uint8)
np.set_printoptions(formatter={'int':hex})
arr=np.array(arr)
f=open("QR icon.png","w")
f.write(arr)
f.close()
Also when I open the created image in notepad, there is an extra space which is not there in the file I created using node. I think I am creating the file in a wrong way. Please help me .....

Okay, first your code didn't work for me i run into a few small errors. When i fixed them, it solved you initial problem:
import io
import numpy as np
arr =[137,80,78,71,13,10,26,10,0,0,0,13,73,72,68,82,0,0,0,200,0,0,0,200,8,6,0,0,0,173,88,174,158,0,0,0,4,115,66,73,84,8,8,8,8,124,8,100,136,0,0,0,9,112,72,89,115,0,0,11,19,0,0,11,19,1,0,154,156,24,0,0,3,43,73,68,65,84,120,156,237,221,193,110,163,48,20,64,209,206,168,255,255,203,51,251,44,174,21,192,177,33,231,236,41,52,237,149,37,30,14,63,63,0,0,0,0,0,0,0,0,0,0,19,253,57,120,220,191,75,175,98,236,232,117,114,15,219,254,63,253,157,121,21,112,119,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,191,147,126,174,185,5,239,184,250,255,229,178,185,138,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,172,103,177,222,245,233,61,201,239,58,251,172,208,236,223,111,247,235,59,107,217,179,125,86,16,8,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,71,231,32,159,190,47,61,251,124,103,231,0,163,227,103,207,41,102,159,127,247,207,255,213,101,215,107,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,126,144,179,102,207,1,70,86,239,199,88,253,93,200,171,63,255,105,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,158,50,7,89,125,159,253,238,251,49,206,218,253,250,14,179,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,184,203,28,228,238,239,175,56,59,39,89,189,223,98,247,207,127,26,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,199,62,199,255,97,171,223,15,226,239,56,137,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,183,236,7,121,250,126,139,213,251,77,86,127,126,211,88,65,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,236,50,7,217,125,63,197,234,227,207,90,125,254,145,109,175,207,10,2,65,32,16,4,2,65,32,16,4,2,65,32,16,4,2,97,245,253,239,171,124,251,156,98,245,241,35,179,247,243,76,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,246,125,252,145,217,63,223,249,207,157,127,25,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,109,239,63,191,184,251,126,135,167,255,252,213,199,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,151,57,200,200,234,253,32,171,143,31,121,202,223,249,227,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,238,242,158,244,145,213,239,175,152,109,245,251,73,86,179,31,4,118,36,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,228,233,251,37,118,127,255,198,215,206,97,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,118,153,131,220,253,62,253,232,252,187,127,111,214,89,219,190,223,227,44,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,93,230,32,171,239,227,143,172,126,15,249,200,234,57,195,234,243,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,116,14,242,233,231,255,87,239,167,120,250,123,216,71,118,159,83,189,186,236,243,176,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,56,122,127,252,177,223,131,196,35,152,131,192,39,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,204,250,110,222,171,247,48,123,182,235,217,182,221,243,110,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,1,0,0,0,0,0,0,0,0,56,239,63,169,44,80,125,75,214,2,231,0,0,0,0,73,69,78,68,174,66,96,130]
arr = np.array(arr, dtype=np.uint8)
np.set_printoptions(formatter={'int':hex})
arr=np.array(arr)
f=open("QR icon.png","wb")
f.write(arr.tostring())
f.close()
The difference is the wb so that i can write binary and the arr.tostring(). When i checked the properties it had the 905 pixels.

h5py open file with unknown datasets

I try to use h5py to open a file which was created by another program. Unfortunately I don't know the inner structure of the file. All I know is that it should contain a 20x20 matrix which I would like to process with numpy.
Here is what I have done so far:
import numpy
import h5py
f = h5py.File('example.hdf5')
print(f.keys())
The result is as follows:
KeysViewWithLock(<HDF5 file "example.hdf5" (mode r+)>)
How do I go from here? I want to access the matrix as a single numpy.ndarray. The h5py documentation always talks about creating hdf5 files, not reading unknown files.
Thanks a lot.
SOLUTION (thanks to akash karothiya)
use print(list(f.keys())) instead. That gives the names of groups/datasets which can then be accessed as a=f['dataset'].

Ok, as mentioned before akash karothiya helped me find the solution.
Instead of print(f.keys()) use print(list(f.keys())). This returns ['dataset'].
Using this information I can get an h5py dataset object which I then converted into a numpy array as follows:
a = f['dataset']
b = numpy.zeros(np.shape(a), dtype=complex)
for i in range(numpy.size(a,0)):
b[i,:] = np.asarray(a[i]['real'] + 1j*a[i]['imag'], dtype=complex)
UPDATE:
New version without for loop, potentially faster and very versatile (works for both complex and real data and cubes with dimensions NxMxO as well):
a = f['dataset']
if len(a.dtype) == 0:
b = np.squeeze(a[()])
elif len(a.dtype) == 2:
b = np.squeeze(a[()]['real'] + 1.0j*a[()]['imag'])

Python error : index out of bounds

I was curious about image processing with python, so I found this great library imageio,
I tried to manipulate the pixels of a picture and save them in a new file,
but i had some problems with the loops
this is what the code looks like
enter image description here
and this the error that i Got !
IndexError: index 3507 is out of bounds for axis 0 with size 3507
the code :
# -*- coding: iso-8859-1 -*-
import imageio
import numpy as np
im = imageio.imread("JAFFRE009a.png")
taille=im.shape #taille is a tuple (Width,Height)
print taille # (4961,3507)
matrice_pixels=open("matrice.txt",'w')
for i in range(taille[1]):
line=""
for j in range(taille[0]):
line+=repr(im[i][j])
matrice_pixels.write(line+'\n')
matrice_pixels.close()

Because your image doesn't have squarred shape, reshape it before you go through your loop

EDIT
We can iterate through each row/column position and save to a file as below.It will take very long time depending upon file size.
Instead of writing your own function, you may want to take advantage of inbuilt binary save (which is more efficient) as
np.save('matrix.py', np_array)
You can load this file as np array and manipulate
Or as a text file using np.save [ will take longer ]
np.save('matrix.txt', np_array)
Working Code:
import imageio
import numpy as np
im = imageio.imread("9v9zU.png")
matrice_pixels=open("matric.txt","wb")
nx,ny = im.shape
for i in range(nx):
line=""
for j in range(ny):
line+=repr(im[i][j])
matrice_pixels.write(line+'\n')
matrice_pixels.close()
#Save as Binary data
np.save('matrix1.npy', im)
#Save as Human readable data
np.savetxt('matrix1.txt', im)
Alternately, you may want to look into off the shelf libraries that will do what you are intending to do.
For e.g. This SO link discusses how to remove section of the picture based upon its color using PIL library.
Also , in future, please DO NOT post a picture of your code. Copy/pase to SO window so that we can copy and modify. In this case I had write everything down line by line to test(thankfully code was not that long).

I need to find the location of a value in one numpy array and use it to refer to a value in the same location of another numpy array

I am writing code to analyze accelerometer data on a raspberry pi. The sensor data is output to a single txt file with columns separated by \t. I imported the text file using numpy.loadtxt and unpacked it into separate arrays. I can perform things like trapz and cumtrapz on the arrays.
This data will be used in combination with another sensor that will output a specific time of an event. I want to take that time, find the closest logged time from my sensor and correspond it to values from the other arrays.
I tried using numpy.where with a specific time value that i knew was in the list and got an output of "(array([], dtype=int32),)"
Here is the code I ran. I'm sure I misused at least one thing. I am still very much a beginner in Python and coding in general...
import logging
import sys
import numpy as np
from scipy import integrate
x,y,z,t=np.loadtxt('a.txt', dtype={'names':['x','y','z','t'],
'formats':['f4','f4','f4','f4']},unpack='true')
p = integrate.trapz(integrate.cumtrapz(x, t, initial=0), t)
ti = np.where(x==1.5670002)
print ti
print p
The full output from that is
(array([], dtype=int32),)
0.0114166
So I was searching x for a value from t. it is now outputting
(array([101]),)
How would I print that corresponding number from another array?

Here is my solution:
import logging
import sys
import numpy as np
from scipy import integrate
x,y,z,t=np.loadtxt('a.txt', dtype={'names':['x','y','z','t'],
'formats':['f4','f4','f4','f4']},unpack='true')
p = integrate.cumtrapz(integrate.cumtrapz(x, t, initial=0), t)
t0=input("What is reference time?")
ti = np.where(t>=t0)[0][0]
if t[ti]-t0 <= t0-t[ti-1]:
t1 = ti
else:
t1 = ti-1
print ('closest time was {0:0.4f}\ndisplacement at that time was {1:0.4f}' .format(t[t1],p[t1]))
output is
closest time was 1.1991
displacement at that time was 0.0100
Seems to be working. I will have to add error messages for when the reference time is outside of a usable range. Would love some constructive criticism though. Any commands that you think would work better/faster than what I have used?

Trying to save vtk files with tvtk made from NumPy arrays

I'm trying to use tvtk (the package included with Enthought's Canopy) to turn some arrays into .vtk data that I can toss over to VisIt (mayavi complains on my OS (Mac OS X). I found what looked like the solution here (Exporting a 3D numpy to a VTK file for viewing in Paraview/Mayavi) but I'm not recovering the output that the author of the answer does and was wondering if anyone could tell me what I'm doing wrong. So I enter the commands in the Canopy notebook,
import numpy as np
from enthought.tvtk.api import tvtk, write_data
data = np.random.random((10,10,10))
grid = tvtk.ImageData(spacing=(10, 5, -10), origin=(100, 350, 200),
dimensions=data.shape)
grid.point_data.scalars = np.ravel([], order='F')
grid.point_data.scalars.name = 'Test Data'
# Writes legacy ".vtk" format if filename ends with "vtk", otherwise
# this will write data using the newer xml-based format.
write_data(grid, '/Users/Epictetus/Documents/Dropbox/Work/vtktest.vtk')
which does create a vtk file, but unlike the output the author of the previous answer suggests, I just get a blank output,
# vtk DataFile Version 3.0
vtk output
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 10 10 10
SPACING 10 5 -10
ORIGIN 100 350 200
Is it obvious what I'm doing wrong? File I/O has never been my forte...
Cheers!
-user2275987

Change the line
grid.point_data.scalars = np.ravel([], order='F')
to
grid.point_data.scalars = data.ravel(order='F')
Your grid doesn't have any data, and hence nothing is saved to the vtk file! :-)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

numpy.fromfile seems to be unable to read large files - python

Related

How to create a .png file using python?

h5py open file with unknown datasets

Python error : index out of bounds

I need to find the location of a value in one numpy array and use it to refer to a value in the same location of another numpy array

Trying to save vtk files with tvtk made from NumPy arrays

Categories

Resources