numpy.fromfile seems to be unable to read large files - python
I wanted to write some very simple python helper tool for my project which is reading binary data from an ECG record. I have found somewhere that numpy.fromfile is the most appropriate tool to approach it, so I wrote:
#!/usr/bin/env python3
import sys
import numpy as np
arrayOfNums = np.fromfile(sys.argv[1], 'short')
print("Converting " + sys.argv[1] + "...")
conversionOutput = open("output", "x")
conversionOutput.write(np.array2string(arrayOfNums, separator=' '))
conversionOutput.close()
print("Conversion done.")
I did that to write the data which is 2 byte records unseparated. The input file is somewhat large for a simple text file (over 7MB), however not large enough I think to cause numpy troubles.
The output I got in the file: [-32243 -32141 -32666 ... -32580 -32635 -32690]
Why the dots between? It seems to convert it okay, but omits almost everything it is supposed to save. Any help would be appreciated.
Numpy reads correctly your file. To avoid a long display, numpy uses the dots:
import numpy as np
a = np.random.random(10000)
Output:
>>> a
array([0.20902653, 0.80097215, 0.06909818, ..., 0.5963183 , 0.94024005,
0.31870234])
>>> a.shape
(10000,)
a contains 10000 values and not only the 6 displayed values.
Update
To display the full output:
import sys
np.set_printoptions(threshold=sys.maxsize)
print(a)
Related
How to create a .png file using python?
I am trying to create a .png file from uint8array but am not getting the expected result. The file is 908 bytes but it is supposed to be of 905 bytes. When I try to open the image in the MS paint, it says This is not a valid bitmap file. The same array works for me when I use node.js. Here is the code : import io import numpy as np arr =[137,80,78,71,13,10,26,10,0,0,0,13,73,72,68,82,0,0,0,200,0,0,0,200,8,6,0,0,0,173,88,174,158,0,0,0,4,115,66,73,84,8,8,8,8,124,8,100,136,0,0,0,9,112,72,89,115,0,0,11,19,0,0,11,19,1,0,154,156,24,0,0,3,43,73,68,65,84,120,156,237,221,193,110,163,48,20,64,209,206,168,255,255,203,51,251,44,174,21,192,177,33,231,236,41,52,237,149,37,30,14,63,63,0,0,0,0,0,0,0,0,0,0,19,253,57,120,220,191,75,175,98,236,232,117,114,15,219,254,63,253,157,121,21,112,119,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,191,147,126,174,185,5,239,184,250,255,229,178,185,138,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,172,103,177,222,245,233,61,201,239,58,251,172,208,236,223,111,247,235,59,107,217,179,125,86,16,8,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,71,231,32,159,190,47,61,251,124,103,231,0,163,227,103,207,41,102,159,127,247,207,255,213,101,215,107,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,126,144,179,102,207,1,70,86,239,199,88,253,93,200,171,63,255,105,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,158,50,7,89,125,159,253,238,251,49,206,218,253,250,14,179,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,184,203,28,228,238,239,175,56,59,39,89,189,223,98,247,207,127,26,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,199,62,199,255,97,171,223,15,226,239,56,137,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,183,236,7,121,250,126,139,213,251,77,86,127,126,211,88,65,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,236,50,7,217,125,63,197,234,227,207,90,125,254,145,109,175,207,10,2,65,32,16,4,2,65,32,16,4,2,65,32,16,4,2,97,245,253,239,171,124,251,156,98,245,241,35,179,247,243,76,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,246,125,252,145,217,63,223,249,207,157,127,25,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,109,239,63,191,184,251,126,135,167,255,252,213,199,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,151,57,200,200,234,253,32,171,143,31,121,202,223,249,227,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,238,242,158,244,145,213,239,175,152,109,245,251,73,86,179,31,4,118,36,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,228,233,251,37,118,127,255,198,215,206,97,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,118,153,131,220,253,62,253,232,252,187,127,111,214,89,219,190,223,227,44,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,93,230,32,171,239,227,143,172,126,15,249,200,234,57,195,234,243,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,116,14,242,233,231,255,87,239,167,120,250,123,216,71,118,159,83,189,186,236,243,176,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,56,122,127,252,177,223,131,196,35,152,131,192,39,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,204,250,110,222,171,247,48,123,182,235,217,182,221,243,110,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,1,0,0,0,0,0,0,0,0,56,239,63,169,44,80,125,75,214,2,231,0,0,0,0,73,69,78,68,174,66,96,130] arr = np.array(arr, dtype=np.uint8) np.set_printoptions(formatter={'int':hex}) arr=np.array(arr) f=open("QR icon.png","w") f.write(arr) f.close() Also when I open the created image in notepad, there is an extra space which is not there in the file I created using node. I think I am creating the file in a wrong way. Please help me .....
Okay, first your code didn't work for me i run into a few small errors. When i fixed them, it solved you initial problem: import io import numpy as np arr =[137,80,78,71,13,10,26,10,0,0,0,13,73,72,68,82,0,0,0,200,0,0,0,200,8,6,0,0,0,173,88,174,158,0,0,0,4,115,66,73,84,8,8,8,8,124,8,100,136,0,0,0,9,112,72,89,115,0,0,11,19,0,0,11,19,1,0,154,156,24,0,0,3,43,73,68,65,84,120,156,237,221,193,110,163,48,20,64,209,206,168,255,255,203,51,251,44,174,21,192,177,33,231,236,41,52,237,149,37,30,14,63,63,0,0,0,0,0,0,0,0,0,0,19,253,57,120,220,191,75,175,98,236,232,117,114,15,219,254,63,253,157,121,21,112,119,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,191,147,126,174,185,5,239,184,250,255,229,178,185,138,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,172,103,177,222,245,233,61,201,239,58,251,172,208,236,223,111,247,235,59,107,217,179,125,86,16,8,2,129,32,16,8,2,129,32,16,8,2,129,32,16,8,71,231,32,159,190,47,61,251,124,103,231,0,163,227,103,207,41,102,159,127,247,207,255,213,101,215,107,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,126,144,179,102,207,1,70,86,239,199,88,253,93,200,171,63,255,105,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,158,50,7,89,125,159,253,238,251,49,206,218,253,250,14,179,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,184,203,28,228,238,239,175,56,59,39,89,189,223,98,247,207,127,26,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,199,62,199,255,97,171,223,15,226,239,56,137,21,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,194,183,236,7,121,250,126,139,213,251,77,86,127,126,211,88,65,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,236,50,7,217,125,63,197,234,227,207,90,125,254,145,109,175,207,10,2,65,32,16,4,2,65,32,16,4,2,65,32,16,4,2,97,245,253,239,171,124,251,156,98,245,241,35,179,247,243,76,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,246,125,252,145,217,63,223,249,207,157,127,25,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,109,239,63,191,184,251,126,135,167,255,252,213,199,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,151,57,200,200,234,253,32,171,143,31,121,202,223,249,227,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,238,242,158,244,145,213,239,175,152,109,245,251,73,86,179,31,4,118,36,16,8,2,129,32,16,8,2,129,32,16,8,2,129,176,203,28,100,228,233,251,37,118,127,255,198,215,206,97,172,32,16,4,2,65,32,16,4,2,65,32,16,4,2,65,32,16,118,153,131,220,253,62,253,232,252,187,127,111,214,89,219,190,223,227,44,43,8,4,129,64,16,8,4,129,64,16,8,4,129,64,16,8,132,93,230,32,171,239,227,143,172,126,15,249,200,234,57,195,234,243,79,99,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,129,112,116,14,242,233,231,255,87,239,167,120,250,123,216,71,118,159,83,189,186,236,243,176,130,64,16,8,4,129,64,16,8,4,129,64,16,8,4,129,64,56,122,127,252,177,223,131,196,35,152,131,192,39,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,8,4,130,64,32,204,250,110,222,171,247,48,123,182,235,217,182,221,243,110,5,129,32,16,8,2,129,32,16,8,2,129,32,16,8,2,1,0,0,0,0,0,0,0,0,56,239,63,169,44,80,125,75,214,2,231,0,0,0,0,73,69,78,68,174,66,96,130] arr = np.array(arr, dtype=np.uint8) np.set_printoptions(formatter={'int':hex}) arr=np.array(arr) f=open("QR icon.png","wb") f.write(arr.tostring()) f.close() The difference is the wb so that i can write binary and the arr.tostring(). When i checked the properties it had the 905 pixels.
h5py open file with unknown datasets
I try to use h5py to open a file which was created by another program. Unfortunately I don't know the inner structure of the file. All I know is that it should contain a 20x20 matrix which I would like to process with numpy. Here is what I have done so far: import numpy import h5py f = h5py.File('example.hdf5') print(f.keys()) The result is as follows: KeysViewWithLock(<HDF5 file "example.hdf5" (mode r+)>) How do I go from here? I want to access the matrix as a single numpy.ndarray. The h5py documentation always talks about creating hdf5 files, not reading unknown files. Thanks a lot. SOLUTION (thanks to akash karothiya) use print(list(f.keys())) instead. That gives the names of groups/datasets which can then be accessed as a=f['dataset'].
Ok, as mentioned before akash karothiya helped me find the solution. Instead of print(f.keys()) use print(list(f.keys())). This returns ['dataset']. Using this information I can get an h5py dataset object which I then converted into a numpy array as follows: a = f['dataset'] b = numpy.zeros(np.shape(a), dtype=complex) for i in range(numpy.size(a,0)): b[i,:] = np.asarray(a[i]['real'] + 1j*a[i]['imag'], dtype=complex) UPDATE: New version without for loop, potentially faster and very versatile (works for both complex and real data and cubes with dimensions NxMxO as well): a = f['dataset'] if len(a.dtype) == 0: b = np.squeeze(a[()]) elif len(a.dtype) == 2: b = np.squeeze(a[()]['real'] + 1.0j*a[()]['imag'])
Python error : index out of bounds
I was curious about image processing with python, so I found this great library imageio, I tried to manipulate the pixels of a picture and save them in a new file, but i had some problems with the loops this is what the code looks like enter image description here and this the error that i Got ! IndexError: index 3507 is out of bounds for axis 0 with size 3507 the code : # -*- coding: iso-8859-1 -*- import imageio import numpy as np im = imageio.imread("JAFFRE009a.png") taille=im.shape #taille is a tuple (Width,Height) print taille # (4961,3507) matrice_pixels=open("matrice.txt",'w') for i in range(taille[1]): line="" for j in range(taille[0]): line+=repr(im[i][j]) matrice_pixels.write(line+'\n') matrice_pixels.close()
Because your image doesn't have squarred shape, reshape it before you go through your loop
EDIT We can iterate through each row/column position and save to a file as below.It will take very long time depending upon file size. Instead of writing your own function, you may want to take advantage of inbuilt binary save (which is more efficient) as np.save('matrix.py', np_array) You can load this file as np array and manipulate Or as a text file using np.save [ will take longer ] np.save('matrix.txt', np_array) Working Code: import imageio import numpy as np im = imageio.imread("9v9zU.png") matrice_pixels=open("matric.txt","wb") nx,ny = im.shape for i in range(nx): line="" for j in range(ny): line+=repr(im[i][j]) matrice_pixels.write(line+'\n') matrice_pixels.close() #Save as Binary data np.save('matrix1.npy', im) #Save as Human readable data np.savetxt('matrix1.txt', im) Alternately, you may want to look into off the shelf libraries that will do what you are intending to do. For e.g. This SO link discusses how to remove section of the picture based upon its color using PIL library. Also , in future, please DO NOT post a picture of your code. Copy/pase to SO window so that we can copy and modify. In this case I had write everything down line by line to test(thankfully code was not that long).
I need to find the location of a value in one numpy array and use it to refer to a value in the same location of another numpy array
I am writing code to analyze accelerometer data on a raspberry pi. The sensor data is output to a single txt file with columns separated by \t. I imported the text file using numpy.loadtxt and unpacked it into separate arrays. I can perform things like trapz and cumtrapz on the arrays. This data will be used in combination with another sensor that will output a specific time of an event. I want to take that time, find the closest logged time from my sensor and correspond it to values from the other arrays. I tried using numpy.where with a specific time value that i knew was in the list and got an output of "(array([], dtype=int32),)" Here is the code I ran. I'm sure I misused at least one thing. I am still very much a beginner in Python and coding in general... import logging import sys import numpy as np from scipy import integrate x,y,z,t=np.loadtxt('a.txt', dtype={'names':['x','y','z','t'], 'formats':['f4','f4','f4','f4']},unpack='true') p = integrate.trapz(integrate.cumtrapz(x, t, initial=0), t) ti = np.where(x==1.5670002) print ti print p The full output from that is (array([], dtype=int32),) 0.0114166 So I was searching x for a value from t. it is now outputting (array([101]),) How would I print that corresponding number from another array?
Here is my solution: import logging import sys import numpy as np from scipy import integrate x,y,z,t=np.loadtxt('a.txt', dtype={'names':['x','y','z','t'], 'formats':['f4','f4','f4','f4']},unpack='true') p = integrate.cumtrapz(integrate.cumtrapz(x, t, initial=0), t) t0=input("What is reference time?") ti = np.where(t>=t0)[0][0] if t[ti]-t0 <= t0-t[ti-1]: t1 = ti else: t1 = ti-1 print ('closest time was {0:0.4f}\ndisplacement at that time was {1:0.4f}' .format(t[t1],p[t1])) output is closest time was 1.1991 displacement at that time was 0.0100 Seems to be working. I will have to add error messages for when the reference time is outside of a usable range. Would love some constructive criticism though. Any commands that you think would work better/faster than what I have used?
Trying to save vtk files with tvtk made from NumPy arrays
I'm trying to use tvtk (the package included with Enthought's Canopy) to turn some arrays into .vtk data that I can toss over to VisIt (mayavi complains on my OS (Mac OS X). I found what looked like the solution here (Exporting a 3D numpy to a VTK file for viewing in Paraview/Mayavi) but I'm not recovering the output that the author of the answer does and was wondering if anyone could tell me what I'm doing wrong. So I enter the commands in the Canopy notebook, import numpy as np from enthought.tvtk.api import tvtk, write_data data = np.random.random((10,10,10)) grid = tvtk.ImageData(spacing=(10, 5, -10), origin=(100, 350, 200), dimensions=data.shape) grid.point_data.scalars = np.ravel([], order='F') grid.point_data.scalars.name = 'Test Data' # Writes legacy ".vtk" format if filename ends with "vtk", otherwise # this will write data using the newer xml-based format. write_data(grid, '/Users/Epictetus/Documents/Dropbox/Work/vtktest.vtk') which does create a vtk file, but unlike the output the author of the previous answer suggests, I just get a blank output, # vtk DataFile Version 3.0 vtk output ASCII DATASET STRUCTURED_POINTS DIMENSIONS 10 10 10 SPACING 10 5 -10 ORIGIN 100 350 200 Is it obvious what I'm doing wrong? File I/O has never been my forte... Cheers! -user2275987
Change the line grid.point_data.scalars = np.ravel([], order='F') to grid.point_data.scalars = data.ravel(order='F') Your grid doesn't have any data, and hence nothing is saved to the vtk file! :-)