I have the following code written in Matlab:
>>> fid = fopen('filename.bin', 'r', 'b')
>>> %separated r and b
>>> dim = fread(dim, 2, 'uint32');
if I use a "equivalent" code in Python
>>> fid = open('filename.bin', 'rb')
>>> dim = np.fromfile(fid, dtype=np.uint32)
I got a different value of dim when I use Python.
Someone knows how to open this file with permission like Matlab ('r' and 'b' separated) in Python?
Thanks in advance,
Rhenan
From Matlab docs I learn that your third parameter 'b' stands for Big-Endian ordering, is not a permission.
Most probably Numpy uses the little-endian order on your machine. To fix the problem try to specify explicitly the ordering in Numpy (as you do in Matlab):
>>> fid = open('filename.bin', 'rb')
>>> dim = np.fromfile(fid, dtype='>u4')
the dtype string stands for Big-Endian ('>'), unsigned integer ('u'), 4-bytes number.
See also Data type objects (dtype) in Numpy reference.
Related
So i need to unpack an extremely long byte stream (from USB) into 4 byte values.
Currently i got it working, but i feel there's a better way to do this.
Currently i got:
l=[]
for i in range(int(len(mybytes)/4)):
l.append(struct.unpack_from('>i',mybytes,i*4))
So this feels like very resource expensive, and im doing this for 16k bytes A LOT.
I also feel like this has probably been asked before i just don't really know how to word it for searching
You could also try the array module which has the ability to load directly from binary data:
import array
arr = array.array("I",mybytes) # "I" stands for unsigned integer
arr.byteswap() # only if you're reading endian coding different from your platform
l = list(arr)
You can specify a size for the integers to unpack (Python 3.6+):
>>> import struct
>>> mybytes = bytes([1,2,3,4,5,6,7,8])
>>> struct.unpack(f'>2i',mybytes)
(16909060, 84281096)
>>> n = len(mybytes) // 4
>>> struct.unpack(f'>{n}i',mybytes) # Python 3.6+ f-strings
(16909060, 84281096)
>>> struct.unpack('>{}i'.format(n),mybytes) # Older Pythons
(16909060, 84281096)
>>> [hex(i) for i in _]
['0x1020304', '0x5060708']
Wrap it in a BytesIO object, then use iter to call its read method until it returns an empty bytes value.
>>> import io, struct
>>> bio = io.BytesIO(b'abcdefgh')
>>> int_fmt = struct.Struct(">i")
>>> list(map(int_fmt.unpack, iter(lambda: bio.read(4), b'')))
[(1633837924,), (1701209960,)]
You can tweak this to extract the single int value from each tuple, or switch to the from_bytes class method.
>>> bio = io.BytesIO(b'abcdefgh')
>>> list(map(lambda i: int.from_bytes(i, 'big'), iter(lambda: bio.read(4), b'')))
[1633837924, 1701209960]
I once read the following function in a given program, but I am not very clear about what is this function used for? According SciPy.org,
dtype.newbyteorder(new_order='S')
Return a new dtype with a different byte order.
I do not quite understand what does it mean?
def _read32(bytestream):
dt = numpy.dtype(numpy.uint32).newbyteorder('>')
return numpy.frombuffer(bytestream.read(4), dtype=dt)[0]
Here's a simple example:
>>> dt = np.dtype(np.uint32)
>>> val = np.frombuffer(b'\x01\x02\x03\x04', dtype=dt)
>>> hex(val)
'0x4030201'
>>> val2 = np.frombuffer(b'\x01\x02\x03\x04', dtype=dt.newbyteorder('>'))
>>> hex(val2)
'0x1020304'
Byte order describes which order to pack a series of bytes (in this case, a bytes object from a b"..." literal) into the data type you asked for.
I have various problems with my assigned data types after read from any binary file with np.fromfile and np.memmap.
I am reading the following:
openfile = open(mypath,'rb')
openfile.seek(start_byte)
myvalue = np.fromfile(openfile, dtype = np.uint64, count=1)
print myvalue
return:
myvalue = [1234]
myvalue has 8 bytes and is interpreted as an ndarray, but I want just an uint64-value using it as an index.
1) How to prevent np.fromfile to write in an ndarray?
If I am trying: myvalue = myvalue[0] myvalue loses it's data type completely.
2) Why does myvalue looses it's data type when I am accessing the first
I have to do something like that with my arrays:
data.extend([myvalue for l in range(myvalue)])
Try to assign again a data type: myvalue = myvalue[0].astype(np.uint64). Now I get:
self.data_array[count:count+myvalue,0] = data[count:count+myvalue]
TypeError: slice indices must be integers or None or have an __index__ method
3) What is going wrong here?
If I am assigning myvalue as: myvalue = myvalue[0].astype(np.int32) The data is interpreted wrongly and I get: -35566848567 etc.
4) Why can myvalue still be wrongly interpreted by the programme after read in since
myvalue = myvalue[0].astype(np.int32)
IS NOT
myvalue = myvalue[0].astype(np.uint64)
Forgive the possible non-answer, but it's easier to post code this way...
Why do you think that myvalue loses its type information? It doesn't do that when I try it here (Python 2.7):
>>> myvalue = np.array([1234], np.uint64)
>>> myvalue = myvalue[0]
>>> type(myvalue)
<type 'numpy.uint64'>
There is a warning about using fromfile in its docs:
Notes
----- Do not rely on the combination of tofile and fromfile for data storage, as the binary files generated are are not platform
independent. In particular, no byte-order or data-type information is
saved. Data can be stored in the platform independent .npy format
using save and load instead.
Actually, I'm trying to convert ctypes arrays to python lists and back.
If found this thread. But it assumes that we know the type at compile time.
But is it possible to retrieve a ctypes type for an element?
I have a python list that contains at least one element. I want to do something like that
import ctypes
arr = (type(pyarr[0]) * len(pyarr))(*pyarr)
This obviously doesn't work because type() doesn't return a ctypes compatible class. But even if the list contains object created directly from ctypes, the above code doesn't work because its an object instance of the type.
Is there any way to perform this task?
[EDIT]
Ok, here is the code that works for me. I'm using it to convert input paraters from comtypes server method to python lists and return values to array pointers:
def list(count, p_items):
"""Returns a python list for the given times represented by a pointer and the number of items"""
items = []
for i in range(count):
items.append(p_items[i])
return items
def p_list(items):
"""Returns a pointer to a list of items"""
c_items = (type(items[0])*len(items))(*items)
p_items = cast(c_items, POINTER(type(items[0])))
return p_items
As explained before, p_list(items) requires at least one element.
I don't think that's directly possible, because multiple ctypes types map to single Python types. For example c_int/c_long/c_ulong/c_ulonglong all map to Python int. Which type would you choose? You could create a map of your preferences:
>>> D = {int:c_int,float:c_double}
>>> pyarr = [1.2,2.4,3.6]
>>> arr = (D[type(pyarr[0])] * len(pyarr))(*pyarr)
>>> arr
<__main__.c_double_Array_3 object at 0x023540D0>
>>> arr[0]
1.2
>>> arr[1]
2.4
>>> arr[2]
3.6
Also, the undocumented _type_ can tell the type of a ctypes array.
>>> arr._type_
<class 'ctypes.c_double'>
Using a Python array, I can initialize a 32,487,834 integer array (found in a file HR.DAT) using the following (not perfectly Pythonic, of course) commands:
F = open('HR.DAT','rb')
HR = array('I',F.read())
F.close()
I need to do the same in ctypes. So far the best I have is:
HR = c_int * 32487834
I'm not sure how to initilize each element of the array using HR.DAT. Any thoughts?
Thanks,
Mike
File objects have a 'readinto(..)' method that can be used to fill objects that support the buffer interface.
So, something like this should work:
f = open('hr.dat', 'rb')
array = (c_int * 32487834)()
f.readinto(array)
Try something like this to convert array to ctypes array
>>> from array import array
>>> a = array("I")
>>> a.extend([1,2,3])
>>> from ctypes import c_int
>>> ca = (c_int*len(a))(*a)
>>> print ca[0], ca[1], ca[2]
1 2 3