Not able to convert binary data from dicom file with python

Not able to convert binary data from dicom file with python - python

I have a Vector Grid Data from a Deformable Registration Grid Sequence whos type is binary.
i'm Trying to convert this data to a list of, i think, signed floating point value elements. but can find the function that allows me perform this operation. Let me show you a piece of the information.
b' dZ=\x00\x90\xb3=\x00\x18\x89\xbd \xe9}=\x00\xc0\xd6=\x00\xa0\xa5\xbd\xe0]\x93=\x00\x10\xfd=\x00\xa8\xc4\xbd\xc0\x8e\xa9=
...
\x95\xf9\xbb\xbc\x00\x80\x06=\xc6\x88(=\xa9\xcb\x82\xbc\x00#\xa6<A\xce\xc6<\xc5\xd5\x19\xbc\x00\x00\x0e<k\xba\x17<\x02\x07i\xbb'
i'll appreciate your help

Vector Grid Data consists of triplets of 4 byte floating point values. Try
from struct import unpack
data = b"..."
values = unpack(f"<{len(data) / 4}f", data)

Related

How to decode and visualize DICOM curve data in Python 3?

I am trying to visualize a DICOM file with Python 3 and pyDicom which should contain a black 100x100 image with some curves drawn in it. The pixel data is extracted from header (7fe0,0010) and when printed shows b'\x00\x00\x00...'. This I can easily convert to a 100x100 numpy array.
However, the curve data in (5000,3000) shows me b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0H#\x00\x00\x00\x00\x00\xc0X#\x00\x00\x00\x00\x00\xc0H#' which I am not able to convert to x,y coordinates in my 100x100 pixel image. In the DICOM file it says
curve dimensions: 2
number of points: 2
type of data: poly
data value representation: 3
curve label: horizontal axis
curve data: 32 elements
The main question is: How do I decode the coordinates required for retracing the curve within my 100x100 image? My main concern is the fact that there should be 32 elements, but only 26 hex values in the output. Also I have no clue how to deal with the \xc0H# and \xc0X#. When I print those, it yields 192 72 64 and 192 88 64. How does python decode these 2 hex codes to 6 numbers? And what do these numbers represent?
EDIT:
Apparently data value representation 3 means the data is represented as a floating point double. On the other hand, there should be two points in the data, so each point is represented by 16 elements? I don't see how these two statements are compatible. What is interesting is that the first \xc0H# translates to 3 numbers as mentioned before, and by doing so complete the first 16 elements of the curve data. How can I convert this into a point in my 2D image?

Curve data has been retired in DICOM since 2004, so you will find the relevant information in the DICOM standard from 2004 (thanks to #kritzel_sw for the link).
As you already found out, Data Value Representation 3 means that the data entries are in double format, and with a Type of Data of polygon, you have x/y tuples in your data. As a double value is saved in 8 bytes, there are 16 bytes per point -- in your case (32 bytes of data) 2 points overall.
Pydicom does not (and probably will not) directly support the retired Curve module (though support for the Waveform module, the current equivalent, has been added in pydicom 2.1), so you have to decode the data yourself. You can do something like this (given double numbers):
from struct import unpack
from pydicom import dcm_read
ds = dcm_read(filename)
data = ds[0x50003000].value
# unpack('d') unpacks 8 bytes into a double
numbers = [unpack('d', data[i:i+8])[0] for i in range(0, len(data), 8)]
# I'm sure there is a nicer way for this...
coords = [(numbers[i], numbers[i+1]) for i in range(0, len(numbers), 2)]
In your example data, this will return:
[(0.0, 49.5), (99.0, 49.5)]
e.g. the x/y coordinates (0, 49.9) and (99.0, 49.5), which corresponds to a horizontal line in the middle of your image.
As to the mismatch of 26 hex elements vs 32 bytes: a byte string representation shows only the bytes that cannot be converted to ASCII in hex string notation, the rest is just shown as the representation of the corresponding ASCII characters. So, for example this part of your byte string: \x00\xc0H# is 4 bytes long and could also be represented as \x00\xc0\x48\x40 in hex string notation.

How to remove the decimal point from the result of an array

from array import array
scores = array('d')
scores.append(90)
scores.append(91)
print(scores)
print(scores[1])
How do we remove the decimal point from the result of an array?

scores = array('d') initializes an array of doubles (as denoted by 'd') rather than integers. If you want to hold an array of values without decimals construct an array this way you could write array('i').
You might also want to take a look at other data types that may fit your needs.
If you just want to print the values without the decimal you can cast them into integers when printing as such: print(int(scores[1]))

You can string format to 0 decimal points print("{0:.0f}".format(scores[1])).

So the value of the element '90' you stored while appending in the array is of data type float
You can eliminate decimal by converting the value to float while printing as follows
print(int(scores[1]))

Also you can use round() for eg round(arr[i],1)

Python: Converting two sequential 2-byte registers (4 bytes) into IEEE floating-point big endian

I am hooking up an instrument to a laptop over TCP/IP. I have been using a python package to talk to it, and have it return numbers. There are two probes hooked up to this instrument, and I believe these are the bytes corresponding to the temperature readings of these two probes.
The instrument, by default, is set to Big Endian and these data should be of a 32-bit floating point variety - meaning that the variable (b) in the code chunk represents two numbers. b is representative of the output that I would get from the TCP functions.
>>> b = [16746, 42536, 16777, 65230]
>>>
My goal in this is to convert these into their float values, and automating the process. Currently, I am running b through the (hex) function to retrieve the hexadecimal equivalents of each byte:
>>> c =[hex(value) for value in b]
>>>
>>> c
>['0x416a', '0xa628', '0x4189', '0xfece']
>>>
... then I have manually created data_1 and data_2 below to match these hex values, then unpacked them using struct.unpack as I found in this other answer:
>>> data_1 = b'\x41\x6a\xa6\x28'
>>> import struct
>>> struct.unpack('>f', data_1)
>(14.665565490722656,)
>>> data_2 = b'\x41\x89\xfe\xce'
>>> struct.unpack('>f', data_2)
>(17.24941635131836,)
>>>
Some questions:
Am I fundamentally missing something? I am a biologist by trade, and usually a R programmer, so Python is relatively new to me.
I am primarily looking for a streamlined way to get from the TCP output (b) to the number outputs of struct.unpack. The eventual goal of this project is to constantly be polling the sensors for data, which will be graphed/displayed on screen as well as being saved to a .csv.
Thank you!

The function below produces same numbers you found:
import struct
def bigIntToFloat(bigIntlist):
pair = []
for bigInt in bigIntlist:
pair.append(bytes.fromhex(format(bigInt, '04x')))
if len(pair) == 2:
yield struct.unpack('>f', b''.join(pair))[0]
pair = []
The key parts are format(bigInt, '04x') which turns an integer into a hex value without the (in this case) unneeded '0x', while ensuring it's zero-padding to four characters, and bytes.fromhex, which turns the output of that into a bytes object suitable for struct.unpack.
As for whether you're missing something, that's hard for me to say, but I will say that the numbers you give look "reasonable" - that is, if you had the ordering wrong, I'd expect the numbers to be vastly different from each other, rather than slightly.

The simplest way is to use struct.pack to turn those numbers back into a byte string, then unpack as you were doing. pack and unpack can also work with multiple values at a time; the only snag is that pack expects individual arguments instead of a list, so you must put a * in front to expand the list.
>>> struct.unpack('>2f', struct.pack('>4H', *b))
(14.665565490722656, 17.24941635131836)

How to take an integer array and convert it into other types?

I'm currently trying to take integer arrays that actually represent other data types and convert them into the correct datatype.
So for example, if I had the integer array [1196773188, 542327116], I discover that this integer array represents a string from some other function, convert it, and realize it represents the string "DOUGLAS". The first number translates to the hexadecimal number 0x47554F44 and the second number represents the hexadecimal number 0x2053414C. Using a hex to string converter, these correspond to the strings 'GOUD' and 'SAL' respectively, spelling DOUGLAS in a little endian manner. The way the letters are backwards in individual elements of the array likely stem from the bytes being stored in a litte endian manner, although I might be mistaken on that.
These integer arrays could represent a number of datatypes, including strings, booleans, and floats.
I need to use Python 2.7, so I unfortunately can't use the bytes function.
Is there a simple way to convert an integer array to its corresponding datatype?

It seems that the struct module is the best way to go when converting between different types like this:
import struct
bufferstr = ""
dougarray = [1196773188, 542327116]
for num in dougarray:
bufferstr += struct.pack("i", num)
print bufferstr # Result is 'DOUGLAS'
From this point on we can easily convert 'DOUGLAS' to any datatype we want using struct.unpack():
print struct.unpack("f", bufferstr[0:4]) # Result is (54607.265625)
We can only unpack a certain number of bytes at a time however. Thank you all for the suggestions!

How would I implement a bit map?

I wish to implement a 2d bit map class in Python. The class would have the following requirements:
Allow the creating of arbitrarily sized 2d bitmaps. i.e. to create an 8 x 8 bitmap (8 bytes), something like:
bitmap = Bitmap(8,8)
provide an API to access the bits in this 2d map as boolean or even integer values, i.e.:
if bitmap[1, 2] or bitmap.get(0, 1)
Able to retrieve the data as packed Binary data. Essentially it would be each row of the bit map concatenated and returned as Binary data. It may be padded to the nearest byte or something similar.
bitmap.data()
Be able to create new maps from the binary data retrieved:
new_bitmap = Bitmap(8, 8, bitmap.data())
I know Python is able to perform binary operations, but I'd like some suggestions as how best to use them to implement this class.

Bit-Packing numpy ( SciPY ) arrays does what you are looking for.
The example shows 4x3 bit (Boolean) array packed into 4 8-bit bytes. unpackbits unpacks uint8 arrays into a Boolean output array that you can use in computations.
>>> a = np.array([[[1,0,1],
... [0,1,0]],
... [[1,1,0],
... [0,0,1]]])
>>> b = np.packbits(a,axis=-1)
>>> b
array([[[160],[64]],[[192],[32]]], dtype=uint8)
If you need 1-bit pixel images, PIL is the place to look.

No need to create this yourself.
Use the very good Python Imaging Library (PIL)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.