I have a 10bit .raw file which is coded into 8 bit. so after the 4 bytes, the 5th byte contains the 2 bits or each 4bytes. ( so total after decode I should get 10 bits per pixel).
raw file : <_io.TextIOWrapper name='img_10bit.raw' mode='r' encoding='cp1252'>
size of 1d array after reading using np.uint8 = (2611200,)
check this image for the bits placement
How to get the 10 bit data per pixel to reconstruct the image? , i need to do this in python.
Hint:
Read four bytes and shift them left by two bits.
Read a fifth byte and slice it in four parts:
"bitwise and" with 3;
shift right by two bits.
Add the results of the "bitwise and" to the respective bytes and you get four 10 bits numbers.
Related
I am trying to visualize a DICOM file with Python 3 and pyDicom which should contain a black 100x100 image with some curves drawn in it. The pixel data is extracted from header (7fe0,0010) and when printed shows b'\x00\x00\x00...'. This I can easily convert to a 100x100 numpy array.
However, the curve data in (5000,3000) shows me b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0H#\x00\x00\x00\x00\x00\xc0X#\x00\x00\x00\x00\x00\xc0H#' which I am not able to convert to x,y coordinates in my 100x100 pixel image. In the DICOM file it says
curve dimensions: 2
number of points: 2
type of data: poly
data value representation: 3
curve label: horizontal axis
curve data: 32 elements
The main question is: How do I decode the coordinates required for retracing the curve within my 100x100 image? My main concern is the fact that there should be 32 elements, but only 26 hex values in the output. Also I have no clue how to deal with the \xc0H# and \xc0X#. When I print those, it yields 192 72 64 and 192 88 64. How does python decode these 2 hex codes to 6 numbers? And what do these numbers represent?
EDIT:
Apparently data value representation 3 means the data is represented as a floating point double. On the other hand, there should be two points in the data, so each point is represented by 16 elements? I don't see how these two statements are compatible. What is interesting is that the first \xc0H# translates to 3 numbers as mentioned before, and by doing so complete the first 16 elements of the curve data. How can I convert this into a point in my 2D image?
Curve data has been retired in DICOM since 2004, so you will find the relevant information in the DICOM standard from 2004 (thanks to #kritzel_sw for the link).
As you already found out, Data Value Representation 3 means that the data entries are in double format, and with a Type of Data of polygon, you have x/y tuples in your data. As a double value is saved in 8 bytes, there are 16 bytes per point -- in your case (32 bytes of data) 2 points overall.
Pydicom does not (and probably will not) directly support the retired Curve module (though support for the Waveform module, the current equivalent, has been added in pydicom 2.1), so you have to decode the data yourself. You can do something like this (given double numbers):
from struct import unpack
from pydicom import dcm_read
ds = dcm_read(filename)
data = ds[0x50003000].value
# unpack('d') unpacks 8 bytes into a double
numbers = [unpack('d', data[i:i+8])[0] for i in range(0, len(data), 8)]
# I'm sure there is a nicer way for this...
coords = [(numbers[i], numbers[i+1]) for i in range(0, len(numbers), 2)]
In your example data, this will return:
[(0.0, 49.5), (99.0, 49.5)]
e.g. the x/y coordinates (0, 49.9) and (99.0, 49.5), which corresponds to a horizontal line in the middle of your image.
As to the mismatch of 26 hex elements vs 32 bytes: a byte string representation shows only the bytes that cannot be converted to ASCII in hex string notation, the rest is just shown as the representation of the corresponding ASCII characters. So, for example this part of your byte string: \x00\xc0H# is 4 bytes long and could also be represented as \x00\xc0\x48\x40 in hex string notation.
im a beginner with python and want to make a program that converts a hex RGB value to a 15 bit RGB one (5 bits for every color) i heard that it can be done by bitshifts but i don´t get how i also didn´t find anything helpful on the internet can someone please help me
If you're doing this manually (say, for homework), think of the problem like this:
If you have a six character hex representation of a color ("7FD87F" for instance), it's made up of the RGB components R: 7F, G: D8, B: 7F.
Two hexadecimal digits can encode 256 different states (16^2 = 256), so each of these components is 8 bits in size (256 = 2^8).
You want to transform your value into a color space where each component is represented in 5 bits. The way to do this is to throw away the three least significant bits from each of the components. For example:
0b10101010 => 0b10101
As you correctly mentioned, you would do this via bit-shifting. You'll then need to recombine the components. As a hint, here's how I would recombine the original 8-bit components into a single 24-bit representation:
(R << 16) + (G << 8) + (B << 0)
# or just B since a shift by zero is equivalent to
# multiplying by 2^0 is
# multiplying by 1 is
# the multiplicative identity
So the sketch of your algorithm, assuming you are starting with a hex string and not an integer:
Split the hex string into individual color components
Convert the string representations to numeric representations
Bit shift the components
Recombine the 5-bit components into a 15-bit representation
Additionally, steps 1 and 2 are interchangible with some bit-masking and a little more bit-shifting.
I've got a raw binary file (1 KB↓) that is a serial data dump of a GPS stream (along with some associated metadata). I'm specifically trying to pull a value out of the binary file that represents the GPS time; I know its offset and width in the file (10 and 8 bytes respectively, with a total frame width of 28 bytes) but it's encoded in a very weird way as described in the quote below.
What's the most Pythonic way to read this data (into a list or array)?
GPS TIME - GPS Sensor time (time of week in seconds, starting at
Saturday 2400 hours/ Sunday 0000 hours) if GPS Time Valid Message 3500
is set to 1, otherwise SDN500 system time since power up is reported.
Data words are in the order 2, 1 (MSW), 4 (LSW), 3.
A message word length is 16 bits on the SDN500–HV interface. However,
the SDN500–HV protocol, which uses a standard Universal Asynchronous
Receiver Transmitter (UART), transmits data in 8-bit groups (bytes).
This means that two bytes are required in order to make up one message
word.
A byte of information is transmitted as a sequence of 11 bits: one
start bit, 8 bits of data (least significant bit (LSB) first), one
parity bit (odd), and one stop bit. For each 16-bit data word, the
least significant byte is transmitted first, followed by the most
significant byte. Integer and floating point data types consisting of
more than one word are transmitted from the lowest numbered word to
the highest numbered word. The one exception to this rule is the time
tag, which is output in words 6-9 of each HV output message. The four
16-bit data words are in the following order: 2,1,4,3, where 1
represents the most significant word and 4 the least significant word.
Each word is separately byte-reversed.
start by opening the file
fin = open("20160128t184727_pps","rb")
then read in a frame
def read_frame(f_handle):
frame = f_handle.read(28) # 28 byte frame size
start_byte = 10
end_byte = 18 # 4 words each word is 2 bytes
timestamp_raw = frame[start_byte:end_byte]
timestamp_words = struct.unpack(">HHHH",timestamp_raw)
I could probably help more but I dont understand where the timestamp startbyte and endbyte is from your description as it does not seem to match the description you quoted ... I also do not know what the expected output value is ...if you provided those details I could probably help more
I am trying to decode the run-length-encoding described in this specification here.
it says:
There may be 1, 2, 3, or 4 bytes per count. The first two bits of the first count byte contains 0,1,2,3 indicating that the count is contained in 1, 2,3, or 4 bytes. Then the rest of the byte (6 bits) represent the six most significant bytes of the count. The next byte, if present, represents decreasing significance
I have successfully read the first 2 bits for the length, but am unable to figure out how to get the value encoded in the next 14 bits.
heres how I got the length:
number_of_bytes = (firstbyte >> 6) + 1
It seams that the data is big endian. I have tried bit shifting and unpacking and repacking with different endiannesses bit I cant get the numbers I expect.
To get the 6 least significant bits, use
firstbyte & 0b111111
so to get a 14 bit value
((firstbyte & 0b111111) << 8) + secondbyte
Given this example in Python
sample = '5PB37L2CH5DUDWN2SUOYE6LJPYCJBFM5N2FGVEHF7HD224UR52KB===='
a = base64.b32decode(sample)
b = base64.b32encode(a)
why is it that
sample != b ?
BUT where
sample = '5PB37L2CH5DUDWN2SUOYE6LJPYCJBFM5N2FGVEHF7HD224UR52KBAAAA'
then
sample == b
the first sample you got there is invalid base64.
taken from wiki:
When the number of bytes to encode is not divisible by 3 (that is, if there are only one or two bytes of input for the last block), then the following action is performed: Add extra bytes with value zero so there are three bytes, and perform the conversion to base64. If there was only one significant input byte, only the first two base64 digits are picked, and if there were two significant input bytes, the first three base64 digits are picked. '=' characters might be added to make the last block contain four base64 characters.
http://en.wikipedia.org/wiki/Base64#Examples
edit:
taken from RFC 4648:
Special processing is performed if fewer than 24 bits are available
at the end of the data being encoded. A full encoding quantum is
always completed at the end of a quantity. When fewer than 24 input
bits are available in an input group, bits with value zero are added
(on the right) to form an integral number of 6-bit groups. Padding
at the end of the data is performed using the '=' character.
4 times 8bits (the ='s) (at the end of your sample) is more than 24bits so they are at the least unneccessary. (not sure what datatype sample is, but find out and take it's size times number of characters divided by 24)
about your particular sample:
base-encoding reads in 24bit chunks and only needs '=' padding characters at the end of the base'd string to make whatever was left of the string after splitting it into 24bit chunks be "of size 24" so it can be parsed by the decoder.
since the ===='s at the end of your string amount to more than 24bits they are useless, hence: invalid...
First, let's be clear: your question is about base32, not base64.
Your original sample is a bit too long. There are 4 = padding at the end, meaning at least 20 bits of padding. The number of bits must be a multiple of 8 so it's really 24 bits. The encoding for B in base32 is 1, which means one of the padding bits is set. This is a violation of the spec, which says all the padding bits must be clear. The decode drops the bit completely, and the encode produces the proper value A instead of B.