What is `readUInt32BE` of nodeJS in Python?

What is `readUInt32BE` of nodeJS in Python? - python

I am translating a source of NodeJS to Python. However there is a function readUInt32BE that I do not quite understand how it works
Original Code
const buf = Buffer.from("vgEAAwAAAA1kZXYubG9yaW90LmlvzXTUl6ESlOrvJST-gsL_xQ==", 'base64');
const appId = parseInt(buf.slice(0, 4).toString('hex'), 16);
const serverIdLength = buf.slice(4, 8).readUInt32BE(0);
Here is what I have tried so far in Python
encodeToken = base64.b64decode("vgEAAwAAAA1kZXYubG9yaW90LmlvzXTUl6ESlOrvJST-gsL_xQ==")
appId = encodeToken[:4]
appId = appId.hex()
serverIdLength = ......
If possible, can you write a function that works the same as readUInt32BE(0) and explain it for me ? Thanks

I'm assuming from the name that the function interpreters an arbitrary sequence of 4 bytes as an unsigned 32-bit (big endian) integer.
The corresponding Python function would be struct.unpack with an appropriate format string.
import struct
appId = encodeToken[:4]
serverIdLength = struct.unpack(">I", appId)[0]
# ">" means "big-endian"
# "I" means 4-byte unsigned integer
No need to to get the a hex representation of the bytes first. unpack always returns a tuple, even if only one value is created by the format string, so you need to take the first element of that tuple as the final value.

Related

How to pack a character and a number correctly?

I'm learning about client-server communication in python, and I want to send some packed structures.I want to pack a mathematical sign and a number. I tried like this:
idx = 50
value1 = "<"
value2 = idx
packer = struct.Struct('1s I')
packed_data = packer.pack(*value1, *value2)
But I got the error:
packed_data = packer.pack(*value1, *value2)
TypeError: 'int' object is not iterable
or this error:
packed_data = packer.pack(*value1, *value2)
struct.error: argument for 's' must be a bytes object
If I try like this:
value2 = [idx]
I don't know how to do this correctly.

The first problem is that you are unnecessarily trying to (sequence-)unpack your arguments. The Struct format expects a bytes and an int, and you (almost) already have them.
The second problem is that "<" is a Unicode string, and pack expects bytes instead. You need to properly encode the string first.
packed_data = packer.pack(value1.encode('utf-8'), value2)
The particular encoding you use doesn't matter, as long as you use the same one to unpack the data.
Note that if you did have a Unicode character that couldn't be encoded in one byte, your string format would be wrong. The struct module doesn't handle variable-length strings by itself, so it would probably be simpler to just encode the int by itself and concatenated that with your encoded string.
value =
packed_data = value1.encode('utf-8') + struct.pack("I", value2)

Why does the size of this Python String change on a failed int conversion

From the tweet here:
import sys
x = 'ñ'
print(sys.getsizeof(x))
int(x) #throws an error
print(sys.getsizeof(x))
We get 74, then 77 bytes for the two getsizeof calls.
It looks like we are adding 3 bytes to the object, from the failed int call.
Some more examples from twitter (you may need to restart python to reset the size back to 74):
x = 'ñ'
y = 'ñ'
int(x)
print(sys.getsizeof(y))
77!
print(sys.getsizeof('ñ'))
int('ñ')
print(sys.getsizeof('ñ'))
74, then 77.

The code that converts strings to ints in CPython 3.6 requests a UTF-8 form of the string to work with:
buffer = PyUnicode_AsUTF8AndSize(asciidig, &buflen);
and the string creates the UTF-8 representation the first time it's requested and caches it on the string object:
if (PyUnicode_UTF8(unicode) == NULL) {
assert(!PyUnicode_IS_COMPACT_ASCII(unicode));
bytes = _PyUnicode_AsUTF8String(unicode, NULL);
if (bytes == NULL)
return NULL;
_PyUnicode_UTF8(unicode) = PyObject_MALLOC(PyBytes_GET_SIZE(bytes) + 1);
if (_PyUnicode_UTF8(unicode) == NULL) {
PyErr_NoMemory();
Py_DECREF(bytes);
return NULL;
}
_PyUnicode_UTF8_LENGTH(unicode) = PyBytes_GET_SIZE(bytes);
memcpy(_PyUnicode_UTF8(unicode),
PyBytes_AS_STRING(bytes),
_PyUnicode_UTF8_LENGTH(unicode) + 1);
Py_DECREF(bytes);
}
The extra 3 bytes are for the UTF-8 representation.
You might be wondering why the size doesn't change when the string is something like '40' or 'plain ascii text'. That's because if the string is in "compact ascii" representation, Python doesn't create a separate UTF-8 representation. It returns the ASCII representation directly, which is already valid UTF-8:
#define PyUnicode_UTF8(op) \
(assert(_PyUnicode_CHECK(op)), \
assert(PyUnicode_IS_READY(op)), \
PyUnicode_IS_COMPACT_ASCII(op) ? \
((char*)((PyASCIIObject*)(op) + 1)) : \
_PyUnicode_UTF8(op))
You also might wonder why the size doesn't change for something like '１'. That's U+FF11 FULLWIDTH DIGIT ONE, which int treats as equivalent to '1'. That's because one of the earlier steps in the string-to-int process is
asciidig = _PyUnicode_TransformDecimalAndSpaceToASCII(u);
which converts all whitespace characters to ' ' and converts all Unicode decimal digits to the corresponding ASCII digits. This conversion returns the original string if it doesn't end up changing anything, but when it does make changes, it creates a new string, and the new string is the one that gets a UTF-8 representation created.
As for the cases where calling int on one string looks like it affects another, those are actually the same string object. There are many conditions under which Python will reuse strings, all just as firmly in Weird Implementation Detail Land as everything we've discussed so far. For 'ñ', the reuse happens because this is a single-character string in the Latin-1 range ('\x00'-'\xff'), and the implementation stores and reuses those.

Python (rospy) to C++ (roscpp) struct.unpack

I am currently translating a rospy IMU-driver to roscpp and have difficulites figuring out what this piece of code does and how I can translate it.
def ReqConfiguration(self):
"""Ask for the current configuration of the MT device.
Assume the device is in Config state."""
try:
masterID, period, skipfactor, _, _, _, date, time, num, deviceID,\
length, mode, settings =\
struct.unpack('!IHHHHI8s8s32x32xHIHHI8x', config)
except struct.error:
raise MTException("could not parse configuration.")
conf = {'output-mode': mode,
'output-settings': settings,
'length': length,
'period': period,
'skipfactor': skipfactor,
'Master device ID': masterID,
'date': date,
'time': time,
'number of devices': num,
'device ID': deviceID}
return conf
I have to admit that I never ever worked with neither ros nor python before.
This is no 1:1 code from the source, I removed the lines I think I know what they do, but especially the try-block is what I don't understand. I would really appreciate help, because I am under great preasure of time.
If someone is curious(context reasons): The files I have to translate are mtdevice.py , mtnode.py and mtdef.py and can be found googleing for the filesnames + the keyword ROS IMU Driver
Thanks a lot in advance.

This piece of code unpacks the fields of a C structure, namely masterID, period, skipfactor, _, _, _, date, time, num, deviceID, length, mode, settings, stores those in a Python dictionary and returns that dictionary as call result. The underscores are placeholders for the parts of the struct that aren't used.
See also: https://docs.python.org/2/library/struct.html, e.g. for a description of the format string ('!IHHHHI8s8s32x32xHIHHI8x') that tells the unpack function what the struct looks like.
The syntax a, b, c, d = f () means that the function returns a thing called a tuple in Python. By assigning a tuple to multiple variables, it's split into its fields.
Example:
t = (1, 2, 3, 4)
a, b, c, d = t
# At this point a == 1, b == 2, c == 3, d == 4
To replace this piece of code by C++ should not be too hard, since C++ has structs much like C. So the simplest C++ implementation of requestConfiguration would be to just return that struct. If you want to stay closer to the Python functionality, your function could put the fields of the struct into a C++ STL map and return that. The format string + the docs that the link points to, tell you what data types are in your struct and where.
Note that it's the second parameter of unpack that holds your data, the first parameter just contains information on the layout (format) of the second parameter, as explained in the link. The second parameter looks to Python as if it's a string, but it's actually a C struct. The first parameter tells Python where to find what in that struct.
So if you read the docs on format strings, you can find out the layout of your second parameter (C struct). But maybe you don't need to. It depends on the caller of your function. It may just expect the plain C struct.
From your added comments I understand that there's more code in your function than you show. The fields of the structs are assigned to attributes of a class.
If you know the field names of your C struct (config) then you can assign them directly to the attributes of your C++ class.
// Pointer 'this' isn't needed but inserted for clarity
this->mode = config.mode;
this->settings = config.settings;
this->length = config.length;
I've assumed that the field names of the config struct are indeed mode, settings, length etc. but you'd have to verify that. Probably the layout of this struct is declared in some C header file (or in the docs).

To do the same thing with C++, you'd declare a struct with the various parameters:
struct DeviceRecord {
uint32_t masterId;
uint16_t period, skipfactor, _a, _b;
uint32_t _c;
char date[8];
char time[8];
char padding[64];
uint16_t num;
uint32_t deviceID;
uint16_t length, mode;
uint32_t settings;
char padding[8];
};
(It's possible this struct is already declared somewhere; it might also use "unsigned int" instead of "uint32_t" and "unsigned short" instead of "uint16_t", and _a, _b, _c would probably have real names.)
Once you have your struct, the question is how to get the data. That depends on where the data is. If it's in a file, you'd do something like this:
DeviceRecord rec; // An instance of the struct, whatever it's called
std::ifstream fin("yourfile.txt", std::ios::binary);
fin.read(reinterpret_cast<char*>(&rec), sizeof(rec));
// Now you can access rec.masterID etc
On the other hand, if it's somewhere in memory (ie, you have a char* or void* to it), then you just need to cast it:
void* data_source = get_data(...); // You'd get this from somewhere
DeviceRecord* rec_ptr = reinterpret_cast<DeviceRecord*>(stat_source);
// Now you can access rec_ptr->masterID etc
If you have a std::vector, you can easily get such a pointer:
std::vector<uint8_t> data_source = get_data(...); // As above
DeviceRecord* rec_ptr = reinterpret_cast<DeviceRecord*>(data_source.data());
// Now you can access rec_ptr->masterID etc, provided data_source remains in scope. You should probably also avoid modifying data_source.
There's one more issue here. The data you've received is in big-endian, but unless you have a PowerPC or other unusual processor, you're probably on a little-endian machine. So you need to do a little byte-swapping before you access the data. You can use the following function to do this.
template<typename Int>
Int swap_int(Int n) {
if(sizeof(Int) == 2) {
union {char c[2]; Int i;} swapper;
swapper.i = n;
std::swap(swapper.c[0], swapper.c[1]);
n = swapper.i;
} else if(sizeof(Int) == 4) {
union {char c[4]; Int i;} swapper;
swapper.i = n;
std::swap(swapper.c[0], swapper.c[3]);
std::swap(swapper.c[1], swapper.c[2]);
n = swapper.i;
}
return n;
}
These return the swapped value rather than changing it in-place, so now you'd access your data with something like swap_int(rec->num). NB: The above byte-swapping code is untested; I'll try compiling it a bit later and fix it if necessary.
Without more information, I can't give you a definitive way of doing this, but perhaps this will be enough to help you work it out on your own.

Python 3 compatibility issue

Description of problem
I have to migrate some code to Python 3. The compilation terminated with success. But I have a problem on the runtime:
static PyObject* Parser_read(PyObject * const self, PyObject * unused0, PyObject * unused1) {
//Retrieve bytes from the underlying data stream.
//In this case, an iterator
PyObject * const i = PyIter_Next(self->readIterator);
//If the iterator returns NULL, then no more data is available.
if(i == NULL)
{
Py_RETURN_NONE;
}
//Treat the returned object as just bytes
PyObject * const bytes = PyObject_Bytes(i);
Py_DECREF(i);
if( not bytes )
{
//fprintf(stderr, "try to read %s\n", PyObject_Str(bytes));
PyErr_SetString(PyExc_ValueError, "iterable must return bytes like objects");
return NULL;
}
....
}
In my python code, I have something like that:
for data in Parser(open("file.txt")):
...
The code works well on Python 2. But on Python 3, I got:
ValueError: iterable must return bytes like objects
Update
The solution of #casevh works well in all test cases except one: when I wrap the stream:
def wrapper(stream):
for data in stream:
for i in data:
yield i
for data in Parser(wrapper(open("file.txt", "rb"))):
...
and I got:
ValueError: iterable must return bytes like objects

One option is to open the file in binary mode:
open("file.txt", "rb")
That should create an iterator that returns a sequence of bytes.
Python 3 strings are assumed to be Unicode and without proper encoding/decoding, they shouldn't be interpreted as a sequence of bytes. If you are reading plain ASCII text, and not a binary data stream, you could also convert from Unicode to ASCII. See PyUnicode_AsASCIIString() and related functions.

As noted by #casevh, in Python you need to decide whether your data is binary or text. The fact that you are iterating lines makes me think that the latter is the case.
def wrapper(stream):
for data in stream:
for i in data:
yield i
works in Python 2, because iterating a str will yield 1-character strings; in Python 3, iterating over a bytes object will yield individual bytes that are integers in range 0 - 255. You can get the the code work identically in Python 2 and 3 (and identically to the Python 2 behaviour of the code above) by using range and slicing 1 byte/character at a time:
def wrapper(stream):
for data in stream:
for i in range(len(data)):
yield data[i:i + 1]
P.S. You also have a mistake in your C extension code: Parser_read takes 3 arguments, 2 of which are named unused_x. Only a method annotated with METH_KEYWORDS takes 3 arguments (PyCFunctionWithKeywords); all others, including METH_NOARGS must be functions taking 2 arguments (PyCFunction).

Python - Convert a signed float to Unsigned Long( DWORD for win32 )

I know python doesn't have unsigned variables but I need to convert one from a program that runs python( Blender ) to a win32 application written in C++. I know I can convert an integer like so:
>>> int i = -1
>>> _ + 2**32
How can I take a float like: 0.2345f and convert it to a long type? I will need to convert to long in python and then back to float in win32( c++ )...
typically in C++ it is down by
>>>float f = 0.2345f;
>>>DWORD dw = *reinterpret_cast< DWORD* >( &f );
this produces an unsigned long ... and to convert it back is simply the reverse:
>>>FLOAT f = *reinterpret_cast< FLOAT* >( &dw );

You can use struct.pack and struct.unpack for this. Note though that it is not a cast (i.e. a reinterpretation of the same memory), but a converter (copy to a new piece of memory).
import struct
def to_float(int_):
return struct.unpack('d', struct.pack('q', int_))[0]
def to_long(float_):
return struct.unpack('q', struct.pack('d', float_))[0]
data = 0.2345
long_data= to_long(data) #4597616773191482474
new_data = to_float(long_data) #0.2345

i = 0.2345
converted = long(i)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is `readUInt32BE` of nodeJS in Python? - python

Related

How to pack a character and a number correctly?

Why does the size of this Python String change on a failed int conversion

Python (rospy) to C++ (roscpp) struct.unpack

Python 3 compatibility issue

Python - Convert a signed float to Unsigned Long( DWORD for win32 )

Categories

Resources