Defining type of binary file string - python

Is it possible to check what type of data does represent string binary file?
For example:
We have binary data read from image file (f.e. example.jpg). Can we guess that this binary data represent image file? If yes, how we can do that?

You could look at the python-magic library. It seems to do what you want (although I've never used it myself).

Related

Python unpack equivalent in vba

Does anyone know the equivalent of the Python unpack function in vba. Ultimately, trying to read a file as binary, loading the data into a Byte array then converting certain portions of the byte array into floating point numbers using little endian ordering. I can do this in Python, but would prefer to use vba for other reasons.

Reading binary big endian files in python

I'd like to use python read a large binary file in ieee big endian 64bit floating point format, but am having trouble getting the correct values. I have a working method in matlab, as below:
fid=fopen(filename,'r','ieee-be');
data=fread(fid,inf,'float64',0,'ieee-be');
fclose(fid)
I've tried the following in python:
data = np.fromfile(filename, dtype='>f', count=-1)
This method doesn't throw any errors, but the values it reads are extremely large and incorrect. Can anyone help with a way to read these files? Thanks in advance.
Using >f will give you a single-precision (32-bit) floating point value. Instead, try
data = np.fromfile(filename, dtype='>f8', count=-1)

storing matrices in golang in compressed binary format

I am exploring a comparison between Go and Python, particularly for mathematical computation. I noticed that Go has a matrix package mat64.
1) I wanted to ask someone who uses both Go and Python if there are functions / tools comparable that are equivalent of Numpy's savez_compressed which stores data in a npz format (i.e. "compressed" binary, multiple matrices per file) for Go's matrics?
2) Also, can Go's matrices handle string types like Numpy does?
1) .npz is a numpy specific format. It is unlikely that Go itself would ever support this format in the standard library. I also don't know of any third party library that exists today, and (10 second) search didn't pop one up. If you need npz specifically, go with python + numpy.
If you just want something similar from Go, you can use any format. Binary formats include golang binary and gob. Depending on what you're trying to do, you could even use a non-binary format like json and just compress it on your own.
2) Go doesn't have built-in matrices. That library you found is third party and it only handles float64s.
However, if you just need to store strings in matrix (n-dimensional) format, you would use a n-dimensional slice. For 2-dimensional it looks like this: var myStringMatrix [][]string.
npz files are zip archives. Archiving and compression (optional) are handled by the Python zip module. The npz contains one npy file for each variable that you save. Any OS based archiving tool can decompress and extract the component .npy files.
So the remaining question is - can you simulate the npy format? It isn't trivial, but also not difficult either. It consists of a header block that contains shape, strides, dtype, and order information, followed by a data block, which is, effectively, a byte image of the data buffer of the array.
So the buffer information, and data are closely linked to the numpy array content. And if the variable isn't a normal array, save uses the Python pickle mechanism.
For a start I'd suggest using the csv format. It's not binary, and not fast, but everyone and his brother can generate and read it. We constantly get SO questions about reading such files using np.loadtxt or np.genfromtxt. Look at the code for np.savetxt to see how numpy produces such files. It's pretty simple.
Another general purpose choice would be JSON using the tolist format of an array. That comes to mind because GO is Google's home grown alternative to Python for web applications. JSON is a cross language format based on simplified Javascript syntax.

pysnmp Command Responder - handling managed objects value classes

I'm developing a command responder with pysnmp, based on
http://pysnmp.sourceforge.net/examples/current/v3arch/agent/cmdrsp/v2c-custom-scalar-mib-objects.html
My intention is to answer to the get message of my managed objects by reading the snmp data from a text file (updated over time).
I'm polling the responder using snmpB, drawing a graph of the polled object value evolution.
I've successfully modify the example exporting my first Managed Object, adding it with mibBuilder.exportSymbols() and retrieving the values from the txt file in the modified getvalue method. I'm able to poll this object with success. It's a Counter32 type object.
The next step is handle other objects with a value type different from the "supported" classes like Integer32, Counter32, OctetString
I need to handle floating point values or other specific data formats defined within MIB files, because snmpB expect these specific formats for correctly plotting the graph.
Unfortunetely I can't figure out a way to do this.
Hope someone can help,
Mark
EDIT 1
The textual-convention I need to implement is the Float32TC defined in FLOAT-TC-MIB from RFC6340:
Float32TC ::= TEXTUAL-CONVENTION
STATUS current
DESCRIPTION "This type represents a 32-bit (4-octet) IEEE
floating-point number in binary interchange format."
REFERENCE "IEEE Standard for Floating-Point Arithmetic,
Standard 754-2008"
SYNTAX OCTET STRING (SIZE(4))
There is no native floating point type in SNMP, and you can't add radically new types to the protocol. But you can put additional constraints on existing types or modify value representation via TEXTUAL-CONVENTION.
To represent floating point numbers you have two options:
encode floating point number into octet-stream and pass it as OCTET STREAM type (RFC6340)
use INTEGER type along with some TEXTUAL-CONVENTION to represent integer as float
Whatever values are defined in MIB, they always base on some built-in SNMP type.
You could automatically generate pysnmp MibScalar classes from your ASN.1 MIB with pysmi tool, then you could manually add MibScalarInstance classes with some system-specific code thus linking pysnmp to your data sources (like text files).

Read and Write to McIDAS Area files in Python

I'm working with GOES data from NOAA, and need to save the result off in McIDAS Area format. Currently, I'm using the struct module to pack/unpack the binary data according to the byte format described in the McIDAS Programmer's Guide.
Is there something better suited for dealing with these binary formats (hopefully something that can be used with Python)?

Categories