Python: parse VARIANT (?)

Python: parse VARIANT (?) - python

I have to read a file in python that uses Microsoft VARIANT (I think - I really don't know much about Microsoft code :S). Basically I want to know if there are python packages that can do this for me.
To explain - the file I'm trying to read is just a whole bunch of { 2-byte integer, <data> } repeated over and over, where the 2-byte integer specifies what the <data> is.
The 2-byte integer corresponds to the Microsoft data types in VARIANT: VT_I2, VT_I4, etc, and based on the type I can write code to read in and coerce <data> to an appropriate Python object.
My current attempt is along the following lines:
while dtype = file.read(2):
value = None
# translate dtype (I've put in VT_XX myself to match up with Microsoft)
if dtype == VT_I2:
value = file.read(2)
elif dtype == VT_I4:
value = file.read(4)
# ... and so on for other types
# append value to the list of values
# return the values we read
return values
The thing is, I'm having trouble working out how to convert some of the bytes to the appropriate Python object (for example VT_BSTR, VT_DECIMAL, VT_DATE). However before I try further, I'd like to know if there are any existing python packages that do this logic for me (i.e. take in a file object/bytes and parse it into a set of python objects, be they float, int, dates, strings, ...).
It just seems like this is a fairly common thing to do.
However, I've been having difficulty looking for packages to do it because not knowing anything about Microsoft code, I don't have the terminology to do the appropriate googling. (If it is relevant, I am running LINUX).

The win32com package in pywin32 will do just that for you. The documentation is quite underwhelming, but there's a lot variant.html included explaining the basic use and a lot of tutorials and references online.

Related

Key 'boot_num' is not recognized when being interpreted from a .JSON file

Currently, I am working on a Boot Sequence in Python for a larger project. For this specific part of the sequence, I need to access a .JSON file (specs.json), establish it as a dictionary in the main program. I then need to take a value from the .JSON file, and add 1 to it, using it's key to find the value. Once that's done, I need to push the changes to the .JSON file. Yet, every time I run the code below, I get the error:
bootNum = spcInfDat['boot_num']
KeyError: 'boot_num'`
Here's the code I currently have:
(Note: I'm using the Python json library, and have imported dumps, dump, and load.)
# Opening of the JSON files
spcInf = open('mki/data/json/specs.json',) # .JSON file that contains the current system's specifications. Not quite needed, but it may make a nice reference?
spcInfDat = load(spcInf)
This code is later followed by this, where I attempt to assign the value to a variable by using it's dictionary key (The for statement was a debug statement, so I could visibly see the Key):
for i in spcInfDat['spec']:
print(CBL + str(i) + CEN)
# Loacting and increasing the value of bootNum.
bootNum = spcInfDat['boot_num']
print(str(bootNum))
bootNum = bootNum + 1
(Another Note: CBL and CEN are just variables I use to colour text I send to the terminal.)
This is the interior of specs.json:
{
"spec": [
{
"os":"name",
"os_type":"getwindowsversion",
"lang":"en",
"cpu_amt":"cpu_count",
"storage_amt":"unk",
"boot_num":1
}
]
}
I'm relatively new with .JSON files, as well as using the Python json library; I only have experience with them through some GeeksforGeeks tutorials I found. There is a rather good chance that I just don't know how .JSON files work in conjunction with the library, but I figure that it would still be worth a shot to check here. The GeeksForGeeks tutorial had no documentation about this, as well as there being minimal I know about how this works, so I'm lost. I've tried searching here, and have found nothing.
Issue Number 2
Now, the prior part works. But, when I attempt to run the code on the following lines:
# Changing the values of specDict.
print(CBL + "Changing values of specDict... 50%" + CEN)
specDict ={
"os":name,
"os_type":ost,
"lang":"en",
"cpu_amt":cr,
"storage_amt":"unk",
"boot_num":bootNum
}
# Writing the product of makeSpec to `specs.json`.
print(CBL + "Writing makeSpec() result to `specs.json`... 75%" + CEN)
jsonobj = dumps(specDict, indent = 4)
with open('mki/data/json/specs.json', "w") as outfile:
dump(jsonobj, outfile)
I get the error:
TypeError: Object of type builtin_function_or_method is not JSON serializable.
Is there a chance that I set up my dictionary incorrectly, or am I using the dump function incorrectly?

You can show the data using:
print(spcInfData)
This shows it to be a dictionary, whose single entry 'spec' has an array, whose zero'th element is a sub-dictionary, whose 'boot_num' entry is an integer.
{'spec': [{'os': 'name', 'os_type': 'getwindowsversion', 'lang': 'en', 'cpu_amt': 'cpu_count', 'storage_amt': 'unk', 'boot_num': 1}]}
So what you are looking for is
boot_num = spcInfData['spec'][0]['boot_num']
and note that the value obtained this way is already an integer. str() is not necessary.
It's also good practice to guard against file format errors so the program handles them gracefully.
try:
boot_num = spcInfData['spec'][0]['boot_num']
except (KeyError, IndexError):
print('Database is corrupt')
Issue Number 2
"Not serializable" means there is something somewhere in your data structure that is not an accepted type and can't be converted to a JSON string.
json.dump() only processes certain types such as strings, dictionaries, and integers. That includes all of the objects that are nested within sub-dictionaries, sub-arrays, etc. See documentation for json.JSONEncoder for a complete list of allowable types.

How to load json files with rasdaman

im studying Array database management systems a bit, in particular Rasdaman, i understand superficially the architecture and how the system works with sets and multidimensional arrays instead of tables as it is usual in relational dbms, im trying to save my own type of data to check if this type of databases can give me better performance to my specific problem(geospatial data in a particular format: DGGS), to do so i have created my own basic type based on a structure as indicated by the documentation, created my array type, set type and finally my collection for testing, i'm trying to insert data into this collection with the following idea:
query_executor.execute_update_from_file("insert into test_json_dict values decode($1, 'json', '{\"formatParameters\": {\"domain\": \"[0:1000]\",\"basetype\": struct { char k, long v } } })'", "...path.../rasdapy-demo/dggs_sample.json")
I'm using the library rasdapy to work from python instead of using rasql only(i use it anyways to validate small things), but i have been fighting with error messages that give little to no information:
Internal error: RasnetClientComm::executeQuery(): illegal status value 5
My source file has this type of data into it:
{
"N1": 6
}
A simple dict with a key and a value, i wanna save both things, i also tried to have a bigger dict with multiples keys and values on it but as the rasdaman decode function expects a basetype definition if i understand correctly i tried to change my data source format as a simple dict. It is obvious to see that i'm not doing the appropriate definition for decoding or that my source file has the wrong format but i haven't been able to find any examples on the web, any ideas on how to proceed? maybe i am even doing this whole thing from the wrong perspective and maybe i should try to use the OGC Web Coverage Service (WCS) standard ? i don't understand this yet so i have been avoiding it, anyways any advice or direction is greatly appreciated. Thanks in advance.
Edit:
I have been trying to load CSV data with the following format:
1 930
2 461
..
and the following query
query_executor.execute_update_from_file("insert into test_json_dict values decode($1, 'csv', '{\"formatParameters\": {\"domain\": \"[1:255]\",\"basetype\": struct { char key, long value } } })'", "...path.../rasdapy-demo/dggs_sample_4.csv")
but still no results, even tho it looks quite similar to the documentation example in Look for the CSV/JSON examples but no results still. What could be the issue?

It seems that my problem was trying to use the rasdapy library, this lib works fine but when working with data formats like csv and json it is best to use the rasql command line option, it states in the documentation :
filePaths - An array of absolute paths to input files to be decoded, e.g. ["/path/to/rgb.tif"]. This improves ingestion performance if the data is on the same machine as the rasdaman server, as the network transport is bypassed and the data is read directly from disk. Supported only for GDAL, NetCDF, and GRIB data formats.
and also it says:
As a first parameter the data to be decoded must be specified. Technically this data must be in the form of a 1D char array. Usually it is specified as a query input parameter with $1, while the binary data is attached with the --file option of the rasql command-line client tool, or with the corresponding methods in the client API.
It would be interesting to note if rasdapy takes this into account. Anyhow use of rasql gives way better response errors so i recommend that to anyone having a similar problem.
An example command could be:
rasql -q 'insert into test_basic values decode($1, "csv", "{ \"formatParameters\": {\"domain\": \"[0:1,0:2]\",\"basetype\": \"long\" } }")' --out string --file "/home/rasdaman/Documents/TFM/include/DGGS-Comparison/rasdapy-demo/dggs_sample_6.csv" --user rasadmin --passwd rasadmin
using this data:
1,2,3,2,1,3
After that you just got to start making it more and more complex as you need.

Passing a record over a socket

I have basic socket communication set up between python and Delphi code (text only). Now I would like to send/receive a record of data on both sides. I have a Record "C compatible" and would like to pass records back and forth have it in a usable format in python.
I use conn.send("text") in python to send the text but how do I send/receive a buffer with python and access the record items sent in python?
Record
TPacketData = record
pID : Integer;
dataType : Integer;
size : Integer;
value : Double;
end;

I don't know much about python, but I have done a lot between Delphi, C++, C# and Java even with COBOL.Anyway, to send a record from Delphi to C first you need to pack the record at both ends,
in Deplhi
MyRecord = pack record
in C++
#pragma pack(1)
I don’t know in python but I guess there must be a similar one. Make sure that at both sides the sizeof(MyRecord) is the same length.Also, before sending the records, you should take care about byte ordering (you know, Little-Endian vs Big-Endian), use the Socket.htonl() and Socket.ntohl() in python and the equivalent in Deplhi which are in WinSock unit. Also a "double" in Delphi could not be the same as in python, in Delphi is 8 bytes check this as well, and change it to Single(4 bytes) or Extended (10 bytes) whichever matches.
If all that match then you could send/receive binary records in one shut, otherwise, I'm afraid, you have to send the individual fields one by one.

I know this answer is a bit late to the game, but may at least prove useful to other people finding this question in their search-results. Because you say the Delphi code sends and receives "C compatible data" it seems that for the sake of the answer about Python's handling it is irrelevant whether it is Delphi (or any other language) on the other end...
The python struct and socket modules have all the functionality for the basic usage you describe. To send the example record you would do something like the below. For simplicity and sanity I have presumed signed integers and doubles, and packed the data in "network order" (bigendian). This can easily be a one-liner but I have split it up for verbosity and reusability's sake:
import struct
t_packet_struc = '>iiid'
t_packet_data = struct.pack(t_packet_struc, pid, data_type, size, value)
mysocket.sendall(t_packet_data)
Of course the mentioned "presumptions" don't need to be made, given tweaks to the format string, data preparation, etc. See the struct inline help for a description of the possible format strings - which can even process things like Pascal-strings... By the way, the socket module allows packing and unpacking a couple of network-specific things which struct doesn't, like IP-address strings (to their bigendian int-blob form), and allows explicit functions for converting data bigendian-to-native and vice-versa. For completeness, here is how to unpack the data packed above, on the Python end:
t_packet_size = struct.calcsize(t_packet_struc)
t_packet_data = mysocket.recv(t_packet_size)
(pid, data_type, size, value) = struct.unpack(t_packet_struc,
t_packet_data)
I know this works in Python version 2.x, and suspect it should work without changes in Python version 3.x too. Beware of one big gotcha (because it is easy to not think about, and hard to troubleshoot after the fact): Aside from different endianness, you can also distinguish between packing things using "standard size and alignment" (portably) or using "native size and alignment" (much faster) depending on how you prefix - or don't prefix - your format string. These can often yield wildly different results than you intended, without giving you a clue as to why... (there be dragons).

Reading a Delphi binary file in Python

I have a file that was written with the following Delphi declaration ...
Type
Tfulldata = Record
dpoints, dloops : integer;
dtime, bT, sT, hI, LI : real;
tm : real;
data : array[1..armax] Of Real;
End;
...
Var:
fh: File Of Tfulldata;
I want to analyse the data in the files (many MB in size) using Python if possible - is there an easy way to read in the data and cast the data into Python objects similar in form to the Delphi records? Does anyone know of a library perhaps that does this?
This is compiled on Delphi 7 with the following options which may (or may not) be pertinent,
Record Field Alignment: 8
Pentium Safe FDIV: False
Stack Frames: False
Optimization: True

Here is the full solutions thanks to hints from KillianDS and Ritsaert Hornstra
import struct
fh = open('my_file.dat', 'rb')
s = fh.read(40256)
vals = struct.unpack('iidddddd5025d', s)
dpoints, dloops, dtime, bT, sT, hI, LI, tm = vals[:8]
data = vals[8:]

I do not know how Delphi internally stores data, but if it is as simple byte-wise data (so not serialized and mangled), use struct. This way you can treat a string from a python file as binary data. Also, open files as binary file(open,'rb').

Please note that when you define a record in Delphi (like struct in C) the fields are layed out in order and in binary given the current alignment (eg Bytes are aligned on 1 byte boundaries, Words on 2 byte, Integers on 4 byte etc, but it may vary given the compiler settings.
When serialized to a file, you probably mean that this record is written in binary to the file and the next record is written after the first one starting at position sizeof( structure) etc etc. Delphi does not specify how thing should be serialized to/from file, So the information you give leaves us guessing.
If you want to make sure it is always the same without interference of any compiler setings, use packed record.
Real can have multiple meanings (it is an 48 bit float type for older Delphi versions and later on a 64 bit float (IEEE double)).
If you cannot access the Delphi code or compile it yourself, just ty to check the data with a HEX editor, you should see the boundaries of the records clearly since they start with Integers and only floats follow.

Trying to write to binary plist format from Python (w/PyObjC) to be fetch and read in by Cocoa Touch

I'm trying to serve a property list of search results to my iPhone app. The server is a prototype, written in Python.
First I found Python's built-in plistlib, which is awesome. I want to give search-as-you-type a shot, so I need it to be as small as possible, and xml was too big. The binary plist format seems like a good choice. Unfortunately plistlib doesn't do binary files, so step right up PyObjC.
(Segue: I'm very open to any other thoughts on how to accomplish live search. I already pared down the data as much as possible, including only displaying enough results to fill the window with the iPhone keyboard up, which is 5.)
Unfortunately, although I know Python and am getting pretty decent with Cocoa, I still don't get PyObjC.
This is the Cocoa equivalent of what I want to do:
NSArray *plist = [NSArray arrayWithContentsOfFile:read_path];
NSError *err;
NSData *data = [NSPropertyListSerialization dataWithPropertyList:plist
format:NSPropertyListBinaryFormat_v1_0
options:0 // docs say this must be 0, go figure
error:&err];
[data writeToFile:write_path atomically:YES];
I thought I should be able to do something like this, but dataWithPropertyList isn't in the NSPropertyListSerialization objects dir() listing. I should also probably convert the list to NSArray. I tried the PyObjC docs, but it's so tangential to my real work that I thought I'd try an SO SOS, too.
from Cocoa import NSArray, NSData, NSPropertyListSerialization, NSPropertyListBinaryFormat_v1_0
plist = [dict(key1='val', key2='val2'), dict(key1='val', key2='val2')]
NSPropertyListSerialization.dataWithPropertyList_format_options_error(plist,
NSPropertyListBinaryFormat_v1_0,
?,
?)
This is how I'm reading in the plist on the iPhone side.
NSData *data = [NSData dataWithContentsOfURL:url];
NSPropertyListFormat format;
NSString *err;
id it = [NSPropertyListSerialization
propertyListFromData:data
mutabilityOption:0
format:&format
errorDescription:&err];
Happy to clarify if any of this doesn't make sense.

I believe the correct function name is
NSPropertyListSerialization.dataWithPropertyList_format_options_error_
because of the ending :.
(BTW, if the object is always an array or dictionary, -writeToFile:atomically: will write the plist (as XML format) already.)

As KennyTM said, you're missing the trailing underscore in the method name. In PyObjC you need to take the Objective-C selector name (dataWithPropertyList:format:options:error:) and replace all of the colons with underscores (don't forget the last colon, too!). That gives you dataWithPropertyList_format_options_error_ (note the trailing underscore). Also, for the error parameter, you can just use None. That makes your code look like this:
bplist = NSPropertyListSerialization.dataWithPropertyList_format_options_error_(
plist,
NSPropertyListBinaryFormat_v1_0,
0,
None)
# bplist is an NSData object that you can operate on directly or
# write to a file...
bplist.writeToFile_atomically_(pathToFile, True)
If you test the resulting file, you'll see that it's a Binary PList file, as desired:
Jagaroth:~/Desktop $ file test.plist
test.plist: Apple binary property list

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: parse VARIANT (?) - python

The win32com package in pywin32 will do just that for you. The documentation is quite underwhelming, but there's a lot variant.html included explaining the basic use and a lot of tutorials and references online.

Related

Key 'boot_num' is not recognized when being interpreted from a .JSON file

How to load json files with rasdaman

Passing a record over a socket

Reading a Delphi binary file in Python

Trying to write to binary plist format from Python (w/PyObjC) to be fetch and read in by Cocoa Touch

Categories

Resources