I have a mat-file that I accessed using
from scipy import io
mat = io.loadmat('example.mat')
From matlab, example.mat contains the following struct
>> load example.mat
>> data1
data1 =
LAT: [53x1 double]
LON: [53x1 double]
TIME: [53x1 double]
units: {3x1 cell}
>> data2
data2 =
LAT: [100x1 double]
LON: [100x1 double]
TIME: [100x1 double]
units: {3x1 cell}
In matlab, I can access data as easy as data2.LON, etc.. It's not as trivial in python. It give me several option though like
mat.clear mat.get mat.iteritems mat.keys mat.setdefault mat.viewitems
mat.copy mat.has_key mat.iterkeys mat.pop mat.update mat.viewkeys
mat.fromkeys mat.items mat.itervalues mat.popitem mat.values mat.viewvalues
Is is possible to preserve the same structure in python? If not, how to best access the data? The present python code that I am using is very difficult to work with.
Thanks
Found this tutorial about matlab struct and python
http://docs.scipy.org/doc/scipy/reference/tutorial/io.html
When I need to load data into Python from MATLAB that is stored in an array of structs {strut_1,struct_2} I extract a list of keys and values from the object that I load with scipy.io.loadmat. I can then assemble these into there own variables, or if needed, repackage them into a dictionary. The use of the exec command may not be appropriate in all cases, but if you are just trying to processes data it works well.
# Load the data into Python
D= sio.loadmat('data.mat')
# build a list of keys and values for each entry in the structure
vals = D['results'][0,0] #<-- set the array you want to access.
keys = D['results'][0,0].dtype.descr
# Assemble the keys and values into variables with the same name as that used in MATLAB
for i in range(len(keys)):
key = keys[i][0]
val = np.squeeze(vals[key][0][0]) # squeeze is used to covert matlat (1,n) arrays into numpy (1,) arrays.
exec(key + '=val')
this will return the mat structure as a dictionary
def _check_keys( dict):
"""
checks if entries in dictionary are mat-objects. If yes
todict is called to change them to nested dictionaries
"""
for key in dict:
if isinstance(dict[key], sio.matlab.mio5_params.mat_struct):
dict[key] = _todict(dict[key])
return dict
def _todict(matobj):
"""
A recursive function which constructs from matobjects nested dictionaries
"""
dict = {}
for strg in matobj._fieldnames:
elem = matobj.__dict__[strg]
if isinstance(elem, sio.matlab.mio5_params.mat_struct):
dict[strg] = _todict(elem)
else:
dict[strg] = elem
return dict
def loadmat(filename):
"""
this function should be called instead of direct scipy.io .loadmat
as it cures the problem of not properly recovering python dictionaries
from mat files. It calls the function check keys to cure all entries
which are still mat-objects
"""
data = sio.loadmat(filename, struct_as_record=False, squeeze_me=True)
return _check_keys(data)
(!) In case of nested structures saved in *.mat files, is necessary to check if the items in the dictionary that io.loadmat outputs are Matlab structures. For example if in Matlab
>> thisStruct
ans =
var1: [1x1 struct]
var2: 3.5
>> thisStruct.var1
ans =
subvar1: [1x100 double]
subvar2: [32x233 double]
Then use the code by mergen in scipy.io.loadmat nested structures (i.e. dictionaries)
Related
I am working on a code which pulls data from database and based on the different type of tables , store the data in dictionary for further usage.
This code handles around 20-30 different table so there are 20-30 dictionaries and few lists which I have defined as class variables for further usage in code.
for example.
class ImplVars(object):
#dictionary capturing data from Asset-Feed table
general_feed_dict = {}
ports_feed_dict = {}
vulns_feed_dict = {}
app_list = []
...
I want to clear these dictionaries before I add data in it.
Easiest or common way is to use clear() function but this code is repeatable as I will have to write for each dict.
Another option I am exploring is with using dir() function but its returning variable names as string.
Is there any elegant method which will allow me to fetch all these class variables and clear them ?
You can use introspection as you suggest:
for d in filter(dict.__instancecheck__, ImplVars.__dict__.values()):
d.clear()
Or less cryptic, covering lists and dicts:
for obj in ImplVars.__dict__.values():
if isinstance(obj, (list, dict)):
obj.clear()
But I would recommend you choose a bit of a different data structure so you can be more explicit:
class ImplVars(object):
data_dicts = {
"general_feed_dict": {},
"ports_feed_dict": {},
"vulns_feed_dict": {},
}
Now you can explicitly loop over ImplVars.data_dicts.values and still have other class variables that you may not want to clear.
code:
a_dict = {1:2}
b_dict = {2:4}
c_list = [3,6]
vars_copy = vars().copy()
for variable, value in vars_copy.items():
if variable.endswith("_dict"):
vars()[variable] = {}
elif variable.endswith("_list"):
vars()[variable] = []
print(a_dict)
print(b_dict)
print(c_list)
result:
{}
{}
[]
Maybe one of the easier kinds of implementation would be to create a list of dictionaries and lists you want to clear and later make the loop clear them all.
d = [general_feed_dict, ports_feed_dict, vulns_feed_dict, app_list]
for element in d:
element.clear()
You could also use list comprehension for that.
I'm new in Python, and I'm trying to encode in Json an data dict.
My dict is :
data = { ('analogInput', 18) : [('objectName','AI8-Voltage'),
('presentValue',238.3),
('units','Volts')],
('analogInput', 3) : [('objectName','AI3-Pulse'),
('presentValue',100),
('units','Amp')]
}
And when i'm trying to do : foo = json.dumps(data)
I've got this message : Fatal error : keys must be str, int, float, bool or None, not tuple
I'm trying to search answers, but I dont understand how i can do proceed in my case
Thanx you any answers
First of all, not all types can be used for JSON keys.
Keys must be strings, and values must be a valid JSON data type (string, number, object, array, Boolean or null).
For more information, take a look at this.
Now as feasible solution, I recommend you to implement two functions that converts your tuples to string and converts your strings to tuple. A quite simple example is provided below:
import json
data = { ('analogInput', 18) : [('objectName','AI8-Voltage'),
('presentValue',238.3),
('units','Volts')],
('analogInput', 3) : [('objectName','AI3-Pulse'),
('presentValue',100),
('units','Amp')]
}
def tuple_to_str(t):
# It can be implemeneted with more options
return str(t[0])+'_'+str(t[1])
def str_to_tuple(s):
l =s.split('_')
# Your first (second) item is int
l[1] = int(l[1])
return tuple(l)
if __name__=="__main__":
# create a space for a dict of data with string keys
s_data= dict()
for key in data:
s_data[tuple_to_str(key)] = data[key]
x = json.dumps(s_data)
# create a space to load the json with string keys
raw_data = json.loads(x)
final_data = dict()
for key in raw_data:
final_data[str_to_tuple(key)] = raw_data[key]
# Ture
print(final_data)
The error is explicit. In a Python dict, the key can be any hashable type, including a tuple, a frozen set or a frozen dict (but neither a list, nor a set or a dict).
But in a Json object, dictionary keys can only be strings, numbers (int, or float), booleans or the special object None.
Long story short, your input dictionary cannot be directly converted to Json.
Possible workarounds:
use an different serialization tool. For example, pickle can accept any Python type, but is not portable to non Python application. But you could also use a custom serialization format, if you write both the serialization and de-serialization parts
convert the key to a string. At deserialization time, you would just have to convert the string back to a tuple with ast.literal_evel:
js = json.dumps({str(k): v for k,v in data.items()})
giving: {"('analogInput', 18)": [["objectName", "AI8-Voltage"], ["presentValue", 238.3], ["units", "Volts"]], "('analogInput', 3)": [["objectName", "AI3-Pulse"], ["presentValue", 100], ["units", "Amp"]]}
You can load it back with:
data2 = {ast.literal_eval(k): v for k,v in json.loads(js).items()}
giving {('analogInput', 18): [['objectName', 'AI8-Voltage'], ['presentValue', 238.3], ['units', 'Volts']], ('analogInput', 3): [['objectName', 'AI3-Pulse'], ['presentValue', 100], ['units', 'Amp']]}
You can just see that the json transformation has changed the tuples into lists.
I have a data structure that I would like to add comments to, then convert into YAML.
I'd like to avoid outputting the data structure as YAML and loading it back in using RoundTripLoader.
Is there a way to convert my data structure into one that supports the ruamel.yaml comments interface?
There is a way, although the interface for that is not guaranteed to be stable.
Because of that, and the lack of documentation, it often helps to look at the representation of round_trip_loading() your expected output, or a small sample thereof.
You'll have to realise that comments are attached to, special versions of, the representation of the structured nodes (mapping and sequence). For a mapping that would safe_load() as a Python dict, this is a CommentedMap() and for a sequence, that would load as a Python list, this is a CommentedSeq().
Both these classes can have a .ca attribute holding the comments that may occur before the structural node, as end-of-line-comments after a key/value pair resp. item, on its own line between key-value pairs or items, and at the end of a node.
That means you have to convert any dict or list you have, that needs commenting on (which can be done automatically/recursively e.g. by the routine comment_prep()), and then find the correct point and way to attach the comment. Because the comment manipulation routines have not stabilized, make sure you wrap your comment adding routines to get a single place where to update in case they do change.
import sys
from ruamel.yaml import round_trip_dump as rtd
from ruamel.yaml.comments import CommentedMap, CommentedSeq
# please note that because of the dict the order of the keys is undetermined
data = dict(a=1, b=2, c=['x', 'y', dict(k='i', l=42, m='∞')])
rtd(data, sys.stdout)
print('-' * 30)
def comment_prep(base):
"""replace all dict with CommentedMap and list with CommentedSeq"""
if isinstance(base, dict):
ret_val = CommentedMap()
for key in sorted(base): # here we force sorted order
ret_val[key] = comment_prep(base[key])
return ret_val
if isinstance(base, list):
ret_val = CommentedSeq()
for item in base:
ret_val.append(comment_prep(item))
return ret_val
return base
data = comment_prep(data)
data['c'][2].yaml_add_eol_comment('# this is the answer', key='l', column=15)
rtd(data, sys.stdout)
gives:
c:
- x
- y
- k: i
m: ∞
l: 42
b: 2
a: 1
------------------------------
a: 1
b: 2
c:
- x
- y
- k: i
l: 42 # this is the answer
m: ∞
The file test_comment_manipulation.py, has some more examples and is a good place to keep an eye on (as the interface changes, so will the tests in that file).
How can i make a set of dictionaries from one list of dictionaries?
Example:
import copy
v1 = {'k01': 'v01', 'k02': {'k03': 'v03', 'k04': {'k05': 'v05'}}}
v2 = {'k11': 'v11', 'k12': {'k13': 'v13', 'k14': {'k15': 'v15'}}}
data = []
N = 5
for i in range(N):
data.append(copy.deepcopy(v1))
data.append(copy.deepcopy(v2))
print data
How would you create a set of dictionaries from the list data?
NS: One dictionary is equal to another when they are structurally the same. That means, they got exactly the same keys and same values (recursively)
A cheap workaround would be to serialize your dicts, for example:
import json
dset = set()
d1 = {'a':1, 'b':{'c':2}}
d2 = {'b':{'c':2}, 'a':1} # the same according to your definition
d3 = {'x': 42}
dset.add(json.dumps(d1, sort_keys=True))
dset.add(json.dumps(d2, sort_keys=True))
dset.add(json.dumps(d3, sort_keys=True))
for p in dset:
print json.loads(p)
In the long run it would make sense to wrap the whole thing in a class like SetOfDicts.
Dictionaries are mutable and therefore not hashable in python.
You could either create a dict-subclass with a __hash__ method. Make sure that the hash of a dictionary does not change while it is in the set (that probably means that you cannot allow modifying the members).
See http://code.activestate.com/recipes/414283-frozen-dictionaries/ for an example implementation of frozendicts.
If you can define a sort order on your (frozen) dictionaries, you could alternatively use a data structure based on a binary tree instead of a set. This boils down to the bisect solution provided in the link below.
See also https://stackoverflow.com/a/18824158/5069869 for an explanation why sets without hash do not make sense.
not exactly what you're looking for as this accounts for lists too but:
def hashable_structure(structure):
if isinstance(structure, dict):
return {k: hashable_structure(v) for k, v in structure.items()}
elif isinstance(structure, list):
return {hashable_structure(elem) for elem in structure)}
else:
return structure
I'm having trouble getting my data in the form that I'd like in python.
Basically I have a program that reads in binary data and provides functions for plotting and analysis on said data.
My data has main headings and then subheadings that could be any number of varied datatypes.
I'd like to be able to access my data like for example:
>>> a = myDatafile.readit()
>>> a.elements.hydrogen.distributionfunction
(a big array)
>>> a.elements.hydrogen.mass
1
>>> a.elements.carbon.mass
12
but I don't know the names of the atoms until runtime.
I've tried using namedtuple, for example after I've read in all the atom names:
self.elements = namedtuple('elements',elementlist)
Where elementlist is a list of strings for example ('hydrogen','carbon'). But the problem is I can't nest these using for example:
for i in range(0,self.nelements):
self.elements[i] = namedtuple('details',['ux','uy','uz','mass','distributionfunction'])
and then be able to access the values through for example
self.elements.electron.distributionfunction.
Maybe I'm doing this completely wrong. I'm fairly inexperienced with python. I know this would be easy to do if I wasn't bothered about naming the variables dynamically.
I hope I've made myself clear with what I'm trying to achieve!
Without knowing your data, we can only give a generic solution.
Considering the first two lines contains the headings and Sub-Heading reading it somehow you determined the hierarchy. All you have to do is to create an hierarchical dictionary.
For example, extending your example
data.elements.hydrogen.distributionfunction
data.elements.nitrogen.xyzfunction
data.elements.nitrogen.distributionfunction
data.compound.water.distributionfunction
data.compound.hcl.xyzfunction
So we have to create a dictionary as such
{'data':{'elements':{'hydrogen':{'distributionfunction':<something>}
'nitrogen':{'xyzfunction':<something>,
'distributionfunction':<something>}
}
compound:{'water':{'distributionfunction':<something>}
'hcl':{'xyzfunction':<something>}
}
}
}
how you will populate the dictionary depends on the data which is difficult to say now.
But the keys to the dictionary you should populate from the headers, and somehow you have to map the data to the respective value in the empty slot's of the dictionary.
Once the map is populated, you can access it as
yourDict['data']['compound']['hcl']['xyzfunction']
If your element name are dynamic and obtained from the data at runtime, you can assign them to a dict and access like this
elements['hydrogen'].mass
but if you want dotted notation you can create attributes at run time e.g.
from collections import namedtuple
class Elements(object):
def add_element(self, elementname, element):
setattr(self, elementname, element)
Element = namedtuple('Element', ['ux','uy','uz','mass','distributionfunction'])
elements = Elements()
for data in [('hydrogen',1,1,1,1,1), ('helium',2,2,2,2,2), ('carbon',3,3,3,3,3)]:
elementname = data[0]
element = Element._make(data[1:])
elements.add_element(elementname, element)
print elements.hydrogen.mass
print elements.carbon.distributionfunction
Here I am assuming the data you have, but with data in any other format you can do similar tricks
Here's a method for recursively creating namedtuples from nested data.
from collections import Mapping, namedtuple
def namedtuplify(mapping, name='NT'): # thank you https://gist.github.com/hangtwenty/5960435
""" Convert mappings to namedtuples recursively. """
if isinstance(mapping, Mapping):
for key, value in list(mapping.items()):
mapping[key] = namedtuplify(value)
return namedtuple_wrapper(name, **mapping)
elif isinstance(mapping, list):
return [namedtuplify(item) for item in mapping]
return mapping
def namedtuple_wrapper(name, **kwargs):
wrap = namedtuple(name, kwargs)
return wrap(**kwargs)
stuff = {'data': {'elements': {'hydrogen': {'distributionfunction': 'foo'},
'nitrogen': {'xyzfunction': 'bar',
'distributionfunction': 'baz'}
},
'compound': {'water': {'distributionfunction': 'lorem'},
'hcl': {'xyzfunction': 'ipsum'}}}
}
example = namedtuplify(stuff)
example.data.elements.hydrogen.distributionfunction # 'foo'
I had the same issue with nested json but needed to be able to serialise the output with pickle which doesn't like you creating objects on the fly.
I've taken #bren's answer and enhanced it so that the resulting structure will be serialisable with pickle. You have to save references to each of the structures you create to globals so that pickle can keep tabs on them.
##############################################
class Json2Struct:
'''
Convert mappings to nested namedtuples
Usage:
jStruct = Json2Struct('JS').json2Struct(json)
'''
##############################################
def __init__(self, name):
self.namePrefix = name
self.nameSuffix = 0
def json2Struct(self, jsonObj): # thank you https://gist.github.com/hangtwenty/5960435
"""
Convert mappings to namedtuples recursively.
"""
if isinstance(jsonObj, Mapping):
for key, value in list(jsonObj.items()):
jsonObj[key] = self.json2Struct(value)
return self.namedtuple_wrapper(**jsonObj)
elif isinstance(jsonObj, list):
return [self.json2Struct(item) for item in jsonObj]
return jsonObj
def namedtuple_wrapper(self, **kwargs):
self.nameSuffix += 1
name = self.namePrefix + str(self.nameSuffix)
Jstruct = namedtuple(name, kwargs)
globals()[name] = Jstruct
return Jstruct(**kwargs)
The example below should work as follows and also be serialisable:
stuff = {'data': {'elements': {'hydrogen': {'distributionfunction': 'foo'},
'nitrogen': {'xyzfunction': 'bar',
'distributionfunction': 'baz'}
},
'compound': {'water': {'distributionfunction': 'lorem'},
'hcl': {'xyzfunction': 'ipsum'}}}
}
example = Json2Struct('JS').json2Struct(stuff)
example.data.elements.hydrogen.distributionfunction # 'foo'