Python3: Sort a 3-levels nested dictionary by value - python

I've been trying to sort a dictionary of dictionaries of dictionaries (3-levels) but had no luck. I need to sort it by the "common_RMSD" field value. This is an input example (sorry I don't know how to format properly the dictionary here...)
my_dict = {'E': {
'E': {
'common_sup': <Bio.PDB.Superimposer.Superimposer object at 0x08451030>,
'common_sample_chain_ID': 'E',
'common_ref_chain_ID': 'E',
'common_RMSD': 2.013799285922141e-14},
'F': {
'common_sup': <Bio.PDB.Superimposer.Superimposer object at 0x08786FB0>,
'common_sample_chain_ID': 'F',
'common_ref_chain_ID': 'E',
'common_RMSD': 0.1207801497077146}},
'F': {
'E': {
'common_sup': <Bio.PDB.Superimposer.Superimposer object at 0x08AB6410>,
'common_sample_chain_ID': 'E',
'common_ref_chain_ID': 'F',
'common_RMSD': 0.12078014970771417},
'F': {
'common_sup': <Bio.PDB.Superimposer.Superimposer object at 0x08AB63F0>,
'common_sample_chain_ID': 'F',
'common_ref_chain_ID': 'F',
'common_RMSD': 7.559143985024071e-15}
} }
I have read in previous posts the use of lamdda and items() functions. With items() I can get through the first value of the first key-value level (which is the second dictionary), but I don't know how to go further inside the dictionary.
sorted(dict.items(), key = lambda x: x[1]... )
Is there a possibility to reorder the whole dictionary, sorting the nested dictionaries by the 'common_RMSD' value? So that when I loop through them, the first dictionary has the lowest 'common_RMSD' value? If not, then output a tuple with ('common_sup','common_sample_chain_ID','common_ref_chain_ID','common_RMSD'), again ordered by 'common_RMSD'

Related

How to access values from a dictionary inside a list based on another key-value pair?

Given the list below,
graph = [{'Node':'A', 'children':['T','Z','S'], 'gWt':[118,75,140], 'h':366},
{'Node':'Z', 'children':['O','A'], 'gWt':[71,75], 'h':374},
{'Node':'T', 'children':['A'], 'gWt':[118], 'h':329},
{'Node':'S', 'children':['A','O','R','F'], 'gWt':[140,151,80,99], 'h':253}]
For a given node value, how would i get all of its 'children'?
For example,
graph.index('Node'=='S')['children'] -> ['A','O','R','F']
Comments on using index() or next():
Creating a list of the node names and then finding the index of a node by using the index() is the way to go but notice that this can lead to huge performance downgrade as well as logical errors.
Performance issues: Finding in a list using list.index(target) is O(n), meaning that it iterates over every element in the list until it finds the first element that matches the target which is not ideal.
Logic errors: If you were to have a malformed graph, a list containing the same node name twice or more, e.g.: graph = [{"node": "A", ...props1}, {"node": "A", ... props2}], you'll only encounter the first node because of the nature of the index() function. If you want to find every node "A" then you'll need to run index() over all the list until you are sure you've found all nodes "A" and then think on a merge strategy for the node properties.
 Solution (recommendation):
Store the graph's nodes by using a dictionary. Dictionaries have constant time lookups (O(1)) since they are hash tables. The representation you are looking to achieve is:
graph = {
"A": {'children':['T','Z','S'], 'gWt':[118,75,140], 'h':366},
"B": {'children':['O','A'], 'gWt':[71,75], 'h':374},
...
}
Which will also make the traversal of your graph super easy. This way, whenever you are looking for the node "X" you only have to graph["X"] to obtain it, and graph["X"]["children"] to obtain its children.
 How to transform the given input into the optimum data structure?
If you cannot construct the data structure mentioned above, then you'll definitively want to transform it into the optimum one if you are planning to query the graph's nodes properties constantly. You can achieve that with:
def transform_graph(graph):
new_graph = {}
for node in graph:
node_name = node['Node']
new_graph[node_name] = new_graph.get(
node_name,
{'children': set(), 'gWt': set(), 'h': 0}
)
new_graph[node_name]["children"] |= set(node['children'])
new_graph[node_name]["gWt"] |= set(node["gWt"])
new_graph[node_name]["h"] = node["h"]
return new_graph
The output of calling the function transform_graph(old_graph) is
old_graph = [
{'Node':'A', 'children':['T','Z','S'], 'gWt':[118,75,140], 'h':366},
{'Node':'Z', 'children':['O','A'], 'gWt':[71,75], 'h':374},
{'Node':'T', 'children':['A'], 'gWt':[118], 'h':329},
{'Node':'S', 'children':['A','O','R','F'], 'gWt':[140,151,80,99], 'h':253}
]
new_graph = transform_graph(old_graph)
print(new_graph)
> {
'A': {'children': {'S', 'T', 'Z'}, 'gWt': {75, 118, 140}, 'h': 366},
'Z': {'children': {'A', 'O'}, 'gWt': {71, 75}, 'h': 374},
'T': {'children': {'A'}, 'gWt': {118}, 'h': 329},
'S': {'children': {'A', 'F', 'O', 'R'}, 'gWt': {80, 99, 140, 151}, 'h': 253}
}
Assuming there is one and only one value equals to 'S', you could use next:
graph = [{'Node':'A', 'children':['T','Z','S'], 'gWt':[118,75,140], 'h':366},
{'Node':'Z', 'children':['O','A'], 'gWt':[71,75], 'h':374},
{'Node':'T', 'children':['A'], 'gWt':[118], 'h':329},
{'Node':'S', 'children':['A','O','R','F'], 'gWt':[140,151,80,99], 'h':253}]
result = next(e for e in graph if e['Node'] == 'S')['children']
print(result)
Output
['A', 'O', 'R', 'F']
I fell like your data in the best format it could be, but this does the job:
graph[[x['Node'] for x in graph].index('S')]['children']
will print
['A', 'O', 'R', 'F']

What does "char_to_ix = { ch:i for i,ch in enumerate(sorted(chars)) }" do?

What does this line of code do?
char_to_ix = { ch:i for i,ch in enumerate(sorted(chars)) }
What is the meaning of ch:i?
this is a dict comprehension as mentioned in by #han solo
the final product is a dict
it will sort your chars, attach a number in ascending order to them, and then use each character as the key to that numerical value
here's an example:
chars = ['d', 'a', 'b']
sorted(chars) => ['a', 'b', 'd']
enumerate(sorted(chars)) => a generator object that unrolls into [(0, 'a'), (1, 'b'), (2, 'd')]
char_to_ix = {'a': 0, 'b': 1, 'd': 2}
It is dict comprehension.
ch - it is key in dictionary,
i - value for that key.
Dictionary syntax is
dict = {
key1: value1,
key2: value2
}
With your code you will generate key: value pairs from enumerated chars.
Key would be an element of sorted list of chars.
Value - index of that element

Merge dictionaries with old keys when old value and new key is same

I want to merge two dictionaries like this:
old={'a':'1a','b':'1b','c':'1c'}
new={'1a':'c','1b':'d','1c':'e'}
I want output like this:
new_dict={'a':'c','b':'d','c':'e'}
Note: The length of both dictionaries is different.
How to do it in python?
With a dict-comprehension:
old = {'a': '1a','b': '1b','c': '1c'}
new = {'1a': 'c','1b': 'd','1c': 'e'}
res = {k: new[v] for k, v in old.items()} # if all values in `old` exist in `new` as keys.
res = {k: new.get(v, None) for k, v in old.items()} # if you cannot guarantee the above.
print(res) # {'b': 'd', 'a': 'c', 'c': 'e'}
*Note that the None parameter of the .get() method is the default one and as such, it can be omitted. I will leave it there though to remind you that you can specify anything you want depending on the specifics of your problem (e.g., '' (blank string) might be better in your case)
You can get the new dictionary using a dictionary comprehension where you get the values from the new dictionary based on the keys in the old dictionary. Be sure to use get which returns None by default if a value from the old dictionary is not present as a key in the new dictionary.
old = {'a': '1a', 'b': '1b' ,'c': '1c'}
new = {'1a': 'c', '1b': 'd', '1c': 'e'}
new_dict = {k: new.get(old[k]) for k in old}
>>> new_dict
{'a': 'c', 'b': 'd', 'c': 'e'}
You could use a dictionary comprehension that interprets the value of the first dictionary as the key of the second dictionary:
>>> {item: new[value] for item, value in old.items() if value in new}
{'a': 'c', 'b': 'd', 'c': 'e'}
In case you can garantuee that all values of old are in new you could omit the if value in new part.

find key value based on value field python dictionary

I have a dictionary where the values are lists. I would like to look for a specific value in the lists (value field of dictionary) and return the key value:
dict={'a':['ab','cd','ef'], 'b':['gh', 'ij'], 'c':['kl', 'mn']}
So for 'ef' I would get 'a', for 'mn' I would get 'c'...
I have tryied
value_key=[a for a,b in dict if value in b]
Any ideas?
Assuming you want to do indexing this way more than once, you should build the reverse mapping, from values (sub-values really) to keys:
{ vsub:k for k, v in d.iteritems() for vsub in v }
This takes your original dict (called d here because dict is a builtin name in Python), and "inverts" it, with the tweak of mapping each sub-value (the elements within the lists).
Once you have this new dict, you can simply index into it by keys like 'ab' to get 'a':
{'ab': 'a', 'ef': 'a', 'mn': 'c', 'kl': 'c', 'ij': 'b', 'cd': 'a', 'gh': 'b'}
Iterate through the dictionary with for key in dict_object, and then use in operator to check if the value being searched is in the dictionary's value corresponding to the key. If it exists, then retain the key for the output.
my_dict,val={"a": ["ab", "cd", "ef"], "b": ["gh", "ij"], "c": ["kl", "mn"]}, "ef"
print [key for key in my_dict if val in my_dict[key]]
# ['a']
The advantage of this method is, it will work irrespective of the current popular Python versions, as we don't have to worry about items and iteritems methods.

Recalling information from dictionaries within dictionaries

So say I have a number of dictionaries within a dictionary
d = {'a': {'name': bob, 'class': 2a}, 'b': {'name': mike, 'class': 2b}, 'c': {'name': ben, 'class': 2b}}
How would I go about identifying items within each of these internal dictionaries. Say I wanted to identify the keys of the internal dictionaries that were in 'class' '2b'. How would I code this so that it gave me the keys 'b' and 'c'???
Thanks in advance.
You need to loop over the keys of your dictionary and check each sub-dict.
[k for k in d if d[k]['class'] == '2b']
Out[16]: ['c', 'b']
optionally,
[k for k,v in d.items() if v['class'] == '2b']
Out[17]: ['c', 'b']

Categories