Reduce dictionary dimension to 1 - python

I have several dictionaries that I would like to merge into a pd.DataFrame. In order to be able to do that, all the dictionaries need to have a dimension of 1. How can I reduce the dimension of the following dictionary to 1 but so that it remains a dictionary? In cases where there are multiple entries for a dictionary item, it should simply be joined. So myFunds should become '2.0 0.0'
h = {
'gameStage': 'PreFlop',
'losses': 0,
'myFunds': [2.0, '0.0'],
'myLastBet': 0,
'peviousRoundCards': ['2S', 'QS'],
'previousCards': ['2S', 'QS'],
'previousPot': 0.03,
'wins': 0
}
To be more precise what my problem is: I want to save all properties of an object in a DataFrame. For that I do vars(Object) which gives me a dictionary, but not all of the entries have the same dimension, so I can't do DataFrame(vars(Object)) because I get an error. That's why I need to first convert vars(Object) into a one dimensional dictionary. But it needs to remain a Dictionary, otherwise I can't convert it into a DataFrame.

You can iterate over the items in the dictionary, and them flatten them with str.join().
We basically tell it to iterate over each key/value pair in the dictionary value, turn it to a string (as you can't concatenate an int and a string), and then put it back in the sictionary
The function func is from here and what it does is makes sure we always have an iterable item
from collections import Iterable
def func(x):
#use str instead of basestring if Python3
if isinstance(x, Iterable) and not isinstance(x, basestring):
return x
return [x]
for key, val in h.items():
h[key] = " ".join([str(ele) for ele in func(val)])
Output of h becomes:
{
'gameStage': 'PreFlop',
'losses': '0',
'myFunds': '2.0 0.0',
'myLastBet': '0',
'peviousRoundCards': '2S QS',
'previousCards': '2S QS',
'previousPot': '0.03',
'wins': '0'
}

Related

update dictionary in nested loop

I am having trouble updating a dictionary. i am just extracting certain fields within each key - but the output is not as expected. data, expected output and code are below. Thanks for looking, i appreciate any comments
categories = {'categories_a1' : [{'group': '13GH9', 'number': '1'},{'group': '17KPO', 'number': '73'}, {'group': '26BN11', 'number': '2'}, {'group': '813W', 'number': '99'}],
'categories_a2' : [{'group': '99ITY', 'number': '12'},{'group': 'JH871', 'number': '15'}, {'group': 'OLH83', 'number': '99'}, {'group': '44RTQ', 'number': '1'}]}
xpected= {'categories_a1' : [{'13GH9': '1'},{'17KPO':'73'}, {'26BN11':'2'}, {'813W': '99'}],
'categories_a2' : [{'99ITY':'12'},{'JH871': '15'}, {'OLH83': '99'}, {'44RTQ':'1'}]}
out={}
for k in categories.keys():
for i in categories[k]:
x = {k: v for k, v in zip([i['group']], [i['number']])}
out[k] = x
out.update(out)
Let's first clean up some general weirdness:
out.update(out)
This line does effectively nothing and should be omitted.
x = {k: v for k, v in zip([i['group']], [i['number']])}
This makes little sense; we create lists with one element each and iterate over them in parallel. We could just as easily just use those values directly: x = {i['group']: i['number']}.
After swapping that in, let's consider the part that causes the actual problem:
for i in categories[k]:
x = {i['group']: i['number']}
out[k] = x
The problem here is that you want out[k] to constitute a list of all of the modified dictionaries, but x is repeatedly being assigned one of those dictionaries, and the result then becomes out[k]. What you presumably intended to do is repeatedly append those dictionaries to a new empty list:
x = []
for i in categories[k]:
x.append({i['group']: i['number']})
out[k] = x
However, it's clear that you're already familiar and comfortable with comprehensions, and this is an ideal place to use one:
out[k] = [{i['group']: i['number']} for i in categories[k]]
And, of course, we can extend this technique to the overall loop:
out = {
k: [{i['group']: i['number']} for i in v]
for k, v in categories.items()
}
Please carefully study the structure of this code and make sure you understand the technique. We have a source dictionary that we want to transform to create our output, and the rule is: the key remains unchanged, the value (which is a list) undergoes its own transformation. So we start by writing the skeleton for a dict comprehension, using .items() to give us key-value pairs:
out = {
k: # we need to fill in something to do with `v` here
for k, v in categories.items()
}
Then we figure out what we're doing with the value: each element of the list is a dictionary; the way that we process the list is iterative (each element of the input list tells us an element to use in the output list), but the processing of those elements is not (we look at exactly two hard-coded values from that dict, and make a dict from them). Given an element i of the list, the corresponding dict that we want has exactly one key-value pair, which we can compute as {i['group']: i['number']}. So we wrap a list comprehension around that: [{i['group']: i['number']} for i in v]; and we insert that into the dict comprehension skeleton, giving us the final result.
One approach:
for key, value in categories.items():
categories[key] = [{ d["group"] : d["number"] } for d in value]
print(categories)
Output
{'categories_a1': [{'13GH9': '1'}, {'17KPO': '73'}, {'26BN11': '2'}, {'813W': '99'}], 'categories_a2': [{'99ITY': '12'}, {'JH871': '15'}, {'OLH83': '99'}, {'44RTQ': '1'}]}

customize dictionary key and values

I have a question about change dictionary format.
The dictionary is :
{'index': 'cfs_nucleus_bespoke_88260', 'host': 'iaasn00018224.svr.us.jpmchase.net', 'source': '/logs/tomcat7inst0/localhost_tomcat7inst0_access_log2018-11-02.txt', '_time': '2018-11-02 19:46:50.000 EDT', 'count': '1'}
I want to ask is there a way for me to change the format like below:
{"column1":{'index': 'cfs_nucleus_', 'host': 'iaasn00018224.net'}, "column2":{'source': '/logs/tomcat7inst0/localhost_tomcat7inst0_access_log2018-11-02.txt'}, "column3":{'_time': '2018-11-02, 'count': '1'}}
You can do the following:
dict1 = {'index': 'cfs_nucleus_bespoke_88260', 'host': 'iaasn00018224.svr.us.jpmchase.net', 'source': '/logs/tomcat7inst0/localhost_tomcat7inst0_access_log2018-11-02.txt', '_time': '2018-11-02 19:46:50.000 EDT', 'count': '1'}
d1_items = list(dict1.items())
col_width = 2
dict2 = {f'column{col_num // col_width + 1}': {k: v for k, v in d1_items[col_num:col_num + col_width]} for col_num in range(0, len(dict1), col_width)}
Try it online!
There are a few moving parts that interact to create this solution:
Dict comprehensions
Python has a neat trick where it allows you to embed for in loops in iterable and dict declarations to efficiently cycle and modify a set of elements. Here, the outer iterator is range(0, len(dict1), col_width): this goes through a sequence of integers starting from 0, and progressively increases by col_width until it is greater than or equal to the size of the list, which functions to choose the start index of each col_width-sized dict segment.
Tuple unpacking
dict1.items() is convenient because it returns a dict view of 2-tuples of each dictionary key and its value. Later, we utilize tuple unpacking k: v for k, v in d1_items[ ... ], where a tuple of variables is flattened into two variables that can then easily form a key-value pair of the currrent dictionary comprehension. (This is only in newer versions of Python.)
List slicing
d1_items[col_num:col_num + col_width] is basically a way of getting a sublist. The syntax in relatively straightforward: starting from position col_num, get a sublist up to and excluding the element col_num + col_width (i.e. a sublist of size col_width).
Formatted string literals
Preceding a string with f makes it a formatted string literal. Anything within { } is interpreted as literal Python syntax (with the exception of ', {, and }, which may vary behavior by context.) Here, in f'column{col_num // col_width + 1}', it allows us to label each column with a bit of division and a +1 offset to start counting from 1 instead of 0. (This is new to Python 3.6)

Replace elements in a nested dict with the appropriate in a list

I have such following a dict and a list.
mylist= ['1H1.PyModule.md',
'1H2.Class.md',
'1H3.MetaObject.md',
'2B1D0.Data.md',
'2B1D1.Primitive.md',
'2B1D2.Operator.md',
'2B2D3.Container.md',
'2B2S0.Function.md',
'2B2S0.Statemment.md',
'2B2S1.Controlled_Loop.md',
'2B2S2.Conditions.md',
'2B2S3.Except.md',
...
]
mydict = {'Body': {'Data': ['1.primitive', '2.operator', '3.container'],
'Statement': ['0.function', '1.controlled_loop', '2.condition', '3.except']},
'Header': ['1.Modle', '2.Class', '3.Object'],
...}
I attempt to repalce the strings in mydict with the appropriate in mylist
I can figure out '2B1D0.Data.md' has the shortest length,
So I slice the keyward 'Data'
In [82]: '2B1D0.Data.md'[-7:-3]
Out[82]: 'Data'
The dict has both a nested list and nested dict.
So I write a iteration function with type checking
if an item's value isinstance(value,list), renew that value,
if an item's value isinstance(value, dict),call the function replace_ele()to continue.
I name string in mylist as str_of_list,
while string in mydict as str_of_dict for readable concerns.
#replace one string in mydict
def replace_ele(mydict, str_of_list):
for key, value in mydict.items():
if isinstance(value, list): #type checking
for str_of_dict in value:
#replace str_of_dict with str_of_list
if str_of_list[-7:-3].lower() == str_of_dict[-4:]: #[-7:-3] the shortest length
value.remove(str_of_dict)
value.append(str_of_list)
value.sort()
mydict[key] = value
#iteration if a dict
if isinstance(value, dict):
replace_ele(value,str_of_list)
for str_of_list in mylist:
replace_ele(mydict, str_of_list)
Then running and outputs:
Out[117]:
{'Body': {'Data': ['2B1D1.Primitive.md',
'2B1D2.Operator.md',
'2B2D3.Container.md'],
'Statement': ['2B2S0.Function.md',
'2B2S0.Function.md',
'2B2S1.Controlled_Loop.md',
'2B2S3.Except.md']},
'Header': ['1.Modle', '1H2.Class.md', '1H3.MetaObject.md']
....}
I assume that such a problem can be solved with less codes.
However, I cannot find that solution with the limited knowledge.
How to accomplish it elegantly?
My suggestion is that you create a function to reduce element of mylist and elements in lists of mydict values to the same format:
For example, you can split by '.' character, take the second field, convert it to lower case:
def f(s):
return s.split('.')[1].lower()
E.g.:
>>> f('2B2S1.Controlled_Loop.md')
'controlled_loop'
>>> f('1.controlled_loop')
'controlled_loop'
Now, from mylist create a dict to hold replacements:
mylist_repl={f(x): x for x in mylist}
i.e. mylist_repl contains key: value items such as 'metaobject': '1H3.MetaObject.md'.
With dict mylist_repl and function f, it is easy to transform a list from mydict to the desired value, example:
>>> [mylist_repl[f(x)] for x in ['1.primitive', '2.operator', '3.container']]
['2B1D1.Primitive.md', '2B1D2.Operator.md', '2B2D3.Container.md']
Also note that this dictionary lookup is more efficient (i.e.: faster) than a nested for loop!
If you have different replacement logic, you probably only need to change how f maps items from two different sets to a common key.

Sort a list of dictionaries by value

I have a list of dictionaries, of the form:
neighbour_list = [{1:4}, {3:5}, {4:9}, {5:2}]
I need to sort the list in order of the dictionary with the largest value. So, for the above code the sorted list would look like:
sorted_list = [{4:9}, {3:5}, {1:4}, {5:2}]
Each dictionary within the list only has one mapping.
Is there an efficient way to do this? Currently I am looping through the list to get the biggest value, then remembering where it was found to return the largest value, but I'm not sure how to extend this to be able to sort the entire list.
Would it just be easier to implement my own dict class?
EDIT: here is my code for returning the dictionary which should come 'first' in an ideally sorted list.
temp = 0
element = 0
for d in list_of_similarities:
for k in d:
if (d[k] > temp):
temp = d[k]
element = k
dictionary = d
first = dictionary[element]
You can use an anonymous function as your sorting key to pull out the dict value (not sure if i've done this the most efficient way though:
sorted(neighbour_list, key = lambda x: tuple(x.values()), reverse=True)
[{4: 9}, {3: 5}, {1: 4}, {5: 2}]
Note we need to coerce x.values() to a tuple, since in Python 3, x.values() is of type "dict_values" which is unorderable. I guess the idea is that a dict is more like a set than a list (hence the curly braces), and there's no (well-) ordering on sets; you can't use the usual lexicographic ordering since there's no notion of "first element", "second element", etc.
You could list.sort using the dict values as the key.
neighbour_list.sort(key=lambda x: x.values(), reverse=1)
Considering you only have one value, for python2 you can just call next on itervalues to get the first and only value:
neighbour_list.sort(key=lambda x: next(x.itervalues()), reverse=1)
print(neighbour_list)
For python3, you cannot call next on dict.values, it would have to be:
neighbour_list.sort(key=lambda x: next(iter(x.values())), reverse=1)
And have to call list on dict.values:
neighbour_list.sort(key=lambda x: list(x.values()), reverse=1)

Iterating over "function" object

I'm looking for a way to be able to define dictionary keys by function parameters. In the code below I make divisions of the first and second letters of the dictionary keys but Python's function parameters are not strings.
def x(a, b, c):
dict = {'ab': 0, 'ac': 0, 'bc': 0}
for d, e in dict.keys:
dict[de] = d/e
x(10, 20, 30)
Here's some code that will handle any number of arguments. It first sorts the argument names into alphabetical order. Then it creates all pairs of arguments and performs the divisions (with the value of the earlier arg in alphabetical order being divided by the value of the later one), storing the results in a dict, with the dict's keys being constructed by the concatenating the argument names. The function returns the constructed dict, but of course in your code you may wish to perform further actions on it.
from itertools import combinations
def ratios(**kwargs):
pairs = combinations(sorted(kwargs.keys()), 2)
return dict((p + q, kwargs[p] / kwargs[q]) for p, q in pairs)
print(ratios(a=600, b=3, c=2))
output
{'ac': 300.0, 'ab': 200.0, 'bc': 1.5}

Categories