Split dictionaries into different dictionaries based on the key in python? - python

Suppose I have a dictionary:
d = {'a_c':1,'b_c':2,'a_d':3,'b_d':4}
how do I split into two based on the last word/letter of the key ('c','d') like this?
d1 = {'a_c':1,'b_c':2}
d2 = {'a_d':3,'b_d':4}

This is one way:
from collections import defaultdict
d = {'a_c':1,'b_c':2,'a_d':3,'b_d':4}
key = lambda s: s.split('_')[1]
res = defaultdict(dict)
for k, v in d.items():
res[key(k)][k] = v
print(list(res.values()))
Output:
[{'a_c': 1, 'b_c': 2}, {'a_d': 3, 'b_d': 4}]
The result is a list of dictionaries divided on the last letter of the key.

You could try something like this:
func = lambda ending_str: {x: d[x] for x in d.keys() if x.endswith(ending_str)}
d1 = func('_c')
d2 = func('_d')
Also, like Marc mentioned in the comments, you shouldn't have two same name keys in the dictionary. It will only keep the last key/value pair in that case.

Related

Merge two dictionaries conditionally using dict comprehension

I'd like two join two dictionaries based on the value of d1 and a substring of the key of d2. The resulting dictionary has the key of d1 with the corresponding value of d2.
d1 = {'web02': '23', 'web01': '50'}
d2 = {'server/dc-50': 's01.local', 'server/dc-23': 's02.local'}
Would result in = {web01:s01.local, web02:s02.local}
I guess this is what you need :
result = {k1:v2 for k1,v1 in d1.items() for k2,v2 in d2.items() if v1 in k2}
Output :
{'web02': 's02.local', 'web01': 's01.local'}
This is done without a nested loop by getting the value using string formatting:
data = {k: d2['server/dc-' + v] for k, v in d1.items()}
Prints:
{'web02': 's02.local', 'web01': 's01.local'}

Adding a string to all keys in dictionary (Python)

I'm new to Python and Pyspark and I'm practicing TF-IDF.
I split all words from sentences in the txt file, removed punctuations, removed the words that are in the stop-words list, and saved them as a dictionary with the codes below.
x = text_file.flatmap(lambda line: str_clean(line).split()
x = x.filter(lambda word: word not in stopwords
x = x.reduceByKey(lambda a,b: a+b)
x = x.collectAsMap()
I have 10 different txt files for this same process. And I'd like to add a string like "#d1" to keys in dictionary so that I can indicate that the key is from document 1.
How can I add "#1" to all keys in the dictionary?
Essentially my dictionary is in the form:
{'word1': 1, 'word2': 1, 'word3': 2, ....}
And I would like it to be:
{'word1#d1': 1, 'word2#d1': 1, 'word3#d1': 2, ...}
Try a dictionary comprehension:
{k+'#d1': v for k, v in d.items()}
In Python 3.6+, you can use f-strings:
{f'{k}#d1': v for k, v in d.items()}
You can use dict constructor to rebuild the dict, appending file number to the end of each key:
>>> d = {'a': 1, 'b': 2}
>>> file_number = 1
>>> dict(("{}#{}".format(k,file_number),v) for k,v in d.items())
>>> {'a#1': 1, 'b#1': 2}
I have a list of dict that looks like below
def prefix_key_dict(prefix,test_dict):
res = {prefix + str(key).lower(): val for key, val in test_dict.items()}
return res
temp_prefix = 'column_'
transformed_dict = [prefix_dict(temp_prefix,each) for each in table_col_list]
and the transformed json looks like below

Sort the keys of a dictionary in python

dict:
d1 = {'b,a':12,'b,c,a':13}
Code:
x = collections.OrderedDict(sorted(d1.items()))
print(x)
Not getting the expected output.
Expected Output:
d1 = {'a,b': 12, 'a,b,c':13}
It looks like you don't actually want to sort the keys, you want to re-arrange the non-comma substrings of your keys such that these substrings are ordered.
>>> d1 = {'b,a':12,'b,c,a':13}
>>> {','.join(sorted(key.split(','))):val for key, val in d1.items()}
{'a,b': 12, 'a,b,c': 13}
d1.items(): returns a list of (key, value) tuples
sorted(d1.items()): simply sorts the above list
If you want to sort the items in your keys, then you need to run sort on your keys.

converting list to dict and averaging the values of duplicates in python

I have a list:
list = [(a,1),(b,2),(a,3)]
I want to convert it to a dict where when there is a duplicate (eg. (a,1) and (a,3)), it will be get the average so dict will just have 1 key:value pair which would be in this case a:2.
from collections import defaultdict
l = [('a',1),('b',2),('a',3)]
d = defaultdict(list)
for pair in l:
d[pair[0]].append(pair[1]) #add each number in to the list with under the correct key
for (k,v) in d.items():
d[k] = sum(d[k])/len(d[k]) #re-assign the value associated with key k as the sum of the elements in the list divided by its length
So
print(d)
>>> defaultdict(<type 'list'>, {'a': 2, 'b': 2})
Or even nicer and producing a plain dictionary in the end:
from collections import defaultdict
l = [('a',1),('b',2),('a',3)]
temp_d = defaultdict(list)
for pair in l:
temp_d[pair[0]].append(pair[1])
#CHANGES HERE
final = dict((k,sum(v)/len(v)) for k,v in temp_d.items())
print(final)
>>>
{'a': 2, 'b': 2}
Note that if you are using 2.x (as you are, you will need to adjust the following to force float division):
(k,sum(v)/float(len(v)))
OR
sum(d[k])/float(len(d[k]))

Modify all values in a dictionary

Code goes below:
d = {'a':0, 'b':0, 'c':0, 'd':0} #at the beginning, all the values are 0.
s = 'cbad' #a string
indices = map(s.index, d.keys()) #get every key's index in s, i.e., a-2, b-1, c-0, d-3
#then set the values to keys' index
d = dict(zip(d.keys(), indices)) #this is how I do it, any better way?
print d #{'a':2, 'c':0, 'b':1, 'd':3}
Any other way to do that?
PS. the code above is just a simple one to demonstrate my question.
Something like this might make your code more readable:
dict([(x,y) for y,x in enumerate('cbad')])
But you should give more details what you really want to do. Your code will probably fail if the characters in s do not fit the keys of d. So d is just a container for the keys and the values are not important. Why not start with a list in that case?
use update() method of dict:
d.update((k,s.index(k)) for k in d.iterkeys())
What about
d = {'a':0, 'b':0, 'c':0, 'd':0}
s = 'cbad'
for k in d.iterkeys():
d[k] = s.index(k)
? It's no functional programming anymore but should be more performant and more pythonic, perhaps :-).
EDIT: A function variant using python dict-comprehensions (needs Python 2.7+ or 3+):
d.update({k : s.index(k) for k in d.iterkeys()})
or even
{k : s.index(k) for k in d.iterkeys()}
if a new dict is okay!
for k in d.iterkeys():
d[k] = s.index[k]
Or, if you don't already know the letters in the string:
d = {}
for i in range(len(s)):
d[s[i]]=i
another one liner:
dict([(k,s.index(k)) for (k,v) in d.items()])
You don't need to pass a list of tuples to dict. Instead, you can use a dictionary comprehension with enumerate:
s = 'cbad'
d = {v: k for k, v in enumerate(s)}
If you need to process the intermediary steps, including initial setting of values, you can use:
d = dict.fromkeys('abcd', 0)
s = 'cbad'
indices = {v: k for k, v in enumerate(s)}
d = {k: indices[k] for k in d} # dictionary comprehension
d = dict(zip(d, map(indices.get, d))) # dict + zip alternative
print(d)
# {'a': 2, 'b': 1, 'c': 0, 'd': 3}
You choose the right way but think that no need to create dict and then modify it if you have ability to do this in the same time:
keys = ['a','b','c','d']
strK = 'bcad'
res = dict(zip(keys, (strK.index(i) for i in keys)))
Dict comprehension for python 2.7 and above
{key : indice for key, indice in zip(d.keys(), map(s.index, d.keys()))}
>>> d = {'a':0, 'b':0, 'c':0, 'd':0}
>>> s = 'cbad'
>>> for x in d:
d[x]=s.find(x)
>>> d
{'a': 2, 'c': 0, 'b': 1, 'd': 3}

Categories