Suppose I have a dictionary:
d = {'a_c':1,'b_c':2,'a_d':3,'b_d':4}
how do I split into two based on the last word/letter of the key ('c','d') like this?
d1 = {'a_c':1,'b_c':2}
d2 = {'a_d':3,'b_d':4}
This is one way:
from collections import defaultdict
d = {'a_c':1,'b_c':2,'a_d':3,'b_d':4}
key = lambda s: s.split('_')[1]
res = defaultdict(dict)
for k, v in d.items():
res[key(k)][k] = v
print(list(res.values()))
Output:
[{'a_c': 1, 'b_c': 2}, {'a_d': 3, 'b_d': 4}]
The result is a list of dictionaries divided on the last letter of the key.
You could try something like this:
func = lambda ending_str: {x: d[x] for x in d.keys() if x.endswith(ending_str)}
d1 = func('_c')
d2 = func('_d')
Also, like Marc mentioned in the comments, you shouldn't have two same name keys in the dictionary. It will only keep the last key/value pair in that case.
Related
I'd like two join two dictionaries based on the value of d1 and a substring of the key of d2. The resulting dictionary has the key of d1 with the corresponding value of d2.
d1 = {'web02': '23', 'web01': '50'}
d2 = {'server/dc-50': 's01.local', 'server/dc-23': 's02.local'}
Would result in = {web01:s01.local, web02:s02.local}
I guess this is what you need :
result = {k1:v2 for k1,v1 in d1.items() for k2,v2 in d2.items() if v1 in k2}
Output :
{'web02': 's02.local', 'web01': 's01.local'}
This is done without a nested loop by getting the value using string formatting:
data = {k: d2['server/dc-' + v] for k, v in d1.items()}
Prints:
{'web02': 's02.local', 'web01': 's01.local'}
I'm new to Python and Pyspark and I'm practicing TF-IDF.
I split all words from sentences in the txt file, removed punctuations, removed the words that are in the stop-words list, and saved them as a dictionary with the codes below.
x = text_file.flatmap(lambda line: str_clean(line).split()
x = x.filter(lambda word: word not in stopwords
x = x.reduceByKey(lambda a,b: a+b)
x = x.collectAsMap()
I have 10 different txt files for this same process. And I'd like to add a string like "#d1" to keys in dictionary so that I can indicate that the key is from document 1.
How can I add "#1" to all keys in the dictionary?
Essentially my dictionary is in the form:
{'word1': 1, 'word2': 1, 'word3': 2, ....}
And I would like it to be:
{'word1#d1': 1, 'word2#d1': 1, 'word3#d1': 2, ...}
Try a dictionary comprehension:
{k+'#d1': v for k, v in d.items()}
In Python 3.6+, you can use f-strings:
{f'{k}#d1': v for k, v in d.items()}
You can use dict constructor to rebuild the dict, appending file number to the end of each key:
>>> d = {'a': 1, 'b': 2}
>>> file_number = 1
>>> dict(("{}#{}".format(k,file_number),v) for k,v in d.items())
>>> {'a#1': 1, 'b#1': 2}
I have a list of dict that looks like below
def prefix_key_dict(prefix,test_dict):
res = {prefix + str(key).lower(): val for key, val in test_dict.items()}
return res
temp_prefix = 'column_'
transformed_dict = [prefix_dict(temp_prefix,each) for each in table_col_list]
and the transformed json looks like below
dict:
d1 = {'b,a':12,'b,c,a':13}
Code:
x = collections.OrderedDict(sorted(d1.items()))
print(x)
Not getting the expected output.
Expected Output:
d1 = {'a,b': 12, 'a,b,c':13}
It looks like you don't actually want to sort the keys, you want to re-arrange the non-comma substrings of your keys such that these substrings are ordered.
>>> d1 = {'b,a':12,'b,c,a':13}
>>> {','.join(sorted(key.split(','))):val for key, val in d1.items()}
{'a,b': 12, 'a,b,c': 13}
d1.items(): returns a list of (key, value) tuples
sorted(d1.items()): simply sorts the above list
If you want to sort the items in your keys, then you need to run sort on your keys.
I have a list:
list = [(a,1),(b,2),(a,3)]
I want to convert it to a dict where when there is a duplicate (eg. (a,1) and (a,3)), it will be get the average so dict will just have 1 key:value pair which would be in this case a:2.
from collections import defaultdict
l = [('a',1),('b',2),('a',3)]
d = defaultdict(list)
for pair in l:
d[pair[0]].append(pair[1]) #add each number in to the list with under the correct key
for (k,v) in d.items():
d[k] = sum(d[k])/len(d[k]) #re-assign the value associated with key k as the sum of the elements in the list divided by its length
So
print(d)
>>> defaultdict(<type 'list'>, {'a': 2, 'b': 2})
Or even nicer and producing a plain dictionary in the end:
from collections import defaultdict
l = [('a',1),('b',2),('a',3)]
temp_d = defaultdict(list)
for pair in l:
temp_d[pair[0]].append(pair[1])
#CHANGES HERE
final = dict((k,sum(v)/len(v)) for k,v in temp_d.items())
print(final)
>>>
{'a': 2, 'b': 2}
Note that if you are using 2.x (as you are, you will need to adjust the following to force float division):
(k,sum(v)/float(len(v)))
OR
sum(d[k])/float(len(d[k]))
Code goes below:
d = {'a':0, 'b':0, 'c':0, 'd':0} #at the beginning, all the values are 0.
s = 'cbad' #a string
indices = map(s.index, d.keys()) #get every key's index in s, i.e., a-2, b-1, c-0, d-3
#then set the values to keys' index
d = dict(zip(d.keys(), indices)) #this is how I do it, any better way?
print d #{'a':2, 'c':0, 'b':1, 'd':3}
Any other way to do that?
PS. the code above is just a simple one to demonstrate my question.
Something like this might make your code more readable:
dict([(x,y) for y,x in enumerate('cbad')])
But you should give more details what you really want to do. Your code will probably fail if the characters in s do not fit the keys of d. So d is just a container for the keys and the values are not important. Why not start with a list in that case?
use update() method of dict:
d.update((k,s.index(k)) for k in d.iterkeys())
What about
d = {'a':0, 'b':0, 'c':0, 'd':0}
s = 'cbad'
for k in d.iterkeys():
d[k] = s.index(k)
? It's no functional programming anymore but should be more performant and more pythonic, perhaps :-).
EDIT: A function variant using python dict-comprehensions (needs Python 2.7+ or 3+):
d.update({k : s.index(k) for k in d.iterkeys()})
or even
{k : s.index(k) for k in d.iterkeys()}
if a new dict is okay!
for k in d.iterkeys():
d[k] = s.index[k]
Or, if you don't already know the letters in the string:
d = {}
for i in range(len(s)):
d[s[i]]=i
another one liner:
dict([(k,s.index(k)) for (k,v) in d.items()])
You don't need to pass a list of tuples to dict. Instead, you can use a dictionary comprehension with enumerate:
s = 'cbad'
d = {v: k for k, v in enumerate(s)}
If you need to process the intermediary steps, including initial setting of values, you can use:
d = dict.fromkeys('abcd', 0)
s = 'cbad'
indices = {v: k for k, v in enumerate(s)}
d = {k: indices[k] for k in d} # dictionary comprehension
d = dict(zip(d, map(indices.get, d))) # dict + zip alternative
print(d)
# {'a': 2, 'b': 1, 'c': 0, 'd': 3}
You choose the right way but think that no need to create dict and then modify it if you have ability to do this in the same time:
keys = ['a','b','c','d']
strK = 'bcad'
res = dict(zip(keys, (strK.index(i) for i in keys)))
Dict comprehension for python 2.7 and above
{key : indice for key, indice in zip(d.keys(), map(s.index, d.keys()))}
>>> d = {'a':0, 'b':0, 'c':0, 'd':0}
>>> s = 'cbad'
>>> for x in d:
d[x]=s.find(x)
>>> d
{'a': 2, 'c': 0, 'b': 1, 'd': 3}