how to group new dictionaries by using common strings from my dictionary - python

Refer to my previous question: How to extract the common words before particular symbol and find particular word
mydict = {"g18_84pp_2A_MVP1_GoodiesT0-HKJ-DFG_MIX-CMVP1_Y1000-MIX.txt" : 0,
"g18_84pp_2A_MVP2_GoodiesT0-HKJ-DFG_MIX-CMVP2_Y1000-MIX.txt" : 1,
"g18_84pp_2A_MVP3_GoodiesT0-HKJ-DFG_MIX-CMVP3_Y1000-MIX.txt" : 2,
"g18_84pp_2A_MVP4_GoodiesT0-HKJ-DFG_MIX-CMVP4_Y1000-MIX.txt" : 3,
"g18_84pp_2A_MVP5_GoodiesT0-HKJ-DFG_MIX-CMVP5_Y1000-MIX.txt" : 4,
"g18_84pp_2A_MVP6_GoodiesT0-HKJ-DFG_MIX-CMVP6_Y1000-MIX.txt" : 5,
"h18_84pp_3A_MVP1_GoodiesT1-HKJ-DFG-CMVP1_Y1000-FIX.txt" : 6,
"g18_84pp_2A_MVP7_GoodiesT0-HKJ-DFG_MIX-CMVP7_Y1000-MIX.txt" : 7,
"h18_84pp_3A_MVP2_GoodiesT1-HKJ-DFG-CMVP2_Y1000-FIX.txt" : 8,
"h18_84pp_3A_MVP3_GoodiesT1-HKJ-DFG-CMVP3_Y1000-FIX.txt" : 9,
"p18_84pp_2B_MVP1_GoodiesT2-HKJ-DFG-CMVP3_Y1000-FIX.txt" : 10}
and I already got my OutputNameDict,
OutputNameDict = {'h18_84pp_3A_MVP_FIX': 1, 'p18_84pp_2B_MVP_FIX': 2, 'g18_84pp_2A_MVP_MIX': 0}
Now what I want to do is to group three new dictionaries by using my common strings CaseNameString(refer to previous question) and values from OutputNameDict.
The idea result will like:
Group1. mydict0 using value 0 in OutputNameDict and string g18_84pp_2A_MVP_GoodiesT0 inCaseNameString.
mydict0 = {"g18_84pp_2A_MVP1_GoodiesT0-HKJ-DFG_MIX-CMVP1_Y1000-MIX.txt" : 0,
"g18_84pp_2A_MVP2_GoodiesT0-HKJ-DFG_MIX-CMVP2_Y1000-MIX.txt" : 1,
"g18_84pp_2A_MVP3_GoodiesT0-HKJ-DFG_MIX-CMVP3_Y1000-MIX.txt" : 2,
"g18_84pp_2A_MVP4_GoodiesT0-HKJ-DFG_MIX-CMVP4_Y1000-MIX.txt" : 3,
"g18_84pp_2A_MVP5_GoodiesT0-HKJ-DFG_MIX-CMVP5_Y1000-MIX.txt" : 4,
"g18_84pp_2A_MVP6_GoodiesT0-HKJ-DFG_MIX-CMVP6_Y1000-MIX.txt" : 5,
"g18_84pp_2A_MVP7_GoodiesT0-HKJ-DFG_MIX-CMVP7_Y1000-MIX.txt" : 6}
Group2. mydict1 using value 1 in OutputNameDict and string h18_84pp_3A_MVP_GoodiesT1 inCaseNameString.
mydict1 ={"h18_84pp_3A_MVP1_GoodiesT1-HKJ-DFG-CMVP1_Y1000-FIX.txt" : 0,
"h18_84pp_3A_MVP2_GoodiesT1-HKJ-DFG-CMVP2_Y1000-FIX.txt" : 1,
"h18_84pp_3A_MVP3_GoodiesT1-HKJ-DFG-CMVP3_Y1000-FIX.txt" : 2}
Group3. mydict2 using value 2 in OutputNameDict and string p18_84pp_2B_MVP_GoodiesT2 inCaseNameString.
mydict2 ={"p18_84pp_2B_MVP1_GoodiesT2-HKJ-DFG-CMVP3_Y1000-FIX.txt" : 0}
Any suggestion? Is there any function to call?

I'd change your OutputNameDict keys to be regular expression patterns, as follows:
OutputNameDict = {'h18_84pp_3A_MVP.*FIX': 1, 'p18_84pp_2B_MVP.*FIX': 2, 'g18_84pp_2A_MVP.*MIX': 0}
Then, using the re regular expression module, use that to match against the keys in mydict, and place the dictionary element into the appropriate key in output_dicts dictionary, as follows
import collections
import re
output_dicts = collections.defaultdict(dict)
for k, v in mydict.iteritems():
for pattern, suffix in OutputNameDict.iteritems():
if re.match(pattern,k):
output_dicts['mydict' + str(suffix)][k] = v
break
else:
output_dicts['not matched'][k] = v
This results in the output_dicts dictionary populated as follows
for k, v in output_dicts.iteritems():
print k
print v
print
Which outputs
mydict1
{'h18_84pp_3A_MVP2_GoodiesT1-HKJ-DFG-CMVP2_Y1000-FIX.txt': 8,
'h18_84pp_3A_MVP3_GoodiesT1-HKJ-DFG-CMVP3_Y1000-FIX.txt': 9,
'h18_84pp_3A_MVP1_GoodiesT1-HKJ-DFG-CMVP1_Y1000-FIX.txt': 6}
mydict0
{'g18_84pp_2A_MVP1_GoodiesT0-HKJ-DFG_MIX-CMVP1_Y1000-MIX.txt': 0,
'g18_84pp_2A_MVP2_GoodiesT0-HKJ-DFG_MIX-CMVP2_Y1000-MIX.txt': 1,
'g18_84pp_2A_MVP4_GoodiesT0-HKJ-DFG_MIX-CMVP4_Y1000-MIX.txt': 3,
'g18_84pp_2A_MVP5_GoodiesT0-HKJ-DFG_MIX-CMVP5_Y1000-MIX.txt': 4,
'g18_84pp_2A_MVP3_GoodiesT0-HKJ-DFG_MIX-CMVP3_Y1000-MIX.txt': 2,
'g18_84pp_2A_MVP6_GoodiesT0-HKJ-DFG_MIX-CMVP6_Y1000-MIX.txt': 5,
'g18_84pp_2A_MVP7_GoodiesT0-HKJ-DFG_MIX-CMVP7_Y1000-MIX.txt': 7}
mydict2
{'p18_84pp_2B_MVP1_GoodiesT2-HKJ-DFG-CMVP3_Y1000-FIX.txt': 10}

Related

Store the value of a key-value by its order?

Reading about data structures and have a question.
This is a dictionary:
example = {'the': 8,
'metal': 8,
'is': 23,
'worth': 3,
'many': 3,
'dollars': 2,
'right': 2}
How to store to a variable the value of a key/value pair by order?
For example, how to store the value of the third pair, which is 23?
Tried this, which is not correct:
for k in example:
if k == 3:
a_var = example(k)
If you know the key/values have been inserted in the correct order, you can use islice() to get the third key/value pair. This has the benefit of not needing to create a whole list of values and is a bit simpler than explicitly writing a loop:
from itertools import islice
example = {
'the': 8,
'metal': 8,
'is': 23,
'worth': 3,
'many': 3,
'dollars': 2,
'right': 2
}
key, val = next(islice(example.items(), 2, None))
# 'is', 23
If you only want the value instead of the key/value pair, then of course you can pass values() instead of items() to islice():
val = next(islice(example.values(), 2, None))
# 23
This does what you need:
example = {'the': 8,
'metal': 8,
'is': 23,
'worth': 3,
'many': 3,
'dollars': 2,
'right': 2}
k=0
for key in example:
k+=1
if k == 3:
a_var = example[key]
print(a_var)
Output:
23
If you really think you need this:
def find( example, ordinal):
for k,value in enumerate(example.values()):
if k == ordinal:
return value
or
def find( example, ordinal):
return list(example.values())[ordinal]
a_var=example[list(example.keys())[3-1]]

Adding the values of same keys in one dictionary

I have a Dictionary :
Dict1= {“AAT”: 2, “CCG”: 1, “ATA”: 5, “GCG”: 7, “CGC”: 2, “TAG”: 1, “GAT”: 0, “AAT”: 3, “CCG”: 2, “ATG”: 5, “GCG”: 3, “CGC”: 7, “TAG”: 0, “GAT”: 0}
And I have to sum all the similar triplet codes in a new dictionary.
Output should be like this:
Dict2 = {“AAT”: 5, “CCG”: 3, “ATA”: 5, “GCG”: 10, “CGC”: 9, “TAG”: 1, “GAT”: 0}
How do I proceed with the code?
Dict1 is not a valid dictionary as dictionary keys have to be unique. In general if you have some (non-unique) strings and values assigned to them, you can write
if key in Dict2:
Dict2[key] += val
else
Dict2[key] = val
You are trying to sum up the values of same keys which not possible since python doesn't allow duplicate keys in dictionary. You can check this for reference:
https://www.w3schools.com/python/python_dictionaries.asp

Updating dictionary values using keys and values from lists

I want to be able to index a dictionary and replace its values for particular keys by using keys from within a specific list and writing values to those keys from that list.
Code
dicty = {"NDS" : 1, "TCT": 2, "ET" : 3, "ACC" : 4,"Ydist" : 5, "Diam" : 6}
tem = ["NDS", "TCT"]
circ = ["ET", "ACC"]
jit = ["Ydist", "Diam"]
def cal_loop(cal_vers):
if cal_vers == temp_calibration:
print("DO TEMP CALIBRATION")
tem_results = [19,30]
dict_keys = tem
dicty[[dict_keys][0]] = tem_results[0]
print(dicty["NDS"])
temp_calibration = 6
cal_loop(temp_calibration)
print(dicty)
Traceback
Desired output
{'NDS': 19, 'TCT': 2, 'ET': 3, 'ACC': 4, 'Ydist': 5, 'Diam': 6}
#I also want to know how to do both keys in the list given e.g.
{'NDS': 19, 'TCT': 30, 'ET': 3, 'ACC': 4, 'Ydist': 5, 'Diam': 6}
tem = ["NDS", "TCT"]
tem_results = [19,30]
for k, v in zip(tem, tem_results):
dicty[k] = v
The issue is with dicty[[dict_keys][0]] = tem_results[0]. You have to loop thought the two lists and update the dictionary or instead create a new dictonary and update the existing one using:
dicty.update({k: v for k, v in zip(tem, tem_results)})

remove () in keys created by FreqDist and ngrams

I used these codes to generate frequencies for ngrams
all_counts = FreqDist(ngrams(token, 1))
However, keys are in ().
FreqDist({('a',): 6,
('b',): 1,
('c',): 1})
I would like it to be as below:
FreqDist({'a': 6,
'b': 1,
'c': 1})
Thank you.
all_counts = {key[0] : val for key, val in all_counts.items()}

How to remove part of string from list

I have a file with data and i want to count numbers of macaddress:
file.txt:
Blockquote
D8:6C:E9:3C:77:FF;2016/01/10 14:02:47
D8:6C:E9:3C:77:FF;2016/01/10 14:02:47
D8:6C:E9:43:52:BF;2016/01/10 13:41:29
F0:82:61:31:6B:88;2016/01/10 13:43:41
8C:10:D4:D4:83:E5;2016/01/10 13:44:35
54:64:D9:E8:64:36;2016/01/10 13:46:13
18:1E:78:5A:CD:25;2016/01/10 13:46:27
18:1E:78:5A:D7:A5;2016/01/10 13:46:35
54:64:D9:75:1B:4B;2016/01/10 13:30:28
54:64:D9:75:1B:4B;2016/01/10 13:30:28
etc....
I put it to the list :
with open ('file.txt') as f:
mac = f.read().splitlines()
my_dic = {i:mac.count(i) for i in mac}
print my_dic
output:
{'18:1E:78:5A:D7:A5;2016/01/10 13:46:35': 1, 'D8:6C:E9:3C:77:FF;2016/01/10 14:02:47': 2, '54:64:D9:E8:64:36;2016/01/10 13:46:13': 1, 'D8:6C:E9:43:52:BF;2016/01/10 13:41:29': 1, 'F0:82:61:31:6B:88;2016/01/10 13:43:41': 1, '54:64:D9:75:1B:4B;2016/01/10 13:30:28': 2, '18:1E:78:5A:CD:25;2016/01/10 13:46:27': 1, '8C:10:D4:D4:83:E5;2016/01/10 13:44:35': 1}
how to rid of dates because i expected:
{'18:1E:78:5A:D7:A5 : 1, 'D8:6C:E9:3C:77:FF : 2, '54:64:D9:E8:64:36 : 1, 'D8:6C:E9:43:52:BF : 1, 'F0:82:61:31:6B:88 : 1, '54:64:D9:75:1B:4B : 2, '18:1E:78:5A:CD:25 : 1, '8C:10:D4:D4:83:E5 : 1}
Write a regexp that match this date format, and use re.sub() to remove the matching part.

Categories