I'm writing in Python 3.3.
I have a set of nested dictionaries (shown below) and am trying to search using a key at the lowest level and return each of the values that correspond to the second level.
Patients = {}
Patients['PatA'] = {'c101':'AT', 'c367':'CA', 'c542':'GA'}
Patients['PatB'] = {'c101':'AC', 'c367':'CA', 'c573':'GA'}
Patients['PatC'] = {'c101':'AT', 'c367':'CA', 'c581':'GA'}
I'm trying to use a set of 'for loops' to search pull out the value attached to the c101 key in each Pat* dictionary nested under the main Patients dictionary.
This is what I have so far:
pat = 'PatA'
mutations = Patients[pat]
for Pat in Patients.values(): #iterate over the Pat* dictionaries
for mut in Pat.keys(): #iterate over the keys in the Pat* dictionaries
if mut == 'c101': #when the key in a Pat* dictionary matches 'c101'
print(Pat[mut].values()) #print the value attached to the 'c101' key
I get the following error, suggesting that my for loop returns each value as a string and that this can't then be used as a dictionary key to pull out the value.
Traceback (most recent call last):
File "filename", line 13, in
for mut in Pat.keys():
AttributeError: 'str' object has no attribute 'keys'
I think I'm missing something obvious to do with the dictionaries class, but I can't quite tell what it is. I've had a look through this question, but I don't think its quite what I'm asking.
Any advice would be greatly appreciated.
Patients.keys() gives you the list of keys in Patients dictionary (['PatA', 'PatC', 'PatB']) not the list of values hence the error. You can use dict.items to iterate over key: value pairs like this:
for patient, mutations in Patients.items():
if 'c101' in mutations.keys():
print(mutations['c101'])
To make your code working:
# Replace keys by value
for Pat in Patients.values():
# Iterate over keys from Pat dictionary
for mut in Pat.keys():
if mut == 'c101':
# Take value of Pat dictionary using
# 'c101' as a key
print(Pat['c101'])
If you want you can create list of mutations in simple one-liner:
[mutations['c101'] for p, mutations in Patients.items() if mutations.get('c101')]
Patients = {}
Patients['PatA'] = {'c101':'AT', 'c367':'CA', 'c542':'GA'}
Patients['PatB'] = {'c101':'AC', 'c367':'CA', 'c573':'GA'}
Patients['PatC'] = {'c101':'AT', 'c367':'CA', 'c581':'GA'}
for keys,values in Patients.iteritems():
# print keys,values
for keys1,values1 in values.iteritems():
if keys1 is 'c101':
print keys1,values1
#print values1
Related
i'm using an api call in python 3.7 which returns json data.
result = (someapicall)
the data returned appears to be in the form of two nested dictionaries within a list, i.e.
[{name:foo, firmware:boo}{name:foo, firmware:bar}]
i would like to retrieve the value of the key "name" from the first dictionary and also the value of key "firmware" from both dictionaries and store in a new dictionary in the following format.
{foo:(boo,bar)}
so far i've managed to retrieve the value of both the first "name" and the first "firmware" and store in a dictionary using the following.
dict1={}
for i in result:
dict1[(i["networkId"])] = (i['firmware'])
i've tried.
d7[(a["networkId"])] = (a['firmware'],(a['firmware']))
but as expected the above just seems to return the same firmware twice.
can anyone help achive the desired result above
you can use defaultdict to accumulate values in a list, like this:
from collections import defaultdict
result = [{'name':'foo', 'firmware':'boo'},{'name':'foo', 'firmware':'bar'}]
# create a dict with a default of empty list for non existing keys
dict1=defaultdict(list)
# iterate and add firmwares of same name to list
for i in result:
dict1[i['name']].append(i['firmware'])
# reformat to regular dict with tuples
final = {k:tuple(v) for k,v in dict1.items()}
print(final)
Output:
{'foo': ('boo', 'bar')}
My question is about finding highest value in a dictionary using max function.
I have a created dictionary that looks like this:
cc_GDP = {'af': 1243738953, 'as': 343435646, etc}
I would like to be able to simply find and print the highest GDP value for each country.
My best attempt having read through similar questions is as follows (I'm currently working through the Python crash course book at which the base of this code has been taken, note the get_country_code function is simply providing 2 letter abbreviations for the countries in the GDP_data json file):
#Load the data into a list
filename = 'gdp_data.json'
with open(filename) as f:
gdp_data = json.load(f)
cc_GDP` = {}
for gdp_dict in gdp_data:
if gdp_dict['Year'] == 2016:
country_name = gdp_dict['Country Name']
GDP_total = int(gdp_dict['Value'])
code = get_country_code(country_name)
if code:
cc_GDP[code] = int(GDP_total)
print(max(cc_GDP, key=lambda key: cc_GDP[key][1]))
This provides the following error 'TypeError: 'int' object is not subscriptable'
Note if leaving out the [1] in the print function, this does provide the highest key which relates to the highest value, but does not return the highest value itself which is what I wish to achieve.
Any help would be appreciated.
So you currently extract the key of the country that has the highest value with this line:
country_w_highest_val = max(cc_GDP, key=lambda key: cc_GDP[key]))
You can of course just look that up in the dictionary again:
highest_val = cc_GDP[contry_w_highest_val]
But simpler, disregard the keys completely, and just find the highest value of all values in the dictionary:
highest_val = max(cc_GDP.values())
How about something like this:
print max(cc_GDP.values())
That will give you the highest value but not the key.
The error is being cause because you need to look at the entire dictionary, not just one item. remove the [1] and then use the following line:
print(cc_GDP[max(cc_GDP, key=lambda key: cc_GDP[key])])
Your code currently just returns the dictionary key. You need to plug this key back into the dictionary to get the GDP.
You could deploy .items() method of dict to get key-value pairs (tuples) and process it following way:
cc_GDP = {'af': 1243738953, 'as': 343435646}
m = max(list(cc_GDP.items()), key=lambda x:x[1])
print(m) #prints ('af', 1243738953)
Output m in this case is 2-tuple, you might access key 'af' via m[0] and value 1243738953 via m[1].
I'm working on an exercise that requires me to build two dictionaries, one whose keys are country names, and the values are the GDP. This part works fine.
The second dictionary is where I'm lost, as the keys are supposed to be the letters A‐Z and the values are sets of country names. I tried using a for loop, which I've commented on below, where the issue lies.
If the user enters a string with only one letter (like A), the program should print all the countries that begin with that letter. When you run the program, however, it only prints out one country for each letter.
The text file contains 228 lines. ie:
1:Qatar:98900
2:Liechtenstein:89400
3:Luxembourg:80600
4:Bermuda:69900
5:Singapore:59700
6:Jersey:57000
etc.
And here's my code.
initials = []
countries=[]
incomes=[]
dictionary={}
dictionary_2={}
keywordFile = open("raw.txt", "r")
for line in keywordFile:
line = line.upper()
line = line.strip("\n")
line = line.split(":")
initials.append(line[1][0]) # first letter of second element
countries.append(line[1])
incomes.append(line[2])
for i in range(0,len(countries)):
dictionary[countries[i]] = incomes[i]
this for loop should spit out 248 values (one for each country), where the key is the initial and the value is the country name. However, it only spits out 26 values (one country for each letter in the alphabet)
for i in range(0,len(countries)):
dictionary_2[initials[i]] = countries[i]
print(dictionary_2)
while True:
inputS = str(input('Enter an initial or a country name.'))
if inputS in dictionary:
value = dictionary.get(inputS, "")
print("The per capita income of {} is {}.".format((inputS.title()), value ))
elif inputS in dictionary_2:
value = dictionary_2.get(inputS)
print("The countries that begin with the letter {} are: {}.".format(inputS, (value.title())))
elif inputS.lower() in "quit":
break
else:
print("Does not exit.")
print("End of session.")
I'd appreciate any input leading me in the right direction.
Use defaultdict to make sure each value of your initials dict is a set, and then use the add method. If you just use = you'll be overwriting the initial keys value each time, defaultdict is an easier way of using an expression like:
if initial in dict:
dict[initial].add(country)
else:
dict[initial] = {country}
See the full working example below, and also note that i'm using enumerate instead of range(0,len(countries)), which i'd also recommend:
#!/usr/bin/env python3
from collections import defaultdict
initials, countries, incomes = [],[],[]
dict1 = {}
dict2 = defaultdict(set)
keywordFile = """
1:Qatar:98900
2:Liechtenstein:89400
3:Luxembourg:80600
4:Bermuda:69900
5:Singapore:59700
6:Jersey:57000
""".split("\n\n")
for line in keywordFile:
line = line.upper().strip("\n").split(":")
initials.append(line[1][0])
countries.append(line[1])
incomes.append(line[2])
for i,country in enumerate(countries):
dict1[country] = incomes[i]
dict2[initials[i]].add(country)
print(dict2["L"])
Result:
{'LUXEMBOURG', 'LIECHTENSTEIN'}
see: https://docs.python.org/3/library/collections.html#collections.defaultdict
The values for dictionary2 should be such that they can contain a list of countries. One option is to use a list as the values in your dictionary. In your code, you are overwriting the values for each key whenever a new country with the same initial is to be added as the value.
Moreover, you can use the setdefault method of the dictionary type. This code:
dictionary2 = {}
for country in countries:
dictionary2.setdefault(country[0], []).append(country)
should be enough to create the second dictionary elegantly.
setdefault, either returns the value for the key (in this case the key is set to the first letter of the country name) if it already exists, or inserts a new key (again, the first letter of the country) into the dictionary with a value that is an empty set [].
edit
if you want your values to be set (for faster lookup/membership test), you can use the following lines:
dictionary2 = {}
for country in countries:
dictionary2.setdefault(country[0], set()).add(country)
Here's a link to a live functioning version of the OP's code online.
The keys in Python dict objects are unique. There can only ever be one 'L' key a single dict. What happens in your code is that first the key/value pair 'L':'Liechtenstein' is inserted into dictionary_2. However, in a subsequent iteration of the for loop, 'L':'Liechtenstein' is overwritten by 'L':Luxembourg. This kind of overwriting is sometimes referred to as "clobbering".
Fix
One way to get the result that you seem to be after would be to rewrite that for loop:
for i in range(0,len(countries)):
dictionary_2[initials[i]] = dictionary_2.get(initials[i], set()) | {countries[i]}
print(dictionary_2)
Also, you have to rewrite the related elif statement beneath that:
elif inputS in dictionary_2:
titles = ', '.join([v.title() for v in dictionary_2[inputS]])
print("The countries that begin with the letter {} are: {}.".format(inputS, titles))
Explanation
Here's a complete explanation of the dictionary_2[initials[i]] = dictionary_2.get(initials[i], set()) | {countries[i]} line above:
dictionary_2.get(initials[i], set())
If initials[i] is a key in dictionary_2, this will return the associated value. If initials[i] is not in the dictionary, it will return the empty set set() instead.
{countries[i]}
This creates a new set with a single member in it, countries[i].
dictionary_2.get(initials[i], set()) | {countries[i]}
The | operator adds all of the members of two sets together and returns the result.
dictionary_2[initials[i]] = ...
The right hand side of the line either creates a new set, or adds to an existing one. This bit of code assigns that newly created/expanded set back to dictionary_2.
Notes
The above code sets the values of dictionary_2 as sets. If you want to use list values, use this version of the for loop instead:
for i in range(0,len(countries)):
dictionary_2[initials[i]] = dictionary_2.get(initials[i], []) + [countries[i]]
print(dictionary_2)
You're very close to what you're looking for, You could populate your dictionaries respectively while looping over the contents of the file raw.txt that you're reading. You can also read the contents of the file first and then perform the necessary operations to populate the dictionaries. You could achieve your requirement with nice oneliners in python using dict comprehensions and groupby. Here's an example:
country_per_capita_dict = {}
letter_countries_dict = {}
keywordFile = [line.strip() for line in open('raw.txt' ,'r').readlines()]
You now have a list of all lines in the keywordFile as follows:
['1:Qatar:98900', '2:Liechtenstein:89400', '3:Luxembourg:80600', '4:Bermuda:69900', '5:Singapore:59700', '6:Jersey:57000', '7:Libya:1000', '8:Sri Lanka:5000']
As you loop over the items, you can split(':') and use the [1] and [2] index values as required.
You could use dictionary comprehension as follows:
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
Which results in:
{'Qatar': '98900', 'Libya': '1000', 'Singapore': '59700', 'Luxembourg': '80600', 'Liechtenstein': '89400', 'Bermuda': '69900', 'Jersey': '57000'}
Similarly using groupby from itertools you can obtain:
from itertools import groupby
country_list = country_per_capita_dict.keys()
country_list.sort()
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
Which results in the required dictionary of initial : [list of countries]
{'Q': ['Qatar'], 'S': ['Singapore'], 'B': ['Bermuda'], 'L': ['Luxembourg', 'Liechtenstein'], 'J': ['Jersey']}
A complete example is as follows:
from itertools import groupby
country_per_capita_dict = {}
letter_countries_dict = {}
keywordFile = [line.strip() for line in open('raw.txt' ,'r').readlines()]
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
country_list = country_per_capita_dict.keys()
country_list.sort()
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
print (country_per_capita_dict)
print (letter_countries_dict)
Explanation:
The line:
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
loops over the following list
['1:Qatar:98900', '2:Liechtenstein:89400', '3:Luxembourg:80600', '4:Bermuda:69900', '5:Singapore:59700', '6:Jersey:57000', '7:Libya:1000', '8:Sri Lanka:5000'] and splits each entry in the list by :
It then takes the value at index [1] and [2] which are the country names and the per capita value and makes them into a dictionary.
country_list = country_per_capita_dict.keys()
country_list.sort()
This line, extracts the name of all the countries from the dictionary created before into a list and sorts them alphabetically for groupby to work correctly.
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
This lambda expression takes the input as the list of countries and groups together the names of countries where each x starts with x[0] into list(g).
I have a list of dictionaries that maps different IDs to a central ID. I have a document with these different IDs associated with terms. I have created a function that now has a key the central ID from the different IDs in the document. The goFile is the document where in the first column there's an ID and in the second one there's a GOterm. The mappingList is a list containing dictionaries in which the ID in the goFile is mapped to a main ID.
My expected output is a dictionary with a main ID as a key and a set with the go terms associated with it as value.
def parseGO(mappingList, goFile):
# open the file
file = open(goFile)
# this will be the dictionary that this function returns
# entries will have as a key an Ensembl ID
# and the value will be a set of GO terms
GOdict = {}
GOset = set()
for line in file:
splitline = line.split(' ')
GO_term = splitline[1]
value_ID = splitline[0]
for dict in mappingList:
if value_ID in dict:
ENSB_term = dict[value_ID]
#my best try
for dict in mappingList:
for key in GOdict.keys():
if value_ID in dict and key == dict[value_ID]:
GOdict[ENSB_term].add(GO_term)
GOdict[ENSB_term] = GOset
return GOdict
My problem is that now I have to add to the central ID in my GOdict the terms that are associated in the document to the different IDs. To avoid duplicates i use a set (GOset). How do I do it? All my try end having all the terms mapped to all the main IDs.
Some sample:
mappingList = [{'1234': 'mainID1', '456': 'mainID2'}, {'789': 'mainID2'}]
goFile:
1234 GOTERM1
1234 GOTERM2
456 GOTERM1
456 GOTERM3
789 GOTERM1
expected output:
GOdict = {'mainID1': set([GOTERM1, GOTERM2]), 'mainID2': set([GOTERM1, GOTERM3])}
First off, you shouldn't use the variable name 'dict', as it shadows the built-in dict class, and will cause you problems at some point.
The following should work for you:
from collections import defaultdict
def parse_go(mapping_list, go_file):
go_dict = defaultdict(set)
with open(go_file) as f: # Better garbage handling using 'with'
for line in f:
(value_id, go_term) = line.split() # Feel free to change the split behaviour
# work better for you.
for map_dict in mapping_list:
if value_id in map_dict:
go_dict[map_dict[value_id]].add(go_term)
return go_dict
The code is fairly straightforward, but here's a breakdown anyway.
We use a default dictionary instead of a normal dictionary so we can eliminate all that if in or setdefault() boilerplate.
For each line in the file, we check if the first item (value_id) is a key in any of the mapping dictionaries, and if so, adds the lines second item (go_term) to that value_id's set in the dictionary.
EDIT: Request for doing this without defaultdict(). Assume that go_dict is just a normal dictionary (go_dict = {}), your for loop would look like:
for map_dict in mapping_list:
if value_id in map_dict:
esnb_entry = go_dict.setdefault(map_dict[value_id], set())
esnb_entry.add(go_term)
Here is my code:
for response in responses["result"]:
ids = {}
key = response['_id'].encode('ascii')
print key
for value in response['docs']:
ids[key].append(value)
Traceback:
File "people.py", line 47, in <module>
ids[key].append(value)
KeyError: 'deanna'
I am trying to add multiple values to a key. Throws an error like above
Check out setdefault:
ids.setdefault(key, []).append(value)
It looks to see if key is in ids, and if not, sets that to be an empty list. Then it returns that list for you to inline call append on.
Docs:
http://docs.python.org/2/library/stdtypes.html#dict.setdefault
If I'm reading this correctly your intention is to map the _id of a response to its docs. In that case you can bring down everything you have above to a dict comprehension:
ids = {response['_id'].encode('ascii'): response['docs']
for response in responses['result']}
This also assumes you meant to have id = {} outside of the outermost loop, but I can't see any other reasonable interpretation.
If the above is not correct,
You can use collections.defaultdict
import collections # at top level
#then in your loop:
ids = collections.defaultdict(list) #instead of ids = {}
A dictionary whose default value will be created by calling the init argument, in this case calling list() will produce an empty list which can then be appended to.
To traverse the dictionary you can iterate over it's items()
for key, val in ids.items():
print(key, val)
The reason you're getting a KeyError is this: In the first iteration of your for loop, you look up the key in an empty dictionary. There is no such key, hence the KeyError.
The code you gave will work, if you first insert an empty list into the dictionary under to appropriate key. Then append the values to the list. Like so:
for response in responses["result"]:
ids = {}
key = response['_id'].encode('ascii')
print key
if key not in ids: ## <-- if we haven't seen key yet
ids[key] = [] ## <-- insert an empty list into the dictionary
for value in response['docs']:
ids[key].append(value)
The previous answers are correct. Both defaultdict and dictionary.setdefault are automatic ways of inserting the empty list.