Function that makes dict from string but swaps keys and values? - python

I'm trying to make a function that takes in list of strings as an input like the one listed below:
def swap_values_dict(['Summons: Bahamut, Shiva, Chocomog',
'Enemies: Bahamut, Shiva, Cactaur'])
and creates a dictionary from them using the words after the colons as keys and the words before the colons as values. I need to clarify that, at this point, there are only two strings in the list. I plan to split the strings into sublists and, from there, try and assign them to a dictionary.
The output should look like
{'Bahamut': ['Summons','Enemies'],'Shiva':['Summons','Enemies'],'Chocomog':['Summons'],'Cactaur':['Enemies']}
As you can see, the words after the colon in the original list have become keys while the words before the colon (categories) have become the values. If one of the values appears in both lists, it is assigned two values in the final dictionary. I would like to be able to make similar dictionaries out of many lists of different sizes, not just ones that contain two strings. Could this be done without list comprehension and only for loops and if statements?
What I've Tried So Far
title_list = []
for i in range(len(mobs)):#counts amount of strings in list
titles = (mobs[i].split(":"))[0] #gets titles from list using split
title_list.append(titles)
title_list
this code returns ['Summons', 'Enemies'] which aren't the results I wanted to receive but I think they could help me write the function. I had planned on separating the keys and values into separate lists and then zipping them together afterwards as a dictionary.

Try:
def swap_values_dict(lst):
tmp = {}
for s in lst:
k, v = map(str.strip, s.split(":"))
tmp[k] = list(map(str.strip, v.split(",")))
out = {}
for k, v in tmp.items():
for i in v:
out.setdefault(i, []).append(k)
return out
print(
swap_values_dict(
[
"Summons: Bahamut, Shiva, Chocomog",
"Enemies: Bahamut, Shiva, Cactaur",
]
)
)
Prints:
{
"Bahamut": ["Summons", "Enemies"],
"Shiva": ["Summons", "Enemies"],
"Chocomog": ["Summons"],
"Cactaur": ["Enemies"],
}

I'd use a defaultdict. It saves you the trouble of manually checking if a key exists in your dictionary and constructing a new empty list, making for a rather concise function:
from collections import defaultdict
def swap_values_dict(mobs):
result = defaultdict(list)
for elem in mobs:
role, members = elem.split(': ')
for m in members.split(', '):
result[m].append(role)
return result

Related

How to read and print a list in a specific order/format based on the content in the list for python?

New to python and for this example list
lst = ['<name>bob</name>', '<job>doctor</job>', '<gender>male</gender>', '<name>susan</name>', '<job>teacher</job>', '<gender>female</gender>', '<name>john</name>', '<gender>male</gender>']
There are 3 categories of name, job, and gender. I would want those 3 categories to be on the same line which would look like
<name>bob</name>, <job>doctor</job>, <gender>male</gender>
My actual list is really big with 10 categories I would want to be on the same line. I am also trying to figure out a way where if one of the categories is not in the list, it would print something like N/A to indicate that it is not in the list
for example I would want it to look like
<name>bob</name>, <job>doctor</job>, <gender>male</gender>
<name>susan</name>, <job>teacher</job>, <gender>female</gender>
<name>john</name>, N/A, <gender>male</gender>
What would be the best way to do this?
This is one way to do it. This would handle any length list, and guarantee grouping no matter how long the lists are as long as they are in the correct order.
Updated to convert to dict, so you can test for key existence.
lst = ['<name>bob</name>', '<job>doctor</job>', '<gender>male</gender>', '<name>susan</name>', '<job>teacher</job>', '<gender>female</gender>', '<name>john</name>', '<gender>male</gender>']
newlst = []
tmplist = {}
for item in lst:
value = item.split('>')[1].split('<')[0]
key = item.split('<')[1].split('>')[0]
if '<name>' in item:
if tmplist:
newlst.append(tmplist)
tmplist = {}
tmplist[key] = value
#handle the remaining items left over in the list
if tmplist:
newlst.append(tmplist)
print(newlst)
#test for existance
for each in newlst:
print(each.get('job', 'N/A'))

Change list according to dictionary items

Hello I have a list that looks like:
>>>ids
'70723295',
'75198124',
'140',
'199200',
'583561',
'71496270',
'69838760',
'70545907',
...]
I also have a dictionary that gives those numbers a 'name'. Now I want to create a new list that contains only the names, in the order like the numbers before.. so replace the numbers in the right order with the right names from the dictionary.
I tried:
with open('/home/anja/Schreibtisch/Master/ABA/alltogether/filelist.txt') as f:
ids = [line.strip() for line in f.read().split('\n')]
rev_subs = { v:k for v,k in dictionary.items()}
new_list=[rev_subs.get(item,item) for item in ids]
#dict looks like:
'16411': 'Itgax',
'241041': 'Gm4956',
'22419': 'Wnt5b',
'20174': 'Ruvbl2',
'71833': 'Dcaf7',
...}
But new_list is still the same as ids.
What am I doing wrong?
Maybe the dictionary keys are not in the format you think? Maybe the dictionary contains integers, meanwhile the ids are strings. I would investigate on that, it seems a mismatch of types more than an empty (or non-matching) dictionary.
Your dictionary keys are bs4.element.NavigableString objects rather than strings, so you cannot use strings as keys to look up its values.
You can fix this by converting the keys to strings when you build rev_subs:
rev_subs = {str(k): v for k, v in dictionary.items()}

Dealing with lists in a dictionary

I am iterating through some folders to read all the objects in that list to later on move the not rejected ones. As the number of folders and files may vary, basically I managed to create a dictionary where each folder is a key and the items are the items. In a dummy situation I have:
Iterating through the number of source of folders (known but may vary)
sourcefolder = (r"C:\User\Desktop\Test")
subfolders = 3
for i in range(subfolders):
Lst_All["allfiles" + str(i)] = os.listdir(sourcefolder[i])
This results in the dictionary below:
Lst_All = {
allfiles0: ('A.1.txt', 'A.txt', 'rejected.txt')
allfiles1: ('B.txt')
allfiles2: ('C.txt')}
My issue is to remove the rejected files so I can do a shutil.move() with only valid files.
So far I got:
for k, v in lst_All.items():
for i in v:
if i == "rejected.txt":
del lst_All[i]
but it returns an error KeyError: 'rejected.txt'. Any thoughts? Perhaps another way to create the list of items to be moved?
Thanks!
For a start, the members of your dictionary are tuples, not lists. Tuples are immutable, so we can't remove items as easily as we can with lists. To replicate the functionality I think you're after, we can do the following:
Lst_All = {'allfiles0': ('A.1.txt', 'A.txt', 'rejected.txt'),
'allfiles1': ('B.txt',),
'allfiles2': ('C.txt',)}
Lst_All = {k: tuple(x for x in v if x!="rejected.txt") for k, v in Lst_All.items()}
Which gives us:
>>> Lst_All
{'allfiles0': ('A.1.txt', 'A.txt'),
'allfiles1': ('B.txt',),
'allfiles2': ('C.txt',)}
You should not iterate over a dictionary when removing element from that dictionary inside loop. Better to make an list of keys and then iterate over that. Also you do not need a separate loop to check whether rejected.txt is present in that directory.
keys = list(lst_All.keys())
for k in keys:
if "rejected.txt" in lst_All[k]:
del lst_All[k]
If you want to remove rejected.txt then you can only create another tuple without that element and insert in the dictionary with the key. You can do that like -
keys = list(lst_All.keys())
for k in keys:
lst_All[k] = tuple((e for e in lst_All[k] if e != 'rejected.txt'))

How to create a dictionary whose values are sets?

I'm working on an exercise that requires me to build two dictionaries, one whose keys are country names, and the values are the GDP. This part works fine.
The second dictionary is where I'm lost, as the keys are supposed to be the letters A‐Z and the values are sets of country names. I tried using a for loop, which I've commented on below, where the issue lies.
If the user enters a string with only one letter (like A), the program should print all the countries that begin with that letter. When you run the program, however, it only prints out one country for each letter.
The text file contains 228 lines. ie:
1:Qatar:98900
2:Liechtenstein:89400
3:Luxembourg:80600
4:Bermuda:69900
5:Singapore:59700
6:Jersey:57000
etc.
And here's my code.
initials = []
countries=[]
incomes=[]
dictionary={}
dictionary_2={}
keywordFile = open("raw.txt", "r")
for line in keywordFile:
line = line.upper()
line = line.strip("\n")
line = line.split(":")
initials.append(line[1][0]) # first letter of second element
countries.append(line[1])
incomes.append(line[2])
for i in range(0,len(countries)):
dictionary[countries[i]] = incomes[i]
this for loop should spit out 248 values (one for each country), where the key is the initial and the value is the country name. However, it only spits out 26 values (one country for each letter in the alphabet)
for i in range(0,len(countries)):
dictionary_2[initials[i]] = countries[i]
print(dictionary_2)
while True:
inputS = str(input('Enter an initial or a country name.'))
if inputS in dictionary:
value = dictionary.get(inputS, "")
print("The per capita income of {} is {}.".format((inputS.title()), value ))
elif inputS in dictionary_2:
value = dictionary_2.get(inputS)
print("The countries that begin with the letter {} are: {}.".format(inputS, (value.title())))
elif inputS.lower() in "quit":
break
else:
print("Does not exit.")
print("End of session.")
I'd appreciate any input leading me in the right direction.
Use defaultdict to make sure each value of your initials dict is a set, and then use the add method. If you just use = you'll be overwriting the initial keys value each time, defaultdict is an easier way of using an expression like:
if initial in dict:
dict[initial].add(country)
else:
dict[initial] = {country}
See the full working example below, and also note that i'm using enumerate instead of range(0,len(countries)), which i'd also recommend:
#!/usr/bin/env python3
from collections import defaultdict
initials, countries, incomes = [],[],[]
dict1 = {}
dict2 = defaultdict(set)
keywordFile = """
1:Qatar:98900
2:Liechtenstein:89400
3:Luxembourg:80600
4:Bermuda:69900
5:Singapore:59700
6:Jersey:57000
""".split("\n\n")
for line in keywordFile:
line = line.upper().strip("\n").split(":")
initials.append(line[1][0])
countries.append(line[1])
incomes.append(line[2])
for i,country in enumerate(countries):
dict1[country] = incomes[i]
dict2[initials[i]].add(country)
print(dict2["L"])
Result:
{'LUXEMBOURG', 'LIECHTENSTEIN'}
see: https://docs.python.org/3/library/collections.html#collections.defaultdict
The values for dictionary2 should be such that they can contain a list of countries. One option is to use a list as the values in your dictionary. In your code, you are overwriting the values for each key whenever a new country with the same initial is to be added as the value.
Moreover, you can use the setdefault method of the dictionary type. This code:
dictionary2 = {}
for country in countries:
dictionary2.setdefault(country[0], []).append(country)
should be enough to create the second dictionary elegantly.
setdefault, either returns the value for the key (in this case the key is set to the first letter of the country name) if it already exists, or inserts a new key (again, the first letter of the country) into the dictionary with a value that is an empty set [].
edit
if you want your values to be set (for faster lookup/membership test), you can use the following lines:
dictionary2 = {}
for country in countries:
dictionary2.setdefault(country[0], set()).add(country)
Here's a link to a live functioning version of the OP's code online.
The keys in Python dict objects are unique. There can only ever be one 'L' key a single dict. What happens in your code is that first the key/value pair 'L':'Liechtenstein' is inserted into dictionary_2. However, in a subsequent iteration of the for loop, 'L':'Liechtenstein' is overwritten by 'L':Luxembourg. This kind of overwriting is sometimes referred to as "clobbering".
Fix
One way to get the result that you seem to be after would be to rewrite that for loop:
for i in range(0,len(countries)):
dictionary_2[initials[i]] = dictionary_2.get(initials[i], set()) | {countries[i]}
print(dictionary_2)
Also, you have to rewrite the related elif statement beneath that:
elif inputS in dictionary_2:
titles = ', '.join([v.title() for v in dictionary_2[inputS]])
print("The countries that begin with the letter {} are: {}.".format(inputS, titles))
Explanation
Here's a complete explanation of the dictionary_2[initials[i]] = dictionary_2.get(initials[i], set()) | {countries[i]} line above:
dictionary_2.get(initials[i], set())
If initials[i] is a key in dictionary_2, this will return the associated value. If initials[i] is not in the dictionary, it will return the empty set set() instead.
{countries[i]}
This creates a new set with a single member in it, countries[i].
dictionary_2.get(initials[i], set()) | {countries[i]}
The | operator adds all of the members of two sets together and returns the result.
dictionary_2[initials[i]] = ...
The right hand side of the line either creates a new set, or adds to an existing one. This bit of code assigns that newly created/expanded set back to dictionary_2.
Notes
The above code sets the values of dictionary_2 as sets. If you want to use list values, use this version of the for loop instead:
for i in range(0,len(countries)):
dictionary_2[initials[i]] = dictionary_2.get(initials[i], []) + [countries[i]]
print(dictionary_2)
You're very close to what you're looking for, You could populate your dictionaries respectively while looping over the contents of the file raw.txt that you're reading. You can also read the contents of the file first and then perform the necessary operations to populate the dictionaries. You could achieve your requirement with nice oneliners in python using dict comprehensions and groupby. Here's an example:
country_per_capita_dict = {}
letter_countries_dict = {}
keywordFile = [line.strip() for line in open('raw.txt' ,'r').readlines()]
You now have a list of all lines in the keywordFile as follows:
['1:Qatar:98900', '2:Liechtenstein:89400', '3:Luxembourg:80600', '4:Bermuda:69900', '5:Singapore:59700', '6:Jersey:57000', '7:Libya:1000', '8:Sri Lanka:5000']
As you loop over the items, you can split(':') and use the [1] and [2] index values as required.
You could use dictionary comprehension as follows:
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
Which results in:
{'Qatar': '98900', 'Libya': '1000', 'Singapore': '59700', 'Luxembourg': '80600', 'Liechtenstein': '89400', 'Bermuda': '69900', 'Jersey': '57000'}
Similarly using groupby from itertools you can obtain:
from itertools import groupby
country_list = country_per_capita_dict.keys()
country_list.sort()
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
Which results in the required dictionary of initial : [list of countries]
{'Q': ['Qatar'], 'S': ['Singapore'], 'B': ['Bermuda'], 'L': ['Luxembourg', 'Liechtenstein'], 'J': ['Jersey']}
A complete example is as follows:
from itertools import groupby
country_per_capita_dict = {}
letter_countries_dict = {}
keywordFile = [line.strip() for line in open('raw.txt' ,'r').readlines()]
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
country_list = country_per_capita_dict.keys()
country_list.sort()
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
print (country_per_capita_dict)
print (letter_countries_dict)
Explanation:
The line:
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
loops over the following list
['1:Qatar:98900', '2:Liechtenstein:89400', '3:Luxembourg:80600', '4:Bermuda:69900', '5:Singapore:59700', '6:Jersey:57000', '7:Libya:1000', '8:Sri Lanka:5000'] and splits each entry in the list by :
It then takes the value at index [1] and [2] which are the country names and the per capita value and makes them into a dictionary.
country_list = country_per_capita_dict.keys()
country_list.sort()
This line, extracts the name of all the countries from the dictionary created before into a list and sorts them alphabetically for groupby to work correctly.
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
This lambda expression takes the input as the list of countries and groups together the names of countries where each x starts with x[0] into list(g).

populate dictionary with for loop

I have several large dictionaries where all the values are the same except for the last several characters.
like : http://www:example.com/abc
Right now im using a dictionary like so:
categories = {1:'http://www:example.com/abc',
2:'http://www:example.com/def'
with an additional 30 k,v pairs.
How can I use a for loop to add the static and end variables together as the value, and generate integer as keys of a dictionary?
static = 'http://www.example.com
end = ['abc','def']
You can do what you are trying to do with a dictionary comprehension.
static = 'http://www.example.com/'
end = ['abc','def']
{ k:'{}{}'.format(static, v) for k,v in enumerate(end) }
But it does beg the question as raised by #mkrieger why not just use a list.
Use a dictionary comprehension.
template = 'http://www.example.com/{path}'
categories = {i+1: template.format(path=e) for i, e in enumerate(end)}
Since the keys are a range of integers, you could as well use a list. The only difference is that the indices start at 0 instead of 1.
categories_list = [template.format(path=e) for e in end]

Categories