Say I have a list:
['1,2,3', '1,2,3', '1,3,2', '2,1,3', '2,3,1']
How can I categorise the items into specific keys of a dictionary according to where a specific number is in an item? If 1 is the first number, the item should be added to the value of the first key, if it's the second, it should be added to the value of the second etc.
So if I have a dictionary with keys A, B, C:
{'A': [], 'B': [], 'C': []}
The resulting dictionary should look like:
{'A': ['1,2,3', '1,2,3', '1,3,2'], 'B': ['2,1,3'], 'C':['2,3,1']
At the moment I have the following code:
lst = ['1,2,3', '1,2,3', '1,3,2', '2,1,3', '2,3,1']
dict = {'A': [], 'B': [], 'C': []}
for item in list
item.strip(',')
if item[0] == 1:
dict['A'].append(item)
elif item[1] == 1:
dict['B'].append(item)
elif item[2] == 1:
dict['C'].append(item)
print(dict)
However, this just returns the original dictionary.
try this:
lst = ['1,2,3', '1,2,3', '1,3,2', '2,1,3', '2,3,1']
dict = {'A': [], 'B': [], 'C': []}
for item in lst:
l = item.replace(',', '')
if l[0] == '1':
dict['A'].append(item)
elif l[1] == '1':
dict['B'].append(item)
elif l[2] == '1':
dict['C'].append(item)
print(dict)
I hope this helps !
I think you meant to use item.split(',') instead of item.strip(',') which only removes any commas at the start and end of the string item. item.split(',') splits the string item into a list using a comma as the delimiter. Also, you need to save the result of the method call, none of the aforementioned method calls modifies the string.
What you probably want to do is something like:
lst = ['1,2,3', '1,2,3', '1,3,2', '2,1,3', '2,3,1']
dict = {'A': [], 'B': [], 'C': []}
for item in lst:
item_arr = item.split(',')
key = 'ABC'[item_arr.index('1')]
dict[key].append(item)
print(dict)
It's less efficient and not very readable but here is a one-liner using dictionary and list comprehensions:
lst = ['1,2,3', '1,2,3', '1,3,2', '2,1,3', '2,3,1']
keys = ['A', 'B', 'C']
dic = {key: [x for x in lst if x.split(',')[j] == '1'] for j, key in enumerate(keys)}
# {'A': ['1,2,3', '1,2,3', '1,3,2'], 'B': ['2,1,3'], 'C': ['2,3,1']}
Related
I have a data set as below
tmp_dict = {
'a': ?,
'b': ?,
'c': ?,
}
and I have a data is a list of dictionaries like
tmp_list = [tmp_dict1, tmp_dict2, tmp_dict3....]
and I found some of dictionaries are not perfectly have keys about 'a','b','c'.
How do I check and fill the key is not existing
You could try something like this:
# List of keys to look for in each dictionary
dict_keys = ['a','b','c']
# Generate the dictionaries for demonstration purposes only
tmp_dict1 = {'a':[1,2,3], 'b':[4,5,6]}
tmp_dict2 = {'a':[7,8,9], 'b':[10,11,12], 'c':[13,14,15]}
tmp_dict3 = {'a':[16,17,18], 'c':[19,20,21]}
# Add the dictionaries to a list as per OP instructions
tmp_list = [tmp_dict1, tmp_dict2, tmp_dict3]
#--------------------------------------------------------
# Check for missing keys in each dict.
# Print the dict name and keys missing.
# -------------------------------------------------------
for i, dct in enumerate(tmp_list, start=1):
for k in dict_keys:
if dct.get(k) == None:
print(f"tmp_dict{i} is missing key:", k)
OUTPUT:
tmp_dict1 is missing key: c
tmp_dict3 is missing key: b
I think you want this.
tmp_dict = {'a':1, 'b': 2, 'c':3}
default_keys = tmp_dict.keys()
tmp_list = [{'a': 1}, {'b': 2,}, {'c': 3}]
for t in tmp_list:
current_dict = t.keys()
if default_keys - current_dict:
t.update({diff: None for diff in list(default_keys-current_dict)})
print(tmp_list)
Output:
[{'a': 1, 'c': None, 'b': None}, {'b': 2, 'a': None, 'c': None}, {'c': 3, 'a': None, 'b': None}]
You can compare the keys in the dictionary with a set containing all the expected keys.
for d in tmp_list:
if set(d) != {'a', 'b', 'c'}:
print(d)
I have a dictionary with four keys a,b,c,d with values 100,200,300,400
list1 = {'a':'100','b':'200','c':'300','d':'400'}
And a variable inputs.
inputs = 'c'
If inputs is c. The list1 dictionary has to be sorted based on it.
inputs = 'c'
list1 = {'c':'300','a':'100','b':'200','d':'400'}
inputs = 'b'
list1 = {'b':'200','a':'100','c':'300','d':'400'}
In Python3.7+ dict keys are stored in the insertion order
k ='c'
d={k:list1[k]}
for key in list1:
if key!=k:
d[key]=list1[key]
Output
{'c': '300', 'a': '100', 'b': '200', 'd': '400'}
Seems like you just want to rearrange your dict to have the chosen value at the front, then the remaining keys afterwards:
dict1 = {'a':'100','b':'200','c':'300','d':'400'}
key = 'c'
result = {key: dict1[key], **{k: v for k, v in dict1.items() if k != key}}
print(result)
# {'c': '300', 'a': '100', 'b': '200', 'd': '400'}
The ** simply merges the leftover filtered keys with key: dict1[key].
If you just want to change the position to the first one a given value if it exists, it could be done in the following way:
list1 = {'a':'100','b':'200','c':'300','d':'400'}
inputs = 'c'
output = {}
if inputs in list1.keys():
output[inputs] = list1.get(inputs)
for i in list1.keys():
output[i] = list1[i]
Output;
{'c': '300', 'a': '100', 'b': '200', 'd': '400'}
Here's a one-liner:
d = {'a':'100','b':'200','c':'300','d':'400'}
i = input()
d = {i:d[i],**{k:d[k] for k in d if k!=i}}
print(list1)
Input:
c
Output:
{'a': '100', 'b': '200', 'd': '400', 'c': '300'}
I have a Dictionary here:
dic = {'A':1, 'B':6, 'C':42, 'D':1, 'E':12}
and a list here:
lis = ['C', 'D', 'C', 'C', 'F']
What I'm trying to do is (also a requirement of the homework) to check whether the values in the lis matches the key in dic, if so then it increment by 1 (for example there's 3 'C's in the lis then in the output of dic 'C' should be 45). If not, then we create a new item in the dic and set the value to 1.
So the example output should be look like this:
dic = {'A':1, 'B':6, 'C':45, 'D':2, 'E':12, 'F':1}
Here's what my code is:
def addToInventory(dic, lis):
for k,v in dic.items():
for i in lis:
if i == k:
dic[k] += 1
else:
dic[i] = 1
return dic
and execute by this code:
dic = addToInventory(dic,lis)
It compiles without error but the output is strange, it added the missing F into the dic but didn't update the values correctly.
dic = {'A':1, 'B':6, 'C':1, 'D':1, 'E':12, 'F':1}
What am I missing here?
There's no need to iterate over a dictionary when it supports random lookup. You can use if x in dict to do this. Furthermore, you'd need your return statement outside the loop.
Try, instead:
def addToInventory(dic, lis):
for i in lis:
if i in dic:
dic[i] += 1
else:
dic[i] = 1
return dic
out = addToInventory(dic, lis)
print(out)
{'A': 1, 'B': 6, 'C': 45, 'D': 2, 'E': 12, 'F': 1}
As Harvey suggested, you can shorten the function a little by making use of dict.get.
def addToInventory(dic, lis):
for i in lis:
dic[i] = dic.get(i, 0) + 1
return dic
The dic.get function takes two parameters - the key, and a default value to be passed if the value associated with that key does not already exist.
If your professor allows the use of libraries, you can use the collections.Counter data structure, it's meant precisely for keeping counts.
from collections import Counter
c = Counter(dic)
for i in lis:
c[i] += 1
print(dict(c))
{'A': 1, 'B': 6, 'C': 45, 'D': 2, 'E': 12, 'F': 1}
I have a list which is structured as follows:
arr = [ ['a'],
['a','b'],
['a','x','y'],
['a','c'],
['a','c','a'],
['a','c','b'],
['a','c','b','a'],
['a','c','b','b'],
['a','d'],
['b'],
['b','c'],
['b','c','a'],
['b','c','b'],
['c','d'],
['c','d','e'],
['c','d','f'],
['c','d','f','a'],
['c','d','f','b'],
['c','d','f','b','a'],
]
As you would observe that the list has some unique elements and then following elements are building upon the unique element till a new unique element appears. These are supposed to categories and subcategories. So [a] , [b] , ['c','d'] are the broad level main categories and then there are further sub categories within sub categories based on the same priciple as above. Ideally I want the categories and sub categories as a dictionary. the end result should look something like:
{'a': ['a-b',
'a-x-y',
{'a-c':
['a-c-a',
{'a-c-b':
['a-c-b-a',
'a-c-b-b']
}]
}
],
'b' : ................
'c-d': ...............}
I may also be able to work with just the first level of sub-classification and discarding the rest altogether. In that case, the output would be:
{'a': ['a-b', 'a-x-y', 'a-c', 'a-d'], 'b': ['b-c'], 'c-d': ['c-d-e', 'c-d-f']}
I have written a code for the second scenario but I am not sure if this is a robust way to solve this:
def arrange(arr):
cat = {"-".join(arr[0]): ["-".join(arr[1])]}
main = 0
for i in range(2,len(arr)):
l = len(arr[main])
if arr[main] == arr[i][0:l]:
cat["-".join(arr[main])].append("-".join(arr[i]))
else:
cat["-".join(arr[i])] = []
main = i
for k,v in cat.items():
found = True
i = 0
while i < len(v)-1:
f_idx = i + 1
while v[i] in v[f_idx]:
v.pop(f_idx)
i += 1
return cat
Output-:
{'a': ['a-b', 'a-x-y', 'a-c', 'a-d'], 'b': ['b-c'], 'c-d': ['c-d-e', 'c-d-f']}
Please help me make this code better and or help me with a dictionary that has the complete structure where I have all the sub-classifications. Thanks
Finally , I believe I've what you describe as first level of sub-classification and discarding the rest altogether.
The trick was to create action based upon when an item in the list (keys) was not a sublist of subsequent items (values).
The same logic was used for removing duplicates.
from collections import defaultdict
#Function that compares two lists even with duplicate items
def contains_sublist(lst, sublst):
n = len(sublst)
return any((sublst == lst[i:i+n]) for i in xrange(len(lst)-n+1))
#Define default dict of list
aDict = defaultdict(list)
it = iter(arr)
#Format key
key = '-'.join(next(it))
s = list(key)
# Loop that collects keys if key is not sublist else values
for l in it:
if contains_sublist(l, s):
aDict[key].append(l)
else:
key = '-'.join(l)
s = l
#Loop to remove duplicate items based upon recurrance of sublist
it = iter(aDict.keys())
for k in it:
dellist = []
for s in aDict[k]:
for l in aDict[k]:
if l != s:
if contains_sublist(l, s):
if not l in dellist:
dellist.append(l)
for l in dellist:
try:
aDict[k].remove(l)
except ValueError:
pass
#Create final dict by concatenating list of list with '-'
finaldict = {k:[ '-'.join(i) for i in v ] for k,v in aDict.iteritems()}
Result:
Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>>
>>> finaldict
{'a': ['a-b', 'a-x-y', 'a-c', 'a-d'], 'b': ['b-c'], 'c-d': ['c-d-e', 'c-d-f']}
>>>
You're describing a Trie.
Here's a very basic implementation:
def make_trie(words):
root = dict()
for word in words:
current_dict = root
for letter in word:
current_dict = current_dict.setdefault(letter, {})
current_dict[1] = 1
return root
trie = make_trie(arr)
print(trie)
# {'a': {1: 1, 'c': {'a': {1: 1}, 1: 1, 'b': {'a': {1: 1}, 1: 1, 'b': {1: 1}}}, 'b': {1: 1}, 'd': {1: 1}, 'x': {'y': {1: 1}}}, 'c': {'d': {1: 1, 'e': {1: 1}, 'f': {'a': {1: 1}, 1: 1, 'b': {'a': {1: 1}, 1: 1}}}}, 'b': {1: 1, 'c': {'a': {1: 1}, 1: 1, 'b': {1: 1}}}}
print(trie.get('a',{}).get('x',{}))
# {'y': {1: 1}}
This trie is just nested dicts, so it's easy to iterate over all the children of ['a', 'x'] or select all the dicts that have a max-depth of 2 for example.
1 is used for leaf words: for example if you have ['a', 'x', 'y'] as sub-array, but not ['a', 'x'].
There are more complete Trie libraries for Python, such as pygtrie.
I have a dictionary of lists, and it should be initialized with default keys. I guess, the code below is not good (I mean, it works, but I don't feel that it is written in the pythonic way):
d = {'a' : [], 'b' : [], 'c' : []}
So I want to use something more pythonic like defaultict:
d = defaultdict(list)
However, every tutorial that I've seen dynamically sets the new keys. But in my case all the keys should be defined from the start. I'm parsing other data structures, and I add values to my dictionary only if specific key in the structure also contains in my dictionary.
How can I set the default keys?
From the comments, I'm assuming you want a dictionary that fits the following conditions:
Is initialized with set of keys with an empty list value for each
Has defaultdict behavior that can initialize an empty list for non-existing keys
#Aaron_lab has the right method, but there's a slightly cleaner way:
d = defaultdict(list,{ k:[] for k in ('a','b','c') })
That's already reasonable but you can shorten that up a bit with a dict comprehension that uses a standard list of keys.
>>> standard_keys = ['a', 'b', 'c']
>>> d1 = {key:[] for key in standard_keys}
>>> d2 = {key:[] for key in standard_keys}
>>> ...
If you're going to pre-initialize to empty lists, there is no need for a defaultdict. Simple dict-comprehension gets the job done clearly and cleanly:
>>> {k : [] for k in ['a', 'b', 'c']}
{'a': [], 'b': [], 'c': []}
If you have a close set of keys (['a', 'b', 'c'] in your example)
you know you'll use, you can definitely use the answers above.
BUT...
dd = defaultdict(list) gives you much more then: d = {'a':[], 'b':[], 'c':[]}.
You can append to "not existing" keys in defaultdict:
>>dd['d'].append(5)
>>dd
>>defaultdict(list, {'d': 5})
where if you do:
>>d['d'].append(5) # you'll face KeyError
>>KeyError: 'd'
Recommend to do something like:
>>d = {'a' : [], 'b' : [], 'c' : []}
>>default_d = defaultdict(list, **d)
now you have a dict holding your 3 keys: ['a', 'b', 'c'] and empty lists as values, and you can also append to other keys without explicitly writing: d['new_key'] = [] before appending
You can have a function defined which will return you a dict with preset keys.
def get_preset_dict(keys=['a','b','c'],values=None):
d = {}
if not values:
values = [[]]*len(keys)
if len(keys)!=len(values):
raise Exception('unequal lenghts')
for index,key in enumerate(keys):
d[key] = values[index]
return d
In [8]: get_preset_dict()
Out[8]: {'a': [], 'b': [], 'c': []}
In [18]: get_preset_dict(keys=['a','e','i','o','u'])
Out[18]: {'a': [], 'e': [], 'i': [], 'o': [], 'u': []}
In [19]:
get_preset_dict(keys=['a','e','i','o','u'],values=[[1],[2,2,2],[3],[4,2],[5]])
Out[19]: {'a': [1], 'e': [2, 2, 2], 'i': [3], 'o': [4, 2], 'u': [5]}
from collections import defaultdict
list(map((data := defaultdict(list)).__getitem__, 'abcde'))
data
Out[3]: defaultdict(list, {'a': [], 'b': [], 'c': [], 'd': [], 'e':
[]})