Python sort list to set - python

*edit
I make
word=['I','love','hello','world','love','I']
when I convert to set, It change the order to
print(set(word))
output: {'world', 'I', 'hello', 'love'}
How to sort the set again to be
{'I', 'love', 'hello', 'world'}

Sets are unordered. If you want order, convert back to a list.
E.g.
print(sorted(set(word)))
sorted will sort your items and return a list.
However, if you want to retain the order of your elements rather than sort them, you can use a set for deduplication and a list for ordering, something like this:
def unique(items):
seen = set()
result = []
for item in items:
if item not in seen:
seen.add(item)
result.append(item)
return result
and use it as:
>>> word = ['I','love','hello','world','love','I']
>>> print(unique(word))
['I', 'love', 'hello', 'world']

If you just want an ordered collection of unique values, you can create a dict from the list, either with a dict comprehension or dict.fromkeys. In Python 3, dictionaries will retain insertion order; for older versions, use collections.OrderedDict. The dict will have values besides the keys, but you can just ignore those.
>>> word = ['a','b','c','c','b','e']
>>> {k: None for k in word}
{'a': None, 'b': None, 'c': None, 'e': None}
>>> dict.fromkeys(word)
{'a': None, 'b': None, 'c': None, 'e': None}
Other than sorted, this also works if the original order is different than the sorted order.
>>> word = ['f','a','b','c','c','b','e']
>>> dict.fromkeys(word)
{'f': None, 'a': None, 'b': None, 'c': None, 'e': None}
You can then either convert the result to list or keep it a dict and add more values, but if you make it a set, the order will be lost again. Like a set, the dict also allows fast O(1) lookup, but no set operations like intersection or union.

Related

Create two dictionaries by iterating through a function that returns a tuple of two elements in Python

I want to create two dictionaries in python by dictionary comprehension at the same time. The two dictionaries share the same key set, but have different values for each key. Therefore, I use a function to return a tuple of two values, and hoping a dictionary comprehension can create these two dictionaries at the same time.
Say, I have a function
def my_func(foo):
blablabla...
return a, b
And I will create two dictionaries by
dict_of_a, dict_of_b = ({key:my_func(key)[0]}, {key:my_func(key)[1]} for key in list_of_keys)
Is there any better code to improve it? In my opinion, my_func(key) will be called twice in each iteration, slowing down the code. What is the correct way to do it?
With ordered slicing:
def myfunc(k):
return k + '0', k + '1'
list_of_keys = ['a', 'b', 'c']
groups = [(k,v) for k in list_of_keys for v in myfunc(k)]
dict_of_a, dict_of_b = dict(groups[::2]), dict(groups[1::2])
print(dict_of_a) # {'a': 'a0', 'b': 'b0', 'c': 'c0'}
print(dict_of_b) # {'a': 'a1', 'b': 'b1', 'c': 'c1'}
for key in list_of_keys:
dict_of_a[key],dict_of_b[key] = my_func(key)
The regular loop is probably the best way to go. If you want to play with functools, you can write:
>>> def func(foo): return foo[0], foo[1:]
...
>>> L = ['a', 'ab', 'abc']
>>> functools.reduce(lambda acc, x: tuple({**d, x: v} for d, v in zip(acc, func(x))), L, ({}, {}))
({'a': 'a', 'ab': 'a', 'abc': 'a'}, {'a': '', 'ab': 'b', 'abc': 'bc'})
The function reduce is a fold: it takes the current accumulator (here the dicts being built) and the next value from L:
d, v in zip(acc, func(x)) extracts the dicts one at a time and the matching element of the return value of func;
{**d, x: v} update the dict with the current value.
I don't recommend this kind of code since it's hard to maintain.
my_func(key) will be called twice in each iteration, slowing down the code
Dont worry about it. Unless you need to do thousands/millions of iterations and the script takes an unreasonably long time to complete, you shouldn't concern with negligible optimization gains.
That said, I'd use something like this:
if __name__ == '__main__':
def my_func(k):
return f'a{k}', f'b{k}'
keys = ['x', 'y', 'z']
results = (my_func(k) for k in keys)
grouped_values = zip(*results)
da, db = [dict(zip(keys, v)) for v in grouped_values]
print(da)
print(db)
# Output:
# {'x': 'ax', 'y': 'ay', 'z': 'az'}
# {'x': 'bx', 'y': 'by', 'z': 'bz'}
You cannot create two dicts in one dict comprehension.
If your primary goal is to just call my_func once to create both dicts, use a function for that:
def mkdicts(keys):
dict_of_a = {}
dict_of_b = {}
for key in keys:
dict_of_a[key], dict_of_b[key] = my_func(key)
return dict_of_a, dict_of_b

In Python, given a dictionary with lists in the values, how do I sort the dictionary based on the amount of items in that list?

I have a dictionary
d={'a': ['apple'], 'd': ['dog', 'dance', 'dragon'], 'r': ['robot'], 'c': ['cow', 'cotton']}
and I want to define a function that will order them by the size of the set. That is, since "d" has 3 items in the value, "c" has 2 items, and "a" and "r" each have one item, I want a dictionary in that order. So
d={'d': ['dog', 'dance', 'dragon'], 'c': ['cow', 'cotton'], 'a': ['apple'], 'r': ['robot']}
What I have so far is
def order_by_set_size(d):
return sorted(d, key=lambda k: len(d[k]), reverse=True)
This gives me a list, but I can't figure out how to have it give me a dictionary. I've looked at a lot of other questions and tried different variations of code and this is as close as I can get.
(I'm using Python 3)
you need to use an OrderedDict
see https://docs.python.org/3/library/collections.html#collections.OrderedDict
Based on their example
from collections import OrderedDict
d={'a': ['apple'], 'd': ['dog', 'dance', 'dragon'], 'r': ['robot'], 'c': ['cow', 'cotton']}
ordered = OrderedDict(sorted(d.items(),key=lambda t: len(t[1]),reverse=True))
Dictionaries are, by definition, unordered key-value pairs. Therefore, the code below is correct, as you can only get a list if you want it to be sorted. In other words, dictionaries cannot be sorted, so the task given is impossible.
def order_by_set_size(d):
return sorted(d, key=lambda k: len(d[k]), reverse=True)

list comprehension using dictionary entries

trying to figure out how I might be able to use list comprehension for the following:
I have a dictionary:
dict = {}
dict ['one'] = {"tag":"A"}
dict ['two'] = {"tag":"B"}
dict ['three'] = {"tag":"C"}
and I would like to create a list (let's call it "list") which is populated by each of the "tag" values of each key, i.e.
['A', 'B', 'C']
is there an efficient way to do this using list comprehension? i was thinking something like:
list = [x for x in dict[x]["tag"]]
but obviously this doesn't quite work. any help appreciated!
This is an extra step but gets the desired output and avoids using reserved words:
d = {}
d['one'] = {"tag":"A"}
d['two'] = {"tag":"B"}
d['three'] = {"tag":"C"}
new_list = []
for k in ('one', 'two', 'three'):
new_list += [x for x in d[k]["tag"]]
print(new_list)
Try this:
d = {'one': {'tag': 'A'},
'two': {'tag': 'B'},
'three': {'tag': 'C'}}
tag_values = [d[i][j] for i in d for j in d[i]]
>>> print tag_values
['C', 'B', 'A']
You can sort the list afterwards if it matters.
If you have other key/value pairs in the inner dicts, apart from 'tag', you may want to specify the 'tag' keys, like this:
tag_value = [d[i]['tag'] for i in d if 'tag' in d[i]]
for the same result. If 'tag' is definitely always there, remove the if 'tag' in d[i] part.
As a side note, never a good idea to call a list 'list', since it's a reserved word in Python.
You can try this:
[i['tag'] for i in dict.values()]
I would do something like this:
untransformed = {
'one': {'tag': 'A'},
'two': {'tag': 'B'},
'three': {'tag': 'C'},
'four': 'bad'
}
transformed = [value.get('tag') for key,value in untransformed.items() if isinstance(value, dict) and 'tag' in value]
It also sounds like you're trying to get some info out of JSON you might want to look into a tool like https://stedolan.github.io/jq/manual/

Sorting dictionary for printing

I want to print a sorted dictionary, which contains a lot of key value pairs (~2000).
Each pair consists of a number as the key and a string as the value.
It is just about printing, i don't want to sort the dictionary actually.
If i use the sorted() method, python sorts my dictionary, but in an awkward way:
{'0':'foo', '1':'bar', '10': 'foofoo', '100': 'foobar', '1000': 'barbar',
'1001': 'barfoo', '1002': 'raboof', ...}
But I want to sort it the 'conventional' way like this:
{'0':'foo', '1':'bar', '2': 'foofoo', '3': 'foobar', '4': 'barbar',
'5': 'barfoo', ... , '1001': 'raboof'}
Can I force the method to behave how I want to, or is there another better solution?
Your keys are strings representing integers; if you want a numeric sort, use int() to turn the keys to integers:
sorted(yourdict, key=int)
gives you a numerically sorted list of keys and
sorted(yourdict.items(), key=lambda i: int(i[0]))
gives you items sorted by the numeric value of the key.
However, if you have sequential keys starting at 0, you should be using a list object instead. Index references are faster than dictionary lookups as there is no hashing step required.
Even if your keys do not start at 0 but are still sequential, for a small start index you'd just pad the list with None values:
[None, 'foo', 'bar', 'foofoo', ...]
and index into that starting at 1.
You cannot sort the dictionary, because they are naturally unordered (they use hashing internally), but you can print the key-value pairs in the sorted way
print sorted(d.items(), key = lambda x: int(x[0]))
Output
[('0', 'foo'),
('1', 'bar'),
('10', 'foofoo'),
('100', 'foobar'),
('1000', 'barbar'),
('1001', 'barfoo'),
('1002', 'raboof')]
If you want to iterate through the dictionary in the sorted manner, by default, then you can use the custom SortedDict class from this answer
Also, you can print the dictionary in sorted way, like this
print "{{{}}}".format(", ".join(["{!r}: {!r}".format(key, d[key]) for key in sorted(d, key=int)]))
# {'0': 'foo', '1': 'bar', '10': 'foofoo', '100': 'foobar', '1000': 'barbar', '1001': 'barfoo', '1002': 'raboof'}
You need not use string as keys. You can use integers:
dct = {0:'foo', 1:'bar', 10: 'foofoo', 2: 'foo2'}
If you need strings then add key argument to sorted(). In your case it will be int for converting string int integer:
sorted(dct, key=int)

Python remove duplicate value in a combined dictionary's list

I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:
f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g = {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}
def merge(*d):
newdicts={}
for dict in d:
for k in dict.items():
if k[0] in newdicts:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
return newdicts
combined = merge(f, g, h, r)
print(combined)
The output looks like:
{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}
Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?
I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)
Any help would be appreciated. And thank you in advance for all your help!
Just test for the element inside the list before adding it: -
for k in dict.items():
if k[0] in newdicts:
if k[1] not in newdicts[k[0]]: # Do this test before adding.
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.
Also, don't use built-in for your as your variable names. Instead of dict some other variable.
So, you can modify your merge method as:
from collections import defaultdict
def merge(*d):
newdicts = defaultdict(set) # Define a defaultdict
for each_dict in d:
# dict.items() returns a list of (k, v) tuple.
# So, you can directly unpack the tuple in two loop variables.
for k, v in each_dict.items():
newdicts[k].add(v)
# And if you want the exact representation that you have shown
# You can build a normal dict out of your newly built dict.
unique = {key: list(value) for key, value in newdicts.items()}
return unique
>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
... uniques[k].add(v)
...
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b': set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})
Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:
>>> {x: list(y) for x, y in uniques.items()}
{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}
In your for loop add this:
for dict in d:
for k in dict.items():
if k[0] in newdicts:
# This line below
if k[1] not in newdicts[k[0]]:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
This makes sure duplicates aren't added
Use set when you want unique elements:
def merge_dicts(*d):
result={}
for dict in d:
for key, value in dict.items():
result.setdefault(key, set()).add(value)
return result
Try to avoid using indices; unpack tuples instead.

Categories