Related
I want to combine these:
keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']
Into a single dictionary:
{'name': 'Monty', 'age': 42, 'food': 'spam'}
Like this:
keys = ['a', 'b', 'c']
values = [1, 2, 3]
dictionary = dict(zip(keys, values))
print(dictionary) # {'a': 1, 'b': 2, 'c': 3}
Voila :-) The pairwise dict constructor and zip function are awesomely useful.
Imagine that you have:
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
What is the simplest way to produce the following dictionary ?
dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}
Most performant, dict constructor with zip
new_dict = dict(zip(keys, values))
In Python 3, zip now returns a lazy iterator, and this is now the most performant approach.
dict(zip(keys, values)) does require the one-time global lookup each for dict and zip, but it doesn't form any unnecessary intermediate data-structures or have to deal with local lookups in function application.
Runner-up, dict comprehension:
A close runner-up to using the dict constructor is to use the native syntax of a dict comprehension (not a list comprehension, as others have mistakenly put it):
new_dict = {k: v for k, v in zip(keys, values)}
Choose this when you need to map or filter based on the keys or value.
In Python 2, zip returns a list, to avoid creating an unnecessary list, use izip instead (aliased to zip can reduce code changes when you move to Python 3).
from itertools import izip as zip
So that is still (2.7):
new_dict = {k: v for k, v in zip(keys, values)}
Python 2, ideal for <= 2.6
izip from itertools becomes zip in Python 3. izip is better than zip for Python 2 (because it avoids the unnecessary list creation), and ideal for 2.6 or below:
from itertools import izip
new_dict = dict(izip(keys, values))
Result for all cases:
In all cases:
>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}
Explanation:
If we look at the help on dict we see that it takes a variety of forms of arguments:
>>> help(dict)
class dict(object)
| dict() -> new empty dictionary
| dict(mapping) -> new dictionary initialized from a mapping object's
| (key, value) pairs
| dict(iterable) -> new dictionary initialized as if via:
| d = {}
| for k, v in iterable:
| d[k] = v
| dict(**kwargs) -> new dictionary initialized with the name=value pairs
| in the keyword argument list. For example: dict(one=1, two=2)
The optimal approach is to use an iterable while avoiding creating unnecessary data structures. In Python 2, zip creates an unnecessary list:
>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]
In Python 3, the equivalent would be:
>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]
and Python 3's zip merely creates an iterable object:
>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>
Since we want to avoid creating unnecessary data structures, we usually want to avoid Python 2's zip (since it creates an unnecessary list).
Less performant alternatives:
This is a generator expression being passed to the dict constructor:
generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)
or equivalently:
dict((k, v) for k, v in zip(keys, values))
And this is a list comprehension being passed to the dict constructor:
dict([(k, v) for k, v in zip(keys, values)])
In the first two cases, an extra layer of non-operative (thus unnecessary) computation is placed over the zip iterable, and in the case of the list comprehension, an extra list is unnecessarily created. I would expect all of them to be less performant, and certainly not more-so.
Performance review:
In 64 bit Python 3.8.2 provided by Nix, on Ubuntu 16.04, ordered from fastest to slowest:
>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>>
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583
dict(zip(keys, values)) wins even with small sets of keys and values, but for larger sets, the differences in performance will become greater.
A commenter said:
min seems like a bad way to compare performance. Surely mean and/or max would be much more useful indicators for real usage.
We use min because these algorithms are deterministic. We want to know the performance of the algorithms under the best conditions possible.
If the operating system hangs for any reason, it has nothing to do with what we're trying to compare, so we need to exclude those kinds of results from our analysis.
If we used mean, those kinds of events would skew our results greatly, and if we used max we will only get the most extreme result - the one most likely affected by such an event.
A commenter also says:
In python 3.6.8, using mean values, the dict comprehension is indeed still faster, by about 30% for these small lists. For larger lists (10k random numbers), the dict call is about 10% faster.
I presume we mean dict(zip(... with 10k random numbers. That does sound like a fairly unusual use case. It does makes sense that the most direct calls would dominate in large datasets, and I wouldn't be surprised if OS hangs are dominating given how long it would take to run that test, further skewing your numbers. And if you use mean or max I would consider your results meaningless.
Let's use a more realistic size on our top examples:
import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))
And we see here that dict(zip(... does indeed run faster for larger datasets by about 20%.
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095
Try this:
>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}
In Python 2, it's also more economical in memory consumption compared to zip.
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
out = dict(zip(keys, values))
Output:
{'food': 'spam', 'age': 42, 'name': 'Monty'}
You can also use dictionary comprehensions in Python ≥ 2.7:
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}
A more natural way is to use dictionary comprehension
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
dict = {keys[i]: values[i] for i in range(len(keys))}
If you need to transform keys or values before creating a dictionary then a generator expression could be used. Example:
>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3]))
Take a look Code Like a Pythonista: Idiomatic Python.
with Python 3.x, goes for dict comprehensions
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
dic = {k:v for k,v in zip(keys, values)}
print(dic)
More on dict comprehensions here, an example is there:
>>> print {i : chr(65+i) for i in range(4)}
{0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}
For those who need simple code and aren’t familiar with zip:
List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']
This can be done by one line of code:
d = {List1[n]: List2[n] for n in range(len(List1))}
you can use this below code:
dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))
But make sure that length of the lists will be same.if length is not same.then zip function turncate the longer one.
2018-04-18
The best solution is still:
In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...:
In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}
Tranpose it:
lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
keys, values = zip(*lst)
In [101]: keys
Out[101]: ('name', 'age', 'food')
In [102]: values
Out[102]: ('Monty', 42, 'spam')
Here is also an example of adding a list value in you dictionary
list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)
always make sure the your "Key"(list1) is always in the first parameter.
{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}
I had this doubt while I was trying to solve a graph-related problem. The issue I had was I needed to define an empty adjacency list and wanted to initialize all the nodes with an empty list, that's when I thought how about I check if it is fast enough, I mean if it will be worth doing a zip operation rather than simple assignment key-value pair. After all most of the times, the time factor is an important ice breaker. So I performed timeit operation for both approaches.
import timeit
def dictionary_creation(n_nodes):
dummy_dict = dict()
for node in range(n_nodes):
dummy_dict[node] = []
return dummy_dict
def dictionary_creation_1(n_nodes):
keys = list(range(n_nodes))
values = [[] for i in range(n_nodes)]
graph = dict(zip(keys, values))
return graph
def wrapper(func, *args, **kwargs):
def wrapped():
return func(*args, **kwargs)
return wrapped
iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)
for trail in range(1, 8):
print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')
For n_nodes = 10,000,000
I get,
Iteration: 2.825081646999024
Shorthand: 3.535717916001886
Iteration: 5.051560923002398
Shorthand: 6.255070794999483
Iteration: 6.52859034499852
Shorthand: 8.221581164998497
Iteration: 8.683652416999394
Shorthand: 12.599181543999293
Iteration: 11.587241565001023
Shorthand: 15.27298851100204
Iteration: 14.816342867001367
Shorthand: 17.162912737003353
Iteration: 16.645022411001264
Shorthand: 19.976680120998935
You can clearly see after a certain point, iteration approach at n_th step overtakes the time taken by shorthand approach at n-1_th step.
It can be done by the following way.
keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']
dict = {}
for i in range(len(keys)):
dict[keys[i]] = values[i]
print(dict)
{'name': 'Monty', 'age': 42, 'food': 'spam'}
All answers sum up:
l = [1, 5, 8, 9]
ll = [3, 7, 10, 11]
zip:
dict(zip(l,ll)) # {1: 3, 5: 7, 8: 10, 9: 11}
#if you want to play with key or value #recommended
{k:v*10 for k, v in zip(l, ll)} #{1: 30, 5: 70, 8: 100, 9: 110}
counter:
d = {}
c=0
for k in l:
d[k] = ll[c] #setting up keys from the second list values
c += 1
print(d)
{1: 3, 5: 7, 8: 10, 9: 11}
enumerate:
d = {}
for i,k in enumerate(l):
d[k] = ll[i]
print(d)
{1: 3, 5: 7, 8: 10, 9: 11}
Solution as dictionary comprehension with enumerate:
dict = {item : values[index] for index, item in enumerate(keys)}
Solution as for loop with enumerate:
dict = {}
for index, item in enumerate(keys):
dict[item] = values[index]
If you are working with more than 1 set of values and wish to have a list of dicts you can use this:
def as_dict_list(data: list, columns: list):
return [dict((zip(columns, row))) for row in data]
Real-life example would be a list of tuples from a db query paired to a tuple of columns from the same query. Other answers only provided for 1 to 1.
keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']
dic = {}
c = 0
for i in keys:
dic[i] = values[c]
c += 1
print(dic)
{'name': 'Monty', 'age': 42, 'food': 'spam'}
import pprint
p = ['A', 'B', 'C']
q = [5, 2, 7]
r = ["M", "F", "M"]
s = ['Sovabazaar','Shyambazaar','Bagbazaar','Hatkhola']
def makeDictUsingAlternateLists1(**rest):
print("*rest.keys() : ",*rest.keys())
print("rest.keys() : ",rest.keys())
print("*rest.values() : ",*rest.values())
print("**rest.keys() : ",rest.keys())
print("**rest.values() : ",rest.values())
[print(a) for a in zip(*rest.values())]
[ print(dict(zip(rest.keys(),a))) for a in zip(*rest.values())]
print("...")
finalRes= [ dict( zip( rest.keys(),a)) for a in zip(*rest.values())]
return finalRes
l = makeDictUsingAlternateLists1(p=p,q=q,r=r,s=s)
pprint.pprint(l)
"""
*rest.keys() : p q r s
rest.keys() : dict_keys(['p', 'q', 'r', 's'])
*rest.values() : ['A', 'B', 'C'] [5, 2, 7] ['M', 'F', 'M'] ['Sovabazaar', 'Shyambazaar', 'Bagbazaar', 'Hatkhola']
**rest.keys() : dict_keys(['p', 'q', 'r', 's'])
**rest.values() : dict_values([['A', 'B', 'C'], [5, 2, 7], ['M', 'F', 'M'], ['Sovabazaar', 'Shyambazaar', 'Bagbazaar', 'Hatkhola']])
('A', 5, 'M', 'Sovabazaar')
('B', 2, 'F', 'Shyambazaar')
('C', 7, 'M', 'Bagbazaar')
{'p': 'A', 'q': 5, 'r': 'M', 's': 'Sovabazaar'}
{'p': 'B', 'q': 2, 'r': 'F', 's': 'Shyambazaar'}
{'p': 'C', 'q': 7, 'r': 'M', 's': 'Bagbazaar'}
...
[{'p': 'A', 'q': 5, 'r': 'M', 's': 'Sovabazaar'},
{'p': 'B', 'q': 2, 'r': 'F', 's': 'Shyambazaar'},
{'p': 'C', 'q': 7, 'r': 'M', 's': 'Bagbazaar'}]
"""
method without zip function
l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
for l2_ in l2:
d1[l1_] = l2_
l2.remove(l2_)
break
print (d1)
{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}
Although there are multiple ways of doing this but i think most fundamental way of approaching it; creating a loop and dictionary and store values into that dictionary. In the recursive approach the idea is still same it but instead of using a loop, the function called itself until it reaches to the end. Of course there are other approaches like using dict(zip(key, value)) and etc. These aren't the most effective solutions.
y = [1,2,3,4]
x = ["a","b","c","d"]
# This below is a brute force method
obj = {}
for i in range(len(y)):
obj[y[i]] = x[i]
print(obj)
# Recursive approach
obj = {}
def map_two_lists(a,b,j=0):
if j < len(a):
obj[b[j]] = a[j]
j +=1
map_two_lists(a, b, j)
return obj
res = map_two_lists(x,y)
print(res)
Both the results should print
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}
I want to combine these:
keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']
Into a single dictionary:
{'name': 'Monty', 'age': 42, 'food': 'spam'}
Like this:
keys = ['a', 'b', 'c']
values = [1, 2, 3]
dictionary = dict(zip(keys, values))
print(dictionary) # {'a': 1, 'b': 2, 'c': 3}
Voila :-) The pairwise dict constructor and zip function are awesomely useful.
Imagine that you have:
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
What is the simplest way to produce the following dictionary ?
dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}
Most performant, dict constructor with zip
new_dict = dict(zip(keys, values))
In Python 3, zip now returns a lazy iterator, and this is now the most performant approach.
dict(zip(keys, values)) does require the one-time global lookup each for dict and zip, but it doesn't form any unnecessary intermediate data-structures or have to deal with local lookups in function application.
Runner-up, dict comprehension:
A close runner-up to using the dict constructor is to use the native syntax of a dict comprehension (not a list comprehension, as others have mistakenly put it):
new_dict = {k: v for k, v in zip(keys, values)}
Choose this when you need to map or filter based on the keys or value.
In Python 2, zip returns a list, to avoid creating an unnecessary list, use izip instead (aliased to zip can reduce code changes when you move to Python 3).
from itertools import izip as zip
So that is still (2.7):
new_dict = {k: v for k, v in zip(keys, values)}
Python 2, ideal for <= 2.6
izip from itertools becomes zip in Python 3. izip is better than zip for Python 2 (because it avoids the unnecessary list creation), and ideal for 2.6 or below:
from itertools import izip
new_dict = dict(izip(keys, values))
Result for all cases:
In all cases:
>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}
Explanation:
If we look at the help on dict we see that it takes a variety of forms of arguments:
>>> help(dict)
class dict(object)
| dict() -> new empty dictionary
| dict(mapping) -> new dictionary initialized from a mapping object's
| (key, value) pairs
| dict(iterable) -> new dictionary initialized as if via:
| d = {}
| for k, v in iterable:
| d[k] = v
| dict(**kwargs) -> new dictionary initialized with the name=value pairs
| in the keyword argument list. For example: dict(one=1, two=2)
The optimal approach is to use an iterable while avoiding creating unnecessary data structures. In Python 2, zip creates an unnecessary list:
>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]
In Python 3, the equivalent would be:
>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]
and Python 3's zip merely creates an iterable object:
>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>
Since we want to avoid creating unnecessary data structures, we usually want to avoid Python 2's zip (since it creates an unnecessary list).
Less performant alternatives:
This is a generator expression being passed to the dict constructor:
generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)
or equivalently:
dict((k, v) for k, v in zip(keys, values))
And this is a list comprehension being passed to the dict constructor:
dict([(k, v) for k, v in zip(keys, values)])
In the first two cases, an extra layer of non-operative (thus unnecessary) computation is placed over the zip iterable, and in the case of the list comprehension, an extra list is unnecessarily created. I would expect all of them to be less performant, and certainly not more-so.
Performance review:
In 64 bit Python 3.8.2 provided by Nix, on Ubuntu 16.04, ordered from fastest to slowest:
>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>>
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583
dict(zip(keys, values)) wins even with small sets of keys and values, but for larger sets, the differences in performance will become greater.
A commenter said:
min seems like a bad way to compare performance. Surely mean and/or max would be much more useful indicators for real usage.
We use min because these algorithms are deterministic. We want to know the performance of the algorithms under the best conditions possible.
If the operating system hangs for any reason, it has nothing to do with what we're trying to compare, so we need to exclude those kinds of results from our analysis.
If we used mean, those kinds of events would skew our results greatly, and if we used max we will only get the most extreme result - the one most likely affected by such an event.
A commenter also says:
In python 3.6.8, using mean values, the dict comprehension is indeed still faster, by about 30% for these small lists. For larger lists (10k random numbers), the dict call is about 10% faster.
I presume we mean dict(zip(... with 10k random numbers. That does sound like a fairly unusual use case. It does makes sense that the most direct calls would dominate in large datasets, and I wouldn't be surprised if OS hangs are dominating given how long it would take to run that test, further skewing your numbers. And if you use mean or max I would consider your results meaningless.
Let's use a more realistic size on our top examples:
import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))
And we see here that dict(zip(... does indeed run faster for larger datasets by about 20%.
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095
Try this:
>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}
In Python 2, it's also more economical in memory consumption compared to zip.
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
out = dict(zip(keys, values))
Output:
{'food': 'spam', 'age': 42, 'name': 'Monty'}
You can also use dictionary comprehensions in Python ≥ 2.7:
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}
A more natural way is to use dictionary comprehension
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
dict = {keys[i]: values[i] for i in range(len(keys))}
If you need to transform keys or values before creating a dictionary then a generator expression could be used. Example:
>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3]))
Take a look Code Like a Pythonista: Idiomatic Python.
with Python 3.x, goes for dict comprehensions
keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')
dic = {k:v for k,v in zip(keys, values)}
print(dic)
More on dict comprehensions here, an example is there:
>>> print {i : chr(65+i) for i in range(4)}
{0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}
For those who need simple code and aren’t familiar with zip:
List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']
This can be done by one line of code:
d = {List1[n]: List2[n] for n in range(len(List1))}
you can use this below code:
dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))
But make sure that length of the lists will be same.if length is not same.then zip function turncate the longer one.
2018-04-18
The best solution is still:
In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...:
In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}
Tranpose it:
lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
keys, values = zip(*lst)
In [101]: keys
Out[101]: ('name', 'age', 'food')
In [102]: values
Out[102]: ('Monty', 42, 'spam')
Here is also an example of adding a list value in you dictionary
list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)
always make sure the your "Key"(list1) is always in the first parameter.
{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}
I had this doubt while I was trying to solve a graph-related problem. The issue I had was I needed to define an empty adjacency list and wanted to initialize all the nodes with an empty list, that's when I thought how about I check if it is fast enough, I mean if it will be worth doing a zip operation rather than simple assignment key-value pair. After all most of the times, the time factor is an important ice breaker. So I performed timeit operation for both approaches.
import timeit
def dictionary_creation(n_nodes):
dummy_dict = dict()
for node in range(n_nodes):
dummy_dict[node] = []
return dummy_dict
def dictionary_creation_1(n_nodes):
keys = list(range(n_nodes))
values = [[] for i in range(n_nodes)]
graph = dict(zip(keys, values))
return graph
def wrapper(func, *args, **kwargs):
def wrapped():
return func(*args, **kwargs)
return wrapped
iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)
for trail in range(1, 8):
print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')
For n_nodes = 10,000,000
I get,
Iteration: 2.825081646999024
Shorthand: 3.535717916001886
Iteration: 5.051560923002398
Shorthand: 6.255070794999483
Iteration: 6.52859034499852
Shorthand: 8.221581164998497
Iteration: 8.683652416999394
Shorthand: 12.599181543999293
Iteration: 11.587241565001023
Shorthand: 15.27298851100204
Iteration: 14.816342867001367
Shorthand: 17.162912737003353
Iteration: 16.645022411001264
Shorthand: 19.976680120998935
You can clearly see after a certain point, iteration approach at n_th step overtakes the time taken by shorthand approach at n-1_th step.
It can be done by the following way.
keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']
dict = {}
for i in range(len(keys)):
dict[keys[i]] = values[i]
print(dict)
{'name': 'Monty', 'age': 42, 'food': 'spam'}
All answers sum up:
l = [1, 5, 8, 9]
ll = [3, 7, 10, 11]
zip:
dict(zip(l,ll)) # {1: 3, 5: 7, 8: 10, 9: 11}
#if you want to play with key or value #recommended
{k:v*10 for k, v in zip(l, ll)} #{1: 30, 5: 70, 8: 100, 9: 110}
counter:
d = {}
c=0
for k in l:
d[k] = ll[c] #setting up keys from the second list values
c += 1
print(d)
{1: 3, 5: 7, 8: 10, 9: 11}
enumerate:
d = {}
for i,k in enumerate(l):
d[k] = ll[i]
print(d)
{1: 3, 5: 7, 8: 10, 9: 11}
Solution as dictionary comprehension with enumerate:
dict = {item : values[index] for index, item in enumerate(keys)}
Solution as for loop with enumerate:
dict = {}
for index, item in enumerate(keys):
dict[item] = values[index]
If you are working with more than 1 set of values and wish to have a list of dicts you can use this:
def as_dict_list(data: list, columns: list):
return [dict((zip(columns, row))) for row in data]
Real-life example would be a list of tuples from a db query paired to a tuple of columns from the same query. Other answers only provided for 1 to 1.
keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']
dic = {}
c = 0
for i in keys:
dic[i] = values[c]
c += 1
print(dic)
{'name': 'Monty', 'age': 42, 'food': 'spam'}
import pprint
p = ['A', 'B', 'C']
q = [5, 2, 7]
r = ["M", "F", "M"]
s = ['Sovabazaar','Shyambazaar','Bagbazaar','Hatkhola']
def makeDictUsingAlternateLists1(**rest):
print("*rest.keys() : ",*rest.keys())
print("rest.keys() : ",rest.keys())
print("*rest.values() : ",*rest.values())
print("**rest.keys() : ",rest.keys())
print("**rest.values() : ",rest.values())
[print(a) for a in zip(*rest.values())]
[ print(dict(zip(rest.keys(),a))) for a in zip(*rest.values())]
print("...")
finalRes= [ dict( zip( rest.keys(),a)) for a in zip(*rest.values())]
return finalRes
l = makeDictUsingAlternateLists1(p=p,q=q,r=r,s=s)
pprint.pprint(l)
"""
*rest.keys() : p q r s
rest.keys() : dict_keys(['p', 'q', 'r', 's'])
*rest.values() : ['A', 'B', 'C'] [5, 2, 7] ['M', 'F', 'M'] ['Sovabazaar', 'Shyambazaar', 'Bagbazaar', 'Hatkhola']
**rest.keys() : dict_keys(['p', 'q', 'r', 's'])
**rest.values() : dict_values([['A', 'B', 'C'], [5, 2, 7], ['M', 'F', 'M'], ['Sovabazaar', 'Shyambazaar', 'Bagbazaar', 'Hatkhola']])
('A', 5, 'M', 'Sovabazaar')
('B', 2, 'F', 'Shyambazaar')
('C', 7, 'M', 'Bagbazaar')
{'p': 'A', 'q': 5, 'r': 'M', 's': 'Sovabazaar'}
{'p': 'B', 'q': 2, 'r': 'F', 's': 'Shyambazaar'}
{'p': 'C', 'q': 7, 'r': 'M', 's': 'Bagbazaar'}
...
[{'p': 'A', 'q': 5, 'r': 'M', 's': 'Sovabazaar'},
{'p': 'B', 'q': 2, 'r': 'F', 's': 'Shyambazaar'},
{'p': 'C', 'q': 7, 'r': 'M', 's': 'Bagbazaar'}]
"""
method without zip function
l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
for l2_ in l2:
d1[l1_] = l2_
l2.remove(l2_)
break
print (d1)
{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}
Although there are multiple ways of doing this but i think most fundamental way of approaching it; creating a loop and dictionary and store values into that dictionary. In the recursive approach the idea is still same it but instead of using a loop, the function called itself until it reaches to the end. Of course there are other approaches like using dict(zip(key, value)) and etc. These aren't the most effective solutions.
y = [1,2,3,4]
x = ["a","b","c","d"]
# This below is a brute force method
obj = {}
for i in range(len(y)):
obj[y[i]] = x[i]
print(obj)
# Recursive approach
obj = {}
def map_two_lists(a,b,j=0):
if j < len(a):
obj[b[j]] = a[j]
j +=1
map_two_lists(a, b, j)
return obj
res = map_two_lists(x,y)
print(res)
Both the results should print
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}
If you have a dictionary in python how would you find which element occurs in it the most amount of times. For example if you had the following dictionary, the name Bob occurs the most(3 times)(Once as a key and twice as a value). How would you find that name that occurs the most?
Also, I would prefer not to import anything (as I am a beginner)
dict = {'Mark': ['Paul', 'Bob', 'Carol', 'Leanne', 'Will'], 'Paul': ['Will', 'Zach'], 'Bob': ['Sarah', 'Don'], 'Tim': ['Bob', 'Carol']}
You can count the keys using a Counter, and update it with the count of the values.
You can then use the most_common method of the Counter to get the most common name:
from collections import Counter
from itertools import chain
d = {'Mark': ['Paul', 'Bob', 'Carol', 'Leanne', 'Will'], 'Paul': ['Will', 'Zach'], 'Bob': ['Sarah', 'Don'], 'Tim': ['Bob', 'Carol']}
count = Counter(d.keys())
count.update(chain.from_iterable(d.values()))
print(count.most_common(1))
# [('Bob', 3)]
print(count.most_common(1)[0][0])
# Bob
I guess what you mean is how to find which element is the most common among all the lists that appear in your dictionary as values. If that's the case, the following should do the trick:
from collections import Counter
from itertools import chain
dict = {
'Mark': ['Paul', 'Bob', 'Carol', 'Leanne', 'Will'],
'Paul': ['Will', 'Zach'],
'Bob': ['Sarah', 'Don'],
'Tim': ['Bob', 'Carol']
}
counter = Counter(chain.from_iterable(list(dict.values())))
counter.most_common()
[('Bob', 2), ('Carol', 2), ('Will', 2), ('Paul', 1), ('Leanne', 1), ('Zach', 1), ('Sarah', 1), ('Don', 1)]
If you also need to take keys into account, then:
counter = Counter(chain.from_iterable(list(dict.values()) + [dict.keys()]))
counter.most_common()
[('Bob', 3), ('Paul', 2), ('Carol', 2), ('Will', 2), ('Leanne', 1), ('Zach', 1), ('Sarah', 1), ('Don', 1), ('Mark', 1), ('Tim', 1)]
If you don't want to use external libraries:
l = list(dict.keys()) + sum(list(dict.values()), []) # flatten list of lists
max(l, key=l.count)
>>> 'Bob'
Here is a way to do this without imports. It defines a check function, iterates through dic once to generate a dic_count, then uses another for-loop to get the max_count and the most_common_name.
Sidenote: Never name variables or functions after built-in Python functions or objects. This is why I renamed dict to dic.
dic = {'Mark': ['Paul', 'Bob', 'Carol', 'Leanne', 'Will'], 'Paul': ['Will', 'Zach'], 'Bob': ['Sarah', 'Don'], 'Tim': ['Bob', 'Carol']}
dic_count = {}
# Adds string to dic_count if it's not in,
# otherwise increments its count
def check(string):
if string in dic_count:
dic_count[string] += 1
else:
dic_count[string] = 1
for key, value in dic.items():
# Calls the check function for both keys and values
check(key)
for name in value:
check(name)
max_num = 0
most_common_name = ""
for key, value in dic_count.items():
# If the count is greater than max_num,
# updates both max_num and most_common_name
if value > max_num:
max_num = value
most_common_name = key
print(most_common_name)
# Prints Bob
If you would like to get multiple names, change the last part to
max_num = 0
most_common_names = ""
for key, value in dic_count.items():
# If the count is greater than max_num,
# updates both max_num and most_common_name
if value > max_num:
max_num = value
most_common_names = key
elif value == max_num:
most_common_names += " " + key
print(most_common_names)
# Prints Bob Will after adding an extra
# 'Will' to the dictionary
Alternatively, if you would like to avoid defining a function, simply replace the top part with:
for key, value in dic.items():
# Adds string to dic_count if it's not in,
# otherwise increments its count
if key in dic_count:
dic_count[key] += 1
else:
dic_count[key] = 1
for name in value:
if name in dic_count:
dic_count[name] += 1
else:
dic_count[name] = 1
You can create a list with all items (keys + values) of the dict and use collections.Counter. d is your dictionary (dict that you used is not a proper name for Python as its already used fro built in structure)
from collections import Counter
l=[i for i in d.keys()]+[i for k in d.values() for i in k]
res=Counter(l)
>>> print(res)
Counter({'Bob': 3, 'Paul': 2, 'Carol': 2, 'Will': 2, 'Mark': 1, 'Tim': 1, 'Leanne': 1, 'Zach': 1, 'Sarah': 1, 'Don': 1})
By any chance are you looking for something like this
dic = {'Mark': ['Paul', 'Bob', 'Carol', 'Leanne', 'Will'], 'Paul': ['Will', 'Zach'], 'Bob': ['Sarah', 'Don'], 'Tim': ['Bob', 'Carol']}
#getting all the keys
keyslist=dic.keys()
#fetting all the values of dic as list
valuelist=list(dic.values())
#valuelist.append(keyslist)
test_list=[]
test_list.extend(list(keyslist))
for x in valuelist:
test_list.extend(x)
#list with all elements from dict
print(test_list)
# get most frequent element
max = 0
res = test_list[0]
for i in test_list:
freq = test_list.count(i)
if freq > max:
max = freq
res = i
# printing result
print ("Most frequent element is : " + str(res)+ " Frequency :" +str(max))
Output:
Most frequent element is : Bob Frequency :3
I know this is not the best way ..if anybody have any suggestion to make please leave them in the comment i will edit my answer with those
Please check the comments in the code
Use chain to combine keys and values
Use defaultdict which is a special case of dict
where key is appended if not present
Code:
from itertools import chain
from collections import defaultdict
# do not use dict - no shadowing built-in dict
my_dict = {'Mark': ['Paul', 'Bob', 'Carol', 'Leanne', 'Will'], 'Paul': ['Will', 'Zach'], 'Bob': ['Sarah', 'Don'], 'Tim': ['Bob', 'Carol']}
#searching for one specific name occurence
name_to_search = 'Bob'
name_ctr = sum([1 for ele in chain(my_dict.keys(), *(my_dict.values())) if ele == name_to_search])
print(f'{name_to_search} occurs {name_ctr} times')
#searching for max occuring name in a dictionary
my_dict_name_ctr = defaultdict(int)
for name in chain(my_dict.keys(), *(my_dict.values())):
my_dict_name_ctr[name] += 1
max_occuring_val = max(my_dict_name_ctr.values())
most_occuring_names = [name for name,val in my_dict_name_ctr.items() if val == max_occuring_val]
print(most_occuring_names, 'occurs', max_occuring_val, 'times')
Output:
Bob occurs 3 times
['Bob'] occurs 3 times
I have a list of names alphabetically, like:
list = ['ABC', 'ACE', 'BED', 'BRT', 'CCD', ..]
How can I get element from each starting letter? Do I have to iterate the list one time? or Does python has some function to do it? New to python, this may be a really naive problem.
Suppose I want to get the second element from names that starts from 'A', this case I get 'ACE'.
If you're going to do multiple searches, you should take the one-time hit of iterating through everything and build a dictionary (or, to make it simpler, collections.defaultdict):
from collections import defaultdict
d = defaultdict(list)
words = ['ABC', 'ACE', 'BED', 'BRT', 'CCD', ...]
for word in words:
d[word[0]].append(word)
(Note that you shouldn't name your own variable list, as it shadows the built-in.)
Now you can easily query for the second word starting with "A":
d["A"][1] == "ACE"
or the first two words for each letter:
first_two = {c: w[:2] for c, w in d.items()}
Using generator expression and itertools.islice:
>>> import itertools
>>> names = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']
>>> next(itertools.islice((name for name in names if name.startswith('A')), 1, 2), 'no-such-name')
'ACE'
>>> names = ['ABC', 'BBD', 'BED', 'BRT', 'CCD']
>>> next(itertools.islice((name for name in names if name.startswith('A')), 1, 2), 'no-such-name')
'no-such-name'
Simply group all the elements by their first char
from itertools import groupby
from operator import itemgetter
example = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']
d = {g:list(values) for g, values in groupby(example, itemgetter(0))}
Now to get a value starting with a:
print d.get('A', [])
This is most usefull when you have a static list and will have multiple queries since as you may see, getting the 3rd item starting with 'A' is done in O(1)
You might want to use list comprehensions
mylist = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']
elements_starting_with_A = [i for i in mylist if i[0] == 'A']
>>> ['ABC', 'ACE']
second = elements_starting_with_A[1]
>>> 'ACE'
In addition to list comprehension as others have mentioned, lists also have a sort() method.
mylist = ['AA', 'BB', 'AB', 'CA', 'AC']
newlist = [i for i in mylist if i[0] == 'A']
newlist.sort()
newlist
>>> ['AA', 'AB', 'AC']
The simple solution is to iterate over the whole list in O(n) :
(name for name in names if name.startswith('A'))
However you could sort the names and search in O(log(n)) for the item which is supposed to be on the index or after (using lexicographic comparison). The module bisect will help you to find the bounds :
from bisect import bisect_left
names = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']
names.sort()
lower = bisect_left(names, 'B')
upper = bisect_left(names, chr(1+ord('B')))
print [names[i] for i in range(lower, upper)]
# ['BED', 'BRT']
I am writing a python program where I will be appending numbers into a list, but I don't want the numbers in the list to repeat. So how do I check if a number is already in the list before I do list.append()?
You could do
if item not in mylist:
mylist.append(item)
But you should really use a set, like this :
myset = set()
myset.add(item)
EDIT: If order is important but your list is very big, you should probably use both a list and a set, like so:
mylist = []
myset = set()
for item in ...:
if item not in myset:
mylist.append(item)
myset.add(item)
This way, you get fast lookup for element existence, but you keep your ordering. If you use the naive solution, you will get O(n) performance for the lookup, and that can be bad if your list is big
Or, as #larsman pointed out, you can use OrderedDict to the same effect:
from collections import OrderedDict
mydict = OrderedDict()
for item in ...:
mydict[item] = True
If you want to have unique elements in your list, then why not use a set, if of course, order does not matter for you: -
>>> s = set()
>>> s.add(2)
>>> s.add(4)
>>> s.add(5)
>>> s.add(2)
>>> s
39: set([2, 4, 5])
If order is a matter of concern, then you can use: -
>>> def addUnique(l, num):
... if num not in l:
... l.append(num)
...
... return l
You can also find an OrderedSet recipe, which is referred to in Python Documentation
If you want your numbers in ascending order you can add them into a set and then sort the set into an ascending list.
s = set()
if number1 not in s:
s.add(number1)
if number2 not in s:
s.add(number2)
...
s = sorted(s) #Now a list in ascending order
You could probably use a set object instead. Just add numbers to the set. They inherently do not replicate.
To check if a number is in a list one can use the in keyword.
Let's create a list
exampleList = [1, 2, 3, 4, 5]
Now let's see if it contains the number 4:
contains = 4 in exampleList
print(contains)
>>>> True
As you want to append when an element is not in a list, the not in can also help
exampleList2 = ["a", "b", "c", "d", "e"]
notcontain = "e" not in exampleList2
print(notcontain)
>>> False
But, as others have mentioned, you may want to consider using a different data structure, more specifically, set. See examples below (Source):
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
>>> print(basket) # show that duplicates have been removed
{'orange', 'banana', 'pear', 'apple'}
'orange' in basket # fast membership testing
True
'crabgrass' in basket
False
# Demonstrate set operations on unique letters from two words
...
a = set('abracadabra')
b = set('alacazam')
a # unique letters in a
>>> {'a', 'r', 'b', 'c', 'd'}
a - b # letters in a but not in b
>>> {'r', 'd', 'b'}
a | b # letters in a or b or both
>>> {'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
a & b # letters in both a and b
>>> {'a', 'c'}
a ^ b # letters in a or b but not both
>>> {'r', 'd', 'b', 'm', 'z', 'l'}