I know to write something simple and slow with loop, but I need it to run super fast in big scale.
input:
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
desired out put:
d = {1 : ["txt1", "txt2"], 2 : "txt3"]
There is something built-in at python which make dict() extend key instead replacing it?
dict(list(zip(lst[0], lst[1])))
One option is to use dict.setdefault:
out = {}
for k, v in zip(*lst):
out.setdefault(k, []).append(v)
Output:
{1: ['txt1', 'txt2'], 2: ['txt3']}
If you want the element itself for singleton lists, one way is adding a condition that checks for it while you build an output dictionary:
out = {}
for k,v in zip(*lst):
if k in out:
if isinstance(out[k], list):
out[k].append(v)
else:
out[k] = [out[k], v]
else:
out[k] = v
or if lst[0] is sorted (like it is in your sample), you could use itertools.groupby:
from itertools import groupby
out = {}
pos = 0
for k, v in groupby(lst[0]):
length = len([*v])
if length > 1:
out[k] = lst[1][pos:pos+length]
else:
out[k] = lst[1][pos]
pos += length
Output:
{1: ['txt1', 'txt2'], 2: 'txt3'}
But as #timgeb notes, it's probably not something you want because afterwards, you'll have to check for data type each time you access this dictionary (if value is a list or not), which is an unnecessary problem that you could avoid by having all values as lists.
If you're dealing with large datasets it may be useful to add a pandas solution.
>>> import pandas as pd
>>> lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
>>> s = pd.Series(lst[1], index=lst[0])
>>> s
1 txt1
1 txt2
2 txt3
>>> s.groupby(level=0).apply(list).to_dict()
{1: ['txt1', 'txt2'], 2: ['txt3']}
Note that this also produces lists for single elements (e.g. ['txt3']) which I highly recommend. Having both lists and strings as possible values will result in bugs because both of those types are iterable. You'd need to remember to check the type each time you process a dict-value.
You can use a defaultdict to group the strings by their corresponding key, then make a second pass through the list to extract the strings from singleton lists. Regardless of what you do, you'll need to access every element in both lists at least once, so some iteration structure is necessary (and even if you don't explicitly use iteration, whatever you use will almost definitely use iteration under the hood):
from collections import defaultdict
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
result = defaultdict(list)
for key, value in zip(lst[0], lst[1]):
result[key].append(value)
for key in result:
if len(result[key]) == 1:
result[key] = result[key][0]
print(dict(result)) # Prints {1: ['txt1', 'txt2'], 2: 'txt3'}
I would like to merge or update a dictionary in Python with new entries, but replace the values of entries whose key exists with the smaller of the values associated with the key in the existing entry and the new entry. For example:
Input:
dict_A = {1:14, 2:15, 3:16, 4:17}, dict_B= {2:19, 3:9, 4:11, 5:13}
Expected output:
{1:14, 2:15, 3:9, 4:11, 5:13}
I know it can be achieved with a loop iterating through the dictionaries while performing comparisons, but is there any simpler and faster ways or any helpful libraries to achieve this?
in this case you could easily use pandas to avoid writing the loop, though I dunno if there would be any speedup - didn't test that
import pandas as pd
df = pd.DataFrame([dict_A, dict_B])
out = df.min().to_dict()
output: {1: 14.0, 2: 15.0, 3: 9.0, 4: 11.0, 5: 13.0}
there's probably some edge cases you'd have to account for
Quick One-liner
c = {**a, **b, **{key:min(a[key], b[key]) for key in set(a).intersection(set(b))} }
Explanation
This should be quick enough because it uses the set.
You can merge dictionaries by using the **dictionary syntax like so: {**a, **b}. The ** simply just "expands" out the dictionary into each individual item, with the last expanded dictionary overwriting any previous ones (so in {**a, **b}, any matching keys in b overwrite the value from a).
The first thing I do is load in all the values in a and b into the new dictionary:
c = {**a, **b, ...
Then I use dictionary comprehension to generate a new dictionary, which only has the smallest value for every set of keys which are in both a and b.
... {key:min(a[key], b[key]) for key in set(a).intersection(set(b))} ...
To get the set of keys which only exist in both a and b, I convert both dictionaries to sets (which converts them to sets of their keys) and use intersection to quickly find all keys which are in both sets.
... set(a).intersection(set(b)) ...
Then I loop through each of the keys in the matching-keys set, and use the dictionary comprehension to generate a new dictionary with the current key and the min of both dictionaries' values for that key.
... {key:min(a[key], b[key]) ...
Then I use the ** syntax to "expand" this new generated dictionary with the expanded a and b, putting it last to make sure it overwrites any values from the two.
Works on the example given (ctrl-cv'd straight from my terminal):
>>> a = {1:14, 2:15, 3:16, 4:17}
>>> b = {2:19, 3:9, 4:11, 5:13}
>>> c = {**a, **b, **{key:min(a[key], b[key]) for key in set(a).intersection(set(b))} }
>>> c
{1: 14, 2: 15, 3: 9, 4: 11, 5: 13}
Here's something short to do it. Not inherently the fastest or best way to do it, but figured I'd share nonetheless.
max_val = max(max(dict_A.values()), max(dict_B.values())) + 1
keys = set(list(dict_A.keys()) + list(dict_B.keys()))
dict_C = { key : min(dict_A.get(key, max_val), dict_B.get(key, max_val)) for key in keys }
hello this is the code I have made Hope it's what you needed :
dict_A = {1:14, 2:15, 3:16, 4:17}
dict_B= {2:19, 3:9, 4:11, 5:13}
dict_res =dict_B
dict_A_keys = dict_A.keys()
dict_B_keys = dict_B.keys()
for e in dict_A_keys :
if e in dict_B_keys :
if dict_A[e]>dict_B[e]:
dict_res[e]=dict_B[e]
else:
dict_res[e]=dict_A[e]
else:
dict_res[e]=dict_A[e]
I have tested it and the output is :
{1: 14, 2: 15, 3: 9, 4: 11, 5: 13}
I have my program's output as a python dictionary and i want a list of keys from the dictn:
s = "cool_ice_wifi"
r = ["water_is_cool", "cold_ice_drink", "cool_wifi_speed"]
good_list=s.split("_")
dictn={}
for i in range(len(r)):
split_review=r[i].split("_")
counter=0
for good_word in good_list:
if good_word in split_review:
counter=counter+1
d1={i:counter}
dictn.update(d1)
print(dictn)
The conditions on which we should get the keys:
The keys with the same values will have the index copied as it is in a dummy list.
The keys with highest values will come first and then the lowest in the dummy list
Dictn={0: 1, 1: 1, 2: 2}
Expected output = [2,0,1]
You can use a list comp:
[key for key in sorted(dictn, key=dictn.get, reverse=True)]
In Python3 it is now possible to use the sorted method, as described here, to sort the dictionary in any way you choose.
Check out the documentation, but in the simplest case you can .get the dictionary's values, while for more complex operations, you'd define a key function yourself.
Dictionaries in Python3 are now insertion-ordered, so one other way to do things is to sort at the moment of dictionary creation, or you could use an OrderedDict.
Here's an example of the first option in action, which I think is the easiest
>>> a = {}
>>> a[0] = 1
>>> a[1] = 1
>>> a[2] = 2
>>> print(a)
{0: 1, 1: 1, 2: 2}
>>>
>>> [(k) for k in sorted(a, key=a.get, reverse=True)]
[2, 0, 1]
How do I add a key to an existing dictionary? It doesn't have an .add() method.
You create a new key/value pair on a dictionary by assigning a value to that key
d = {'key': 'value'}
print(d) # {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
If the key doesn't exist, it's added and points to that value. If it exists, the current value it points to is overwritten.
I feel like consolidating info about Python dictionaries:
Creating an empty dictionary
data = {}
# OR
data = dict()
Creating a dictionary with initial values
data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}
Inserting/Updating a single value
data['a'] = 1 # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)
Inserting/Updating multiple values
data.update({'c':3,'d':4}) # Updates 'c' and adds 'd'
Python 3.9+:
The update operator |= now works for dictionaries:
data |= {'c':3,'d':4}
Creating a merged dictionary without modifying originals
data3 = {}
data3.update(data) # Modifies data3, not data
data3.update(data2) # Modifies data3, not data2
Python 3.5+:
This uses a new feature called dictionary unpacking.
data = {**data1, **data2, **data3}
Python 3.9+:
The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
Deleting items in dictionary
del data[key] # Removes specific element in a dictionary
data.pop(key) # Removes the key & returns the value
data.clear() # Clears entire dictionary
Check if a key is already in dictionary
key in data
Iterate through pairs in a dictionary
for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys
Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))
To add multiple keys simultaneously, use dict.update():
>>> x = {1:2}
>>> print(x)
{1: 2}
>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}
For adding a single key, the accepted answer has less computational overhead.
"Is it possible to add a key to a Python dictionary after it has been created? It doesn't seem to have an .add() method."
Yes it is possible, and it does have a method that implements this, but you don't want to use it directly.
To demonstrate how and how not to use it, let's create an empty dict with the dict literal, {}:
my_dict = {}
Best Practice 1: Subscript notation
To update this dict with a single new key and value, you can use the subscript notation (see Mappings here) that provides for item assignment:
my_dict['new key'] = 'new value'
my_dict is now:
{'new key': 'new value'}
Best Practice 2: The update method - 2 ways
We can also update the dict with multiple values efficiently as well using the update method. We may be unnecessarily creating an extra dict here, so we hope our dict has already been created and came from or was used for another purpose:
my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})
my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}
Another efficient way of doing this with the update method is with keyword arguments, but since they have to be legitimate python words, you can't have spaces or special symbols or start the name with a number, but many consider this a more readable way to create keys for a dict, and here we certainly avoid creating an extra unnecessary dict:
my_dict.update(foo='bar', foo2='baz')
and my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value',
'foo': 'bar', 'foo2': 'baz'}
So now we have covered three Pythonic ways of updating a dict.
Magic method, __setitem__, and why it should be avoided
There's another way of updating a dict that you shouldn't use, which uses the __setitem__ method. Here's an example of how one might use the __setitem__ method to add a key-value pair to a dict, and a demonstration of the poor performance of using it:
>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}
>>> def f():
... d = {}
... for i in xrange(100):
... d['foo'] = i
...
>>> def g():
... d = {}
... for i in xrange(100):
... d.__setitem__('foo', i)
...
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539
So we see that using the subscript notation is actually much faster than using __setitem__. Doing the Pythonic thing, that is, using the language in the way it was intended to be used, usually is both more readable and computationally efficient.
dictionary[key] = value
If you want to add a dictionary within a dictionary you can do it this way.
Example: Add a new entry to your dictionary & sub dictionary
dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)
Output:
{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}
NOTE: Python requires that you first add a sub
dictionary["dictionary_within_a_dictionary"] = {}
before adding entries.
The conventional syntax is d[key] = value, but if your keyboard is missing the square bracket keys you could also do:
d.__setitem__(key, value)
In fact, defining __getitem__ and __setitem__ methods is how you can make your own class support the square bracket syntax. See Dive Into Python, Classes That Act Like Dictionaries.
You can create one:
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
self[key] = value
## example
myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)
Gives:
>>>
{'apples': 6, 'bananas': 3}
This popular question addresses functional methods of merging dictionaries a and b.
Here are some of the more straightforward methods (tested in Python 3)...
c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )
Note: The first method above only works if the keys in b are strings.
To add or modify a single element, the b dictionary would contain only that one element...
c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'
This is equivalent to...
def functional_dict_add( dictionary, key, value ):
temp = dictionary.copy()
temp[key] = value
return temp
c = functional_dict_add( a, 'd', 'dog' )
Let's pretend you want to live in the immutable world and do not want to modify the original but want to create a new dict that is the result of adding a new key to the original.
In Python 3.5+ you can do:
params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}
The Python 2 equivalent is:
params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})
After either of these:
params is still equal to {'a': 1, 'b': 2}
and
new_params is equal to {'a': 1, 'b': 2, 'c': 3}
There will be times when you don't want to modify the original (you only want the result of adding to the original). I find this a refreshing alternative to the following:
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3
or
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})
Reference: What does `**` mean in the expression `dict(d1, **d2)`?
There is also the strangely named, oddly behaved, and yet still handy dict.setdefault().
This
value = my_dict.setdefault(key, default)
basically just does this:
try:
value = my_dict[key]
except KeyError: # key not found
value = my_dict[key] = default
E.g.,
>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored
This question has already been answered ad nauseam, but since my
comment
gained a lot of traction, here it is as an answer:
Adding new keys without updating the existing dict
If you are here trying to figure out how to add a key and return a new dictionary (without modifying the existing one), you can do this using the techniques below
Python >= 3.5
new_dict = {**mydict, 'new_key': new_val}
Python < 3.5
new_dict = dict(mydict, new_key=new_val)
Note that with this approach, your key will need to follow the rules of valid identifier names in Python.
If you're not joining two dictionaries, but adding new key-value pairs to a dictionary, then using the subscript notation seems like the best way.
import timeit
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383
However, if you'd like to add, for example, thousands of new key-value pairs, you should consider using the update() method.
Here's another way that I didn't see here:
>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}
You can use the dictionary constructor and implicit expansion to reconstruct a dictionary. Moreover, interestingly, this method can be used to control the positional order during dictionary construction (post Python 3.6). In fact, insertion order is guaranteed for Python 3.7 and above!
>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>>
The above is using dictionary comprehension.
First to check whether the key already exists:
a={1:2,3:4}
a.get(1)
2
a.get(5)
None
Then you can add the new key and value.
Add a dictionary (key,value) class.
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
#self[key] = value # add new key and value overwriting any exiting same key
if self.get(key)!=None:
print('key', key, 'already used') # report if key already used
self.setdefault(key, value) # if key exit do nothing
## example
myd = myDict()
name = "fred"
myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)
I think it would also be useful to point out Python's collections module that consists of many useful dictionary subclasses and wrappers that simplify the addition and modification of data types in a dictionary, specifically defaultdict:
dict subclass that calls a factory function to supply missing values
This is particularly useful if you are working with dictionaries that always consist of the same data types or structures, for example a dictionary of lists.
>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})
If the key does not yet exist, defaultdict assigns the value given (in our case 10) as the initial value to the dictionary (often used inside loops). This operation therefore does two things: it adds a new key to a dictionary (as per question), and assigns the value if the key doesn't yet exist. With the standard dictionary, this would have raised an error as the += operation is trying to access a value that doesn't yet exist:
>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'key'
Without the use of defaultdict, the amount of code to add a new element would be much greater and perhaps looks something like:
# This type of code would often be inside a loop
if 'key' not in example:
example['key'] = 0 # add key and initial value to dict; could also be a list
example['key'] += 1 # this is implementing a counter
defaultdict can also be used with complex data types such as list and set:
>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})
Adding an element automatically initialises the list.
Adding keys to dictionary without using add
# Inserting/Updating single value
# subscript notation method
d['mynewkey'] = 'mynewvalue' # Updates if 'a' exists, else adds 'a'
# OR
d.update({'mynewkey': 'mynewvalue'})
# OR
d.update(dict('mynewkey'='mynewvalue'))
# OR
d.update('mynewkey'='mynewvalue')
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
# To add/update multiple keys simultaneously, use d.update():
x = {3:4, 5:6, 7:8}
d.update(x)
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue', 3: 4, 5: 6, 7: 8}
# update operator |= now works for dictionaries:
d |= {'c':3,'d':4}
# Assigning new key value pair using dictionary unpacking.
data1 = {4:6, 9:10, 17:20}
data2 = {20:30, 32:48, 90:100}
data3 = { 38:"value", 99:"notvalid"}
d = {**data1, **data2, **data3}
# The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
# Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))
dico["new key"] = "value"
I have a dictionary of a list of dictionaries. something like below:
x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
The length of the lists (values) is the same for all keys of dict x.
I want to get the length of any one value i.e. a list without having to go through the obvious method -> get the keys, use len(x[keys[0]]) to get the length.
my code for this as of now:
val = None
for key in x.keys():
val = x[key]
break
#break after the first iteration as the length of the lists is the same for any key
try:
what_i_Want = len(val)
except TypeError:
print 'val wasn't set'
i am not happy with this, can be made more 'pythonic' i believe.
This is most efficient way, since we don't create any intermediate lists.
print len(x[next(iter(x))]) # 2
Note: For this method to work, the dictionary should have atleast one key in it.
What about this:
val = x[x.keys()[0]]
or alternatively:
val = x.values()[0]
and then your answer is
len(val)
Some of the other solutions (posted by thefourtheye and gnibbler) are better because they are not creating an intermediate list. I added this response merely as an easy to remember and obvious option, not a solution for time-efficient usage.
Works ok in Python2 or Python3
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> next(len(i) for i in x.values())
2
This is better for Python2 as it avoids making a list of the values. Works well in Python3 too
>>> next(len(x[k]) for k in x)
2
Using next and iter:
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> val = next(iter(x.values()), None) # Use `itervalues` in Python 2.x
>>> val
[{'q': 2, 'p': 1}, {'q': 5, 'p': 4}]
>>> len(val)
2
>>> x = {}
>>> val = next(iter(x.values()), None) # `None`: default value
>>> val is None
True
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> len(x.values()[0])
2
Here, x.values gives you a list of all values then you can get length of any one value from it.