Adding dictionaries together [duplicate] - python

This question already has answers here:
How to concatenate two dictionaries to create a new one? [duplicate]
(5 answers)
Closed 7 years ago.
I have two dictionaries and I'd like to be able to make them one:
Something like this pseudo-Python would be nice:
dic0 = {'dic0': 0}
dic1 = {'dic1': 1}
ndic = dic0 + dic1
# ndic would equal {'dic0': 0, 'dic1': 1}

If you're interested in creating a new dict without using intermediary storage: (this is faster, and in my opinion, cleaner than using dict.items())
dic2 = dict(dic0, **dic1)
Or if you're happy to use one of the existing dicts:
dic0.update(dic1)

Here are quite a few ways to add dictionaries.
You can use Python3's dictionary unpacking feature:
ndic = {**dic0, **dic1}
Note that in the case of duplicates, values from later arguments are used. This is also the case for the other examples listed here.
Or create a new dict by adding both items.
ndic = dict(tuple(dic0.items()) + tuple(dic1.items()))
If modifying dic0 is OK:
dic0.update(dic1)
If modifying dic0 is NOT OK:
ndic = dic0.copy()
ndic.update(dic1)
If all the keys in one dict are ensured to be strings (dic1 in this case, of course args can be swapped)
ndic = dict(dic0, **dic1)
In some cases it may be handy to use dict comprehensions (Python 2.7 or newer),Especially if you want to filter out or transform some keys/values at the same time.
ndic = {k: v for d in (dic0, dic1) for k, v in d.items()}

>>> dic0 = {'dic0':0}
>>> dic1 = {'dic1':1}
>>> ndic = dict(list(dic0.items()) + list(dic1.items()))
>>> ndic
{'dic0': 0, 'dic1': 1}
>>>

You are looking for the update method
dic0.update( dic1 )
print( dic0 )
gives
{'dic0': 0, 'dic1': 1}

dic0.update(dic1)
Note this doesn't actually return the combined dictionary, it just mutates dic0.

The easiest way to do it is to simply use your example code, but using the items() member of each dictionary. So, the code would be:
dic0 = {'dic0': 0}
dic1 = {'dic1': 1}
dic2 = dict(dic0.items() + dic1.items())
I tested this in IDLE and it works fine.
However, the previous question on this topic states that this method is slow and chews up memory. There are several other ways recommended there, so please see that if memory usage is important.

Related

Unpacking list into strings, splitting them and creating a dictionary from there

I'm using python 3.x and I want to create a dictionary from a list. That is, I have each key and value concatenated in a string, and each entry as an element in a list.
my_list = ['val1 key1', 'val2 key2', 'val2 key2']
I can split it into two lists using
values,keys = zip(*(s.split() for s in my_list))
Creating a dictionary from there is easy. Since I still need to do stuff to the keys and values, I do:
my_dict = {k[:-1]:float(v) for k,v in zip(keys,values)}
Out of mere curiosity, I was wondering If there is a way where to avoid the intermediate lists. In short I need to access each list element, split the string, do something to each split part and input it as key:value pair into a dictionary. Following this question I tried
my_dict = {k[:-1]:float(v) for v,k in zip(*(s.split() for s in my_list))}
But I get ValueError: too many values to unpack (expected 2). Then I tried simply using a generator (I think it's a generator) inside the dictionary comprehension syntax and it work. But I don't like it, since the second for is used only to extract an element from the list:
my_dict = {s[1][:-1]:float(s[0]) for s in (s.split(', ') for s in my_list)}
This is what I'm currently using and works perfectly, but I'd like to know why the second solution doesn't work. to me, it seems it should, and that my solution uses one to many for. I'm aware it's not a super relevant question, but I'd like to learn. Also, I'm open to title suggestions.
EDIT1: Fixed a few syntax errors I had.
EDIT2: A full working and explicit example with expected result, as suggested. I'm still working on making good mcve's:
my_list = ['1.123, name1\n', '2.3234, name2\n', '3.983, name3\n', '4.23, name4\n']
The output I want is what I would get if I manually did
my_dict = {'name1':1.123, 'name2':2.3234, 'name3':3.983, 'name4':4.23}
Method that creates intermediate lists:
values,keys = zip(*(s.split(', ') for s in my_list))
print(values)
>>> ('1.123', '2.3234', '3.983', '4.23')
print(keys)
>>> ('name1\n', 'name2\n', 'name3\n', 'name4\n')
my_dict = {k[:-1]:float(v) for k,v in zip(keys,values)}
print(my_dict)
>>> {'name4': 4.23, 'name2': 2.3234, 'name1': 1.123, 'name3': 3.983}
Example that I don't know why it does not work:
my_dict = {k[:-1]:float(v) for v,k in zip(*(s.split(', ') for s in my_list))}
>>> ValueError: too many values to unpack (expected 2)
Working example that to me, seems it uses one to for inside the list/dict comprehension/generator expression:
my_dict = {s[1][:-1]:float(s[0]) for s in (s.split(', ') for s in my_list)}
print(my_dict)
>>> {'name4': 4.23, 'name2': 2.3234, 'name1': 1.123, 'name3': 3.983}
My actual strings look something like '0.9493432915614861, zf_AB012_bn_BOS\n'
, that's why I use a more readable example.
EDIT3: I just learned of the str.strip() method. This makes the line creating the dictionary a bit nicer:
my_dict = {s[1].strip():float(s[0]) for s in (s.split(', ') for s in my_list)}
def dict_unzip(lst):
for x in lst:
yield reversed(x.split(' ', 1))
my_dict = dict(dict_unzip(my_list))
But since it's Python3 things are notably simpler actually:
my_dict = dict(map(lambda s: reversed(s.split(' ', 1)), my_list))
Or even
my_dict = dict(reversed(s.split(' ', 1)) for s in my_list)

Is there something simple like a set for un-hashable objects?

For hashable objects inside a dict I could easily pair down duplicate values store in a dict using a set. For example:
a = {'test': 1, 'key': 1, 'other': 2}
b = set(a.values())
print(b)
Would display [1,2]
Problem I have is I am using a dict to store mapping between variable keys in __dict__ and the corresponding processing functions that will be passed to an engine to order and process those functions, some of these functions may be fast some may be slower due to accessing an API. The problem is each function may use multiple variable, therefor need multiple mappings in the dict. I'm wondering if there is a way to do this or if I am stuck writing my own solution?
Ended up building a callable class, since caching could speed things up for me:
from collections.abc import Callable
class RemoveDuplicates(Callable):
input_cache = []
output_cache = []
def __call__(self, in_list):
if list in self.input_cache:
idx = self.input_cache.index(in_list)
return self.output_cache[idx]
else:
self.input_cache.append(in_list)
out_list = self._remove_duplicates(in_list)
self.output_cache.append(out_list)
return out_list
def _remove_duplicates(self, src_list):
result = []
for item in src_list:
if item not in result:
result.append(item)
return result
If the objects can be ordered, you can use itertools.groupby to eliminate the duplicates:
>>> a = {'test': 1, 'key': 1, 'other': 2}
>>> b = [k for k, it in itertools.groupby(sorted(a.values()))]
>>> print(b)
[1, 2]
Is there something simple like a set for un-hashable objects
Not in the standard library but you need to look beyond and search for BTree implementation of dictionary. I googled and found few hits where the first one (BTree)seems promising and interesting
Quoting from the wiki
The BTree-based data structures differ from Python dicts in several
fundamental ways. One of the most important is that while dicts
require that keys support hash codes and equality comparison, the
BTree-based structures don’t use hash codes and require a total
ordering on keys.
Off-course its trivial fact that a set can be implemented as a dictionary where the value is unused.
You could (indirectly) use the bisect module to create sorted collection of your values which would greatly speed-up the insertion of new values and value membership testing in general — which together can be utilized to unsure that only unique values get put into it.
In the code below, I've used un-hashable set values for the sake of illustration.
# see http://code.activestate.com/recipes/577197-sortedcollection
from sortedcollection import SortedCollection
a = {'test': {1}, 'key': {1}, 'other': {2}}
sc = SortedCollection()
for value in a.values():
if value not in sc:
sc.insert(value)
print(list(sc)) # --> [{1}, {2}]

Declaring a dictionary using another value within the same dictionary?

I'm using python trying to basically do this:
myDict = {"key1" : 1, "key2" : myDict["key1"]+1}
...if you catch my drift. Possible without using multiple statements?
EDIT: Also, if anyone could tell me a better way to state this question more clearly that would be cool. I don't really know how to word what I'm asking.
EDIT2: Seems to be some confusion - yes, it's more complex than just "key2":1+1, and what I'm doing is mostly for code readability as it will get messy if I have to 2-line it.
Here's a bit more accurate code sample of what I'm trying to do...though it's still not nearly as complex as it gets :P
lvls={easy: {mapsize:(10,10), winPos:(mapsize[0]-1,mapsize[1]-1)},
medium:{mapsize:(15,15), winPos:(mapsize[0]-RANDOMINT,mapsize[1]-1)},
hard: {mapsize:(20,20), winPos:(mapsize[0]-RANDOMINT,mapsize[1]-RANDOMINT)}
}
No, this isn't possible in general without using multiple statements.
In this particular case, you could get around it in a hacky way. For example:
myDict = dict(zip(("key1", "key2"), itertools.count(1))
However, that will only work when you want to specify a single start value and everything else will be sequential, and presumably that's not general enough for what you want.
If you're doing this kind of thing a lot, you could wrap those multiple statements up in some suitably-general function, so that each particular instance is just a single expression. For example:
def make_funky_dict(*args):
myDict = {}
for key, value in zip(*[iter(a)]*2):
if value in myDict:
value = myDict[value] + 1
myDict[key] = value
return myDict
myDict = make_funky_dict("key1", 1, "key2", "key1")
But really, there's no good reason not to use multiple statements here, and it will probably be a lot clearer, so… I'd just do it that way.
It's not possible without using multiple statements, at least not using some of the methods from your problem statement. But here's something, using dict comprehension:
>>> myDict = {"key" + str(key): value for (key, value) in enumerate(range(7))}
>>> myDict
{'key0': 0,
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
'key6': 6}
Of course those aren't in order, but they're all there.
The only variable you are trying to use is an integer. How about a nice function:
def makelevel(size,jitterfunc=lambda:0):
return {'mapsize':(size,size), 'winPos':(size-1+jitterfunc(),size-1+jitterfunc())}
lvls = {hardness,makelevel(size) for hardness, size in [('easy',10),('medium',15), ('hard',20)]}
Of course, this function looks a bit like a constructor. Maybe you should be using objects?
If you want a dict that will allow you to to have values that are evaluated on demand you can do something like this:
class myDictType(dict):
def __getitem__(self, key):
retval = dict.__getitem__(self,key)
if type(retval) == type(lambda: 1):
return retval()
return retval
myDict = myDictType()
myDict['bar'] = lambda: foo['foo'] + 1
myDict['foo'] = 1
print myDict['bar'] #This'll print a 2
myDict['foo'] = 2
print myDict['bar'] #This'll print a 3
This overrides __getitem__ in the dictionary to return whatever is stored in it (like a normal dictionary,) unless what is stored there is a lambda. If the value is a lambda, it instead evaluates it and returns the result.

How can I get Python to automatically create missing key/value pairs in a dictionary? [duplicate]

This question already has answers here:
Is there a standard class for an infinitely nested defaultdict?
(6 answers)
Closed 9 years ago.
I'm creating a dictionary structure that is several levels deep. I'm trying to do something like the following:
dict = {}
dict['a']['b'] = True
At the moment the above fails because key 'a' does not exist. At the moment I have to check at every level of nesting and manually insert an empty dictionary. Is there some type of syntactic sugar to be able to do something like the above can produce:
{'a': {'b': True}}
Without having to create an empty dictionary at each level of nesting?
As others have said, use defaultdict. This is the idiom I prefer for arbitrarily-deep nesting of dictionaries:
def nested_dict():
return collections.defaultdict(nested_dict)
d = nested_dict()
d[1][2][3] = 'Hello, dictionary!'
print(d[1][2][3]) # Prints Hello, dictionary!
This also makes checking whether an element exists a little nicer, too, since you may no longer need to use get:
if not d[2][3][4][5]:
print('That element is empty!')
This has been edited to use a def rather than a lambda for pep8 compliance. The original lambda form looked like this below, which has the drawback of being called <lambda> everywhere instead of getting a proper function name.
>>> nested_dict = lambda: collections.defaultdict(nested_dict)
>>> d = nested_dict()
>>> d[1][2][3]
defaultdict(<function <lambda> at 0x037E7540>, {})
Use defaultdict.
Python: defaultdict of defaultdict?
Or you can do this, since dict() function can handle **kwargs:
http://docs.python.org/2/library/functions.html#func-dict
print dict(a=dict(b=True))
# {'a': {'b' : True}}
If the depth of your data structure is fixed (that is, you know in advance that you need mydict[a][b][c] but not mydict[a][b][c][d]), you can build a nested defaultdict structure using lambda expressions to create the inner structures:
two_level = defaultdict(dict)
three_level = defaultdict(lambda: defaultdict(dict))
four_level = defaultdict(lamda: defaultdict(lambda: defaultdict(dict)))

check dictionary for doubled keys [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to raise error if duplicates keys in dictionary
I was recently generating huge dictionaries with hundreds of thousands of keys (such that noticing a bug by looking at them wasn't feasible). They were syntactically correct, yet there was a bug somewhere. It boiled down to "duplicate keys":
{'a':1, ..., 'a':2}
this code compiles fine and I could not figure out why a key has value of 2 as I expected 1. The problem is obvious now.
The question is how I can prevent that in the future. I think this is impossible within python. I used
grep "'.*'[ ]*:" myfile.py | sort | uniq -c | grep -v 1
which is not bulletproof. Any other ideas (within python, this grep is just to illustrate what I'd tried)?
EDIT: I don't want duplicate keys, just need to spot that this occurs and edit data manually
A dict cannot contain double keys. So all you need to do is execute the code and then dump the repr() of the dict.
Another option is creating the dict items as (key, value) tuples. By storing them in a list you can easily create a dict from them and then check if the len()s of the dict/list differ.
If you need to have multiple values per key you can store the values in a list using defaultdict.
>>> from collections import defaultdict
>>> data_dict = defaultdict(list)
>>> data_dict['key'].append('value')
>>> data_dict
defaultdict(<type 'list'>, {'key': ['value']})
>>> data_dict['key'].append('second_value')
>>> data_dict
defaultdict(<type 'list'>, {'key': ['value', 'second_value']})
Are you generating a Python file containing a giant dictionary? Something like:
print "{"
for lines in file:
key, _, value = lines.partition(" ")
print " '%s': '%s',"
print "}"
If so, there's not much you can do to prevent this, as you cannot easily override the construction of the builtin dict.
Instead I'd suggest you validate the data while constructing the dictionary string. You could also generate different syntax:
dict(a = '1', a = '2')
..which will generate a SyntaxError if the key is duplicated. However, these are not exactly equivalent, as dictionary keys are a lot more flexible than keyword-args (e.g {123: '...'} is valid, butdict(123 = '...')` is an error)
You could generate a function call like:
uniq_dict([('a', '...'), ('a', '...')])
Then include the function definition:
def uniq_dict(values):
thedict = {}
for k, v in values:
if k in thedict:
raise ValueError("Duplicate key %s" % k)
thedict[k] = v
return thedict
You don't say or show exactly how you're generating the dictionary display you have where the duplicate keys are appearing. But that is where the problem lies.
Instead of using something like {'a':1, ..., 'a':2} to construct the dictionary, I suggest that you use this form: dict([['a', 1], ..., ['a', 2]]) which will create one from a supplied list of [key, value] pairs. This approach will allow you to check the list of pairs for duplicates before passing it to dict() to do the actual construction of the dictionary.
Here's an example of one way to check the list of pairs for duplicates:
sample = [['a', 1], ['b', 2], ['c', 3], ['a', 2]]
def validate(pairs):
# check for duplicate key names and raise an exception if any are found
dups = []
seen = set()
for key_name,val in pairs:
if key_name in seen:
dups.append(key_name)
else:
seen.add(key_name)
if dups:
raise ValueError('Duplicate key names encountered: %r' % sorted(dups))
else:
return pairs
my_dict = dict(validate(sample))

Categories