Why do changes to a nested dict inside dict2 affect dict1? [duplicate]

Why do changes to a nested dict inside dict2 affect dict1? [duplicate] - python

This question already has answers here:
assigning value in python dict (copy vs reference)
(2 answers)
Closed 4 years ago.
I don't understand these cases:
content = {'a': {'v': 1}, 'b': {'v': 2}}
d1 = {'k1': {}}
d2 = {'k2': {}}
d1['k1'].update(content)
print(d1)
content['a']['v'] = 3
content['b']['v'] = 4
d2['k2'].update(content)
print(d2)
print(d1)
>>> {'k1': {'a': {'v': 1}, 'b': {'v': 2}}}
>>> {'k2': {'a': {'v': 3}, 'b': {'v': 4}}}
>>> {'k1': {'a': {'v': 3}, 'b': {'v': 4}}}
In the case above the content of d1 is changed after the variable content is updated.
content = {'a': 1, 'b': 2}
d1 = {'k1': {}}
d2 = {'k2': {}}
d1['k1'].update(content)
print(d1)
content['a'] = 3
content['b'] = 4
d2['k2'].update(content)
print(d2)
print(d1)
>>> {'k1': {'a': 1, 'b': 2}}
>>> {'k2': {'a': 3, 'b': 4}}
>>> {'k1': {'a': 1, 'b': 2}}
However in this case d1 is not altered even if the variable content was changed. I don't understand why... any idea?

see shallow vs deep copy.
The copy here is a shallow copy so the first level entries are copies but the nested structures are references.
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects
found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the
original.

The key difference between your two snippets is that content['a']['v'] = 3 is a completely different operation than content['a'] = 3. In the first case, you're modifying the inner dictionary by changing its v key. In the latter case, you're replacing the value in the dictionary without modifying it.
It's confusing when everything's a dictionary, so let's replace the dictionaries with variables and instances of a class:
class Person:
def __init__(self, name):
self.name = name
# these two variables are represent your `content` dict
a = Person('Andy') # this variable represents `{'v': 1}`
b = Person('Belle') # this variable represents `{'v': 2}`
# the equivalent of `d1['k1'].update(content)` is a simple assignment
k1_a = a
# and the equivalent of `content['a']['v'] = 3` is changing a's name
a.name = 'Aaron'
# because k1_a and a are the same Person instance, this is reflected in k1_a:
print(k1_a.name) # output: Aaron
The key points to note here are that
k1_a = a doesn't make a copy of the Person; similar to how d1['k1'].update(content) doesn't make a copy of the {'v': 1} dict.
a.name = 'Aaron' modifies the Person; similar to how content['a']['v'] = 3 modifies the inner dict.
The equivalent of your 2nd snippet looks like this:
a = 'Andy'
b = 'Belle'
k1_a = a
a = 'Aaron'
print(k1_a) # output: Andy
This time, no object is ever modified. All we're doing is overwriting the value of the a variable, exactly how content['a'] = 3 overwrites the value of the a key in your dict.
If you don't want the changes in the inner dicts to be reflected in other dicts, you have to copy them with copy.deepcopy:
import copy
content = {'a': {'v': 1}, 'b': {'v': 2}}
d1 = {'k1': {}}
d2 = {'k2': {}}
d1['k1'].update(copy.deepcopy(content))
print(d1)
content['a']['v'] = 3
content['b']['v'] = 4
d2['k2'].update(copy.deepcopy(content))
print(d2)
print(d1)
# output:
# {'k1': {'a': {'v': 1}, 'b': {'v': 2}}}
# {'k2': {'a': {'v': 3}, 'b': {'v': 4}}}
# {'k1': {'a': {'v': 1}, 'b': {'v': 2}}}

If we replace the update() with a simple assignment:
# d1['k1'].update(content)
d1['k1'] = content
We get:
{'k1': {'a': 1, 'b': 2}}
{'k2': {'a': 3, 'b': 4}}
{'k1': {'a': 3, 'b': 4}}
(Which is different from what update does in your example.) This is because update accepts an iterable (e.g. a dictionary) and copies the key value pairs inside. It's equivalent to doing:
d1['k1'] = {k: v for k, v in content.items()}
And of course, the int values are immutables and so their reassignment does not affect the original.

Related

python deepcopy not removing internal references in list extended using * operator?

I have the following code. I extended the list of dicts (l1) by l1*3, deepcopy and assigned to l2. Now when I modify first element in l2, other corresponding elements in l2 also gets modified. So deepcopy does not remove the reference created by * operator on list of dicts ?
import copy
l1 = [{'a':1},{'b':2}]
l2 = copy.deepcopy(l1*3)
print(l2)
l2[0]['a'] = 7 # Why this changed ['a'] to 7 in all dicts in l2, even after deepcopy?
print(l2)
Output:
[{'a': 1}, {'b': 2}, {'a': 1}, {'b': 2}, {'a': 1}, {'b': 2}]
[{'a': 7}, {'b': 2}, {'a': 7}, {'b': 2}, {'a': 7}, {'b': 2}]
Expected: (Only first element should modify)
[{'a': 1}, {'b': 2}, {'a': 1}, {'b': 2}, {'a': 1}, {'b': 2}]
[{'a': 7}, {'b': 2}, {'a': 1}, {'b': 2}, {'a': 1}, {'b': 2}]
I have already found following solution to get expected output:
l2 = [d.copy() for d in l1*3]
Can someone share an explanation for why the deepcopy did not work in first code. ?

deepcopy intentionally replicates any aliased references within the structure being copied. It maintains a memo dictionary of objects already copied during the current copy operation, and when the same object is seen again, it inserts an alias to the already copied object. Among other things, this makes it safe with recursive data structures (where a non-memoized deepcopy would recurse forever until it ran out of memory and died).
If you want the individual elements to be unaliased, deepcopy them individually, e.g.:
l2 = [copy.deepcopy(x) for x in l1*3]
where the separated deepcopy operations maintain separate memoization dictionaries.

Get a dictionary with saved structure of all keys in nested dictionaries

I want to get a possible structure with all possible keys of a single field in the database, which is stored in JSON(dict) format. The structure in my task is very important, so storing in list is not suitable.There can be many levels of nesting much more than two.
This example:
dicts = {'a':1, 'b':2, 'c':{'in_c1': 2}}, \
{'a':1, 'd':2, 'c':{'dict_in_c2': {'v': 2}}},\
{'e':57}
Should return:
{'a': 1, 'b':2, 'c': {'in_c1': 2, 'dict_in_c2': {'v': 2}}, 'd': 2, 'e': 57}
The text(values of keys) is not important to me, it is better to replace it with something similar like none or an empty string.
How I can do this?

This might not be able to handle all cases, but it covers some basic cases.
I am assuming that
keys that have an integer value ( like a and b ) will NOT have any dictionary values.
keys that have a dictionary as value ( like c ) will NOT have any non-dictionary values. Also, the flattening has only been performed on the keys of the dictionary at the first level, but not on the nested levels.
Here is the code:
dicts = {'a':1, 'b':2, 'c':{'in_c1': 2}}, \
{'a':1, 'd':2, 'c':{'dict_in_c2': {'v': 2}}},\
{'e':57}
result = dict()
for dictionary in dicts:
for key, value in dictionary.items():
if not isinstance(value,dict):
result[key] = value
continue
if not result.get(key, None):
result[key] = dict()
for k, v in value.items():
result[key][k] = v
print(result)
Result -
{'b': 2, 'a': 1, 'c': {'dict_in_c2': {'v': 2}, 'in_c1': 2}, 'e': 57, 'd': 2}

If I understand your question correctly, you have a list of dicts, which you want to put it into a single dict.
If the list of dicts is:
dicts = [{'a':1, 'b':2, 'c':{'dict_in_c': 2}},
{'a':1, 'd':2, 'c':{'dict_in_c2': 2}},
{'e':57}]
Then you can do:
newdict={}
for eachdict in dicts:
for eachkey in eachdict.keys():
newdict[eachkey]=eachdict[eachkey]
newdict will be:
{'a': 1, 'b': 2, 'c': {'dict_in_c2': 2}, 'd': 2, 'e': 57}

what's wrong with my python list output [duplicate]

I set dict2 = dict1. When I edit dict2, the original dict1 also changes. Why?
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict1
>>> dict2["key2"] = "WHY?!"
>>> dict1
{'key2': 'WHY?!', 'key1': 'value1'}

Python never implicitly copies objects. When you set dict2 = dict1, you are making them refer to the same exact dict object, so when you mutate it, all references to it keep referring to the object in its current state.
If you want to copy the dict (which is rare), you have to do so explicitly with
dict2 = dict(dict1)
or
dict2 = dict1.copy()

When you assign dict2 = dict1, you are not making a copy of dict1, it results in dict2 being just another name for dict1.
To copy the mutable types like dictionaries, use copy / deepcopy of the copy module.
import copy
dict2 = copy.deepcopy(dict1)

While dict.copy() and dict(dict1) generates a copy, they are only shallow copies. If you want a deep copy, copy.deepcopy(dict1) is required. An example:
>>> source = {'a': 1, 'b': {'m': 4, 'n': 5, 'o': 6}, 'c': 3}
>>> copy1 = source.copy()
>>> copy2 = dict(source)
>>> import copy
>>> copy3 = copy.deepcopy(source)
>>> source['a'] = 10 # a change to first-level properties won't affect copies
>>> source
{'a': 10, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy1
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy2
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy3
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> source['b']['m'] = 40 # a change to deep properties WILL affect shallow copies 'b.m' property
>>> source
{'a': 10, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy1
{'a': 1, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy2
{'a': 1, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy3 # Deep copy's 'b.m' property is unaffected
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
Regarding shallow vs deep copies, from the Python copy module docs:
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

In Depth and an easy way to remember:
Whenever you do dict2 = dict1, dict2 refers to dict1. Both dict1 and dict2 points to the same location in the memory. This is just a normal case while working with mutable objects in python. When you are working with mutable objects in python you must be careful as it is hard to debug.
Instead of using dict2 = dict1, you should be using copy(shallow copy) and deepcopy method from python's copy module to separate dict2 from dict1.
The correct way is:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict1.copy()
>>> dict2
{'key1': 'value1', 'key2': 'value2'}
>>> dict2["key2"] = "WHY?"
>>> dict2
{'key1': 'value1', 'key2': 'WHY?'}
>>> dict1
{'key1': 'value1', 'key2': 'value2'}
>>> id(dict1)
140641178056312
>>> id(dict2)
140641176198960
>>>
As you can see the id of both dict1 and dict2 are different, which means both are pointing/referencing to different locations in the memory.
This solution works for dictionaries with immutable values, this is not the correct solution for those with mutable values.
Eg:
>>> import copy
>>> dict1 = {"key1" : "value1", "key2": {"mutable": True}}
>>> dict2 = dict1.copy()
>>> dict2
{'key1': 'value1', 'key2': {'mutable': True}}
>>> dict2["key2"]["mutable"] = False
>>> dict2
{'key1': 'value1', 'key2': {'mutable': False}}
>>> dict1
{'key1': 'value1', 'key2': {'mutable': False}}
>>> id(dict1)
140641197660704
>>> id(dict2)
140641196407832
>>> id(dict1["key2"])
140641176198960
>>> id(dict2["key2"])
140641176198960
You can see that even though we applied copy for dict1, the value of mutable is changed to false on both dict2 and dict1 even though we only change it on dict2. This is because we changed the value of a mutable dict part of the dict1. When we apply a copy on dict, it will only do a shallow copy which means it copies all the immutable values into a new dict and does not copy the mutable values but it will reference them.
The ultimate solution is to do a deepycopy of dict1 to completely create a new dict with all the values copied, including mutable values.
>>>import copy
>>> dict1 = {"key1" : "value1", "key2": {"mutable": True}}
>>> dict2 = copy.deepcopy(dict1)
>>> dict2
{'key1': 'value1', 'key2': {'mutable': True}}
>>> id(dict1)
140641196228824
>>> id(dict2)
140641197662072
>>> id(dict1["key2"])
140641178056312
>>> id(dict2["key2"])
140641197662000
>>> dict2["key2"]["mutable"] = False
>>> dict2
{'key1': 'value1', 'key2': {'mutable': False}}
>>> dict1
{'key1': 'value1', 'key2': {'mutable': True}}
As you can see, id's are different, it means that dict2 is completely a new dict with all the values in dict1.
Deepcopy needs to be used if whenever you want to change any of the mutable values without affecting the original dict. If not you can use shallow copy. Deepcopy is slow as it works recursively to copy any nested values in the original dict and also takes extra memory.

On python 3.5+ there is an easier way to achieve a shallow copy by using the ** unpackaging operator. Defined by Pep 448.
>>>dict1 = {"key1": "value1", "key2": "value2"}
>>>dict2 = {**dict1}
>>>print(dict2)
{'key1': 'value1', 'key2': 'value2'}
>>>dict2["key2"] = "WHY?!"
>>>print(dict1)
{'key1': 'value1', 'key2': 'value2'}
>>>print(dict2)
{'key1': 'value1', 'key2': 'WHY?!'}
** unpackages the dictionary into a new dictionary that is then assigned to dict2.
We can also confirm that each dictionary has a distinct id.
>>>id(dict1)
178192816
>>>id(dict2)
178192600
If a deep copy is needed then copy.deepcopy() is still the way to go.

The best and the easiest ways to create a copy of a dict in both Python 2.7 and 3 are...
To create a copy of simple(single-level) dictionary:
1. Using dict() method, instead of generating a reference that points to the existing dict.
my_dict1 = dict()
my_dict1["message"] = "Hello Python"
print(my_dict1) # {'message':'Hello Python'}
my_dict2 = dict(my_dict1)
print(my_dict2) # {'message':'Hello Python'}
# Made changes in my_dict1
my_dict1["name"] = "Emrit"
print(my_dict1) # {'message':'Hello Python', 'name' : 'Emrit'}
print(my_dict2) # {'message':'Hello Python'}
2. Using the built-in update() method of python dictionary.
my_dict2 = dict()
my_dict2.update(my_dict1)
print(my_dict2) # {'message':'Hello Python'}
# Made changes in my_dict1
my_dict1["name"] = "Emrit"
print(my_dict1) # {'message':'Hello Python', 'name' : 'Emrit'}
print(my_dict2) # {'message':'Hello Python'}
To create a copy of nested or complex dictionary:
Use the built-in copy module, which provides a generic shallow and deep copy operations. This module is present in both Python 2.7 and 3.*
import copy
my_dict2 = copy.deepcopy(my_dict1)

You can also just make a new dictionary with a dictionary comprehension. This avoids importing copy.
dout = dict((k,v) for k,v in mydict.items())
Of course in python >= 2.7 you can do:
dout = {k:v for k,v in mydict.items()}
But for backwards compat., the top method is better.

In addition to the other provided solutions, you can use ** to integrate the dictionary into an empty dictionary, e.g.,
shallow_copy_of_other_dict = {**other_dict}.
Now you will have a "shallow" copy of other_dict.
Applied to your example:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = {**dict1}
>>> dict2
{'key1': 'value1', 'key2': 'value2'}
>>> dict2["key2"] = "WHY?!"
>>> dict1
{'key1': 'value1', 'key2': 'value2'}
>>>
Pointer: Difference between shallow and deep copys

Assignment statements in Python do not copy objects, they create bindings between a target and an object.
so, dict2 = dict1, it results another binding between dict2and the object that dict1 refer to.
if you want to copy a dict, you can use the copy module.
The copy module has two interface:
copy.copy(x)
Return a shallow copy of x.
copy.deepcopy(x)
Return a deep copy of x.
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
For example, in python 2.7.9:
>>> import copy
>>> a = [1,2,3,4,['a', 'b']]
>>> b = a
>>> c = copy.copy(a)
>>> d = copy.deepcopy(a)
>>> a.append(5)
>>> a[4].append('c')
and the result is:
>>> a
[1, 2, 3, 4, ['a', 'b', 'c'], 5]
>>> b
[1, 2, 3, 4, ['a', 'b', 'c'], 5]
>>> c
[1, 2, 3, 4, ['a', 'b', 'c']]
>>> d
[1, 2, 3, 4, ['a', 'b']]

You can copy and edit the newly constructed copy in one go by calling the dict constructor with additional keyword arguments:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict(dict1, key2="WHY?!")
>>> dict1
{'key2': 'value2', 'key1': 'value1'}
>>> dict2
{'key2': 'WHY?!', 'key1': 'value1'}

This confused me too, initially, because I was coming from a C background.
In C, a variable is a location in memory with a defined type. Assigning to a variable copies the data into the variable's memory location.
But in Python, variables act more like pointers to objects. So assigning one variable to another doesn't make a copy, it just makes that variable name point to the same object.

dict1 is a symbol that references an underlying dictionary object. Assigning dict1 to dict2 merely assigns the same reference. Changing a key's value via the dict2 symbol changes the underlying object, which also affects dict1. This is confusing.
It is far easier to reason about immutable values than references, so make copies whenever possible:
person = {'name': 'Mary', 'age': 25}
one_year_later = {**person, 'age': 26} # does not mutate person dict
This is syntactically the same as:
one_year_later = dict(person, age=26)

Every variable in python (stuff like dict1 or str or __builtins__ is a pointer to some hidden platonic "object" inside the machine.
If you set dict1 = dict2,you just point dict1 to the same object (or memory location, or whatever analogy you like) as dict2. Now, the object referenced by dict1 is the same object referenced by dict2.
You can check: dict1 is dict2 should be True. Also, id(dict1) should be the same as id(dict2).
You want dict1 = copy(dict2), or dict1 = deepcopy(dict2).
The difference between copy and deepcopy? deepcopy will make sure that the elements of dict2 (did you point it at a list?) are also copies.
I don't use deepcopy much - it's usually poor practice to write code that needs it (in my opinion).

dict2 = dict1 does not copy the dictionary. It simply gives you the programmer a second way (dict2) to refer to the same dictionary.

>>> dict2 = dict1
# dict2 is bind to the same Dict object which binds to dict1, so if you modify dict2, you will modify the dict1
There are many ways to copy Dict object, I simply use
dict_1 = {
'a':1,
'b':2
}
dict_2 = {}
dict_2.update(dict_1)

the following code, which is on dicts which follows json syntax more than 3 times faster than deepcopy
def CopyDict(dSrc):
try:
return json.loads(json.dumps(dSrc))
except Exception as e:
Logger.warning("Can't copy dict the preferred way:"+str(dSrc))
return deepcopy(dSrc)

for nested dictionay do not use dict(srcData) or srcData.copy() or {**srcData} because if you change second level and more it will also modify source dictionary
srcData = {
'first': {
'second': 'second Value'
}
}
newData = dict(srcData) # srcData.copy() or {**srcData}
newData['first']['second'] = 'new Second Value'
print(srcData)
print(newData)
# it will print
# srcData: {'first': {'second': 'new Second Value'}}
# newData:{'first': {'second': 'new Second Value'}}
# but it should be
# srcData: {'first': {'second': 'second Value'}}
# newData:{'first': {'second': 'new Second Value'}}
another option for deepcopy is using json trick like Javascript JSON.parse(JSON.stringify(obj))
import json
srcData = {'first': {'second': 'second Value'}}
newData = json.loads(json.dumps(srcData))
newData['first']['second'] = 'new Second Value'
print(srcData)
print(newData)
# srcData: {'first': {'second': 'second Value'}}
# newData: {'first': {'second': 'new Second Value'}}

As others have explained, the built-in dict does not do what you want. But in Python2 (and probably 3 too) you can easily create a ValueDict class that copies with = so you can be sure that the original will not change.
class ValueDict(dict):
def __ilshift__(self, args):
result = ValueDict(self)
if isinstance(args, dict):
dict.update(result, args)
else:
dict.__setitem__(result, *args)
return result # Pythonic LVALUE modification
def __irshift__(self, args):
result = ValueDict(self)
dict.__delitem__(result, args)
return result # Pythonic LVALUE modification
def __setitem__(self, k, v):
raise AttributeError, \
"Use \"value_dict<<='%s', ...\" instead of \"d[%s] = ...\"" % (k,k)
def __delitem__(self, k):
raise AttributeError, \
"Use \"value_dict>>='%s'\" instead of \"del d[%s]" % (k,k)
def update(self, d2):
raise AttributeError, \
"Use \"value_dict<<=dict2\" instead of \"value_dict.update(dict2)\""
# test
d = ValueDict()
d <<='apples', 5
d <<='pears', 8
print "d =", d
e = d
e <<='bananas', 1
print "e =", e
print "d =", d
d >>='pears'
print "d =", d
d <<={'blueberries': 2, 'watermelons': 315}
print "d =", d
print "e =", e
print "e['bananas'] =", e['bananas']
# result
d = {'apples': 5, 'pears': 8}
e = {'apples': 5, 'pears': 8, 'bananas': 1}
d = {'apples': 5, 'pears': 8}
d = {'apples': 5}
d = {'watermelons': 315, 'blueberries': 2, 'apples': 5}
e = {'apples': 5, 'pears': 8, 'bananas': 1}
e['bananas'] = 1
# e[0]=3
# would give:
# AttributeError: Use "value_dict<<='0', ..." instead of "d[0] = ..."
Please refer to the lvalue modification pattern discussed here: Python 2.7 - clean syntax for lvalue modification. The key observation is that str and int behave as values in Python (even though they're actually immutable objects under the hood). While you're observing that, please also observe that nothing is magically special about str or int. dict can be used in much the same ways, and I can think of many cases where ValueDict makes sense.

i ran into a peculiar behavior when trying to deep copy dictionary property of class w/o assigning it to variable
new = copy.deepcopy(my_class.a) doesn't work i.e. modifying new modifies my_class.a
but if you do old = my_class.a and then new = copy.deepcopy(old) it works perfectly i.e. modifying new does not affect my_class.a
I am not sure why this happens, but hope it helps save some hours! :)

If your dict is typed as a Mapping, you can't .copy() it, but you can
dict2 = dict1 | {}
It's slightly cryptic, and I can't speak for performance compared to copy.copy(dict1), but it's very terse.

Copying by using a for loop:
orig = {"X2": 674.5, "X3": 245.0}
copy = {}
for key in orig:
copy[key] = orig[key]
print(orig) # {'X2': 674.5, 'X3': 245.0}
print(copy) # {'X2': 674.5, 'X3': 245.0}
copy["X2"] = 808
print(orig) # {'X2': 674.5, 'X3': 245.0}
print(copy) # {'X2': 808, 'X3': 245.0}

You can use directly:
dict2 = eval(repr(dict1))
where object dict2 is an independent copy of dict1, so you can modify dict2 without affecting dict1.
This works for any kind of object.

Another cleaner way would be using json. see below code
>>> a = [{"name":"Onkar","Address": {"state":"MH","country":"India","innerAddress":{"city":"Pune"}}}]
>>> b = json.dumps(a)
>>> b = json.loads(b)
>>> id(a)
2334461105416
>>> id(b)
2334461105224
>>> a[0]["Address"]["innerAddress"]["city"]="Nagpur"
>>> a
[{'name': 'Onkar', 'Address': {'state': 'MH', 'country': 'India', 'innerAddress': {'city': 'Nagpur'}}}]
>>> b
[{'name': 'Onkar', 'Address': {'state': 'MH', 'country': 'India', 'innerAddress': {'city': 'Pune'}}}]
>>> id(a[0]["Address"]["innerAddress"])
2334460618376
>>> id(b[0]["Address"]["innerAddress"])
2334424569880
To create another dictionary do json.dumps() and then json.loads() on the same dictionary object. You will have separate dict object.

Editing values in dictionary of dictionaries? [duplicate]

I set dict2 = dict1. When I edit dict2, the original dict1 also changes. Why?
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict1
>>> dict2["key2"] = "WHY?!"
>>> dict1
{'key2': 'WHY?!', 'key1': 'value1'}

Python never implicitly copies objects. When you set dict2 = dict1, you are making them refer to the same exact dict object, so when you mutate it, all references to it keep referring to the object in its current state.
If you want to copy the dict (which is rare), you have to do so explicitly with
dict2 = dict(dict1)
or
dict2 = dict1.copy()

When you assign dict2 = dict1, you are not making a copy of dict1, it results in dict2 being just another name for dict1.
To copy the mutable types like dictionaries, use copy / deepcopy of the copy module.
import copy
dict2 = copy.deepcopy(dict1)

While dict.copy() and dict(dict1) generates a copy, they are only shallow copies. If you want a deep copy, copy.deepcopy(dict1) is required. An example:
>>> source = {'a': 1, 'b': {'m': 4, 'n': 5, 'o': 6}, 'c': 3}
>>> copy1 = source.copy()
>>> copy2 = dict(source)
>>> import copy
>>> copy3 = copy.deepcopy(source)
>>> source['a'] = 10 # a change to first-level properties won't affect copies
>>> source
{'a': 10, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy1
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy2
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy3
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> source['b']['m'] = 40 # a change to deep properties WILL affect shallow copies 'b.m' property
>>> source
{'a': 10, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy1
{'a': 1, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy2
{'a': 1, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy3 # Deep copy's 'b.m' property is unaffected
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
Regarding shallow vs deep copies, from the Python copy module docs:
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

In Depth and an easy way to remember:
Whenever you do dict2 = dict1, dict2 refers to dict1. Both dict1 and dict2 points to the same location in the memory. This is just a normal case while working with mutable objects in python. When you are working with mutable objects in python you must be careful as it is hard to debug.
Instead of using dict2 = dict1, you should be using copy(shallow copy) and deepcopy method from python's copy module to separate dict2 from dict1.
The correct way is:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict1.copy()
>>> dict2
{'key1': 'value1', 'key2': 'value2'}
>>> dict2["key2"] = "WHY?"
>>> dict2
{'key1': 'value1', 'key2': 'WHY?'}
>>> dict1
{'key1': 'value1', 'key2': 'value2'}
>>> id(dict1)
140641178056312
>>> id(dict2)
140641176198960
>>>
As you can see the id of both dict1 and dict2 are different, which means both are pointing/referencing to different locations in the memory.
This solution works for dictionaries with immutable values, this is not the correct solution for those with mutable values.
Eg:
>>> import copy
>>> dict1 = {"key1" : "value1", "key2": {"mutable": True}}
>>> dict2 = dict1.copy()
>>> dict2
{'key1': 'value1', 'key2': {'mutable': True}}
>>> dict2["key2"]["mutable"] = False
>>> dict2
{'key1': 'value1', 'key2': {'mutable': False}}
>>> dict1
{'key1': 'value1', 'key2': {'mutable': False}}
>>> id(dict1)
140641197660704
>>> id(dict2)
140641196407832
>>> id(dict1["key2"])
140641176198960
>>> id(dict2["key2"])
140641176198960
You can see that even though we applied copy for dict1, the value of mutable is changed to false on both dict2 and dict1 even though we only change it on dict2. This is because we changed the value of a mutable dict part of the dict1. When we apply a copy on dict, it will only do a shallow copy which means it copies all the immutable values into a new dict and does not copy the mutable values but it will reference them.
The ultimate solution is to do a deepycopy of dict1 to completely create a new dict with all the values copied, including mutable values.
>>>import copy
>>> dict1 = {"key1" : "value1", "key2": {"mutable": True}}
>>> dict2 = copy.deepcopy(dict1)
>>> dict2
{'key1': 'value1', 'key2': {'mutable': True}}
>>> id(dict1)
140641196228824
>>> id(dict2)
140641197662072
>>> id(dict1["key2"])
140641178056312
>>> id(dict2["key2"])
140641197662000
>>> dict2["key2"]["mutable"] = False
>>> dict2
{'key1': 'value1', 'key2': {'mutable': False}}
>>> dict1
{'key1': 'value1', 'key2': {'mutable': True}}
As you can see, id's are different, it means that dict2 is completely a new dict with all the values in dict1.
Deepcopy needs to be used if whenever you want to change any of the mutable values without affecting the original dict. If not you can use shallow copy. Deepcopy is slow as it works recursively to copy any nested values in the original dict and also takes extra memory.

On python 3.5+ there is an easier way to achieve a shallow copy by using the ** unpackaging operator. Defined by Pep 448.
>>>dict1 = {"key1": "value1", "key2": "value2"}
>>>dict2 = {**dict1}
>>>print(dict2)
{'key1': 'value1', 'key2': 'value2'}
>>>dict2["key2"] = "WHY?!"
>>>print(dict1)
{'key1': 'value1', 'key2': 'value2'}
>>>print(dict2)
{'key1': 'value1', 'key2': 'WHY?!'}
** unpackages the dictionary into a new dictionary that is then assigned to dict2.
We can also confirm that each dictionary has a distinct id.
>>>id(dict1)
178192816
>>>id(dict2)
178192600
If a deep copy is needed then copy.deepcopy() is still the way to go.

The best and the easiest ways to create a copy of a dict in both Python 2.7 and 3 are...
To create a copy of simple(single-level) dictionary:
1. Using dict() method, instead of generating a reference that points to the existing dict.
my_dict1 = dict()
my_dict1["message"] = "Hello Python"
print(my_dict1) # {'message':'Hello Python'}
my_dict2 = dict(my_dict1)
print(my_dict2) # {'message':'Hello Python'}
# Made changes in my_dict1
my_dict1["name"] = "Emrit"
print(my_dict1) # {'message':'Hello Python', 'name' : 'Emrit'}
print(my_dict2) # {'message':'Hello Python'}
2. Using the built-in update() method of python dictionary.
my_dict2 = dict()
my_dict2.update(my_dict1)
print(my_dict2) # {'message':'Hello Python'}
# Made changes in my_dict1
my_dict1["name"] = "Emrit"
print(my_dict1) # {'message':'Hello Python', 'name' : 'Emrit'}
print(my_dict2) # {'message':'Hello Python'}
To create a copy of nested or complex dictionary:
Use the built-in copy module, which provides a generic shallow and deep copy operations. This module is present in both Python 2.7 and 3.*
import copy
my_dict2 = copy.deepcopy(my_dict1)

You can also just make a new dictionary with a dictionary comprehension. This avoids importing copy.
dout = dict((k,v) for k,v in mydict.items())
Of course in python >= 2.7 you can do:
dout = {k:v for k,v in mydict.items()}
But for backwards compat., the top method is better.

In addition to the other provided solutions, you can use ** to integrate the dictionary into an empty dictionary, e.g.,
shallow_copy_of_other_dict = {**other_dict}.
Now you will have a "shallow" copy of other_dict.
Applied to your example:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = {**dict1}
>>> dict2
{'key1': 'value1', 'key2': 'value2'}
>>> dict2["key2"] = "WHY?!"
>>> dict1
{'key1': 'value1', 'key2': 'value2'}
>>>
Pointer: Difference between shallow and deep copys

Assignment statements in Python do not copy objects, they create bindings between a target and an object.
so, dict2 = dict1, it results another binding between dict2and the object that dict1 refer to.
if you want to copy a dict, you can use the copy module.
The copy module has two interface:
copy.copy(x)
Return a shallow copy of x.
copy.deepcopy(x)
Return a deep copy of x.
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
For example, in python 2.7.9:
>>> import copy
>>> a = [1,2,3,4,['a', 'b']]
>>> b = a
>>> c = copy.copy(a)
>>> d = copy.deepcopy(a)
>>> a.append(5)
>>> a[4].append('c')
and the result is:
>>> a
[1, 2, 3, 4, ['a', 'b', 'c'], 5]
>>> b
[1, 2, 3, 4, ['a', 'b', 'c'], 5]
>>> c
[1, 2, 3, 4, ['a', 'b', 'c']]
>>> d
[1, 2, 3, 4, ['a', 'b']]

You can copy and edit the newly constructed copy in one go by calling the dict constructor with additional keyword arguments:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict(dict1, key2="WHY?!")
>>> dict1
{'key2': 'value2', 'key1': 'value1'}
>>> dict2
{'key2': 'WHY?!', 'key1': 'value1'}

This confused me too, initially, because I was coming from a C background.
In C, a variable is a location in memory with a defined type. Assigning to a variable copies the data into the variable's memory location.
But in Python, variables act more like pointers to objects. So assigning one variable to another doesn't make a copy, it just makes that variable name point to the same object.

dict1 is a symbol that references an underlying dictionary object. Assigning dict1 to dict2 merely assigns the same reference. Changing a key's value via the dict2 symbol changes the underlying object, which also affects dict1. This is confusing.
It is far easier to reason about immutable values than references, so make copies whenever possible:
person = {'name': 'Mary', 'age': 25}
one_year_later = {**person, 'age': 26} # does not mutate person dict
This is syntactically the same as:
one_year_later = dict(person, age=26)

Every variable in python (stuff like dict1 or str or __builtins__ is a pointer to some hidden platonic "object" inside the machine.
If you set dict1 = dict2,you just point dict1 to the same object (or memory location, or whatever analogy you like) as dict2. Now, the object referenced by dict1 is the same object referenced by dict2.
You can check: dict1 is dict2 should be True. Also, id(dict1) should be the same as id(dict2).
You want dict1 = copy(dict2), or dict1 = deepcopy(dict2).
The difference between copy and deepcopy? deepcopy will make sure that the elements of dict2 (did you point it at a list?) are also copies.
I don't use deepcopy much - it's usually poor practice to write code that needs it (in my opinion).

dict2 = dict1 does not copy the dictionary. It simply gives you the programmer a second way (dict2) to refer to the same dictionary.

>>> dict2 = dict1
# dict2 is bind to the same Dict object which binds to dict1, so if you modify dict2, you will modify the dict1
There are many ways to copy Dict object, I simply use
dict_1 = {
'a':1,
'b':2
}
dict_2 = {}
dict_2.update(dict_1)

the following code, which is on dicts which follows json syntax more than 3 times faster than deepcopy
def CopyDict(dSrc):
try:
return json.loads(json.dumps(dSrc))
except Exception as e:
Logger.warning("Can't copy dict the preferred way:"+str(dSrc))
return deepcopy(dSrc)

for nested dictionay do not use dict(srcData) or srcData.copy() or {**srcData} because if you change second level and more it will also modify source dictionary
srcData = {
'first': {
'second': 'second Value'
}
}
newData = dict(srcData) # srcData.copy() or {**srcData}
newData['first']['second'] = 'new Second Value'
print(srcData)
print(newData)
# it will print
# srcData: {'first': {'second': 'new Second Value'}}
# newData:{'first': {'second': 'new Second Value'}}
# but it should be
# srcData: {'first': {'second': 'second Value'}}
# newData:{'first': {'second': 'new Second Value'}}
another option for deepcopy is using json trick like Javascript JSON.parse(JSON.stringify(obj))
import json
srcData = {'first': {'second': 'second Value'}}
newData = json.loads(json.dumps(srcData))
newData['first']['second'] = 'new Second Value'
print(srcData)
print(newData)
# srcData: {'first': {'second': 'second Value'}}
# newData: {'first': {'second': 'new Second Value'}}

As others have explained, the built-in dict does not do what you want. But in Python2 (and probably 3 too) you can easily create a ValueDict class that copies with = so you can be sure that the original will not change.
class ValueDict(dict):
def __ilshift__(self, args):
result = ValueDict(self)
if isinstance(args, dict):
dict.update(result, args)
else:
dict.__setitem__(result, *args)
return result # Pythonic LVALUE modification
def __irshift__(self, args):
result = ValueDict(self)
dict.__delitem__(result, args)
return result # Pythonic LVALUE modification
def __setitem__(self, k, v):
raise AttributeError, \
"Use \"value_dict<<='%s', ...\" instead of \"d[%s] = ...\"" % (k,k)
def __delitem__(self, k):
raise AttributeError, \
"Use \"value_dict>>='%s'\" instead of \"del d[%s]" % (k,k)
def update(self, d2):
raise AttributeError, \
"Use \"value_dict<<=dict2\" instead of \"value_dict.update(dict2)\""
# test
d = ValueDict()
d <<='apples', 5
d <<='pears', 8
print "d =", d
e = d
e <<='bananas', 1
print "e =", e
print "d =", d
d >>='pears'
print "d =", d
d <<={'blueberries': 2, 'watermelons': 315}
print "d =", d
print "e =", e
print "e['bananas'] =", e['bananas']
# result
d = {'apples': 5, 'pears': 8}
e = {'apples': 5, 'pears': 8, 'bananas': 1}
d = {'apples': 5, 'pears': 8}
d = {'apples': 5}
d = {'watermelons': 315, 'blueberries': 2, 'apples': 5}
e = {'apples': 5, 'pears': 8, 'bananas': 1}
e['bananas'] = 1
# e[0]=3
# would give:
# AttributeError: Use "value_dict<<='0', ..." instead of "d[0] = ..."
Please refer to the lvalue modification pattern discussed here: Python 2.7 - clean syntax for lvalue modification. The key observation is that str and int behave as values in Python (even though they're actually immutable objects under the hood). While you're observing that, please also observe that nothing is magically special about str or int. dict can be used in much the same ways, and I can think of many cases where ValueDict makes sense.

i ran into a peculiar behavior when trying to deep copy dictionary property of class w/o assigning it to variable
new = copy.deepcopy(my_class.a) doesn't work i.e. modifying new modifies my_class.a
but if you do old = my_class.a and then new = copy.deepcopy(old) it works perfectly i.e. modifying new does not affect my_class.a
I am not sure why this happens, but hope it helps save some hours! :)

If your dict is typed as a Mapping, you can't .copy() it, but you can
dict2 = dict1 | {}
It's slightly cryptic, and I can't speak for performance compared to copy.copy(dict1), but it's very terse.

Copying by using a for loop:
orig = {"X2": 674.5, "X3": 245.0}
copy = {}
for key in orig:
copy[key] = orig[key]
print(orig) # {'X2': 674.5, 'X3': 245.0}
print(copy) # {'X2': 674.5, 'X3': 245.0}
copy["X2"] = 808
print(orig) # {'X2': 674.5, 'X3': 245.0}
print(copy) # {'X2': 808, 'X3': 245.0}

You can use directly:
dict2 = eval(repr(dict1))
where object dict2 is an independent copy of dict1, so you can modify dict2 without affecting dict1.
This works for any kind of object.

Another cleaner way would be using json. see below code
>>> a = [{"name":"Onkar","Address": {"state":"MH","country":"India","innerAddress":{"city":"Pune"}}}]
>>> b = json.dumps(a)
>>> b = json.loads(b)
>>> id(a)
2334461105416
>>> id(b)
2334461105224
>>> a[0]["Address"]["innerAddress"]["city"]="Nagpur"
>>> a
[{'name': 'Onkar', 'Address': {'state': 'MH', 'country': 'India', 'innerAddress': {'city': 'Nagpur'}}}]
>>> b
[{'name': 'Onkar', 'Address': {'state': 'MH', 'country': 'India', 'innerAddress': {'city': 'Pune'}}}]
>>> id(a[0]["Address"]["innerAddress"])
2334460618376
>>> id(b[0]["Address"]["innerAddress"])
2334424569880
To create another dictionary do json.dumps() and then json.loads() on the same dictionary object. You will have separate dict object.

Python adding dictionary values with same key within a list

i just picked up python not too long ago.
An example below
i have a dictionary within a list
myword = [{'a': 2},{'b':3},{'c':4},{'a':1}]
I need to change it to the output below
[{'a':3} , {'b':3} , {'c':4}]
is there a way where i can add the value together? I tried using counter, but it prints out the each dict out.
what i did using Counter:
for i in range(1,4,1):
text = myword[i]
Print Counter(text)
The output
Counter({'a': 2})
Counter({'b': 3})
Counter({'c': 4})
Counter({'a': 1})
i have read the link below but what they compared was between 2 dict.
Is there a better way to compare dictionary values
Thanks!

Merge dictionaries into one dictionary (Counter), and split them.
>>> from collections import Counter
>>> myword = [{'a': 2}, {'b':3}, {'c':4}, {'a':1}]
>>> c = Counter()
>>> for d in myword:
... c.update(d)
...
>>> [{key: value} for key, value in c.items()]
[{'a': 3}, {'c': 4}, {'b': 3}]
>>> [{key: value} for key, value in sorted(c.items())]
[{'a': 3}, {'b': 3}, {'c': 4}]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why do changes to a nested dict inside dict2 affect dict1? [duplicate] - python

Related

python deepcopy not removing internal references in list extended using * operator?

Get a dictionary with saved structure of all keys in nested dictionaries

what's wrong with my python list output [duplicate]

Editing values in dictionary of dictionaries? [duplicate]

Python adding dictionary values with same key within a list

Categories

Resources