Python - Updating value in one dictionary is updating value in all dictionaries - python

I have a list of dictionaries called lod. All dictionaries have the same keys but different values. I am trying to update one specific value in the list of values for the same key in all the dictionaries.
I am attempting to do it with the following for loop:
for i in range(len(lod)):
a=lod[i][key][:]
a[p]=a[p]+lov[i]
lod[i][key]=a
What's happening is each is each dictionary is getting updated len(lod) times so lod[0][key][p] is supposed to have lov[0] added to it but instead it is getting lov[0]+lov[1]+.... added to it.
What am I doing wrong?
Here is how I declared the list of dicts:
lod = [{} for _ in range(len(dataul))]
for j in range(len(dataul)):
for i in datakl:
rrdict[str.split(i,',')[0]]=list(str.split(i,',')[1:len(str.split(i,','))])
lod[j]=rrdict

The problem is in how you created the list of dictionaries. You probably did something like this:
list_of_dicts = [{}] * 20
That's actually the same dict 20 times. Try doing something like this:
list_of_dicts = [{} for _ in range(20)]
Without seeing how you actually created it, this is only an example solution to an example problem.
To know for sure, print this:
[id(x) for x in list_of_dicts]
If you defined it in the * 20 method, the id is the same for each dict. In the list comprehension method, the id is unique.

This it where the trouble starts: lod[j] = rrdict. lod itself is created properly with different dictionaries. Unfortunately, afterwards any references to the original dictionaries in the list get overwritten with a reference to rrdict. So in the end, the list contains only references to one single dictionary. Here is some more pythonic and readable way to solve your problem:
lod = [{} for _ in range(len(dataul))]
for rrdict in lod:
for line in datakl:
splt = line.split(',')
rrdict[splt[0]] = splt[1:]

You created the list of dictionaries correctly, as per other answer.
However, when you are updating individual dictionaries, you completely overwrite the list.
Removing noise from your code snippet:
lod = [{} for _ in range(whatever)]
for j in range(whatever):
# rrdict = lod[j] # Uncomment this as a possible fix.
for i in range(whatever):
rrdict[somekey] = somevalue
lod[j] = rrdict
Assignment on the last line throws away the empty dict that was in lod[j] and inserts a reference to the object represented by rrdict.
Not sure what your code does, but see a commented-out line - it might be the fix you are looking for.

Related

Python - How to create sublists from list of strings based on part of the string?

I saw similar questions but unfortunately I didnt found answer for my problem.
I have a list:
list = ['a_abc', 'a_xyz', 'a_foo', 'b_abc', 'b_xyz', 'b_foo']
I want to split this list into 3 based on character after underscore _.
Desired output would be:
list_1 = ['a_abc', 'b_abc']
list_2 = ['a_xyz', 'b_xyz']
list_3 = ['a_foo', 'b_foo']
I would like to avoid something like:
for element in list:
if 'abc' in element...
if 'xyz' in element...
because I have over 200 strings to group in this way in my use case. So code "should recognize" the same part of the string (after underscore) and group this in sublists.
Since I didnt notice similar issue any advice is highly appreciated.
You shouldn't want to do this with one or more lists, because you don't know at runtime how many there are (or, even if you know, it will be repeated code).
Instead, you can use defaultdict; it's like a default dictionary, but handles missing value simply creating a new element with your specified factory.
In this case, defaultdict(list) means to create a dictionary with a list factory; when a key is missing, the object will create an empty list for that key.
from collections import defaultdict
l = ['a_abc', 'a_xyz', 'a_foo', 'b_abc', 'b_xyz', 'b_foo']
d = defaultdict(list)
for el in l:
key = el.split("_")[1]
# key = el[2:] # use this if the format of elements is <letter>_<other_chars>
d[key].append(el)
print(d)
# defaultdict(<class 'list'>, {'abc': ['a_abc', 'b_abc'], 'xyz': ['a_xyz', 'b_xyz'], 'foo': ['a_foo', 'b_foo']})
print(d["abc"])
# ['a_abc', 'b_abc']

Dealing with lists in a dictionary

I am iterating through some folders to read all the objects in that list to later on move the not rejected ones. As the number of folders and files may vary, basically I managed to create a dictionary where each folder is a key and the items are the items. In a dummy situation I have:
Iterating through the number of source of folders (known but may vary)
sourcefolder = (r"C:\User\Desktop\Test")
subfolders = 3
for i in range(subfolders):
Lst_All["allfiles" + str(i)] = os.listdir(sourcefolder[i])
This results in the dictionary below:
Lst_All = {
allfiles0: ('A.1.txt', 'A.txt', 'rejected.txt')
allfiles1: ('B.txt')
allfiles2: ('C.txt')}
My issue is to remove the rejected files so I can do a shutil.move() with only valid files.
So far I got:
for k, v in lst_All.items():
for i in v:
if i == "rejected.txt":
del lst_All[i]
but it returns an error KeyError: 'rejected.txt'. Any thoughts? Perhaps another way to create the list of items to be moved?
Thanks!
For a start, the members of your dictionary are tuples, not lists. Tuples are immutable, so we can't remove items as easily as we can with lists. To replicate the functionality I think you're after, we can do the following:
Lst_All = {'allfiles0': ('A.1.txt', 'A.txt', 'rejected.txt'),
'allfiles1': ('B.txt',),
'allfiles2': ('C.txt',)}
Lst_All = {k: tuple(x for x in v if x!="rejected.txt") for k, v in Lst_All.items()}
Which gives us:
>>> Lst_All
{'allfiles0': ('A.1.txt', 'A.txt'),
'allfiles1': ('B.txt',),
'allfiles2': ('C.txt',)}
You should not iterate over a dictionary when removing element from that dictionary inside loop. Better to make an list of keys and then iterate over that. Also you do not need a separate loop to check whether rejected.txt is present in that directory.
keys = list(lst_All.keys())
for k in keys:
if "rejected.txt" in lst_All[k]:
del lst_All[k]
If you want to remove rejected.txt then you can only create another tuple without that element and insert in the dictionary with the key. You can do that like -
keys = list(lst_All.keys())
for k in keys:
lst_All[k] = tuple((e for e in lst_All[k] if e != 'rejected.txt'))

Create many empty dictionary in Python

I'm trying to create many dictionaries in a for loop in Python 2.7. I have a list as follows:
sections = ['main', 'errdict', 'excdict']
I want to access these variables, and create new dictionaries with the variable names. I could only access the list sections and store an empty dictionary in the list but not in the respective variables.
for i in enumerate(sections):
sections[i] = dict()
The point of this question is. I'm going to obtain the list sections from a .ini file, and that variable will vary. And I can create an array of dictionaries, but that doesn't work well will the further function requirements. Hence, my doubt.
Robin Spiess answered your question beautifully.
I just want to add the one-liner way:
section_dict = {sec : {} for sec in sections}
For maintaining the order of insertion, you'll need an OrderedDict:
from collections import OrderedDict
section_dict = OrderedDict((sec, {}) for sec in sections)
To clear dictionaries
If the variables in your list are already dictionaries use:
for var in sections:
var.clear()
Note that here var = {} does not work, see Difference between dict.clear() and assigning {} in Python.
To create new dictionaries
As long as you only have a handful of dicts, the best way is probably the easiest one:
main = {} #same meaning as main = dict() but slightly faster
errdict = {}
excdict = {}
sections = [main,errdict,excdict]
The variables need to be declared first before you can put them in a list.
For more dicts I support #dslack's answer in the comments (all credit to him):
sections = [dict() for _ in range(numberOfDictsYouWant)]
If you want to be able to access the dictionaries by name, the easiest way is to make a dictionary of dictionaries:
sectionsdict = {}
for var in sections:
sectionsdict[var] = {}
You might also be interested in: Using a string variable as a variable name

Understanding specific python dictionary creation using for i loop

I just started reading through Dave Peticolas tutorial on Twisted (http://krondo.com/blog/?p=1247), and quickly going through his early examples using Python sockets, I came across a line of code that I can't wrap my head around. The code is on his github, specifically https://github.com/jdavisp3/twisted-intro/blob/master/async-client/get-poetry.py, but the context doesn't really matter.
Here is the line:
sock2task = dict([(s, i + 1) for i, s in enumerate(sockets)])
Where sockets is a list of sockets.
This line will create a dictionary in the form of
{<sock3 object>: 3, <sock3 object>: 2, <sock3 object>: 1}
however, I just don't understand how.
Trying to get an equivalent statement, I came up with
sock2task = dict(enumerate(sockets, start=1))
however this results in
{1: <sock3 object>, 2: <sock3 object>, 3: <sock3 object>}
which has the keys and values swapped, and is in reverse.
So how does it work? In the full code, neither s or i are defined..
Thanks, Matt
Your line in question is the initialization of a dictionary using list comprehension. To break it down:
A dict can be initialized like this
dict = dict([(key0, value0), ...)]) # make a dictionary out of a list of tuples
The list comprehension in the book is made up of following components:
1.
# "for every index i and corresponding entry s in sockets"
for i, s in enumerate(sockets)
2.
# a tuple of the socket s and its index + 1: `i + 1`
(s, i + 1)
3.
# "Make a list in which for every index i
# and corresponding entry s in sockets there is a tuple (s, i + 1)"
[(s, i + 1) for i, s in enumerate(sockets)]
And so:
# "Convert this whole thing into a dictionary!"
dict([(s, i + 1) for i, s in enumerate(sockets)])
An equivalent code would be:
sock2task = {}
for index, socket in enumerate(sockets):
sock2task[socket] = index + 1
The outputted dictionary starts at 3 by coincidence, because dictionaries are not ordered.
I hope it is clearer now.
You have to dive into list comprehensions to get some necessary background.
Despite that, you can create a dictionary from a list of pairs, where the first value will be assigned as key and the second as value. The following code has the same idea but it will help you to understand how it works.
result = {}
for index, s in enumerate(sockets):
result[s] = index + 1
Your code is getting the inverse result because you did not inverted the values returned by enumerate (as you can see in my example and at the original code).
sock2task = dict([(s, i + 1) for i, s in enumerate(sockets)])
^
This is a list comprehension + tuple unpacking, cast to a dict.
A list comprehension takes the form of [f(i) for i in iterable]. In this case, iterable is enumerate(sockets), which creates a tuple of two elements.
Tuples can be unpacked. ie i,j = (0,1) would assign 0 to i and 1 to j.
So, basically, i and s are created where I have the caret sign pointing.
(Incidentally, in python 2.7+ you can also use the dictionary comprehension. sock2task = { s: i + 1 for i,s in enumerate(sockets)}

Python 2.7.9: list of dictionaries, calculating mean and std

Using Python 2.7.9: I have a list of dictionaries that hold a 'data' item, how do I access each item into a list so I may get the mean and standard deviation? Here's an example:
values = [{'colour': 'b.-', 'data': 12.3}, {'colour': 'b.-', 'data': 11.2}, {'colour': 'b.-', 'data': 9.21}]
So far I have:
val = []
for each in values:
val.append(each.items()[1][1])
print np.mean(val) # gives 10.903
print np.std(val) # gives 1.278
Crude and not very Pythonic(?)
Using list comprehension is probably easiest. You can extract the numbers like this:
numbers = [x['data'] for x in values]
Then you just call numpys mean/std/etc functions on that, just like you're doing.
Apologies for (perhaps) an unnecessary question, I've seen this:
average list of dictionaries in python
vals = [i['data'] for i in values]
np.mean(vals) # gives 10.903
np.std(vals) # gives 1.278
(Pythonic solution?)
It is an exceptionally bad idea to index into a dictionary since it has no guarantee of order. Sometimes the 'data' element could be first, sometimes it could be second. There is no way to know without checking.
When using a dictionary, you should almost always access elements by using the key. In dictionary notation, this is { key:value, ... } where each key is "unique". I can't remember the exact definition of "unique" for a python dictionary key, but it believe it is the (type, hash) pair of your object or literal.
Keeping this in mind, we have the more pythonic:
val = []
for data_dict in values:
val.append(data_dict['data'])
If you want to be fancy, you can use a list completion which is a fancy way of generating a list from a more complex statement.
val = [data_dict['data'] for data_dict in values]
To be even more fancy, you can add a few conditionals so check for errors.
val = [data_dict['data'] for data_dict in values if (data_dict and 'data' in data_dict)]
What this most-fancy way of doing thing is doing is filtering the results of the for data_dict in values iteration with if (data_dict and 'data' in data_dict) so that the only data_dict instances you use in data_dict['data'] are the ones that pass the if-check.
You want a pythonic one Liner?
data = [k['data'] for k in values]
print("Mean:\t"+ str(np.mean(data)) + "\nstd :\t" + str(np.std(data)))
you could use the one liner
print("Mean:\t"+ str(np.mean([k['data'] for k in values])) + "\nstd :\t" + str(np.std([k['data'] for k in values])))
but there really is no point, as both print
Mean: 10.9033333333
std : 1.27881021092
and the former is more readable.

Categories