Initialization of a list of dictionaries - python

I want to take input from the user and initialize a list of dictionaries.
I have the following block which works fine.
people = []
for p in range(3):
cell = {"name": "","age" : 0, "education" : "","height" : 0}
cell["name"] = input("name:")
cell["age"] = int(input("age:"))
cell["education"] = input("education:")
cell["height"] = float(input("height:"))
people.append(cell)
The problem I have is why the following block does not work for me.
people = []
cell = {"name": "","age" : 0, "education" : "","height" : 0}
for p in range(0,3):
cell["name"] = input("name:")
cell["age"] = int(input("age:"))
cell["education"] = input("education:")
cell["height"] = float(input("height:"))
people.append(cell)
I do not understand why at the end of the iteration I have the list initialized with the latest input all the 3 times, I mean when I use this line:
cell["name"] = input("name:")
shouldn't the previous value been replaced with the new one?

The important part is this line:
cell = {"name": "","age" : 0, "education" : "","height" : 0}
here you create a dictionary and save a reference to it under cell
If you do this before the loop (like in your second example) you have only created one dictionary and appended it to the list three times. So if you change anything inside of any dictionary, your changes will show up at all indices of the list (since they all point to the same dictionary)
However, if you run the line inside of your loop you actually create three different dictionaries, which is essentially what you want here. This is why your first example works and your second one doesn't.

This is the difference between passing by value and passing by reference. Passing by reference, or a pointer in other language terms, means the variable points to a spot in memory. In Python, dictionaries behave this way, which means that since you are, in the second version, changing what exists in the dictionary then appending it, what you are actually doing is appending the SAME dictionary to the array a second (and third) time. Between each of those times, it happens to change its values, but when you look at the array at the end, the SAME updated dictionary shows up three times.
In the first example, you are instantiating a new copy of the cell variable each loop, which results in a different memory location. It is fundamentally a DIFFERENT dictionary each time, which in this case, gives you the result you are looking for.

shouldn't the previous value been erased and the new one to take it's place?
No! This is because every loop modifies the same object since the reference is not changing. This is why you have the last iteration values showing up three times.
Try:
for p in range(0,3):
cell = {"name": "","age" : 0, "education" : "","height" : 0}
cell["name"] = input("name:")
cell["age"] = int(input("age:"))
cell["education"] = input("education:")
cell["height"] = float(input("height:"))
people.append(cell)

Related

Seeing if a certain element is in a 2nd list without using any

This is my first time submitting here, so please bear with me on formatting.
So, I have a normal list of strings like:
name = ["Michael", "Helen", "Mike", "Joe", "Michael"]
I need to append these to another list if they don't repeat, and then later print out that list. So the next list would be like
norepeat_list = ["Michael", "Helen", "Mike", "Joe"]
Which I've already done and didn't have issues with. The problem is I'm not just appending those names, each name has another value associated with it in another list of times (also strings), and I need that too. So I've been appending both the name and the times at the same time in a list like
norepeat_list.append(name[i], time[i])
Which then makes the result a 2d list. So I don't know how to make it so that the program checks each list in the list for the original name value, and the only solution I've found is 'any' or some import stuff, and I can't do that.
So far I have:
name = ["Michael", "Helen", "Mike", "Joe", "Michael"]
time = ["08:42:39", "08:39:56", "08:58:43", "09:04:03", "05:32:54"]
norepeat_list = []
for i in range(len(name[i])):
if name[i] not in norepeat_list:
norepeat_list.append(name[i])
for i in norepeat_list:
print(i)
I've tried appending the timestamp at the same time and adding another layer of for j in ..., but I can't get anything to work.
The closest working code I can think of to your example is:
name = ["Michael", "Helen", "Mike", "Joe", "Michael"]
time = ["08:42:39", "08:39:56", "08:58:43", "09:04:03", "05:32:54"]
seen_names = []
norepeat_list = []
for i in range(len(name)):
if name[i] not in seen_names:
seen_names.append(name[i])
norepeat_list.append((name[i], time[i]))
for name_time in norepeat_list:
print(name_time)
but note that I've introduced an extra seen_names array to keep track of things you've already seen. Having this extra list is bad for maintenance as you need to make sure both lists stay in sync with each other. It's also bad for performance as checking whether an item is in a list takes time proportional to the length of the list, i.e. it gets slower for longer lists. It would be better to use a set to track the items you've seen, as this wouldn't slow down as more items get added.
A more significant improvement would be to use a dictionary/dict which allows you to associate arbitrary data (i.e. your times) with a set of items (i.e. your names). A naive translation of the above code would be:
names = ["Michael", "Helen", "Mike", "Joe", "Michael"]
times = ["08:42:39", "08:39:56", "08:58:43", "09:04:03", "05:32:54"]
names_and_times = {}
for i in range(len(names)):
if names[i] not in names_and_times:
names_and_times[names[i]] = times[i]
for name_time in names_and_times.items():
print(name_time)
Note that I've switched to a naming convention where plurals indicate a container of multiple values.
This could be improved by noticing that it repeats names[i] a lot. An way to reduce this would be to use enumerate:
names_and_times = {}
for i, name in enumerate(names):
if name not in names_and_times:
names_and_times[name] = times[i]
or alternatively, you could use zip as many other answers have suggested:
names_and_times = {}
for name, time in zip(names, times):
if name not in names_and_times:
names_and_times[name] = time
another variant would be to exploit the fact that dictionaries can't have duplicates, so setting a dictionary item with the same item multiple times would just change the value rather than adding a new entry:
names_and_times = {}
for name, time in zip(names, times):
names_and_times[name] = time
Note that this leaves the last time set for each name rather than the first. Your question doesn't seem to express a preference, but this could be changed by iterating in reverse order:
names_and_times = {}
for name, time in zip(reversed(names), reversed(times)):
names_and_times[name] = time
Next we could use a dictionary comprehension, which cleans the above up to:
names_and_times = {
name: time
for name, time in zip(names, time)
}
Finally we get to my original comment about things being magical, which exploits this usage of zip and the fact that passing an Iterable of pairs causes the constructor of a dict to build a dictionary where the first item of each pair is the key and the second item is the value:
names_and_times = dict(zip(names, times))
or if you want the first time for each name you could do:
names_and_times = dict(zip(reversed(names), reversed(times)))
All of my examples leave names_and_times as a dictionary, but if you wanted to convert back to a list you can just do:
names_and_times = list(names_and_times.items())
In this case I advise you use the "zip" function.
for n,t in zip(name,time):
print(n,t)
this will zip the two lists together and you have access to the values of both lists.
As you wrote "each name has another value associated with it in another list of times" I assume name and time has the same length, then something like this would work
zip produces a tuple of elements from both lists on the same position, so it will maintain the order, and skip elements with non-unique names (because they added to the seen):
seen = set()
result = []
for n, t in zip(name, time):
if n not in seen:
seen.add(n)
result.append((n, t))
print(result)

List of empty variables in Python

Is there any possibility of creating a list of variables/names* that have not been defined yet, and then loop through the list at a later stage to define them?
Like this:
varList = [varA, varB, varC]
for var in varList:
var = 0
print(varList)
>>>[0, 0, 0]
The reason I'm asking is because I have a project where I could hypothetically batch fill 40+ variables/names* this way by looping through a Pandas series*. Unfortunately Python doesn't seem to allow undefined variables in a list.
Does anyone have a creative workaround?
EDIT: Since you asked for the specific problem, here goes:
I have a Pandas series that looks like this (excuse the Swedish):
print(Elanv)
>>>
Förb. KVV PTP 5653,021978
Förb. KVV Skogsflis 0
Förb. KVV Återvinningsflis 337,1416119
Förb. KVV Eo1 6,1
Förb. HVC Återvinningsflis 1848
Name: Elanv, dtype: object
I want to store each value in this array to a set of new variables/names*, the names of which I want to control. For example, I want the new variable/name* containing the first value to be called "förbKVVptp", the second one "förbKVVsflis", and so forth.
The "normal" option is to assign each variable manually, like this:
förbKVVptp, förbKVVsflis, förbKVVåflis = Elanv.iloc[0], Elanv.iloc[1], Elanv.iloc[2] ....
But that creates a not so nice looking long bunch of code just to name variables/names*. Instead I thought I could do something like this (obviously with all the variables/names*, not just the first three) which looks and feels cleaner:
varList = [förbKVVptp, förbKVVsflis, förbKVVåflis]
for i, var in enumerate(varList): var = Elanv.iloc[i]
print(varList)
>>>[5653,021978, 0, 337,1416119]
Obviously this becomes pointless if I have to write the name of my new variables/names* twice (first to define them, then to put them inside the varList) so that was why I asked.
You cannot create uninitialized variables in python. Python doesn't really have variables, it has names referring to values. An uninitialized variable would be a name that doesn't refer to a value - so basically just a string:
varList = ['förbKVVptp', 'förbKVVsflis', 'förbKVVåflis']
You can turn these strings into variables by associating them with a value. One of the ways to do that is via globals:
for i, varname in enumerate(varList):
globals()[varname] = Elanv.iloc[i]
However, dynamically creating variables like this is often a code smell. Consider storing the values in a dictionary or list instead:
my_vars_dict = {
'förbKVVptp': Elanv.iloc[0],
'förbKVVsflis': Elanv.iloc[1],
'förbKVVåflis': Elanv.iloc[2]
}
my_vars_list = [Elanv.iloc[0], Elanv.iloc[1], Elanv.iloc[2]]
See also How do I create a variable number of variables?.
The answer to your question is that you can not have undefined variables in a list.
My solution is specific to solving this part of your problem The reason I'm asking is that I have a project where I could hypothetically batch fill over 100 arrays this way by looping through a Pandas array.
Below solution prefills the list with None and then you can change the values in the list.
Code:
varList = [None]*3
for i in range(len(varList)):
varList[i] = 0
print(varList)
Output:
[0, 0, 0]
So something you are trying to do in your example that won't do what you expect, is how you are trying to modify the list:
for var in varList:
var = 0
When you do var = 0, it won't change the list, nor the values of varA, varB, varC (if they were defined.)
Similarly, the following won't change the value of the list. It will just change the value of var.
var = mylist[0]
var = 1
To change the value of the list, you need to do an assignment expression on an indexed item on the list:
mylist = [None, None, None]
for i in range(len(mylist)):
mylist[i] = 0
print(mylist)
Note that by creating a list with empty slots before assigning the value is inefficient and not pythonic. A better way would be to just iterate through the source values, and append them to a list, or even better, use a list comprehension.

Using variables of equal value as dictionary keys results in overwritten values (Python)

I have a Python dictionary as below:
Mail_Dict = {
MailList0 : CodeList0,
MailList1 : CodeList1,
MailList2 : CodeList2,
MailList3 : CodeList3,
MailList4 : CodeList4
}
The issue is when one of the MailLists have values that are the same as another MailList (ie: MailList0 = 'someone#email.com' and also MailList1 = 'someone#email.com'), the keys are treated as equal and CodeList0 gets overwritten by CodeList1, also making my dictionary shorter in the process.
Is there anyway to keep these separate? I would think that the same logic for below:
a=1
b=1
saving to separate memory addresses and being different from:
a=b=1
would apply here, but I guess that isn't the case =(
Thanks in advance.
If you want to create both the values of the same key in the dictionary, one solution would be to add the old value to a list and append the new value to the list.
Mail_Dict = {
MailList0 : CodeList0,
MailList2 : CodeList2,
MailList3 : CodeList3,
MailList4 : CodeList4
}
Now if you want to add MailList1 (which has the same value as MailList0) to the dictionary, check if MailList1 already is a list. if it is not a list make it into a list and then append the new value.
if(MailList1 in Mail_Dict):
if (isinstance(Mail_Dict[MailList1],list)==False):
Mail_Dict[MailList1] = [Mail_Dict[MailList1]
Mail_Dict[MailList1].append(codeList1)

Python references to references in python

I have a function that takes given initial conditions for a set of variables and puts the result into another global variable. For example, let's say two of these variables is x and y. Note that x and y must be global variables (because it is too messy/inconvenient to be passing large amounts of references between many functions).
x = 1
y = 2
def myFunction():
global x,y,solution
print(x)
< some code that evaluates using a while loop >
solution = <the result from many iterations of the while loop>
I want to see how the result changes given a change in the initial condition of x and y (and other variables). For flexibility and scalability, I want to do something like this:
varSet = {'genericName0':x, 'genericName1':y} # Dict contains all variables that I wish to alter initial conditions for
R = list(range(10))
for r in R:
varSet['genericName0'] = r #This doesn't work the way I want...
myFunction()
Such that the 'print' line in 'myFunction' outputs the values 0,1,2,...,9 on successive calls.
So basically I'm asking how do you map a key to a value, where the value isn't a standard data type (like an int) but is instead a reference to another value? And having done that, how do you reference that value?
If it's not possible to do it the way I intend: What is the best way to change the value of any given variable by changing the name (of the variable that you wish to set) only?
I'm using Python 3.4, so would prefer a solution that works for Python 3.
EDIT: Fixed up minor syntax problems.
EDIT2: I think maybe a clearer way to ask my question is this:
Consider that you have two dictionaries, one which contains round objects and the other contains fruit. Members of one dictionary can also belong to the other (apples are fruit and round). Now consider that you have the key 'apple' in both dictionaries, and the value refers to the number of apples. When updating the number of apples in one set, you want this number to also transfer to the round objects dictionary, under the key 'apple' without manually updating the dictionary yourself. What's the most pythonic way to handle this?
Instead of making x and y global variables with a separate dictionary to refer to them, make the dictionary directly contain "x" and "y" as keys.
varSet = {'x': 1, 'y': 2}
Then, in your code, whenever you want to refer to these parameters, use varSet['x'] and varSet['y']. When you want to update them use varSet['x'] = newValue and so on. This way the dictionary will always be "up to date" and you don't need to store references to anything.
we are going to take an example of fruits as given in your 2nd edit:
def set_round_val(fruit_dict,round_dict):
fruit_set = set(fruit_dict)
round_set = set(round_dict)
common_set = fruit_set.intersection(round_set) # get common key
for key in common_set:
round_dict[key] = fruit_dict[key] # set modified value in round_dict
return round_dict
fruit_dict = {'apple':34,'orange':30,'mango':20}
round_dict = {'bamboo':10,'apple':34,'orange':20} # values can even be same as fruit_dict
for r in range(1,10):
fruit_set['apple'] = r
round_dict = set_round_val(fruit_dict,round_dict)
print round_dict
Hope this helps.
From what I've gathered from the responses from #BrenBarn and #ebarr, this is the best way to go about the problem (and directly answer EDIT2).
Create a class which encapsulates the common variable:
class Count:
__init__(self,value):
self.value = value
Create the instance of that class:
import Count
no_of_apples = Count.Count(1)
no_of_tennis_balls = Count.Count(5)
no_of_bananas = Count.Count(7)
Create dictionaries with the common variable in both of them:
round = {'tennis_ball':no_of_tennis_balls,'apple':no_of_apples}
fruit = {'banana':no_of_bananas,'apple':no_of_apples}
print(round['apple'].value) #prints 1
fruit['apple'].value = 2
print(round['apple'].value) #prints 2

Updating dictionary with randint performing unexpectedly

I'm trying to run a simple program in which I'm trying to run random.randint() in a loop to update a dictionary value but it seems to be working incorrectly. It always seems to be generating the same value.
The program so far is given below. I'm trying to create a uniformly distributed population, but I'm unsure why this isn't working.
import random
__author__ = 'navin'
namelist={
"person1":{"age":23,"region":1},
"person2":{"age":24,"region":2},
"person3":{"age":25,"region":0}
}
def testfunction():
default_val={"age":23,"region":1}
for i in xrange(100):
namelist[i]=default_val
for index in namelist:
x = random.randint(0, 2)
namelist[index]['region']=x
print namelist
if __name__ == "__main__" :
testfunction()
I'm expecting the 103 people to be roughly uniformly distributed across region 0-2, but I'm getting everyone in region 0.
Any idea why this is happening? Have I incorrectly used randint?
It is because all your 100 dictionary entries created in the for loop refer to not only the same value, but the same object. Thus there are only 4 distinct dictionaries at all as the values - the 3 created initially and the fourth one that you add 100 times with keys 0-99.
This can be demonstrated with the id() function that returns distinct integer for each distinct object:
from collections import Counter
...
ids = [ id(i) for i in namelist.values() ]
print Counter(ids)
results in:
Counter({139830514626640: 100, 139830514505160: 1,
139830514504880: 1, 139830514505440: 1})
To get distinct dictionaries, you need to copy the default value:
namelist[i] = default_val.copy()
Or create a new dictionary on each loop
namelist[i] = {"age": 23, "region": 1}
default_val={"age":23,"region":1}
for i in xrange(100):
namelist[i]=default_val
This doesn't mean "set every entry to a dictionary with these particular age and region values". This means "set every entry to this particular dictionary object".
for index in namelist:
x = random.randint(0, 2)
namelist[index]['region']=x
Since every object in namelist is really the same dictionary, all modifications in this loop happen to the same dictionary, and the last value of x wipes the others.
Evaluating a dict literal creates a new dict; assignment does not. If you want to make a new dictionary each time, put the dict literal in the loop:
for i in xrange(100):
namelist[i]={"age":23,"region":1}
Wanted to add this as a comment but the link is too long. As others have said you have just shared the reference to the dictionary, if you want to see the visualisation you can check it out on Python Tutor it should help you grok what's happening.

Categories