Updating dictionary with randint performing unexpectedly - python

I'm trying to run a simple program in which I'm trying to run random.randint() in a loop to update a dictionary value but it seems to be working incorrectly. It always seems to be generating the same value.
The program so far is given below. I'm trying to create a uniformly distributed population, but I'm unsure why this isn't working.
import random
__author__ = 'navin'
namelist={
"person1":{"age":23,"region":1},
"person2":{"age":24,"region":2},
"person3":{"age":25,"region":0}
}
def testfunction():
default_val={"age":23,"region":1}
for i in xrange(100):
namelist[i]=default_val
for index in namelist:
x = random.randint(0, 2)
namelist[index]['region']=x
print namelist
if __name__ == "__main__" :
testfunction()
I'm expecting the 103 people to be roughly uniformly distributed across region 0-2, but I'm getting everyone in region 0.
Any idea why this is happening? Have I incorrectly used randint?

It is because all your 100 dictionary entries created in the for loop refer to not only the same value, but the same object. Thus there are only 4 distinct dictionaries at all as the values - the 3 created initially and the fourth one that you add 100 times with keys 0-99.
This can be demonstrated with the id() function that returns distinct integer for each distinct object:
from collections import Counter
...
ids = [ id(i) for i in namelist.values() ]
print Counter(ids)
results in:
Counter({139830514626640: 100, 139830514505160: 1,
139830514504880: 1, 139830514505440: 1})
To get distinct dictionaries, you need to copy the default value:
namelist[i] = default_val.copy()
Or create a new dictionary on each loop
namelist[i] = {"age": 23, "region": 1}

default_val={"age":23,"region":1}
for i in xrange(100):
namelist[i]=default_val
This doesn't mean "set every entry to a dictionary with these particular age and region values". This means "set every entry to this particular dictionary object".
for index in namelist:
x = random.randint(0, 2)
namelist[index]['region']=x
Since every object in namelist is really the same dictionary, all modifications in this loop happen to the same dictionary, and the last value of x wipes the others.
Evaluating a dict literal creates a new dict; assignment does not. If you want to make a new dictionary each time, put the dict literal in the loop:
for i in xrange(100):
namelist[i]={"age":23,"region":1}

Wanted to add this as a comment but the link is too long. As others have said you have just shared the reference to the dictionary, if you want to see the visualisation you can check it out on Python Tutor it should help you grok what's happening.

Related

How can I rename a dictionary within a program?

I ask the user of my program to input the number of datasets he/she wants to investigate, e.g. three datasets. Accordingly, I should then create three dictionaries (dataset_1, dataset_2, and dataset_3) to hold the values for the various parameters. Since I do not know beforehand the number of datasets the user wants to investigate, I have to create and name the dictionaries within the program.
Apparently, Python does not let me do that. I could not rename the dictionary once it has been created.
I have tried using os.rename("oldname", "newname"), but that only works if I have a file stored on my computer hard disk. I could not get it to work with an object that lives only within my program.
number_sets = input('Input the number of datasets to investigate:')
for dataset in range(number_sets):
init_dict = {}
# create dictionary name for the particular dataset
dict_name = ''.join(['dataset_', str(dataset+1)])
# change the dictionary´s name
# HOW CAN I CHANGE THE DICTIONARY´S NAME FROM "INIT_DICT"
# TO "DATASET_1", WHICH IS THE STRING RESULT FOR DICT_NAME?
I would like to have in the end
dataset_1 = {}
dataset_2 = {}
and so on.
You don't (need to). Keep a list of data sets.
datasets = []
for i in range(number_sets):
init_dict = {}
...
datasets.append(init_dict)
Then you have datasets[0], datasets[1], etc., rather than dataset_1, dataset_2, etc.
Inside the loop, init_dict is set to a brand new empty directory at the top of each iteration, without affecting the dicts added to datasets on previous iterations.
If you want to create variables like that you could use the globals
number_sets = 2
for dataset in range(number_sets):
dict_name = ''.join(['dataset_', str(dataset+1)])
globals() [dict_name] = {}
print(dataset_1)
print(dataset_2)
However this is not a good practice, and it should be avoided, if you need to keep several variables that are similar the best thing to do is to create a list.
You can use a single dict and then add all the data sets into it as a dictionary:
all_datasets = {}
for i in range(number_sets):
all_datasets['dataset'+str(i+1)] = {}
And then you can access the data by using:
all_datasets['dataset_1']
This question gets asked many times in many different variants (this is one of the more prominent ones, for example). The answer is always the same:
It is not easily possible and most of the time not a good idea to create python variable names from strings.
The more easy, approachable, safe and usable way is to just use another dictionary. One of the cool things about dictionaries: any object can become a key / value. So the possibilities are nearly endless. In your code, this can be done easily with a dict comprehension:
number_sets = int(input('Input the number of datasets to investigate:')) # also notice that you have to add int() here
data = {''.join(['dataset_', str(dataset + 1)]): {} for dataset in range(number_sets)}
print(data)
>>> 5
{'dataset_1': {}, 'dataset_2': {}, 'dataset_3': {}, 'dataset_4': {}, 'dataset_5': {}}
Afterwards, these dictionaries can be easily accessed via data[name_of_dataset]. Thats how it should be done.

Initialization of a list of dictionaries

I want to take input from the user and initialize a list of dictionaries.
I have the following block which works fine.
people = []
for p in range(3):
cell = {"name": "","age" : 0, "education" : "","height" : 0}
cell["name"] = input("name:")
cell["age"] = int(input("age:"))
cell["education"] = input("education:")
cell["height"] = float(input("height:"))
people.append(cell)
The problem I have is why the following block does not work for me.
people = []
cell = {"name": "","age" : 0, "education" : "","height" : 0}
for p in range(0,3):
cell["name"] = input("name:")
cell["age"] = int(input("age:"))
cell["education"] = input("education:")
cell["height"] = float(input("height:"))
people.append(cell)
I do not understand why at the end of the iteration I have the list initialized with the latest input all the 3 times, I mean when I use this line:
cell["name"] = input("name:")
shouldn't the previous value been replaced with the new one?
The important part is this line:
cell = {"name": "","age" : 0, "education" : "","height" : 0}
here you create a dictionary and save a reference to it under cell
If you do this before the loop (like in your second example) you have only created one dictionary and appended it to the list three times. So if you change anything inside of any dictionary, your changes will show up at all indices of the list (since they all point to the same dictionary)
However, if you run the line inside of your loop you actually create three different dictionaries, which is essentially what you want here. This is why your first example works and your second one doesn't.
This is the difference between passing by value and passing by reference. Passing by reference, or a pointer in other language terms, means the variable points to a spot in memory. In Python, dictionaries behave this way, which means that since you are, in the second version, changing what exists in the dictionary then appending it, what you are actually doing is appending the SAME dictionary to the array a second (and third) time. Between each of those times, it happens to change its values, but when you look at the array at the end, the SAME updated dictionary shows up three times.
In the first example, you are instantiating a new copy of the cell variable each loop, which results in a different memory location. It is fundamentally a DIFFERENT dictionary each time, which in this case, gives you the result you are looking for.
shouldn't the previous value been erased and the new one to take it's place?
No! This is because every loop modifies the same object since the reference is not changing. This is why you have the last iteration values showing up three times.
Try:
for p in range(0,3):
cell = {"name": "","age" : 0, "education" : "","height" : 0}
cell["name"] = input("name:")
cell["age"] = int(input("age:"))
cell["education"] = input("education:")
cell["height"] = float(input("height:"))
people.append(cell)

Python references to references in python

I have a function that takes given initial conditions for a set of variables and puts the result into another global variable. For example, let's say two of these variables is x and y. Note that x and y must be global variables (because it is too messy/inconvenient to be passing large amounts of references between many functions).
x = 1
y = 2
def myFunction():
global x,y,solution
print(x)
< some code that evaluates using a while loop >
solution = <the result from many iterations of the while loop>
I want to see how the result changes given a change in the initial condition of x and y (and other variables). For flexibility and scalability, I want to do something like this:
varSet = {'genericName0':x, 'genericName1':y} # Dict contains all variables that I wish to alter initial conditions for
R = list(range(10))
for r in R:
varSet['genericName0'] = r #This doesn't work the way I want...
myFunction()
Such that the 'print' line in 'myFunction' outputs the values 0,1,2,...,9 on successive calls.
So basically I'm asking how do you map a key to a value, where the value isn't a standard data type (like an int) but is instead a reference to another value? And having done that, how do you reference that value?
If it's not possible to do it the way I intend: What is the best way to change the value of any given variable by changing the name (of the variable that you wish to set) only?
I'm using Python 3.4, so would prefer a solution that works for Python 3.
EDIT: Fixed up minor syntax problems.
EDIT2: I think maybe a clearer way to ask my question is this:
Consider that you have two dictionaries, one which contains round objects and the other contains fruit. Members of one dictionary can also belong to the other (apples are fruit and round). Now consider that you have the key 'apple' in both dictionaries, and the value refers to the number of apples. When updating the number of apples in one set, you want this number to also transfer to the round objects dictionary, under the key 'apple' without manually updating the dictionary yourself. What's the most pythonic way to handle this?
Instead of making x and y global variables with a separate dictionary to refer to them, make the dictionary directly contain "x" and "y" as keys.
varSet = {'x': 1, 'y': 2}
Then, in your code, whenever you want to refer to these parameters, use varSet['x'] and varSet['y']. When you want to update them use varSet['x'] = newValue and so on. This way the dictionary will always be "up to date" and you don't need to store references to anything.
we are going to take an example of fruits as given in your 2nd edit:
def set_round_val(fruit_dict,round_dict):
fruit_set = set(fruit_dict)
round_set = set(round_dict)
common_set = fruit_set.intersection(round_set) # get common key
for key in common_set:
round_dict[key] = fruit_dict[key] # set modified value in round_dict
return round_dict
fruit_dict = {'apple':34,'orange':30,'mango':20}
round_dict = {'bamboo':10,'apple':34,'orange':20} # values can even be same as fruit_dict
for r in range(1,10):
fruit_set['apple'] = r
round_dict = set_round_val(fruit_dict,round_dict)
print round_dict
Hope this helps.
From what I've gathered from the responses from #BrenBarn and #ebarr, this is the best way to go about the problem (and directly answer EDIT2).
Create a class which encapsulates the common variable:
class Count:
__init__(self,value):
self.value = value
Create the instance of that class:
import Count
no_of_apples = Count.Count(1)
no_of_tennis_balls = Count.Count(5)
no_of_bananas = Count.Count(7)
Create dictionaries with the common variable in both of them:
round = {'tennis_ball':no_of_tennis_balls,'apple':no_of_apples}
fruit = {'banana':no_of_bananas,'apple':no_of_apples}
print(round['apple'].value) #prints 1
fruit['apple'].value = 2
print(round['apple'].value) #prints 2

How to create a dictionary based on variable value in Python

I am trying to create a dictionary where the name comes from a variable.
Here is the situation since maybe there is a better way:
Im using an API to get attributes of "objects". (Name, Description, X, Y, Z) etc. I want to store this information in a way that keeps the data by "object".
In order to get this info, the API iterates through all the "objects".
So what my proposal was that if the object name is one of the ones i want to "capture", I want to create a dictionary with that name like so:
ObjectName = {'Description': VarDescrption, 'X': VarX.. etc}
(Where I say "Varetc..." that would be the value of that attribute passed by the API.
Now since I know the list of names ahead of time, I CAN use a really long If tree but am looking for something easier to code to accomplish this. (and extensible without adding too much code)
Here is code I have:
def py_cell_object():
#object counter - unrelated to question
addtototal()
#is this an object I want?
if aw.aw_string (239)[:5] == "TDT3_":
#If yes, make a dictionary with the object description as the name of the dictionary.
vars()[aw.aw_string (239)]={'X': aw.aw_int (232), 'Y': aw.aw_int (233), 'Z': aw.aw_int (234), 'No': aw.aw_int (231)}
#print back result to test
for key in aw.aw_string (239):
print 'key=%s, value=%s' % (key, aw.aw_string (239)[key])
here are the first two lines of code to show what "aw" is
from ctypes import *
aw = CDLL("aw")
to explain what the numbers in the API calls are:
231 AW_OBJECT_NUMBER,
232 AW_OBJECT_X,
233 AW_OBJECT_Y,
234 AW_OBJECT_Z,
239 AW_OBJECT_DESCRIPTION,
231-234 are integers and 239 is a string
I deduce that you are using the Active Worlds SDK. It would save time to mention that in the first place in future questions.
I guess your goal is to create a top-level dictionary, where each key is the object description. Each value is another dictionary, storing many of the attributes of that object.
I took a quick look at the AW SDK documentation on the wiki and I don't see a way to ask the SDK for a list of attribute names, IDs, and types. So you will have to hard-code that information in your program somehow. Unless you need it elsewhere, it's simplest to just hard-code it where you create the dictionary, which is what you are already doing. To print it back out, just print the attribute dictionary's repr. I would probably format your method more like this:
def py_cell_object():
#object counter - unrelated to question
addtototal()
description = aw.aw_string(239)
if description.startswith("TDT3_"):
vars()[description] = {
'DESCRIPTION': description,
'X': aw.aw_int(232),
'Y': aw.aw_int(233),
'Z': aw.aw_int(234),
'NUMBER': aw.aw_int (231),
... etc for remaining attributes
}
print repr(vars()[description])
Some would argue that you should make named constants for the numbers 232, 233, 234, etc., but I see little reason to do that unless you need them in multiple places, or unless it's easy to generate them automatically from the SDK (for example, by parsing a .h file).
If the variables are defined in the local scope, it's as simple as:
obj_names = {}
while True:
varname = read_name()
if not varname: break
obj_names[varname] = locals()[varname]
This is actual code I am using in my production environment
hope it helps.
cveDict = {}
# StrVul is a python list holding list of vulnerabilities belonging to a report
report = Report.objects.get(pk=report_id)
vul = Vulnerability.objects.filter(report_id=report_id)
strVul = map(str, vul)
# fill up the python dict, += 1 if cvetype already exists
for cve in strVul:
i = Cve.objects.get(id=cve)
if i.vul_cvetype in cveDict.keys():
cveDict[i.vul_cvetype] += 1
else:
cveDict[i.vul_cvetype] = 1

Python large list manipulation

I have python list like below:
DEMO_LIST = [
[{'unweighted_criket_data': [-46.14554728131345, 2.997789122813151, -23.66171024766996]},
{'weighted_criket_index_input': [-6.275794430258629, 0.4076993207025885, -3.2179925936831144]},
{'manual_weighted_cricket_data': [-11.536386820328362, 0.7494472807032877, -5.91542756191749]},
{'average_weighted_cricket_data': [-8.906090625293496, 0.5785733007029381, -4.566710077800302]}],
[{'unweighted_football_data': [-7.586729834820534, 3.9521665714843675, 5.702038461085529]},
{'weighted_football_data': [-3.512655913521907, 1.8298531225972623, 2.6400438074826]},
{'manual_weighted_football_data': [-1.8966824587051334, 0.9880416428710919, 1.4255096152713822]},
{'average_weighted_football_data': [-2.70466918611352, 1.4089473827341772, 2.0327767113769912]}],
[{'unweighted_rugby_data': [199.99999999999915, 53.91020408163265, -199.9999999999995]},
{'weighted_rugby_data': [3.3999999999999857, 0.9164734693877551, -3.3999999999999915]},
{'manual_rugby_data': [49.99999999999979, 13.477551020408162, -49.99999999999987]},
{'average_weighted_rugby_data': [26.699999999999886, 7.197012244897959, -26.699999999999932]}],
[{'unweighted_swimming_data': [2.1979283454982053, 14.079951031527246, -2.7585499298828777]},
{'weighted_swimming_data': [0.8462024130168091, 5.42078114713799, -1.062041723004908]},
{'manual_weighted_swimming_data': [0.5494820863745513, 3.5199877578818115, -0.6896374824707194]},
{'average_weighted_swimming_data': [0.6978422496956802, 4.470384452509901, -0.8758396027378137]}]]
I want to manipulate list items and do some basic math operation,like getting each data type list (example taking all first element of unweighted data and do sum etc)
Currently I am doing it like this.
The current solution is a very basic one, I want to do it in such way that if the list length is grown, it can automatically calculate the results. Right now there are four list, it can be 5 or 8,the final result should be the summation of all the first element of unweighted values,example:
now I am doing result_u1/4,result_u2/4,result_u3/4
I want it like result_u0/4,result_u1/4.......result_n4/4 # n is the number of list inside demo list
Any idea how I can do that?
(sorry for the beginner question)
You can implement a specific list class for yourself, that adds your summary with new item's values in append function, or decrease them on remove:
class MyList(list):
def __init__(self):
self.summary = 0
list.__init__(self)
def append(self, item):
self.summary += item.sample_value
list.append(self, item)
def remove(self, item):
self.summary -= item.sample_value
list.remove(self, item)
And a simple usage:
my_list = MyList()
print my_list.summary # Outputs 0
my_list.append({'sample_value': 10})
print my_list.summary # Outputs 10
In Python, whenever you start counting how many there are of something inside an iterable (a string, a list, a set, a collection of any of these) in order to loop over it - its a sign that your code can be revised.
Things can can work for 3 of something, can work for 300, 3000 and 3 million of the same thing without changing your code.
In your case, your logic is - "For every X inside DEMO_LIST, do something"
This translated into Python is:
for i in DEMO_LIST:
# do something with i
This snippet will run through any size of DEMO_LIST and each time i is each of whatever is in side DEMO_LIST. In your case it is the list that contains your dictionaries.
Further expanding on that, you can say:
for i in DEMO_LIST:
for k in i:
# now you are in each list that is inside the outer DEMO_LIST
Expanding this to do a practical example; a sum of all unweighted_criket_data:
all_unweighted_cricket_data = []
for i in DEMO_LIST:
for k in i:
if 'unweighted_criket_data' in k:
for data in k['unweighted_cricket_data']:
all_unweighted_cricked_data.append(data)
sum_of_data = sum(all_unweighted_cricket_data)
There are various "shortcuts" to do the same, but you can appreciate those once you understand the "expanded" version of what the shortcut is trying to do.
Remember there is nothing wrong with writing it out the 'long way' especially when you are not sure of the best way to do something. Once you are comfortable with the logic, then you can use shortcuts like list comprehensions.
Start by replacing this:
for i in range(0,len(data_list)-1):
result_u1+=data_list[i][0].values()[0][0]
result_u2+=data_list[i][0].values()[0][1]
result_u3+=data_list[i][0].values()[0][2]
print "UNWEIGHTED",result_u1/4,result_u2/4,result_u3/4
With this:
sz = len(data_list[i][0].values()[0])
result_u = [0] * sz
for i in range(0,len(data_list)-1):
for j in range(0,sz):
result_u[j] += data_list[i][0].values()[0][j]
print "UNWEIGHTED", [x/len(data_list) for x in result_u]
Apply similar changes elsewhere. This assumes that your data really is "rectangular", that is to say every corresponding inner list has the same number of values.
A slightly more "Pythonic"[*] version of:
for j in range(0,sz):
result_u[j] += data_list[i][0].values()[0][j]
is:
for j, dataval in enumerate(data_list[i][0].values()[0]):
result_u[j] += dataval
There are some problems with your code, though:
values()[0] might give you any of the values in the dictionary, since dictionaries are unordered. Maybe it happens to give you the unweighted data, maybe not.
I'm confused why you're looping on the range 0 to len(data_list)-1: if you want to include all the sports you need 0 to len(data_list), because the second parameter to range, the upper limit, is excluded.
You could perhaps consider reformatting your data more like this:
DEMO_LIST = {
'cricket' : {
'unweighted' : [1,2,3],
'weighted' : [4,5,6],
'manual' : [7,8,9],
'average' : [10,11,12],
},
'rugby' : ...
}
Once you have the same keys in each sport's dictionary, you can replace values()[0] with ['unweighted'], so you'll always get the right dictionary entry. And once you have a whole lot of dictionaries all with the same keys, you can replace them with a class or a named tuple, to define/enforce that those are the values that must always be present:
import collections
Sport = collections.namedtuple('Sport', 'unweighted weighted manual average')
DEMO_LIST = {
'cricket' : Sport(
unweighted = [1,2,3],
weighted = [4,5,6],
manual = [7,8,9],
average = [10,11,12],
),
'rugby' : ...
}
Now you can replace ['unweighted'] with .unweighted.
[*] The word "Pythonic" officially means something like, "done in the style of a Python programmer, taking advantage of any useful Python features to produce the best idiomatic Python code". In practice it usually means "I prefer this, and I'm a Python programmer, therefore this is the correct way to write Python". It's an argument by authority if you're Guido van Rossum, or by appeal to nebulous authority if you're not. In almost all circumstances it can be replaced with "good IMO" without changing the sense of the sentence ;-)

Categories