I'm reading David Beazley & Brian K. Jones 's book "Python Cookbook" (Ed 3). In pg 35 there's an example of ChainMap. I don't quite understand the values jumping up and down, pls see my understanding and questions below:
>>> values = ChainMap()
>>> values['x'] = 1
I suppose now values is ChainMap({'x': 1})
>>> # Add a new mapping
>>> values = values.new_child()
>>> values['x'] = 2
At the end of these 3 lines, what is values now? the ChainMap or the dictionary inside?
I'm really a bit lost, the codes here meant to me is the ChainMap added a new child, which shall be a new dictionary; so values becomes a dictionary item linked by the ChainMap; then values is assigned to the new dictionary .
>>> # Add a new mapping
>>> values = values.new_child()
>>> values['x'] = 3
But now values' start to callnew_child()again! Isnt'new_child()can only be called by aChainMap, not a particulardictionary` it links?
That's a great book, I suppose every Python developer should read at least chapters 1 and 4
Concerning your question:
At the end of these 3 lines, what is values now?
values is ChainMap object with 2 dictionaries mapped:
values = ChainMap({'x': 1}, {'x': 2})
The confusing thing is that the (mutated) object itself is returned from new_child() and not the real child.
But now values' start to callnew_child()again!
Of course you can call new_child() as many times as you want, values would be always ChainMap object with defined method new_child()
Related
I'm working on a premier league dataset and I need to create a dictionary where the keys are the teams and the values are their relative points. I have a list for the teams and a function that takes the results from the matches and transform them into the points for the teams. I got everything good but the problem is that instead of creating one dictionary with all the teams and their scores, it prints 20 dictionaries for each of the team. What is wrong?
You are creating a new dictionary at each iteration. Instead you should make the dictionary before the loop and then add a new entry at each iteration:
def get_team_points(df, teams):
team_points = {}
for team_name in teams:
num_points = ... # as you have it but since you posted an image I'm not rewriting it
team_points[team_name] = num_points
return team_points
A neater solution is to use a dictionary comprehension
def get_team_points(df, teams):
team_points = {team: get_num_points(team, df) for team in teams}
return team_points
where get_num_points is a function of your num_points = ... line, which again I would type out if you had posted the code as text :)
Also - please start using better variable names ;) your life will improve if you do. Names like List and Dict are really bad since:
they're not descriptive
they shadow build-in classes from the typing module (which you should use)
they violate pep8 naming conventions
and speaking of the typing module, here it is in action:
def get_team_points(df: pd.DataFrame, teams: List[str]) -> Dict[str, int]:
team_points = {team: get_num_points(team, df) for team in teams}
return team_points
now you can use a tool like mypy to catch errors before they occur. If you use an IDE instead of jupyter, it will highlight errors as you go. And also your code becomes much clearer for other developers (including future you) to understand and use.
I think perhaps you want this:
def get_team_points(df, teams):
Dict = {}
for team_name in List:
num_points = TeamPoints(...)
Dict[team_name] = num_points
print(Dict)
In TeamsPointDict() method, you are creating dictionaries for each team member in the list.
To insert all of them in one dictionary, declare the dictionary outside the for loop.
You want to take the sum of HP for Home teams, and AP for Away teams and add them together by team. Instead of manually separating, you can use two groupby operations and sum the results.
The return of each groupby will be a Series that we can then add together as pandas aligns on the index (teams in this case). Then with Series.to_dict() we get the entire dictionary at once.
import pandas as pd
df = pd.DataFrame({'HomeTeam': list('AABCDA'), 'AwayTeam': list('CBAAAB'),
'HP': [4,5,6,7,8,10], 'AP': [0,0,10,11,4,7]})
HomeTeam AwayTeam HP AP
0 A C 4 0
1 A B 5 0
2 B A 6 10
3 C A 7 11
4 D A 8 4
5 A B 10 7
# Fill value so addition works if a team has exclusively home/away games.
s = df.groupby('HomeTeam')['HP'].sum().add(df.groupby('AwayTeam')['AP'].sum(),
fill_value=0).astype(int)
s.to_dict()
{'A': 44, 'B': 13, 'C': 7, 'D': 8}
you should define your dictionary before the function then add your values.
dic = {}
for team_name in List:
dic[team_name] = num_points
I'm working on rewriting a lengthy Rexx script into a Python program and I am trying to figure out the best way to emulate the functionality of a Rexx compound variable. Would a dictionary be the best bet? Obviously, a dictionary will behave differently and won't be exactly the same as a compound variable.
Python dictionaries and Rexx stems are both associative arrays. They differ a bit in how they behave. Rexx's rules are very simple:
An array reference is split into the "stem" and the "tail", separated by a single dot.
The stem is a variable name, case-independently. This is the dictionary.
The tail is processed to identify an element of the array. It is split into one or more dot-separated substrings. Each substring is treated as a variable: if there is a variable with that case-independent name, its value is used instead of its name. Otherwise the name is uppercased and used. The string is put back together, dots and all. This is the key.
The array can have a default value, set by stem. = value, which applies to all unset elements.
So, the result of a an array reference stem.tailpart1.tailpart2.tailpart3 in Python is:
def evaluate_tail(tail, outer_locals):
result = []
for element in tail.split('.'):
if element in outer_locals:
result.append(str(outer_locals[element]))
else:
result.append(str(element).upper())
return '.'.join(result)
array_default_value = 4
stem = {'A.B.C': 1, 'A.9.C': 2, 'A..q': 3}
b = 9
d = 'q'
tail1 = 'a.b.c'
tail2 = 'a..b'
tail3 = 'a..d'
stem.get(evaluate_tail(tail1,locals()), array_default_value) # 'stem.a.b.c' >>> stem['A.9.C'] >>> 2
stem.get(evaluate_tail(tail2,locals()), array_default_value) # 'stem.a..b' >>> stem['A..9'] (not found) >>> (default value) >>> 4
stem.get(evaluate_tail(tail3,locals()), array_default_value) # 'stem.a..d' >>> stem['A..q'] >>> 3
Rexx-Stem variable and python-dictionaries are similar but there are differences.
Considder creating a RexxStem class based on a dictionary
Simple Stem expressions
a.b
can be translated to python as
a[b]
Compound Stem expressions
From my experience
a.b.c.d
would be translated to python as
a[b + '.' + c + '.' + d]
Try running the following rexx with your current interpretter and see what you
get:
a.2.3 = 'qwerty'
zz = 2'.'3
say a.zz
in some rexx interpreters you would get 'qwerty'. Not sure if that is all
Initializing a Stem Variables
in rexx you can initialize a stem variable lic
a. = 'abc'
Some common uses are
no = 0
yes = 1
found. = no
if ... then do
found.v = yes
end
....
if found.y = yes then do
..
end
or
counts. = 0
do while ...
if ... then do
counts.v = counts.v + 1;
end
end
Initial Value of a stem variable
Like all Rexx variables, the default/initial value of a variable so the default value of a.2.3 is A.2.3. If you are coming from another language this may seem strange but it can be quite handy in debugging - if a variable name pops up unexpectedly --> you have not initiated. It also means numeric expressions crash if you do not initialize a variable.
This not something you need to implement, just be aware of.
I am not a Python person but I know what a Dictionary is.
Depending on how complex the Rexx compound variable is, yes.
a.b
...is easily translatable to a dictionary.
a.b.c.d.e.f.g.h
...is less easily translatable to a dictionary. Perhaps a dictionary within a dictionary within a dictionary within a dictionary within a dictionary within a dictionary within a dictionary.
I am currently writing a script that extracts data from an xml and writes it into an html file for easy viewing on a webpage.
Each piece of data has 2 pieces of "sub data": Owner and Type.
In order for the html to work properly I need the "owner" string and the "type" string to be written in the correct place. If it was just a single piece of data then I would use dictionaries and just use the data name as the key and then write the value to html, however there are 2 pieces of data.
My question is, can a dictionary have 2 values (in my case owner and type) assigned to a single key?
Any object can be a value in a dictionary, so you can use any collection to hold more than one value against the same key. To expand my comments into some code samples, in order of increasing complexity (and, in my opinion, readability):
Tuple
The simplest option is a two-tuple of strings, which you can access by index:
>>> d1 = {'key': ('owner', 'type')}
>>> d1['key'][0]
'owner'
>>> d1['key'][1]
'type'
Dictionary
Next up is a sub-dictionary, which allows you to access the values by key name:
>>> d2 = {'key': {'owner': 'owner', 'type': 'type'}}
>>> d2['key']['owner']
'owner'
>>> d2['key']['type']
'type'
Named tuple
Finally the collections module provides namedtuple, which requires a little setup but then allows you to access the values by attribute name:
>>> from collections import namedtuple
>>> MyTuple = namedtuple('MyTuple', ('owner', 'type'))
>>> d3 = {'key': MyTuple('owner', 'type')}
>>> d3['key'].owner
'owner'
>>> d3['key'].type
'type'
Using named keys/attributes makes your subsequent access to the values clearer (d3['key'].owner and d2['key']['owner'] are less ambiguous than d1['key'][0]).
As long as keys are hash-able you can have keys of any format. Note, tuples are hash-able so that would be a possible solution to your problem
Make a tuple of case-owner and type and use it as a key to your dictionary.
Note, generally all objects that are hashable should also be immutable, but not vice-versa. So
I'm trying to run a simple program in which I'm trying to run random.randint() in a loop to update a dictionary value but it seems to be working incorrectly. It always seems to be generating the same value.
The program so far is given below. I'm trying to create a uniformly distributed population, but I'm unsure why this isn't working.
import random
__author__ = 'navin'
namelist={
"person1":{"age":23,"region":1},
"person2":{"age":24,"region":2},
"person3":{"age":25,"region":0}
}
def testfunction():
default_val={"age":23,"region":1}
for i in xrange(100):
namelist[i]=default_val
for index in namelist:
x = random.randint(0, 2)
namelist[index]['region']=x
print namelist
if __name__ == "__main__" :
testfunction()
I'm expecting the 103 people to be roughly uniformly distributed across region 0-2, but I'm getting everyone in region 0.
Any idea why this is happening? Have I incorrectly used randint?
It is because all your 100 dictionary entries created in the for loop refer to not only the same value, but the same object. Thus there are only 4 distinct dictionaries at all as the values - the 3 created initially and the fourth one that you add 100 times with keys 0-99.
This can be demonstrated with the id() function that returns distinct integer for each distinct object:
from collections import Counter
...
ids = [ id(i) for i in namelist.values() ]
print Counter(ids)
results in:
Counter({139830514626640: 100, 139830514505160: 1,
139830514504880: 1, 139830514505440: 1})
To get distinct dictionaries, you need to copy the default value:
namelist[i] = default_val.copy()
Or create a new dictionary on each loop
namelist[i] = {"age": 23, "region": 1}
default_val={"age":23,"region":1}
for i in xrange(100):
namelist[i]=default_val
This doesn't mean "set every entry to a dictionary with these particular age and region values". This means "set every entry to this particular dictionary object".
for index in namelist:
x = random.randint(0, 2)
namelist[index]['region']=x
Since every object in namelist is really the same dictionary, all modifications in this loop happen to the same dictionary, and the last value of x wipes the others.
Evaluating a dict literal creates a new dict; assignment does not. If you want to make a new dictionary each time, put the dict literal in the loop:
for i in xrange(100):
namelist[i]={"age":23,"region":1}
Wanted to add this as a comment but the link is too long. As others have said you have just shared the reference to the dictionary, if you want to see the visualisation you can check it out on Python Tutor it should help you grok what's happening.
For a test program I'm making a simple model of the NFL. I'd like to assign a record (wins and losses) to a team as a value in a dictionary? Is that possible?
For example:
afcNorth = ["Baltimore Ravens", "Pittsburgh Steelers", "Cleveland Browns", "Cincinatti Bengals"]
If the Ravens had 13 wins and 3 loses, can the dictionary account for both of those values? If so, how?
sure, just make the value a list or tuple:
afc = {'Baltimore Ravens': (10,3), 'Pb Steelers': (3,4)}
If it gets more complicated, you might want to make a more complicated structure than a tuple - for example if you like dictionaries, you can put a dictionary in your dictionary so you can dictionary while you dictionary.
afc = {'Baltimore Ravens': {'wins':10,'losses': 3}, 'Pb Steelers': {'wins': 3,'losses': 4}}
But eventually you might want to move up to classes...
The values in the dictionary can be tuples or, maybe better in this case, lists:
d = {"Baltimore Ravens": [13, 3]}
d["Baltimore Ravens"][0] += 1
print d
# {"Baltimore Ravens": [14, 3]}
Well, you can use a tuple (or a list):
records = {}
records["Baltimore Ravens"] = (13, 3)
Or you could be fancy and make a Record class with Record.wins and record.losses, but that's probably overkill.
(As another answer points out, using a list means that you can do arithmetic on the values, which might be useful.)