I'm new to Python, and am sort of surprised I cannot do this.
dictionary = {
'a' : '123',
'b' : dictionary['a'] + '456'
}
I'm wondering what the Pythonic way to correctly do this in my script, because I feel like I'm not the only one that has tried to do this.
EDIT: Enough people were wondering what I'm doing with this, so here are more details for my use cases. Lets say I want to keep dictionary objects to hold file system paths. The paths are relative to other values in the dictionary. For example, this is what one of my dictionaries may look like.
dictionary = {
'user': 'sholsapp',
'home': '/home/' + dictionary['user']
}
It is important that at any point in time I may change dictionary['user'] and have all of the dictionaries values reflect the change. Again, this is an example of what I'm using it for, so I hope that it conveys my goal.
From my own research I think I will need to implement a class to do this.
No fear of creating new classes -
You can take advantage of Python's string formating capabilities
and simply do:
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item) % self
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/%(user)s',
'bin' : '%(home)s/bin'
})
print dictionary["home"]
print dictionary["bin"]
Nearest I came up without doing object:
dictionary = {
'user' : 'gnucom',
'home' : lambda:'/home/'+dictionary['user']
}
print dictionary['home']()
dictionary['user']='tony'
print dictionary['home']()
>>> dictionary = {
... 'a':'123'
... }
>>> dictionary['b'] = dictionary['a'] + '456'
>>> dictionary
{'a': '123', 'b': '123456'}
It works fine but when you're trying to use dictionary it hasn't been defined yet (because it has to evaluate that literal dictionary first).
But be careful because this assigns to the key of 'b' the value referenced by the key of 'a' at the time of assignment and is not going to do the lookup every time. If that is what you are looking for, it's possible but with more work.
What you're describing in your edit is how an INI config file works. Python does have a built in library called ConfigParser which should work for what you're describing.
This is an interesting problem. It seems like Greg has a good solution. But that's no fun ;)
jsbueno as a very elegant solution but that only applies to strings (as you requested).
The trick to a 'general' self referential dictionary is to use a surrogate object. It takes a few (understatement) lines of code to pull off, but the usage is about what you want:
S = SurrogateDict(AdditionSurrogateDictEntry)
d = S.resolve({'user': 'gnucom',
'home': '/home/' + S['user'],
'config': [S['home'] + '/.emacs', S['home'] + '/.bashrc']})
The code to make that happen is not nearly so short. It lives in three classes:
import abc
class SurrogateDictEntry(object):
__metaclass__ = abc.ABCMeta
def __init__(self, key):
"""record the key on the real dictionary that this will resolve to a
value for
"""
self.key = key
def resolve(self, d):
""" return the actual value"""
if hasattr(self, 'op'):
# any operation done on self will store it's name in self.op.
# if this is set, resolve it by calling the appropriate method
# now that we can get self.value out of d
self.value = d[self.key]
return getattr(self, self.op + 'resolve__')()
else:
return d[self.key]
#staticmethod
def make_op(opname):
"""A convience class. This will be the form of all op hooks for subclasses
The actual logic for the op is in __op__resolve__ (e.g. __add__resolve__)
"""
def op(self, other):
self.stored_value = other
self.op = opname
return self
op.__name__ = opname
return op
Next, comes the concrete class. simple enough.
class AdditionSurrogateDictEntry(SurrogateDictEntry):
__add__ = SurrogateDictEntry.make_op('__add__')
__radd__ = SurrogateDictEntry.make_op('__radd__')
def __add__resolve__(self):
return self.value + self.stored_value
def __radd__resolve__(self):
return self.stored_value + self.value
Here's the final class
class SurrogateDict(object):
def __init__(self, EntryClass):
self.EntryClass = EntryClass
def __getitem__(self, key):
"""record the key and return"""
return self.EntryClass(key)
#staticmethod
def resolve(d):
"""I eat generators resolve self references"""
stack = [d]
while stack:
cur = stack.pop()
# This just tries to set it to an appropriate iterable
it = xrange(len(cur)) if not hasattr(cur, 'keys') else cur.keys()
for key in it:
# sorry for being a duche. Just register your class with
# SurrogateDictEntry and you can pass whatever.
while isinstance(cur[key], SurrogateDictEntry):
cur[key] = cur[key].resolve(d)
# I'm just going to check for iter but you can add other
# checks here for items that we should loop over.
if hasattr(cur[key], '__iter__'):
stack.append(cur[key])
return d
In response to gnucoms's question about why I named the classes the way that I did.
The word surrogate is generally associated with standing in for something else so it seemed appropriate because that's what the SurrogateDict class does: an instance replaces the 'self' references in a dictionary literal. That being said, (other than just being straight up stupid sometimes) naming is probably one of the hardest things for me about coding. If you (or anyone else) can suggest a better name, I'm all ears.
I'll provide a brief explanation. Throughout S refers to an instance of SurrogateDict and d is the real dictionary.
A reference S[key] triggers S.__getitem__ and SurrogateDictEntry(key) to be placed in the d.
When S[key] = SurrogateDictEntry(key) is constructed, it stores key. This will be the key into d for the value that this entry of SurrogateDictEntry is acting as a surrogate for.
After S[key] is returned, it is either entered into the d, or has some operation(s) performed on it. If an operation is performed on it, it triggers the relative __op__ method which simple stores the value that the operation is performed on and the name of the operation and then returns itself. We can't actually resolve the operation because d hasn't been constructed yet.
After d is constructed, it is passed to S.resolve. This method loops through d finding any instances of SurrogateDictEntry and replacing them with the result of calling the resolve method on the instance.
The SurrogateDictEntry.resolve method receives the now constructed d as an argument and can use the value of key that it stored at construction time to get the value that it is acting as a surrogate for. If an operation was performed on it after creation, the op attribute will have been set with the name of the operation that was performed. If the class has a __op__ method, then it has a __op__resolve__ method with the actual logic that would normally be in the __op__ method. So now we have the logic (self.op__resolve) and all necessary values (self.value, self.stored_value) to finally get the real value of d[key]. So we return that which step 4 places in the dictionary.
finally the SurrogateDict.resolve method returns d with all references resolved.
That'a a rough sketch. If you have any more questions, feel free to ask.
If you, just like me wandering how to make #jsbueno snippet work with {} style substitutions, below is the example code (which is probably not much efficient though):
import string
class MyDict(dict):
def __init__(self, *args, **kw):
super(MyDict,self).__init__(*args, **kw)
self.itemlist = super(MyDict,self).keys()
self.fmt = string.Formatter()
def __getitem__(self, item):
return self.fmt.vformat(dict.__getitem__(self, item), {}, self)
xs = MyDict({
'user' : 'gnucom',
'home' : '/home/{user}',
'bin' : '{home}/bin'
})
>>> xs["home"]
'/home/gnucom'
>>> xs["bin"]
'/home/gnucom/bin'
I tried to make it work with the simple replacement of % self with .format(**self) but it turns out it wouldn't work for nested expressions (like 'bin' in above listing, which references 'home', which has it's own reference to 'user') because of the evaluation order (** expansion is done before actual format call and it's not delayed like in original % version).
Write a class, maybe something with properties:
class PathInfo(object):
def __init__(self, user):
self.user = user
#property
def home(self):
return '/home/' + self.user
p = PathInfo('thc')
print p.home # /home/thc
As sort of an extended version of #Tony's answer, you could build a dictionary subclass that calls its values if they are callables:
class CallingDict(dict):
"""Returns the result rather than the value of referenced callables.
>>> cd = CallingDict({1: "One", 2: "Two", 'fsh': "Fish",
... "rhyme": lambda d: ' '.join((d[1], d['fsh'],
... d[2], d['fsh']))})
>>> cd["rhyme"]
'One Fish Two Fish'
>>> cd[1] = 'Red'
>>> cd[2] = 'Blue'
>>> cd["rhyme"]
'Red Fish Blue Fish'
"""
def __getitem__(self, item):
it = super(CallingDict, self).__getitem__(item)
if callable(it):
return it(self)
else:
return it
Of course this would only be usable if you're not actually going to store callables as values. If you need to be able to do that, you could wrap the lambda declaration in a function that adds some attribute to the resulting lambda, and check for it in CallingDict.__getitem__, but at that point it's getting complex, and long-winded, enough that it might just be easier to use a class for your data in the first place.
This is very easy in a lazily evaluated language (haskell).
Since Python is strictly evaluated, we can do a little trick to turn things lazy:
Y = lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args)))
d1 = lambda self: lambda: {
'a': lambda: 3,
'b': lambda: self()['a']()
}
# fix the d1, and evaluate it
d2 = Y(d1)()
# to get a
d2['a']() # 3
# to get b
d2['b']() # 3
Syntax wise this is not very nice. That's because of us needing to explicitly construct lazy expressions with lambda: ... and explicitly evaluate lazy expression with ...(). It's the opposite problem in lazy languages needing strictness annotations, here in Python we end up needing lazy annotations.
I think with some more meta-programmming and some more tricks, the above could be made more easy to use.
Note that this is basically how let-rec works in some functional languages.
The jsbueno answer in Python 3 :
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item).format(self)
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/{0[user]}',
'bin' : '{0[home]}/bin'
})
print(dictionary["home"])
print(dictionary["bin"])
Her ewe use the python 3 string formatting with curly braces {} and the .format() method.
Documentation : https://docs.python.org/3/library/string.html
I have a __contains__ method in class called "Graph" with an instance "dict" which is a dictionary.
def __contains__(self,i):
return i in self.dict
Now I would like to use "in" to loop through the keys of "dict." I use:
g = Graph()
for i in g:
print(i)
The question is: why I get values of dict instead of keys from i?
__contains__ is not used when iterating. in the operator is not the same thing as the for ... in ... statement (even though it contains the word in).
Iteration either uses object.__getitem__() (passing in integers starting at 0 until an IndexError is raised), or uses the iterator protocol via the object.__iter__ method.
You probably have a __getitem__ or __iter__ method that produces values rather than keys.
You need to override the python
__iter__
method.
def __iter__(self):
for x in self.dict.keys():
yield x
How to override Python list(iterator) behaviour?
I've got a list of objects, that have a function which does some calculations on several member variables and returns a value. I want to sort the list according to this return value.
I've found how to sort a list of objects by member variables but I want to keep them private (_variable_name).
Is there an easy way to accomplish this similar to the method of sorting the list directly by the members of the objects?
Just use the key argument to the sorted() function or list.sort() method:
sorted_list = sorted(list_of_objects, key=function_that_calculates)
The function_that_calculates is called for each entry in list_of_objects and its result informs the sort.
If you meant that each object has a method, you can use a lambda or the operator.methodcaller() object to call the method on each element:
sorted_list = sorted(list_of_objects, key=lambda obj: obj.method_name())
or
from operator import methodcaller
sorted_list = sorted(list_of_objects, key=methodcaller('method_name'))
Note that in Python, there is no such thing as a private attribute; your sorting function can still just access it. The leading underscore is just a convention. As such, sorting by a specific attribute can be done with either a lambda again, or using the operator.attrgetter() object:
sorted_list = sorted(list_of_objects, key=lambda obj: obj._variable_name)
or
from operator import attrgetter
sorted_list = sorted(list_of_objects, key=attrgetter('_variable_name'))
In Python, can one define a function (that can have statements in it, thus not a lambda) in a way similar to the following JavaScript example?
var func = function(param1, param2) {
return param1*param2;
};
I ask, since I'd like to have a dictionary of functions, and I wouldn't want to first define all the functions, and then put them in a dictionary.
The reason I want a dictionary of functions is because I will have another function that takes another dictionary as parameter, loops through its keys, and if it finds a matching key in the first dictionary, calls the associated function with the value of the second dictionary as parameter. Like this:
def process_dict(d):
for k, v in d.items():
if k in function_dict:
function_dict[k](v)
Maybe there is a more pythonic way to accomplish such a thing?
Use a class (with static methods) instead of a dictionary to contain your functions.
class MyFuncs:
#staticmethod
def func(a, b):
return a * b
# ... define other functions
In Python 3, you don't actually need the #staticmethod since class methods are simple functions anyway.
Now to iterate:
def process_dict(d):
for k, v in d.items():
getattr(MyFuncs, k, lambda *x: None)(*v)
N.B. You could also use a module for your functions and use import.
I find myself in a lot of situations where I have a dictionary value that I want to update with a new value, but only if the new value fulfils some criteria relative to the current value (such as being larger).
Currently I write expressions similar to:
dictionary[key] = max(newvalue, dictionary[key])
which works fine but I keep thinking that there's probably a neater way to do it that doesn't involve repeating myself.
Thanks for any suggestions.
You could make the values objects with update methods that encapsulate that logic. Or subclass dictionary and modify the behavior of __setitem__. Just keep in mind anything you do like this is going to make it less clear to someone not familiar with your code what is going on. What you are doing now is most explicit and clear.
Just write yourself a helper function:
def update(dictionary, key, newvalue, func=max):
dictionary[key] = func(dictionary[key], newvalue)
Not sure if it's "neater", but one way to avoid repeating yourself is to use an object-oriented approach and subclass the built-in dict class to make something able to do what you want. This also has the advantage that instances of your custom class can be used in place of dict instances without changing the rest of your code.
class CmpValDict(dict):
""" dict subclass that stores values associated with each key based
on the return value of a function which allow the value passed to be
first compared to any already there (if there is no pre-existing
value, the second argument passed to the function will be None)
"""
def __init__(self, cmp=None, *args, **kwargs):
self.cmp = cmp if cmp else lambda nv,cv: nv # default returns new value
super(CmpValDict, self).__init__(*args, **kwargs)
def __setitem__(self, key, value):
super(CmpValDict, self).__setitem__(key, self.cmp(value, self.get(key)))
cvdict = CmpValDict(cmp=max)
cvdict['a'] = 43
cvdict['a'] = 17
print cvdict['a'] # 43
cvdict[43] = 'George Bush'
cvdict[43] = 'Al Gore'
print cvdict[43] # George Bush
What about using the Python version of a ternary operator:
d[key]=newval if newval>d[key] else d[key]
or a one line if:
if newval>d[key]: d[key]=newval