Good Style in Python Objects - python

Most of my programming prior to Python was in C++ or Matlab. I don't have a degree in CS (almost completed a PhD in physics), but have done some courses and a good amount of actual programming. Now, I'm taking an algorithms course on Coursera (excellent course, by the way, with a professor from Stanford). I decided to implement the homeworks in Python. However, sometimes I find myself wanting things the language does not so easily support. I'm very used to creating classes and objects for things in C++ just to group together data (i.e. when there are no methods). In Python however, where you can add fields on the fly, what I basically end up wanting all the time are Matlab structs. I think this is possibly a sign I am not using good style and doing things the "Pythonic" way.
Underneath is my implementation of a union-find data structure (for Kruskal's algorithm). Although the implementation is relatively short and works well (there isn't much error checking), there are a few odd points. For instance, my code assumes that the data originally passed in to the union-find is a list of objects. However, if a list of explicit pieces of data are passed in instead (i.e. a list of ints), the code fails. Is there some much clearer, more Pythonic way to implement this? I have tried to google this, but most examples are very simple and relate more to procedural code (i.e. the "proper" way to do a for loop in python).
class UnionFind:
def __init__(self,data):
self.data = data
for d in self.data:
d.size = 1
d.leader = d
d.next = None
d.last = d
def find(self,element):
return element.leader
def union(self,leader1,leader2):
if leader1.size >= leader2.size:
newleader = leader1
oldleader = leader2
else:
newleader = leader2
oldleader = leader1
newleader.size = leader1.size + leader2.size
d = oldleader
while d != None:
d.leader = newleader
d = d.next
newleader.last.next = oldleader
newleader.last = oldleader.last
del(oldleader.size)
del(oldleader.last)

Generally speaking, doing this sort of thing Pythonically means that you try to make your code not care what is given to it, at least not any more than it really needs to.
Let's take your particular example of the union-find algorithm. The only thing that the union-find algorithm actually does with the values you pass to it is compare them for equality. So to make a generally useful UnionFind class, your code shouldn't rely on the values it receives having any behavior other than equality testing. In particular, you shouldn't rely on being able to assign arbitrary attributes to the values.
The way I would suggest getting around this is to have UnionFind use wrapper objects which hold the given values and any attributes you need to make the algorithm work. You can use namedtuple as suggested by another answer, or make a small wrapper class. When an element is added to the UnionFind, you first wrap it in one of these objects, and use the wrapper object to store the attributes leader, size, etc. The only time you access the thing being wrapped is to check whether it is equal to another value.
In practice, at least in this case, it should be safe to assume that your values are hashable, so that you can use them as keys in a Python dictionary to find the wrapper object corresponding to a given value. Of course, not all objects in Python are necessarily hashable, but those that are not are relatively rare and it's going to be a lot more work to make a data structure that is able to handle those.

The more pythonic way is to avoid tedious objects if you don't have to.
class UnionFind(object):
def __init__(self, members=10, data=None):
"""union-find data structure for Kruskal's algorithm
members are ignored if data is provided
"""
if not data:
self.data = [self.default_data() for i in range(members)]
for d in self.data:
d.size = 1
d.leader = d
d.next = None
d.last = d
else:
self.data = data
def default_data(self):
"""create a starting point for data"""
return Data(**{'last': None, 'leader':None, 'next': None, 'size': 1})
def find(self, element):
return element.leader
def union(self, leader1, leader2):
if leader2.leader is leader1:
return
if leader1.size >= leader2.size:
newleader = leader1
oldleader = leader2
else:
newleader = leader2
oldleader = leader1
newleader.size = leader1.size + leader2.size
d = oldleader
while d is not None:
d.leader = newleader
d = d.next
newleader.last.next = oldleader
newleader.last = oldleader.last
oldleader.size = 0
oldleader.last = None
class Data(object):
def __init__(self, **data_dict):
"""convert a data member dict into an object"""
self.__dict__.update(**data_dict)

One option is to use dictionaries to store the information you need about a data item, rather than attributes on the item directly. For instance, rather than referring to d.size you could refer to size[d] (where size is a dict instance). This requires that your data items be hashable, but they don't need to allow attributes to be assigned on them.
Here's a straightforward translation of your current code to use this style:
class UnionFind:
def __init__(self,data):
self.data = data
self.size = {d:1 for d in data}
self.leader = {d:d for d in data}
self.next = {d:None for d in data}
self.last = {d:d for d in data}
def find(self,element):
return self.leader[element]
def union(self,leader1,leader2):
if self.size[leader1] >= self.size[leader2]:
newleader = leader1
oldleader = leader2
else:
newleader = leader2
oldleader = leader1
self.size[newleader] = self.size[leader1] + self.size[leader2]
d = oldleader
while d != None:
self.leader[d] = newleader
d = self.next[d]
self.next[self.last[newleader]] = oldleader
self.last[newleader] = self.last[oldleader]
A minimal test case:
>>> uf = UnionFind(list(range(100)))
>>> uf.find(10)
10
>>> uf.find(20)
20
>>> uf.union(10,20)
>>> uf.find(10)
10
>>> uf.find(20)
10
Beyond this, you could also consider changing your implementation a bit to require less initialization. Here's a version that doesn't do any initialization (it doesn't even need to know the set of data it's going to work on). It uses path compression and union-by-rank rather than always maintaining an up-to-date leader value for all members of a set. It should be asymptotically faster than your current code, especially if you're doing a lot of unions:
class UnionFind:
def __init__(self):
self.rank = {}
self.parent = {}
def find(self, element):
if element not in self.parent: # leader elements are not in `parent` dict
return element
leader = self.find(self.parent[element]) # search recursively
self.parent[element] = leader # compress path by saving leader as parent
return leader
def union(self, leader1, leader2):
rank1 = self.rank.get(leader1,1)
rank2 = self.rank.get(leader2,1)
if rank1 > rank2: # union by rank
self.parent[leader2] = leader1
elif rank2 > rank1:
self.parent[leader1] = leader2
else: # ranks are equal
self.parent[leader2] = leader1 # favor leader1 arbitrarily
self.rank[leader1] = rank1+1 # increment rank

For checking if an argument is of the expected type, use the built-in isinstance() function:
if not isinstance(leader1, UnionFind):
raise ValueError('leader1 must be a UnionFind instance')
Additionally, it is a good habit to add docstrings to functions, classes and member functions. Such a docstring for a function or method should describe what it does, what arguments are to be passed to it and if applicable what is returned and which exceptions can be raised.

I'm guessing that the indentation issues here are just simple errors with inputting the code into SO. Could you possibly create a subclass of a simple, built in data type? For instance, you can create a sub-class of the list data type by putting the datatype in parenthesis:
class UnionFind(list):
'''extends list object'''

Related

How to access a dictionary value from within the same dictionary in Python? [duplicate]

I'm new to Python, and am sort of surprised I cannot do this.
dictionary = {
'a' : '123',
'b' : dictionary['a'] + '456'
}
I'm wondering what the Pythonic way to correctly do this in my script, because I feel like I'm not the only one that has tried to do this.
EDIT: Enough people were wondering what I'm doing with this, so here are more details for my use cases. Lets say I want to keep dictionary objects to hold file system paths. The paths are relative to other values in the dictionary. For example, this is what one of my dictionaries may look like.
dictionary = {
'user': 'sholsapp',
'home': '/home/' + dictionary['user']
}
It is important that at any point in time I may change dictionary['user'] and have all of the dictionaries values reflect the change. Again, this is an example of what I'm using it for, so I hope that it conveys my goal.
From my own research I think I will need to implement a class to do this.
No fear of creating new classes -
You can take advantage of Python's string formating capabilities
and simply do:
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item) % self
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/%(user)s',
'bin' : '%(home)s/bin'
})
print dictionary["home"]
print dictionary["bin"]
Nearest I came up without doing object:
dictionary = {
'user' : 'gnucom',
'home' : lambda:'/home/'+dictionary['user']
}
print dictionary['home']()
dictionary['user']='tony'
print dictionary['home']()
>>> dictionary = {
... 'a':'123'
... }
>>> dictionary['b'] = dictionary['a'] + '456'
>>> dictionary
{'a': '123', 'b': '123456'}
It works fine but when you're trying to use dictionary it hasn't been defined yet (because it has to evaluate that literal dictionary first).
But be careful because this assigns to the key of 'b' the value referenced by the key of 'a' at the time of assignment and is not going to do the lookup every time. If that is what you are looking for, it's possible but with more work.
What you're describing in your edit is how an INI config file works. Python does have a built in library called ConfigParser which should work for what you're describing.
This is an interesting problem. It seems like Greg has a good solution. But that's no fun ;)
jsbueno as a very elegant solution but that only applies to strings (as you requested).
The trick to a 'general' self referential dictionary is to use a surrogate object. It takes a few (understatement) lines of code to pull off, but the usage is about what you want:
S = SurrogateDict(AdditionSurrogateDictEntry)
d = S.resolve({'user': 'gnucom',
'home': '/home/' + S['user'],
'config': [S['home'] + '/.emacs', S['home'] + '/.bashrc']})
The code to make that happen is not nearly so short. It lives in three classes:
import abc
class SurrogateDictEntry(object):
__metaclass__ = abc.ABCMeta
def __init__(self, key):
"""record the key on the real dictionary that this will resolve to a
value for
"""
self.key = key
def resolve(self, d):
""" return the actual value"""
if hasattr(self, 'op'):
# any operation done on self will store it's name in self.op.
# if this is set, resolve it by calling the appropriate method
# now that we can get self.value out of d
self.value = d[self.key]
return getattr(self, self.op + 'resolve__')()
else:
return d[self.key]
#staticmethod
def make_op(opname):
"""A convience class. This will be the form of all op hooks for subclasses
The actual logic for the op is in __op__resolve__ (e.g. __add__resolve__)
"""
def op(self, other):
self.stored_value = other
self.op = opname
return self
op.__name__ = opname
return op
Next, comes the concrete class. simple enough.
class AdditionSurrogateDictEntry(SurrogateDictEntry):
__add__ = SurrogateDictEntry.make_op('__add__')
__radd__ = SurrogateDictEntry.make_op('__radd__')
def __add__resolve__(self):
return self.value + self.stored_value
def __radd__resolve__(self):
return self.stored_value + self.value
Here's the final class
class SurrogateDict(object):
def __init__(self, EntryClass):
self.EntryClass = EntryClass
def __getitem__(self, key):
"""record the key and return"""
return self.EntryClass(key)
#staticmethod
def resolve(d):
"""I eat generators resolve self references"""
stack = [d]
while stack:
cur = stack.pop()
# This just tries to set it to an appropriate iterable
it = xrange(len(cur)) if not hasattr(cur, 'keys') else cur.keys()
for key in it:
# sorry for being a duche. Just register your class with
# SurrogateDictEntry and you can pass whatever.
while isinstance(cur[key], SurrogateDictEntry):
cur[key] = cur[key].resolve(d)
# I'm just going to check for iter but you can add other
# checks here for items that we should loop over.
if hasattr(cur[key], '__iter__'):
stack.append(cur[key])
return d
In response to gnucoms's question about why I named the classes the way that I did.
The word surrogate is generally associated with standing in for something else so it seemed appropriate because that's what the SurrogateDict class does: an instance replaces the 'self' references in a dictionary literal. That being said, (other than just being straight up stupid sometimes) naming is probably one of the hardest things for me about coding. If you (or anyone else) can suggest a better name, I'm all ears.
I'll provide a brief explanation. Throughout S refers to an instance of SurrogateDict and d is the real dictionary.
A reference S[key] triggers S.__getitem__ and SurrogateDictEntry(key) to be placed in the d.
When S[key] = SurrogateDictEntry(key) is constructed, it stores key. This will be the key into d for the value that this entry of SurrogateDictEntry is acting as a surrogate for.
After S[key] is returned, it is either entered into the d, or has some operation(s) performed on it. If an operation is performed on it, it triggers the relative __op__ method which simple stores the value that the operation is performed on and the name of the operation and then returns itself. We can't actually resolve the operation because d hasn't been constructed yet.
After d is constructed, it is passed to S.resolve. This method loops through d finding any instances of SurrogateDictEntry and replacing them with the result of calling the resolve method on the instance.
The SurrogateDictEntry.resolve method receives the now constructed d as an argument and can use the value of key that it stored at construction time to get the value that it is acting as a surrogate for. If an operation was performed on it after creation, the op attribute will have been set with the name of the operation that was performed. If the class has a __op__ method, then it has a __op__resolve__ method with the actual logic that would normally be in the __op__ method. So now we have the logic (self.op__resolve) and all necessary values (self.value, self.stored_value) to finally get the real value of d[key]. So we return that which step 4 places in the dictionary.
finally the SurrogateDict.resolve method returns d with all references resolved.
That'a a rough sketch. If you have any more questions, feel free to ask.
If you, just like me wandering how to make #jsbueno snippet work with {} style substitutions, below is the example code (which is probably not much efficient though):
import string
class MyDict(dict):
def __init__(self, *args, **kw):
super(MyDict,self).__init__(*args, **kw)
self.itemlist = super(MyDict,self).keys()
self.fmt = string.Formatter()
def __getitem__(self, item):
return self.fmt.vformat(dict.__getitem__(self, item), {}, self)
xs = MyDict({
'user' : 'gnucom',
'home' : '/home/{user}',
'bin' : '{home}/bin'
})
>>> xs["home"]
'/home/gnucom'
>>> xs["bin"]
'/home/gnucom/bin'
I tried to make it work with the simple replacement of % self with .format(**self) but it turns out it wouldn't work for nested expressions (like 'bin' in above listing, which references 'home', which has it's own reference to 'user') because of the evaluation order (** expansion is done before actual format call and it's not delayed like in original % version).
Write a class, maybe something with properties:
class PathInfo(object):
def __init__(self, user):
self.user = user
#property
def home(self):
return '/home/' + self.user
p = PathInfo('thc')
print p.home # /home/thc
As sort of an extended version of #Tony's answer, you could build a dictionary subclass that calls its values if they are callables:
class CallingDict(dict):
"""Returns the result rather than the value of referenced callables.
>>> cd = CallingDict({1: "One", 2: "Two", 'fsh': "Fish",
... "rhyme": lambda d: ' '.join((d[1], d['fsh'],
... d[2], d['fsh']))})
>>> cd["rhyme"]
'One Fish Two Fish'
>>> cd[1] = 'Red'
>>> cd[2] = 'Blue'
>>> cd["rhyme"]
'Red Fish Blue Fish'
"""
def __getitem__(self, item):
it = super(CallingDict, self).__getitem__(item)
if callable(it):
return it(self)
else:
return it
Of course this would only be usable if you're not actually going to store callables as values. If you need to be able to do that, you could wrap the lambda declaration in a function that adds some attribute to the resulting lambda, and check for it in CallingDict.__getitem__, but at that point it's getting complex, and long-winded, enough that it might just be easier to use a class for your data in the first place.
This is very easy in a lazily evaluated language (haskell).
Since Python is strictly evaluated, we can do a little trick to turn things lazy:
Y = lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args)))
d1 = lambda self: lambda: {
'a': lambda: 3,
'b': lambda: self()['a']()
}
# fix the d1, and evaluate it
d2 = Y(d1)()
# to get a
d2['a']() # 3
# to get b
d2['b']() # 3
Syntax wise this is not very nice. That's because of us needing to explicitly construct lazy expressions with lambda: ... and explicitly evaluate lazy expression with ...(). It's the opposite problem in lazy languages needing strictness annotations, here in Python we end up needing lazy annotations.
I think with some more meta-programmming and some more tricks, the above could be made more easy to use.
Note that this is basically how let-rec works in some functional languages.
The jsbueno answer in Python 3 :
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item).format(self)
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/{0[user]}',
'bin' : '{0[home]}/bin'
})
print(dictionary["home"])
print(dictionary["bin"])
Her ewe use the python 3 string formatting with curly braces {} and the .format() method.
Documentation : https://docs.python.org/3/library/string.html

Organize functions that create or expand a text file using a class?

I'm brand new to classes and I don't really know when to use them. I want to write a program for simulation of EPR/NMR spectra which requires information about the simulated system. The relevant thing is this: I have a function called rel_inty(I_n,N) that calculates this relevant information from two values. The problem is that it becomes very slow when either of these values becomes large (I_n,N >= 10). That's why I opted for calculating rel_inty(I_n,N) beforehand for the most relevant combinations of (I_n,N) and save them in a dictionary. I write that dictionary to a file and can import it using eval(), since calculating rel_inty(I_n,N) dynamically on each execution would be way too slow.
Now I had the following idea: What if I create a class manage_Dict():, whose methods can either recreate a basic dictionary with adef basic(): , in case the old file somehow gets deleted, or expand the existing one with a def expand(): method, if the basic one doesn't contain a user specified combination of (I_n,N)?
This would be the outline of that class:
class manage_Dict(args):
def rel_inty(I_n,N):
'''calculates relative intensities for a combination (I_n,N)'''
def basic():
'''creates a dict for preset combinations of I_n,N'''
with open('SpinSys.txt','w') as outf:
Dict = {}
I_n_List = [somevalues]
N_List = [somevalues]
for I_n in I_n_List:
Dict[I_n] = {}
for N in N_List:
Dict[I_n][N] = rel_inty(I_n,N)
outf.write(str(Dict))
def expand(*args):
'''expands the existing dict for all tuples (I_n,N) in *args'''
with open('SpinSys.txt','r') as outf:
Dict = eval(outf.read())
for tup in args:
I_n = tup[0]
N = tup[1]
Dict[I_n][N] = rel_inty(I_n,N)
os.remove('SpinSys.txt')
with open('SpinSys.txt','w') as outf:
outf.write(str(Dict))
Usage:
'''Recreate SpinSys.txt if lost'''
manage_Dict.basic()
'''Expand SpinSys.txt in case of missing (I_n,N)'''
manage_Dict.expand((10,5),(11,3),(2,30))
Would this be a sensible solution? I was wondering that because I usually see classes with self and __init__ creating an object instance instead of just managing function calls.
If we are going to make use of an object, lets make sure it's doing some useful work for us and the interface is nicer than just using functions. I'm going to suggest a few big tweaks that will make life easier:
We can sub class dict itself, and then our object is a dict, as well as all our custom fancy stuff
Use JSON instead of text files, so we can quickly, naturally and safely serialise and deserialise
import json
class SpectraDict(dict):
PRE_CALC_I_N = ["...somevalues..."]
PRE_CACL_N = ["...somevalues..."]
def rel_inty(self, i_n, n):
# Calculate and store results from the main function
if i_n not in self:
self[i_n] = {}
if n not in self[i_n]:
self[i_n][n] = self._calculate_rel_inty(i_n, n)
return self[i_n][n]
def _calculate_rel_inty(self, i_n, n):
# Some exciting calculation here instead...
return 0
def pre_calculate(self):
s_dict = SpectraDict()
for i_n in self.PRE_CALC_I_N:
for n in self.PRE_CACL_N:
# Force the dict to calculate and store the values
s_dict.rel_inty(i_n, n)
return s_dict
#classmethod
def load(cls, json_file):
with open(json_file) as fh:
return SpectraDict(json.load(fh))
def save(self, json_file):
with open(json_file, 'w') as fh:
json.dump(self, fh)
return self
Now when ask for values using the rel_inty() function we immediately store the answer in ourselves before giving it back. This is called memoization / caching. Therefore to pre-fill our object with the pre-calculated values, we just need to ask it for lots of answers and it will store them.
After that we can either load or save quite naturally using JSON:
# Bootstrapping from scratch:
s_dict = SpectraDict().pre_calculate().save('spin_sys.json')
# Loading and updating with new values
s_dict = SpectraDict.load('spin_sys.json')
s_dict.rel_inty(10, 45) # All your new calculations here...
s_dict.save('spin_sys.json')

Should a dict that I use only once during initialization be put inside my class?

I only need to use operatorDict one time to determine which operation I am using for the class. What is the most pythonic way to store the correct operator inside of self._f ?
class m:
def __init__(self,n,operator,digit1,digit2):
self._operatorDict = {'+':add, '-':sub, 'x':mul, '/':truediv}
self._f = self._operatorDict[operator]
self._score = 0
self._range = n
self._aMin, self._aMax = getMaxMinDigits(digit1)
self._bMin, self._bMax = getMaxMinDigits(digit2)
Much of this depends on how much of the operatorDict you want to expose. In this case, I'd probably recommend one of two things.
Option 1: Put the dict on the class:
class m:
_operatorDict = {'+':add, '-':sub, 'x':mul, '/':truediv}
def __init__(self,n,operator,digit1,digit2):
self._f = self._operatorDict[operator]
self._score = 0
self._range = n
self._aMin, self._aMax = getMaxMinDigits(digit1)
self._bMin, self._bMax = getMaxMinDigits(digit2)
Since the _operatorDict isn't going to mutate inside the class, it doesn't really seem necessary to have it as a instance attribute. However, it still belongs to the class in some sense. This approach also allows you to change the operatorDict as necessary (e.g. in a subclass).
Option 2: Put the dict in the global namespace:
_operatorDict = {'+':add, '-':sub, 'x':mul, '/':truediv}
class m:
def __init__(self,n,operator,digit1,digit2):
self._f = _operatorDict[operator]
self._score = 0
self._range = n
self._aMin, self._aMax = getMaxMinDigits(digit1)
self._bMin, self._bMax = getMaxMinDigits(digit2)
The advantages here are similar to before -- mainly that you only create one operatorDict and not one per instance of m. It's also a little more rigid in that this form doesn't really allow for easy changing of the operatorDict via subclassing. In some cases, this rigidity can be desirable. Also, as some noted in the comments, if you use this second option, naming _operatorDict to indicate that it is a constant in your naming system (e.g. _OPERATOR_DICT in pep8) is probably a good idea.

Python multiprocessing pool with shared data

I'm attempting to speed up a multivariate fixed-point iteration algorithm using multiprocessing however, I'm running issues dealing with shared data. My solution vector is actually a named dictionary rather than a vector of numbers. Each element of the vector is actually computed using a different formula. At a high level, I have an algorithm like this:
current_estimate = previous_estimate
while True:
for state in all_states:
current_estimate[state] = state.getValue(previous_estimate)
if norm(current_estimate, previous_estimate) < tolerance:
break
else:
previous_estimate, current_estimate = current_estimate, previous_estimate
I'm trying to parallelize the for-loop part with multiprocessing. The previous_estimate variable is read-only and each process only needs to write to one element of current_estimate. My current attempt at rewriting the for-loop is as follows:
# Class and function definitions
class A(object):
def __init__(self,val):
self.val = val
# representative getValue function
def getValue(self, est):
return est[self] + self.val
def worker(state, in_est, out_est):
out_est[state] = state.getValue(in_est)
def worker_star(a_b_c):
""" Allow multiple arguments for a pool
Taken from http://stackoverflow.com/a/5443941/3865495
"""
return worker(*a_b_c)
# Initialize test environment
manager = Manager()
estimates = manager.dict()
all_states = []
for i in range(5):
a = A(i)
all_states.append(a)
estimates[a] = 0
pool = Pool(process = 2)
prev_est = estimates
curr_est = estimates
pool.map(worker_star, itertools.izip(all_states, itertools.repeat(prev_est), itertools.repreat(curr_est)))
The issue I'm currently running into is that the elements added to the all_states array are not the same as those added to the manager.dict(). I keep getting key value errors when trying to access elements of the dictionary using elements of the array. And debugging, I found that none of the elements are the same.
print map(id, estimates.keys())
>>> [19558864, 19558928, 19558992, 19559056, 19559120]
print map(id, all_states)
>>> [19416144, 19416208, 19416272, 19416336, 19416400]
This is happening because the objects you're putting into the estimates DictProxy aren't actually the same objects as those that live in the regular dict. The manager.dict() call returns a DictProxy, which is proxying access to a dict that actually lives in a completely separate manager process. When you insert things into it, they're really being copied and sent to a remote process, which means they're going to have a different identity.
To work around this, you can define your own __eq__ and __hash__ functions on A, as described in this question:
class A(object):
def __init__(self,val):
self.val = val
# representative getValue function
def getValue(self, est):
return est[self] + self.val
def __hash__(self):
return hash(self.__key())
def __key(self):
return (self.val,)
def __eq__(x, y):
return x.__key() == y.__key()
This means the key look ups for items in the estimates will just use the value of the val attribute to establish identity and equality, rather than the id assigned by Python.

Python - Dijkstra's Algorithm

I need to implement Dijkstra's Algorithm in Python. However, I have to use a 2D array to hold three pieces of information - predecessor, length and unvisited/visited.
I know in C a Struct can be used, though I am stuck on how I can do a similar thing in Python, I am told it's possible but I have no idea to be honest
Create a class for it.
class XXX(object):
def __init__(self, predecessor, length, visited):
self.predecessor = predecessor
self.length = length
self.visited = visited
Or use collections.namedtuple, which is particular cool for holding struct-like compound types without own behaviour but named members: XXX = collections.namedtuple('XXX', 'predecessor length visited').
Create one with XXX(predecessor, length, visited).
As mentioned above, you can use an instance of an object.
This author has a pretty convincing python implementation of Dijkstras in python.
#
# This file contains the Python code from Program 16.16 of
# "Data Structures and Algorithms
# with Object-Oriented Design Patterns in Python"
# by Bruno R. Preiss.
#
# Copyright (c) 2003 by Bruno R. Preiss, P.Eng. All rights reserved.
#
# http://www.brpreiss.com/books/opus7/programs/pgm16_16.txt
#
class Algorithms(object):
def DijkstrasAlgorithm(g, s):
n = g.numberOfVertices
table = Array(n)
for v in xrange(n):
table[v] = Algorithms.Entry()
table[s].distance = 0
queue = BinaryHeap(g.numberOfEdges)
queue.enqueue(Association(0, g[s]))
while not queue.isEmpty:
assoc = queue.dequeueMin()
v0 = assoc.value
if not table[v0.number].known:
table[v0.number].known = True
for e in v0.emanatingEdges:
v1 = e.mateOf(v0)
d = table[v0.number].distance + e.weight
if table[v1.number].distance > d:
table[v1.number].distance = d
table[v1.number].predecessor = v0.number
queue.enqueue(Association(d, v1))
result = DigraphAsLists(n)
for v in xrange(n):
result.addVertex(v, table[v].distance)
for v in xrange(n):
if v != s:
result.addEdge(v, table[v].predecessor)
return result
DijkstrasAlgorithm = staticmethod(DijkstrasAlgorithm)
Notice those pieces of information are 'held' in the object he is constructing by calling Algorithms.Entry(). Entry is a class and is defined like this:
class Entry(object):
"""
Data structure used in Dijkstra's and Prim's algorithms.
"""
def __init__(self):
"""
(Algorithms.Entry) -> None
Constructor.
"""
self.known = False
self.distance = sys.maxint
self.predecessor = sys.maxint
The self.known, self.distance... are those pieces of information. He does not set these explicit in the constructor (init) but sets them later. In Python you can access attributes with dot notation. for examle: myObject= Entry(). the myObject.known, myObject.distance... they are all public.
Encapsulate that information in a Python object and you should be fine.
Or you can simply use tuples or dictionaries inside your 2d array:
width=10
height=10
my2darray = []
for x in range(width):
my2darray[x]=[]
for x in range(width):
for y in range(height):
#here you set the tuple
my2darray[x][y] = (n,l,v)
#or you can use a dict..
my2darray[x][y] = dict(node=foo,length=12,visited=False)
Python is object oriented language. So think of it like moving from Structs in C to Classes of C++. You can use the same class structure in Python as well.

Categories