How to create a Tree-based Map with keys - python

For my intro to computer science class we have a tree based map problem. I'm getting really confused on how to make the tree in the fashion they are asking it.
What I have so far:
class EmptyMap():
__slots__ = ()
class NonEmptyMap():
__slots__ = ('left','key','value','right')
def mkEmptyMap():
m = EmptyMap()
return m
def mkNonEmptyMap(map1, key, value, map2):
m = NonEmptyMap()
m.left = map1
m.key = key
m.value = value
m.right = map2
return m
def mapInsert(key, value, map1):
if isinstance(map1, EmptyMap):
else:
I'm getting stuck on the mapInsert function which is supposed to be recursive. Our tutoring lab doesnt have any tutors in it now so any help is appreciated.
Link to homework file http://www.cs.rit.edu/~vcss241/Homeworks/08/TreeMap-stu.pdf
Thanks!

I have never written or seen Python, but try this:
def mapInsert(key, value, map1):
if isinstance(map1, EmptyMap):
return mkNonEmptyMap(mkEmptyMap(), key, value, mkEmptyMap())
else:
if map1.key == key:
map1.value = value;
else if map1.key > key:
return map1.left = mapInsert(key, value, map1.left)
else:
return map1.right = mapInsert(key, value, map1.right)

Related

How to find Dictionary Key(s) from Value in a large nested dictionary of variable depth?

Say that I have a large dictionary full of nested values such as this:
large_dic ={
...
"key":{"sub-key1" :{"sub-key2": "Test"}},
"0key":{"0sub-key1": "0Test"},
"1key":{"1sub-key1":{"1sub-key2":{"1sub-key3":"1Test"}}}
...
}
What I would like to do is to be able to get for example from the final value:
"1Test"
the key(s) to access it, such as in this case:
large_dic["1key"]["1sub-key1"]["1sub-key2"]["1sub-key3"]
Thanks for the support.
Edit to add more infos: The dictionary trees I'm talking about are linear(YAML files converted into a python dictionary structure), there is never more than one key, the ending leaf values may not be unique.
Since OP is looking for hierarchical keys instead
I made this class :
class PointingSlice:
def __init__(self, obj, *slices) -> None:
self.obj = obj
self.slices = slices
def __str__(self):
return f"{str(self.obj)}{''.join(map(self._repr_slice, self.slices))}"
def _repr_slice(self, sliced: slice):
sqbrackets = "[{}]"
if not isinstance(sliced, slice):
return sqbrackets.format(repr(sliced))
items = [sliced.start, sliced.stop, sliced.step]
fn = lambda x: str() if x is None else str(x)
return sqbrackets.format(":".join(map(fn, items)))
def resolve(self):
obj = self.obj
for sliced in self.slices:
obj = obj.__getitem__(sliced)
return obj
and this function for instantiation :
def find_longest(mapping, key):
keys = [key]
value = mapping[key]
while isinstance(value, dict):
((k, value),) = value.items()
keys.append(k)
return PointingSlice(mapping, *keys)
Example use:
print(find_longest(large_dic, "1key"))
# output:
# {'key': {'sub-key1': {'sub-key2': 'Test'}}, '0key': {'0sub-key1': '0Test'}, '1key': {'1sub-key1': {'1sub-key2': {'1sub-key3': '1Test'}}}}['1key']['1sub-key1']['1sub-key2']['1sub-key3']
# do note that it is the same thing as large_dic['1key']['1sub-key1']['1sub-key2']['1sub-key3']
print(find_longest(large_dic, "1key").resolve()) # 1Test
So I made some changes and now it supports additional repr options matching your exact use case :
class PointingSlice:
def __init__(self, obj, *slices, object_name=None) -> None:
self.obj = obj
self.slices = slices
self.object_name = object_name
def __str__(self):
return f"{self.object_name or str(self.obj)}{''.join(map(self._repr_slice, self.slices))}"
def _repr_slice(self, sliced: slice):
sqbrackets = "[{}]"
if not isinstance(sliced, slice):
return sqbrackets.format(repr(sliced))
items = [sliced.start, sliced.stop, sliced.step]
fn = lambda x: str() if x is None else str(x)
return sqbrackets.format(":".join(map(fn, items)))
def resolve(self):
obj = self.obj
for sliced in self.slices:
obj = obj.__getitem__(sliced)
return obj
large_dic = {
"key": {"sub-key1": {"sub-key2": "Test"}},
"0key": {"0sub-key1": "0Test"},
"1key": {"1sub-key1": {"1sub-key2": {"1sub-key3": "1Test"}}},
}
def find_longest(mapping, key):
keys = [key]
value = mapping[key]
while isinstance(value, dict):
((k, value),) = value.items()
keys.append(k)
return PointingSlice(mapping, *keys)
f = find_longest(large_dic, "1key")
f.object_name = "large_dic" # for representational purposes, it works without this
print(f) # large_dic['1key']['1sub-key1']['1sub-key2']['1sub-key3']
print(f.resolve()) # 1Test
There are numerous ways to achieve this. You might want to look up "prefix tree traversal" (or "trie traversal").
A simple recursive solution with poor memory efficiency could look like this:
def find_trie_leaf_path(trie: dict, leaf_value, trie_path: list[str] = []):
for key, value in trie.items():
if isinstance(value, dict):
yield from find_trie_leaf_path(value, leaf_value, trie_path + [key])
elif value == leaf_value:
yield trie_path + [key]
large_dic = {
"key": {"sub-key1": {"sub-key2": "Test"}},
"0key": {"0sub-key1": "0Test"},
"1key": {"1sub-key1": {"1sub-key2": {"1sub-key3": "Test"}}},
}
first_match = next(find_trie_leaf_path(large_dic, "Test"))
all_matches = list(find_trie_leaf_path(large_dic, "Test"))
This should work even if your trie is very wide. If it is very high, I'd rather use an iterative algorithm.
I want to point out, though, that prefix trees are usually used the other way round. If you find yourself needing this search a lot, you should consider a different data structure.
Yes, it's totally possible. Here's the function to get the deeply nested value:
def get_final_value(mapping, key):
value = mapping[key]
while isinstance(value, dict):
(value,) = value.values()
return value
Example use:
>>> get_final_value(large_dic, "key")
'Test'
>>> get_final_value(large_dic, "0key")
'0Test'
>>> get_final_value(large_dic, "1key")
'1Test'
>>>
Can the parent keys be deduced from your final value in any way or is the tree structure rather random? If latter is the case then you'll probably just end up searching your tree until you find your value, what path search algorithm you choose for that again depends on the tree structure you have. As already asked in the comments, does each node only have one other node or is it binary or can it have many child nodes?

finds key but not KEY.lower() in dictionary Python

So I need to write this program where I create a class and an object of that class is a dictionary with categories as keys, and words that are included in such categories are the values (Example: {'name' : {'patrick', 'jane'}, 'discipline' : {'geography',...}, ...}).
At some point in the program (in that class) I have to create a method which takes a the name of a category as an argument. I then have to pick a random word out of that category. In the dictionary all keys(categories) need to be lowercase but when I give a category to choose a word from that shouldn't matter.
Here is my code first (part of it):
import random
class MadLibs:
def __init__(self, woordenschat = {}):
self.woordenschat = woordenschat
def suggereren(self, categorie):
assert categorie.lower() in self.woordenschat, 'onbekende categorie'
randwoord = random.choice(list(self.woordenschat[categorie.lower()]))
if categorie.isupper():
return randwoord.upper()
elif categorie.islower():
return randwoord
else:
return randwoord.capitalize()
so say I got a category 'name' as key in my dictionary with a sequence of words, when I then use the method suggereren and give as argument 'name' it works, but when I give 'NAME' then self.woordenschat[category.lower()] returns an empty list (see the line where I initialize randwoord )
Would somebody be able to tell me why this happens?
UPDATE:
this is how you add the words in the dictionary, categorie is where you give the category, and woorden is where you give new words that belong to that category
def leren(self, categorie, woorden):
if isinstance(woorden, (tuple, list, set)):
woorden = set(woorden)
else:
woorden = {woorden}
if categorie in self.woordenschat:
self.woordenschat[categorie.lower()].add(woord.lower() for woord in woorden)
else:
self.woordenschat[categorie.lower()] = (woord.lower() for woord in woorden)
return None
UPDATE:
seems like the way I added the words in leren was the problem an error something like: object 'generator' does not have ... 'add'
here's my new code:
def leren(self, categorie, woorden):
if isinstance(woorden, (tuple, list, set)):
woorden = set(woorden)
else:
woorden = {woorden}
set_to_add = {woord.lower() for woord in woorden}
if categorie in self.woordenschat:
self.woordenschat[categorie.lower()].union(set_to_add)
else:
self.woordenschat[categorie.lower()] = (set_to_add)
return None
now the only problem left is that my object doesn't really get updated when I add new words to an existing category I'll try to find it first but if I don't I'll just ask a new question.
update: nevermind found it, twas a stupid mistake
actually , in the requests sources code , there have a solution about the caselessdict object maybe satisfied you need.
import collections
class CaseInsensitiveDict(collections.MutableMapping):
def __init__(self, data=None, **kwargs):
self._store = dict()
if data is None:
data = {}
self.update(data, **kwargs)
def __setitem__(self, key, value):
# Use the lowercased key for lookups, but store the actual
# key alongside the value.
self._store[key.lower()] = (key, value)
def __getitem__(self, key):
return self._store[key.lower()][1]
def __delitem__(self, key):
del self._store[key.lower()]
def __iter__(self):
return (casedkey for casedkey, mappedvalue in self._store.values())
def __len__(self):
return len(self._store)
def lower_items(self):
"""Like iteritems(), but with all lowercase keys."""
return (
(lowerkey, keyval[1])
for (lowerkey, keyval)
in self._store.items()
)
def __eq__(self, other):
if isinstance(other, collections.Mapping):
other = CaseInsensitiveDict(other)
else:
return NotImplemented
# Compare insensitively
return dict(self.lower_items()) == dict(other.lower_items())
# Copy is required
def copy(self):
return CaseInsensitiveDict(self._store.values())
def __repr__(self):
return str(dict(self.items()))
#property
def keys(self):
return [i for i in self]
#property
def values(self):
return [self[i] for i in self]

Generic arguments in recursive functions: terrible habit?

I catch myself doing this a lot. The example is simple, but, in practice, there are a lot of complex assignments to update data structures and conditions under which the second recursion is not called.
I'm working with mesh data. Points, Edges, and Faces are stored in separate dictionaries and "pointers" (dict keys) are heavily used.
import itertools
class Demo(object):
def __init__(self):
self.a = {}
self.b = {}
self.keygen = itertools.count()
def add_to_b(self, val):
new_key = next(self.keygen)
self.b[new_key] = val
return new_key
def recur_method(self, arg, argisval=True):
a_key = next(self.keygen)
if argisval is True:
# arg is a value
b_key = self.add_to_b(arg)
self.a[a_key] = b_key
self.recur_method(b_key, argisval=False)
else:
# arg is a key
self.a[a_key] = arg
demo = Demo()
demo.recur_method(2.2)
Is there a better way? short of cutting up all of my assignment code into seven different methods? Should I be worried about this anyway?
Try
def recur_method(self, key=None, val=None):
if key is None and val is None:
raise exception("You fail it")
If None is a valid input, then use a guard value:
sentinel = object()
def recur_method(self, key=sentinel, val=sentinel):
if key is sentinel and val is sentinel:
raise exception("You fail it")

Elegant way to avoid .put() on unchanged entities

A reoccurring pattern in my Python programming on GAE is getting some entity from the data store, then possibly changing that entity based on various conditions. In the end I need to .put() the entity back to the data store to ensure that any changes that might have been made to it get saved.
However often there were no changes actually made and the final .put() is just a waste of money. How to easily make sure that I only put an entity if it has really changed?
The code might look something like
def handle_get_request():
entity = Entity.get_by_key_name("foobar")
if phase_of_moon() == "full":
entity.werewolf = True
if random.choice([True, False]):
entity.lucky = True
if some_complicated_condition:
entity.answer = 42
entity.put()
I could maintain a "changed" flag which I set if any condition changed the entity, but that seems very brittle. If I forget to set it somewhere, then changes would be lost.
What I ended up using
def handle_get_request():
entity = Entity.get_by_key_name("foobar")
original_xml = entity.to_xml()
if phase_of_moon() == "full":
entity.werewolf = True
if random.choice([True, False]):
entity.lucky = True
if some_complicated_condition:
entity.answer = 42
if entity.to_xml() != original_xml: entity.put()
I would not call this "elegant". Elegant would be if the object just saved itself automatically in the end, but I felt this was simple and readable enough to do for now.
Why not check if the result equals (==) the original and so decide whether to save it. This depends on a correctly implemented __eq__, but by default a field-by-field comparison based on the __dict__ should do it.
def __eq__(self, other) :
return self.__dict__ == other.__dict__
(Be sure that the other rich comparison and hash operators work correctly if you do this. See here.)
One possible solution is using a wrapper that tracks any attribute change:
class Wrapper(object):
def __init__(self, x):
self._x = x
self._changed = False
def __setattr__(self, name, value):
if name[:1] == "_":
object.__setattr__(self, name, value)
else:
if getattr(self._x, name) != value:
setattr(self._x, name, value)
self._changed = True
def __getattribute__(self, name):
if name[:1] == "_":
return object.__getattribute__(self, name)
return getattr(self._x, name)
class Contact:
def __init__(self, name, address):
self.name = name
self.address = address
c = Contact("Me", "Here")
w = Wrapper(c)
print w.name # --> Me
w.name = w.name
print w.name, w._changed # --> Me False
w.name = "6502"
print w.name, w._changed # --> 6502 True
This answer is a part of an question i posted about a Python checksum of a dict
With the answers of this question I developed a method to generate checksum from
a db.Model.
This is an example:
>>> class Actor(db.Model):
... name = db.StringProperty()
... age = db.IntegerProperty()
...
>>> u = Actor(name="John Doe", age=26)
>>> util.checksum_from_model(u, Actor)
'-42156217'
>>> u.age = 47
>>> checksum_from_model(u, Actor)
'-63393076'
I defined these methods:
def checksum_from_model(ref, model, exclude_keys=[], exclude_properties=[]):
"""Returns the checksum of a db.Model.
Attributes:
ref: The reference og the db.Model
model: The model type instance of db.Model.
exclude_keys: To exclude a list of properties name like 'updated'
exclude_properties: To exclude list of properties type like 'db.DateTimeProperty'
Returns:
A checksum in signed integer.
"""
l = []
for key, prop in model.properties().iteritems():
if not (key in exclude_keys) and \
not any([True for x in exclude_properties if isinstance(prop, x)]):
l.append(getattr(ref, key))
return checksum_from_list(l)
def checksum_from_list(l):
"""Returns a checksum from a list of data into an int."""
return reduce(lambda x,y : x^y, [hash(repr(x)) for x in l])
Note:
For the base36 implementation: http://en.wikipedia.org/wiki/Base_36#Python_implementation
Edit:
I removed the return in base36, now these functions run without dependences. (An advice from #Skirmantas)
Didn't work with GAE but in same situation I'd use something like:
entity = Entity.get_by_key_name("foobar")
prev_entity_state = deepcopy(entity.__dict__)
if phase_of_moon() == "full":
entity.werewolf = True
if random.choice([True, False]):
entity.lucky = True
if some_complicated_condition:
entity.answer = 42
if entity.__dict__ == prev_entity_state:
entity.put()

Subclass/Child class

I had this class and subclass :
class Range:
def __init__(self, start, end):
self.setStart(start)
self.setEnd(end)
def getStart(self):
return self.start
def setStart(self, s):
self.start = s
def getEnd(self):
return self.end
def setEnd(self, e):
self.end = e
def getLength(self):
return len(range(self.start, self.end))
def overlaps(self, r):
if (r.getStart() < self.getEnd() and r.getEnd() >= self.getEnd()) or \
(self.getStart() < r.getEnd() and self.getEnd() >= r.getEnd()) or \
(self.getStart() >= r.getStart() and self.getEnd() <= r.getEnd()) or \
(r.getStart() >= self.getStart() and r.getEnd() <= self.getEnd()):
return True
else:
return False
class DNAFeature(Range):
def __init__(self, start, end):
self.setStart(start)
self.setEnd(end)
self.strand = none
self.sequencename = none
def getSeqName(self, s):
return self.SeqName
def setSeqName(self, s):
self.sequencename = s
def getStrand(self):
if self.SeqName == 'plus':
return 1
elif self.SeqName == 'minus':
return -1
else:
return 0
def setStrand(self, s):
self.strand = s
And here is what I have to do:
Create
 a 
new 
class
– 
GeneModel
‐
 that 
contains 
a 
group
 of 
DNAFeature 
objects
 representing
 exons
 and
 is 
a 
child 
class 
of 
DNAFeature. 
It
 should 
implement 
the
 following 
methods:


getFeats() 
–
returns 
a 
list
 of
 DNAFeature 
objects,
sorted
 by
 start 
position
addFeat(feat)
–
 accepts 
a 
DNAFeature 
feat 
and
 adds 
it 
to 
its 
internal 
group
 of
 DNAFeature 
objects
setTranslStart(i)
– 
accepts 
a 
non‐negative 
int,
sets 
the
 start
 position 
of
 the
 initiating 
ATG
 codon
getTranslStart()
–
returns 
an 
int, 
the
 start 
position 
of 
the
 initiating 
ATG
 codon
setTranslStop(i)
– 
accepts
 a 
positive 
int,
sets
 the
 end
 position 
for 
the
 stop
 codon
getTranslStop()
–
 returns
 an 
int,
the
 end 
position 
for
 the
 stop 
codon
setDisplayId(s) 
–
sets 
the
 name
 of
 the
 gene
 model; 
s 
is 
a 
string
getDisplayId()
– 
return
 the
 name 
of 
the 
gene
 model,
 returns 
a 
string,
e.g.,
AT1G10555.1
 
 GeneModel
 should 
raise
 appropriate 
ValueError 
and 
TypeError 
exceptions 
when
 users
 pass
 incorrect 
types 
and
 values 
to 
constructors 
and 
“set” 
methods.
I have tried to write whatever comes to my mind, and read the books as well as searching the way to put codes together, but I am so new to programming and hardly can understand how to write the codes correctly. To be honest, this is the first time I ever do a programming class. So if I make any funny mistake in my codes, please forgive me. I haven't finish my codes yet and still reading the books to see where I am doing wrong and right with my codes. However, I really need your help to guide me to the right path. Thank you guys very much. Below is my codes:
class GeneModel(DNAFeature):
def __init__(self, translstart, translend, displayid):
self.setTranslStart(translstart)
self.setTranslStop(translend)
setDisplayId(displayid)
def getFeats():
result = []
sort.self.getStart()
return result
def addFeat(feat):
self.addFeat = feat
return self.getStart+self.getEnd
def setTranslStart(i):
self.translstart = self.setStart
self.translstart = non-negative int
def getTranslStart():
return self.translstart
def setTranslStop(i):
self.translend = self.setEnd
self.translend = "+" int
def getTranslStop():
return self.translend
def setDisplayId(s):
self.displayid = re.compile('r'\AT1G[0-9]{5,5}\.[0-9]{,1}, IGNORECASE')
def getDisplayId():
return self.displayid
I don't understand what the name of the gene model is. I think it's subject specific, but I think this will work for you:
class GenoModel(DNAFeature):
def __init__(self, start, end):
self.setStart(start)
self.setEnd(end)
self.strand = None
self.sequencename = None
self.exons = []
self.translStart = None
self.translStop = None
self.displayId = None
def getFeats(self):
self.exons.sort(cmp=self.start)
return self.exons
def addFeat(self, f):
if type(f) == DNAFeature:
self.exons.append(f)
else:
raise TypeError("Cannot add feature as it is not of type DNAFeature")
def setTranslStart(self, i):
if type(i) != int:
raise TypeError("Cannot set translStart as it is not of type int")
elif i < 0:
raise ValueError("Cannot set tanslStart to a negative int")
else:
self.translStart = i
def getTranslStart(self):
return self.translStart
def setTranslStop(self, i):
if type(i) != int:
raise TypeError("Cannot set translStop as it is not of type int")
elif i <= 0:
raise ValueError("Cannot set tanslStop to anything less than 1")
else:
self.translStop = i
def getTranslStop(self):
return self.translStop
def setDisplayId(self, s):
if type(s) != str:
raise TypeError("Cannot set desiplayId as it is not of type string")
else:
self.displayId = s
def getDisplayId(self):
return self.displayId
Hope this helps.
First, a little bit of cleanup. I'm not completely convinced that your original class, DNAFeature, is actually correct. DNAFeature seems to be inheriting from some other class, named Range, that we're missing here so if you have that code please offer it as well. In that original class, you need to define the variable SeqName (also, its preferable to keep variables lower-cased) since otherwise self.SeqName will be meaningless. Additionally, unless they're inherited from the Range class, you should also define the methods "setStart" and "setEnd". You're getter should not any additional variables, so feel free to change it to "def getSeqName(self)" instead of adding "s". I'm not sure what else your code is really supposed to do, so I'll hold any further comment.
Additionally, though you stated otherwise in your comment, I have to believe from the naming conventions (and what little I remember from bio) that you actually want GeneModel to be a container for a set of DNAFeature instances. That's different from GeneModel subclassing DNAFeature. If I'm right, then you can try:
class GeneModel(object):
def __init__(dnafeatures):
self.dnafeatures = dnafeatures
def get_features(self):
return self.dnafeatures
def add_feature(self, feature):
self.dnafeatures.append(feature)
Here dnafeatures would just be a list of dnafeature instances. This would then allow you to write methods to access these features and do whatever fun stuff you need to do.
My advice would be to make sure your DNAFeature class is correct and that your model of how you want your problem solved (in terms of what your classes do) and try asking again when its a little clearer. Hope this helps!

Categories