I'm currently discovering all the possibilities of the Owlready library.
Right now I'm trying to process some SWRL rules and so far it's been going very good, but I'm stuck at one point.
I've defined some rules in my ontology and now I want to see all the results (so, everything inferred from a rule).
For example, if I had a rule
has_brother(David, ?b) ^ has_child(?b, ?s) -> has_uncle(?s, David)
and David has two brothers, John and Pete, and John's kid is Anna, Pete's kid is Simon, I would like too see something like:
has_brother(David, John) ^ has_child(John, Anna) -> has_uncle(Anna, David)
has_brother(David, Pete) ^ has_child(Pete, Simon) -> has_uncle(Simon, David)
Is this possible in any way?
I thought that maybe if I run the reasoner, I could see it in its output, but I can't find this anywhere.
I appreciate any help possible!
This is my solution:
import owlready2 as owl
onto = owl.get_ontology("http://test.org/onto.owl")
with onto:
class Person(owl.Thing):
pass
class has_brother(owl.ObjectProperty, owl.SymmetricProperty, owl.IrreflexiveProperty):
domain = [Person]
range = [Person]
class has_child(Person >> Person):
pass
class has_uncle(Person >> Person):
pass
rule1 = owl.Imp()
rule1.set_as_rule(
"has_brother(?p, ?b), has_child(?p, ?c) -> has_uncle(?c, ?b)"
)
# This rule gives "irreflexive transitivity",
# i.e. transitivity, as long it does not lead to has_brother(?a, ?a)"
rule2 = owl.Imp()
rule2.set_as_rule(
"has_brother(?a, ?b), has_brother(?b, ?c), differentFrom(?a, ?c) -> has_brother(?a, ?c)"
)
david = Person("David")
john = Person("John")
pete = Person("Pete")
anna = Person("Anna")
simon = Person("Simon")
owl.AllDifferent([david, john, pete, anna, simon])
david.has_brother.extend([john, pete])
john.has_child.append(anna)
pete.has_child.append(simon)
print("Uncles of Anna:", anna.has_uncle) # -> []
print("Uncles of Simon:", simon.has_uncle) # -> []
owl.sync_reasoner(infer_property_values=True)
print("Uncles of Anna:", anna.has_uncle) # -> [onto.Pete, onto.David]
print("Uncles of Simon:", simon.has_uncle) # -> [onto.John, onto.David]
Notes:
One might think has_brother is
symmetric, i.e. has_brother(A, B) ⇒ has_brother(B, A)
transitive, i.e. has_brother(A, B) + has_brother(B, C) ⇒ has_brother(A, C)
irreflexive, i.e. no one is his own brother.
However, transitivity only holds if the unique name assumption holds. Otherwise A could be the same individual as C and this conflicts irreflexivity. Thus I used a rule for this kind of "weak transitivity".
Once, has_brother works as expected the uncle rule also does. Of course, the reasoner must run before.
Update: I published the solution in this Jupyter notebook (which also contains the output of the execution).
Related
I have to write a function that takes a string, and will return the string with added "asteriks" or "*" symbols to signal multiplication.
As we know 4(3) is another way to show multiplication, as well as 4*3 or (4)(3) or 4*(3) etc. Anyway, my code needs to fix that problem by adding an asterik between the 4 and the 3 for when multiplication is shown WITH PARENTHESIS but without the multiplication operator " * ".
Some examples:
"4(3)" -> "4*(3)"
"(4)(3)" -> "(4)*(3)"
"4*2 + 9 -4(-3)" - > "4*2 + 9 -4*(-3)"
"(-9)(-2) (4)" -> "(-9)*(2) *(4)"
"4^(3)" -> "4^(3)"
"(4-3)(4+2)" -> "(4-3)*(4+2)"
"(Aflkdsjalkb)(g)" -> "(Aflkdsjalkb)*(g)"
"g(d)(f)" -> "g*(d)*(f)"
"(4) (3)" -> "(4)*(3)"
I'm not exactly sure how to do this, I am thinking about finding the left parenthesis and then simply adding a " * " at that location but that wouldn't work hence the start of my third example would output "* (-9)" which is what I don't want or my fourth example that would output "4^*(3)". Any ideas on how to solve this problem? Thank you.
Here's something I've tried, and obviously it doesn't work:
while index < len(stringtobeconverted)
parenthesis = stringtobeconverted[index]
if parenthesis == "(":
stringtobeconverted[index-1] = "*"
In [15]: def add_multiplies(input_string):
...: return re.sub(r'([^-+*/])\(', r'\1*(', input_string)
...:
...:
...:
In [16]: for example in examples:
...: print(f"{example} -> {add_multiplies(example)}")
...:
4(3) -> 4*(3)
(4)(3) -> (4)*(3)
4*2 + 9 -4(-3) -> 4*2 + 9 -4*(-3)
(-9)(-2) (4) -> (-9)*(-2) *(4)
4^(3) -> 4^*(3)
(4-3)(4+2) -> (4-3)*(4+2)
(Aflkdsjalkb)(g) -> (Aflkdsjalkb)*(g)
g(d)(f) -> g*(d)*(f)
(g)-(d) -> (g)-(d)
tl;dr– Rather than thinking of this as string transformation, you might:
Parse an input string into an abstract representation.
Generate a new output string from the abstract representation.
Parse input to create an abstract syntax tree, then emit the new string.
Generally you should:
Create a logical representation for the mathematical expressions.You'll want to build an abstract syntax tree (AST) to represent each expression. For example,
2(3(4)+5)
could be form a tree like:
*
/ \
2 +
/ \
* 5
/ \
3 4
, where each node in that tree (2, 3, 4, 5, both *'s, and the +) are each an object that has references to its child objects.
Write the logic for parsing the input.Write a logic that can parse "2(3(4)+5)" into an abstract syntax tree that represents what it means.
Write a logic to serialize the data.Now that you've got the data in conceptual form, you can write methods that convert it into a new, desired format.
Note: String transformations might be easier for quick scripting.
As other answers have shown, direct string transformations can be easier if all you need is a quick script, e.g. you have some text you just want to reformat real quick. For example, as #PaulWhipp's answer demonstrates, regular expressions can make such scripting really quick-and-easy.
That said, for professional projects, you'll generally want to parse data into an abstract representation before emitting a new representation. String-transform tricks don't generally scale well with complexity, and they can be both functionally limited and pretty error-prone outside of simple cases.
I'll share mine.
def insertAsteriks(string):
lstring = list(string)
c = False
for i in range(1, len(lstring)):
if c:
c = False
pass
elif lstring[i] == '(' and (lstring[i - 1] == ')' or lstring[i - 1].isdigit() or lstring[i - 1].isalpha() or (lstring[i - 1] == ' ' and not lstring[i - 2] in "*^-+/")):
lstring.insert(i, '*')
c = True
return ''.join(lstring)
Let's check against your inputs.
print(insertAsteriks("4(3)"))
print(insertAsteriks("(4)(3)"))
print(insertAsteriks("4*2 + 9 -4(-3)"))
print(insertAsteriks("(-9)(-2) (4)"))
print(insertAsteriks("(4)^(-3)"))
print(insertAsteriks("ABC(DEF)"))
print(insertAsteriks("g(d)(f)"))
print(insertAsteriks("(g)-(d)"))
The output is:
4*(3)
(4)*(3)
4*2 + 9 -4*(-3)
(-9)*(-2) (4)
(4)^(-3)
ABC*(DEF)
g*(d)*(f)
(g)-(d)
[Finished in 0.0s]
One way would be to use a simple replacement. The cases to be replaced are:
)( -> )*(
N( -> N*(
)N -> )*N
Assuming you want to preserve whitespace as well, you need to find all patterns on the left side with an arbitrary number of spaces in between and replace that with the same number of spaces less one plus the asterisk at the end. You can use a regex for that.
A more fun way would be using kind of a recursion with fake linked lists:) You have entities and operators. An entity can be a number by itself or anything enclosed in parentheses. Anything else is an operator. How bout something like this:
For each string, find all entities and operators (keep them in a list for example)
Then for each entity see if there are more entities inside.
Keep doing that until there are no more entities left in any entities.
Then starting from the very bottom (the smallest of entities that is) see if there is an operator between two adjacent entities, if there is not, insert an asterisk there. Do that all the way up to the top level. The start from the bottom again and reassemble all the pieces.
Here is a code tested on your examples :
i = 0
input_string = "(4-3)(4+2)"
output_string = ""
while i < len(input_string):
if input_string[i] == "(" and i != 0:
if input_string[i-1] in list(")1234567890"):
output_string += "*("
else:
output_string += input_string[i]
else:
output_string += input_string[i]
i += 1
print(output_string)
The key here is to understand the logic you want to achieve, which is in fact quite simple : you just want to add some "*" before opening parenthesis based on a few conditions.
Hope that helps !
I'm trying to learn how to automatically compile all members of a class into a list. This segment of code is not part of a real project, but just an example to help me explain my objective. I can't seem to find any reading material on this, and I don't even know if it is possible or not. Thanks in advance for your answers! =)
class question:
def __init__(self,question,answer,list_of_answers):
self.question=question
self.answer=answer
self.list_of_answers=list_of_answers
question_01=question('''
Which of these fruits is red?
A). Banana
B). Orange
C). Apple
D). Peach
''',"C",("A","B","C","D"))
question_02=question('''
Which of these is both a fruit and a vegetable?
A). Cauliflower
B). Tomato
C). Brocolli
D). Okrah
''',"B",("A","B","C","D"))
'''My objective is to write code that can automatically compile my questions (the
members of my question class) into a list,even if I have hundreds of them, without
having to manually write them into a list.'''
#If there are only two questions, final output should automatically become:
all_questions=[question_01,question_02]
#If there are one hundred questions, final output should automatically become:
all_questions=[question_01,question_02, ... ,question_99,question_100]
#Without having to manually type all of the one hundred questions (or members
#of the question class) to the list.
You shouldn't have 100 question_01 through question_100 variables in the first place. You're going to have a bad time when you want to reorder the questions, or delete one, or add one in the middle. Do you really want to have to rename 98 variables when you want to put a new question between question_02 and question_03?
At this point, you should strongly consider putting your questions into a data file separate from your source code and reading questions from the file. Even if you don't do that, though, you should eliminate the numbered variables. Put the questions in the list to start with. (Also, classes should be named in CamelCase):
questions = [
Question('''
Which of these fruits is red?
A). Banana
B). Orange
C). Apple
D). Peach
''', "C", ("A","B","C","D")),
Question('''
Which of these is both a fruit and a vegetable?
A). Cauliflower
B). Tomato
C). Brocolli
D). Okrah
''', "B", ("A","B","C","D")),
...
]
There is a way of doing what you wanted: obtain a list of all the objects of a given type from a module (or file). I present two solutions:
Option one, from a different module (file):
Say you have the following file:
QuestionModule.py
class question:
def __init__(self,question,answer,list_of_answers):
self.question=question
self.answer=answer
self.list_of_answers=list_of_answers
question_01=question('''
Which of these fruits is red?
A). Banana
B). Orange
C). Apple
D). Peach
''',"C",("A","B","C","D"))
question_02=question('''
Which of these is both a fruit and a vegetable?
A). Cauliflower
B). Tomato
C). Brocolli
D). Okrah
''',"B",("A","B","C","D"))
Then you can get all questions by:
GetQuestions.py
import QuestionModule
def get():
r = []
for attribute in dir(QuestionModule):
#print(attribute," ",type(getattr(QuestionModule,attribute)))
if type(getattr(QuestionModule,attribute)) == QuestionModule.question:
r.append(getattr(QuestionModule,attribute))
return r
l_questions = get()
Or:
import GetQuestions
l_questions = GetQuestions.get()
Option two, from the same module (file):
If you want to do the same from the same file, you can do:
class question:
def __init__(self,question,answer,list_of_answers):
self.question=question
self.answer=answer
self.list_of_answers=list_of_answers
question_01=question('''
Which of these fruits is red?
A). Banana
B). Orange
C). Apple
D). Peach
''',"C",("A","B","C","D"))
question_02=question('''
Which of these is both a fruit and a vegetable?
A). Cauliflower
B). Tomato
C). Brocolli
D). Okrah
''',"B",("A","B","C","D"))
def getQuestions():
import sys
l = dir(sys.modules[__name__])
r = []
for e in l:
if sys.modules[__name__].question==type(getattr(sys.modules[__name__],e)):
r.append(getattr(sys.modules[__name__], e))
return r
L = getQuestions()
for i in L :
print(i)
print(i.question)
You can take out the import sys from the method and put it at the top if you are to call getQuestions multiple times.
Both answers were very good. I wish that I could select both of them as the main answer. I gave it to mm_ because his answer most closely fit my objectives, but I really like user2357112's answer too.
Thanks for the answers everyone!
I would like to use OGM of py2neo to represent a relationship from one node type to two node types.
I have a solution (below) that works only to store nodes/relationships in the DB, and I could not find one that works properly when retrieving relationships.
This is my example. Consider the relationship OWNS from a Person to a Car:
from py2neo.ogm import GraphObject, Property, RelatedTo
from py2neo import Graph
class Person(GraphObject):
name = Property()
Owns = RelatedTo("Car")
class Car(GraphObject):
model = Property()
g = Graph(host="localhost", user="neo4j", password="neo4j")
# Create Pete
p = Person()
p.name = "Pete"
# Create Ferrari
c = Car()
c.model = "Ferrari"
# Pete OWNS Ferrari
p.Owns.add(c)
# Store
g.push(p)
This works well and fine. Now, let's assume that a Person OWNS a House as well (this code continues from the one above):
class House(GraphObject):
city = Property()
# Create House
h = House()
h.city = "New York"
# Pete OWNS House in New York
p.Owns.add(h)
# Update
g.push(p)
The "to" end of the relationship OWNS is supposed to point to a Car, not a House. But apparently py2neo does not care that much and stores everything in the DB as expected: a Person, a Car and a House connected via OWNS relationships.
Now the problem is to use the above classes to retrieve nodes and relationships. While node properties are loaded correctly, relationships are not:
p = Person.select(g).where(name="Pete").first()
for n in list(p.Owns):
print type(n).__name__
This results in:
Car
Car
This behavior is consistent with the class objects.
How can I model "Person OWNS Car" and "Person OWNS House" with the same class in py2neo.ogm? Is there any known solution or workaround that I can use here?
The issue is that "Owns" is set up as a relationship to the "Car" node. You need to set up another relationship to own a house. If you want the relationship to have the label of "OWNS" in Neo4j, you need to populate the second variable of the RelatedTo function. This is covered in the Py2Neo documentation (http://py2neo.org/v3/) in chapter 3.
class Person(GraphObject):
name = Property()
OwnsCar = RelatedTo("Car", "OWNS")
OwnsHouse = RelatedTo("House" "OWNS")
class Car(GraphObject):
model = Property()
class House(GraphObject):
city = Property()
I do want to say that Rick's answer addressed something I was trying to figure out with labeling with the Py2Neo OGM. So thanks Rick!
I had essentially the same question. I was unable to find an answer and tried to come up with a solution to this using both py2neo and neomodel.
Just a Beginner
It is important to note that I am definitely not answering this as an expert in either one of these libraries but rather as someone trying to evaluate what might be the best one to start a simple project with.
End Result
The end result is that I found a workaround in py2neo that seems to work. I also got a result in neomodel that I was even happier with. I ended up a little frustrated by both libraries but found neomodel the more intuitive to a newcomer.
An Asset Label is the Answer Right?
I thought that the answer would be to create an "Asset" label and add this label to House and Car and create the [:OWNS] relationship between Person and Asset. Easy right? Nope, apparently not. There might be a straightforward answer but I was unable to find it. The only solution that I got to work in py2neo was to drop down to the lower-level (not OGM) part of the library.
Here's what I did in py2neo:
class Person(GraphObject):
name = Property()
class Car(GraphObject):
name = Property()
model = Property()
asset = Label("Asset")
class House(GraphObject):
name = Property()
city = Property()
asset = Label("Asset")
g = graph
# Create Pete
p = Person()
p.name = "Pete"
g.push(p)
# Create Ferrari
c = Car()
c.name = "Ferrari"
c.asset = True
g.push(c)
# Create House
h = House()
h.name = "White House"
h.city = "New York"
h.asset = True
g.push(h)
# Drop down a level and grab the actual nodes
pn = p.__ogm__.node
cn = c.__ogm__.node
# Pete OWNS Ferrari (lower level py2neo)
ap = Relationship(pn, "OWNS", cn)
g.create(ap)
# Pete OWNS House (lower level py2neo)
hn = h.__ogm__.node
ah = Relationship(pn, "OWNS", hn)
g.create(ah)
# Grab & Print
query = """MATCH (a:Person {name:'Pete'})-[:OWNS]->(n)
RETURN labels(n) as labels, n.name as name"""
data = g.data(query)
for asset in data:
print(asset)
This results in:
{'name': 'White House', 'labels': ['House', 'Asset']}
{'name': 'Ferrari', 'labels': ['Car', 'Asset']}
Neomodel Version
py2neo seems to do some clever tricks with the class names to do its magic and the library seems to exclude Labels from this magic. (I hope I am wrong about this but as I said, I could not solve it). I decided to try neomodel.
class Person(StructuredNode):
name = StringProperty(unique_index=True)
owns = RelationshipTo('Asset', 'OWNS')
likes = RelationshipTo('Car', "LIKES")
class Asset(StructuredNode):
__abstract_node__ = True
__label__ = "Asset"
name = StringProperty(unique_index=True)
class Car(Asset):
pass
class House(Asset):
city = StringProperty()
# Create Person, Car & House
pete = Person(name='Pete').save()
car = Car(name="Ferrari").save()
house = House(name="White House", city="Washington DC").save()
#Pete Likes Car
pete.likes.connect(car)
# Pete owns a House and Car
pete.owns.connect(house)
pete.owns.connect(car)
After these objects are created they are relatively simple to work with:
for l in pete.likes.all():
print(l)
Result:
{'name': 'Ferrari', 'id': 385}
With the "abstract" relationship the result is an object of that type, in this case Asset.
for n in pete.owns.all():
print(n)
print(type(n))
Result:
{'id': 389}
<class '__main__.Asset'>
There seems to be a way to "inflate" these objects to the desired type but I gave up trying to figure that out in favor of just using Cypher. (Would appreciate some help understanding this...)
Dropping down to the Cypher level, we get exactly what we want:
query = "MATCH (a:Person {name:'Pete'})-[:OWNS]->(n) RETURN n"
results, meta = db.cypher_query(query)
for n in results:
print(n)
Result:
[<Node id=388 labels={'Asset', 'Car'} properties={'name': 'Ferrari'}>]
[<Node id=389 labels={'Asset', 'House'} properties={'city': 'Washington DC', 'name': 'White House'}>]
Conclusion
The concept of Labels is very intuitive for a lot of the problems I would like to solve. I found py2neo's treatment of Labels confusing. Your workaround might be to drop down to the "lower-level" of py2neo. I personally thought the neomodel syntax was more friendly and suggest checking it out. HTH.
I am implementing an acoustic feature extraction system in python, and I need to implement a makefile-style algorithm to ensure that all blocks in the feature extraction system are run in the correct order, and without repeating any feature extractions stages.
The input to this feature extraction system will be a graph detailing the links between the feature extraction blocks, and I'd like to work out which functions to run when based upon the graph.
An example of such a system might be the following:
,-> [B] -> [D] ----+
input --> [A] ^ v
`-> [C] ----+---> [E] --> output
and the function calls (assuming each block X is a function of the form output = X(inputs) might be something like:
a = A(input)
b = B(a)
c = C(a)
d = D(b,c) # can't call D until both b and c are ready
output = E(d,c) # can't call E until both c and d are ready
I already have the function graph loaded in the form of a dictionary with each dictionary entry of the form (inputs, function) like so:
blocks = {'A' : (('input'), A),
'B' : (('A'), B),
'C' : (('A'), C),
'D' : (('B','C'), D),
'output' : (('D','C'), E)}
I'm just currently drawing a blank on what the makefile algorithm does exactly, and how to go about implementing it. My google-fu appears to be not-very-helpful here too. If someone at least can give me a pointer to a good discussion of the makefile algorithm that would probably be a good start.
Topological sorting
blocks is basically an adjacency list representation of the (acyclic) dependency graph. Hence, to get the correct order to process the blocks, you can perform a topological sort.
As the other helpful answerers have already pointed out, what I'm after is a topological sort, but I think my particular case is a little simpler because the function graph must always start at input and end at output.
So, here is what I ended up doing (I've edited it slightly to remove some context-dependent stuff, so it may not be completely correct):
def extract(func_graph):
def _getsignal(block,signals):
if block in signals:
return signals[block]
else:
inblocks, fn = func_graph[block]
inputs = [_getsignal(i,signals) for i in inblocks]
signals[block] = fn(inputs)
return signals[block]
def extract_func (input):
signals = dict(input=input)
return _getsignal('output',signals)
return extract_func
So now given I can set up the function with
fn = extract(blocks)
And use it as many times as I like:
list_of_outputs = [fn(i) for i in list_of_inputs]
I should probably also put in some checks for loops, but that is a problem for another day.
For code in many languages, including Python try these Rosetta code links: Topological sort, and Topological sort/Extracted top item.
I'm rewriting a data-driven legacy application in Python. One of the primary tables is referred to as a "graph table", and does appear to be a directed graph, so I was exploring the NetworkX package to see whether it would make sense to use it for the graph table manipulations, and really implement it as a graph rather than a complicated set of arrays.
However I'm starting to wonder whether the way we use this table is poorly suited for an actual graph manipulation library. Most of the NetworkX functionality seems to be oriented towards characterizing the graph itself in some way, determining shortest distance between two nodes, and things like that. None of that is relevant to my application.
I'm hoping if I can describe the actual usage here, someone can advise me whether I'm just missing something -- I've never really worked with graphs before so this is quite possible -- or if I should be exploring some other data structure. (And if so, what would you suggest?)
We use the table primarily to transform a user-supplied string of keywords into an ordered list of components. This constitutes 95% of the use cases; the other 5% are "given a partial keyword string, supply all possible completions" and "generate all possible legal keyword strings". Oh, and validate the graph against malformation.
Here's an edited excerpt of the table. Columns are:
keyword innode outnode component
acs 1 20 clear
default 1 100 clear
noota 20 30 clear
default 20 30 hst_ota
ota 20 30 hst_ota
acs 30 10000 clear
cos 30 11000 clear
sbc 10000 10199 clear
hrc 10000 10150 clear
wfc1 10000 10100 clear
default 10100 10101 clear
default 10101 10130 acs_wfc_im123
f606w 10130 10140 acs_f606w
f550m 10130 10140 acs_f550m
f555w 10130 10140 acs_f555w
default 10140 10300 clear
wfc1 10300 10310 acs_wfc_ebe_win12f
default 10310 10320 acs_wfc_ccd1
Given the keyword string "acs,wfc1,f555w" and this table, the traversal logic is:
Start at node 1; "acs" is in the string, so go to node 20.
None of the presented keywords for node 20 are in the string, so choose the default, pick up hst_ota, and go to node 30.
"acs" is in the string, so go to node 10000.
"wfc1" is in the string, so go to node 10100.
Only one choice; go to node 10101.
Only one choice, so pick up acs_wfc_im123 and go to node 10130.
"f555w" is in the string, so pick up acs_f555w and go to node 10140.
Only one choice, so go to node 10300.
"wfc1" is in the string, so pick up acs_wfc_ebe_win12f and go to node 10310.
Only one choice, so pick up acs_wfc_ccd1 and go to node 10320 -- which doesn't exist, so we're done.
Thus the final list of components is
hst_ota
acs_wfc_im123
acs_f555w
acs_wfc_ebe_win12f
acs_wfc_ccd1
I can make a graph from just the innodes and outnodes of this table, but I couldn't for the life of me figure out how to build in the keyword information that determines which choice to make when faced with multiple possibilities.
Updated to add examples of the other use cases:
Given a string "acs", return ("hrc","wfc1") as possible legal next choices
Given a string "acs, wfc1, foo", raise an exception due to an unused keyword
Return all possible legal strings:
cos
acs, hrc
acs, wfc1, f606w
acs, wfc1, f550m
acs, wfc1, f555w
Validate that all nodes can be reached and that there are no loops.
I can tweak Alex's solution for the first two of these, but I don't see how to do it for the last two.
Definitely not suitable for general purpose graph libraries (whatever you're supposed to do if more than one of the words meaningful in a node is in the input string -- is that an error? -- or if none does and there is no default for the node, as for node 30 in the example you supply). Just write the table as a dict from node to tuple (default stuff, dict from word to specific stuff) where each stuff is a tuple (destination, word-to-add) (and use None for the special "word-to-add" clear). So e.g.:
tab = {1: (100, None), {'acs': (20, None)}),
20: ((30, 'hst_ota'), {'ota': (30, 'hst_ota'), 'noota': (30, None)}),
30: ((None, None), {'acs': (10000,None), 'cos':(11000,None)}),
etc etc
Now handling this table and an input comma-separated string is easy, thanks to set operations -- e.g.:
def f(icss):
kws = set(icss.split(','))
N = 1
while N in tab:
stuff, others = tab[N]
found = kws & set(others)
if found:
# maybe error if len(found) > 1 ?
stuff = others[found.pop()]
N, word_to_add = stuff
if word_to_add is not None:
print word_to_add
Adding an answer to respond to the further requirements newly edited in...: I still wouldn't go for a general-purpose library. For "all nodes can be reached and there are no loops", simply reasoning in terms of sets (ignoring the triggering keywords) should do: (again untested code, but the general outline should help even if there's some typo &c):
def add_descendants(someset, node):
"auxiliary function: add all descendants of node to someset"
stuff, others = tab[node]
othernode, _ = stuff
if othernode is not None:
someset.add(othernode)
for othernode, _ in others.values():
if othernode is not None:
someset.add(othernode)
def islegal():
"Return bool, message (bool is True for OK tab, False if not OK)"
# make set of all nodes ever mentioned in the table
all_nodes = set()
for node in tab:
all_nodes.add(node)
add_desendants(all_nodes, node)
# check for loops and connectivity
previously_seen = set()
currently_seen = set([1])
while currently_seen:
node = currently_seen.pop()
if node in previously_seen:
return False, "loop involving node %s" % node
previously_seen.add(node)
add_descendants(currently_seen, node)
unreachable = all_nodes - currently_seen
if unreachable:
return False, "%d unreachable nodes: %s" % (len(unreachable), unreachable)
else:
terminal = currently_seen - set(tab)
if terminal:
return True, "%d terminal nodes: %s" % (len(terminal), terminal)
return True, "Everything hunky-dory"
For the "legal strings" you'll need some other code, but I can't write it for you because I have not yet understood what makes a string legal or otherwise...!