how can I combine a switch-case and regex in Python

how can I combine a switch-case and regex in Python - python

I want to process a string by matching it with a sequence of regular expression. As I'm trying to avoid nested if-then, I'm thinking of switch-case. How can I write the following structure in Python? Thank you
switch str:
case match(regex1):
# do something
case match(regex2):
# do sth else
I know Perl allows one to do that. Does Python?

First consider why there is no case statement in Python. So reset you brain and forget them.
You can use an object class, function decorators or use function dictionaries to achieve the same or better results.
Here is a quick trivial example:
#!/usr/bin/env python
import re
def hat(found):
if found: print "found a hat"
else: print "no hat"
def cat(found):
if found: print "found a cat"
else: print "no cat"
def dog(found):
if found: print "found a dog"
else: print "no dog"
st="""
Here is the target string
with a hat and a cat
no d o g
end
"""
patterns=['hat', 'cat', 'dog']
functions=[hat,cat,dog]
for pattern,case in zip(patterns,functions):
print "pattern=",pattern
case(re.search(pattern,st))
C style case / switch statements also "fall through, such as:
switch(c) {
case 'a':
case 'b':
case 'c': do_abc();
break;
... other cases...
}
Using tuples and lists of callables, you can get the similar behavior:
st="rat kitten snake puppy bug child"
def proc1(st): print "cuddle the %s" % st
def proc2(st): print "kill the %s" % st
def proc3(st): print "pick-up the %s" % st
def proc4(st): print "wear the %s" % st
def proc5(st): print "dispose of the %s" %st
def default(st): print "%s not found" % st
dproc={ ('puppy','kitten','child'):
[proc3, proc1],
('hat','gloves'):
[proc3, proc4],
('rat','snake','bug'):
[proc2, proc3, proc5]}
for patterns,cases in dproc.iteritems():
for pattern in patterns:
if re.search(pattern,st):
for case in cases: case(pattern)
else: default(pattern)
print
This gets the order for the found item correct: 1) pick up child, cuddle the child; 2) kill the rat, pick up the rat... It would be difficult to do the same with a C switch statement in an understandable syntax.
There are many other ways to imitate a C switch statement. Here is one (for integers) using function decorators:
case = {}
def switch_on(*values):
def case_func(f):
case.update((v, f) for v in values)
return f
return case_func
#switch_on(0, 3, 5)
def case_a(): print "case A"
#switch_on(1,2,4)
def case_b(): print "case B"
def default(): print "default"
for i in (0,2,3,5,22):
print "Case: %i" % i
try:
case[i]()
except KeyError:
default()
To paraphrase Larry Wall, Tom Christiansen, Jon Orwant in Programming Perl regarding understanding context in Perl:
You will be miserable programming Python until you use the idioms that are native to the language...

A quick search shows a similar question asked earlier with multiple work arounds. May favorite solution from that one is by Mizard
import re
class Re(object):
def __init__(self):
self.last_match = None
def match(self,pattern,text):
self.last_match = re.match(pattern,text)
return self.last_match
def search(self,pattern,text):
self.last_match = re.search(pattern,text)
return self.last_match
gre = Re()
if gre.match(r'foo',text):
# do something with gre.last_match
elif gre.match(r'bar',text):
# do something with gre.last_match
else:
# do something else

You are looking for pyswitch (disclaimer: I am the author). With it, you can do the following, which is pretty close to the example you gave in your question:
from pyswitch import Switch
mySwitch = Switch()
#myswitch.caseRegEx(regex1)
def doSomething(matchObj, *args, **kwargs):
# Do Something
return 1
#myswitch.caseRegEx(regex2)
def doSomethingElse(matchObj, *args, **kwargs):
# Do Something Else
return 2
rval = myswitch(stringYouWantToSwitchOn)
There's a much more comprehensive example given at the URL I linked. pyswitch is not restricted to just switching on regular expressions. Internally, pyswitch uses a dispatch system similar to the examples others have given above. I just got tired of having to re-write the same code framework over and over every time I needed that kind of dispatch system, so I wrote pyswitch.

Your question regarding Perl style switch statements is ambiguous. You reference Perl but you are using a C style switch statement in your example. (There is a deprecated module that provides C style switch statements in Perl, but this is not recommended...)
If you mean Perl given / when type switch statements, this would not be trivial to implement in Python. You would need to implement smart matching and other non-trivial Perl idioms. You might as well just write whatever in Perl?
If you mean C style switch statements, these are relatively trivial in comparison. Most recommend using a dictionary dispatch method, such as:
import re
def case_1():
print "case 1"
return 1
def case_2():
print "case 2"
return 2
def case_3():
print "case 3"
return 3
def default():
print "None"
return 0
dispatch= {
'a': case_1,
'g': case_2,
'some_other': case_3,
'default': default
}
str="abcdefg"
r=[dispatch[x]() if re.search(x,str) else dispatch['default']()
for x in ['a','g','z'] ]
print "r=",r

If you're avoiding if-then, you can build on something like this:
import re
# The patterns
r1 = "spam"
r2 = "eggs"
r3 = "fish"
def doSomething1():
return "Matched spam."
def doSomething2():
return "Matched eggs."
def doSomething3():
return "Matched fish."
def default():
return "No match."
def match(r, s):
mo = re.match(r, s)
try:
return mo.group()
except AttributeError:
return None
def delegate(s):
try:
action = {
match(r1, s): doSomething1,
match(r2, s): doSomething2,
match(r3, s): doSomething3,
}[s]()
return action
except KeyError:
return default()
Results
>>> delegate("CantBeFound")
0: 'No match.'
>>> delegate("spam")
1: 'Matched spam.'
>>> delegate("eggs")
2: 'Matched eggs.'
>>> delegate("fish")
3: 'Matched fish.'

Related

How to efficiently evaluate a series of methods and call the ones that are not False?

The class below gets a string as input and produces another string via the answer() method.
class Question:
a = ["hello", "hi"]
b = ["how are you", "how do you do"]
c = ["how is the weather", "is it cold today"]
def __init__(self, question):
self.question = question
def get_greeting(self):
if self.question in Question.a:
return "Hi There!"
def get_health(self):
if self.question in Question.b:
return "I am fine"
def get_weather(self):
if self.question in Question.c:
return "It is warm"
def answer(self):
if self.get_greeting():
return self.get_greeting()
elif self.get_health():
return self.get_health()
elif self.get_weather():
return self.get_weather()
else:
return "I don't understand"
question = Question("how is the weather")
print(question.answer()) # getting the output
To me the above is bad because the code inside answer() it's long and it calls each method twice.
Therefore, I came up with a "better" answer() method that calls the methods only once, but it's still a lot of if conditionals.
def answer(self):
result = self.get_greeting()
if result:
return result
result = self.get_health()
if result:
return result
result = self.get_weather()
if result:
return result
return "I don't understand"
I think there might be some other technique that I am missing here. Anyone can suggest something?

The result of or is the left-hand operand if it's not "false-y", otherwise the right-hand operand.
It also evaluates the right-hand operand only when it's needed.
def answer(self):
return self.get_greeting() \
or self.get_health() \
or self.get_weather() \
or "I don't understand"

You can make a tuple of all methods and then call them until one returns something:
def answer(self):
methods = (self.get_greeting, self.get_health, self.get_weather)
for m in methods:
res = m()
if res:
return res
return "I don't understand"
Edit
If you really want to create a lot of methods and have your answer() function try them all without explicitly telling it, you can use this code:
def answer(self):
getters = (v for k, v in self.__class__.__dict__.items() if k.startswith("get_"))
for method in getters:
res = method(self)
if res:
return res
return "I don't understand"
Edit 2
If your system just gets a string as an input and generates predefined outputs from it, you may be able to simplify it quite a bit:
knowledge = [
(["hello", "hi"], "Hi There"),
(["how are you", "how do you do"], "I am fine"),
(["how is the weather", "is it cold today"], "It is warm")
]
def answer(question):
for inputs, answer in knowledge:
if question in inputs:
return answer
return "I don't understand"
print(answer("hello"))
Using this approach, adding a new phrase to the chatbot is as easy as adding a line to the knowledge data structure.

Is there a way to add a conditional string in Python's advance string formatting "foo {}".format(bar)?

For example I have a line of code like this
if checked:
checked_string = "check"
else:
checked_string = "uncheck"
print "You can {} that step!".format(checked_string)
Is there a shortcut to this? I was just curious.

print "You can {} that step!".format('check' if checked else 'uncheck')

checkmap = {True: 'check', False: 'uncheck'}
print "You can {} that step!".format(checkmap[bool(checked)]))

This can be handled with python 3.6+ using f-strings
print(f"You can {'check' if checked else 'uncheck'} that step!")

I know, I'm very late. But people do search.
I'm using this in a situation, where the format strings must be as simple as possible, because they are part of the user supplied configuration, i.e. written by people knowing nothing about Python.
In this basic form is the usage limited to just one condition.
class FormatMap:
def __init__(self, value):
self._value = bool(value)
def __getitem__(self, key):
skey = str(key)
if '/' not in skey:
raise KeyError(key)
return skey.split('/', 1)[self._value]
def format2(fmt, value):
return fmt.format_map(FormatMap(value))
STR1="A valve is {open/closed}."
STR2="Light is {off/on}."
STR3="A motor {is not/is} running."
print(format2(STR1, True))
print(format2(STR2, True))
print(format2(STR3, True))
print(format2(STR3, False))
# A valve is closed.
# Light is on.
# A motor is running.
# A motor is not running.

Real-world examples of nested functions

I asked previously how the nested functions work, but unfortunately I still don't quite get it. To understand it better, can someone please show some real-wold, practical usage examples of nested functions?
Many thanks

Your question made me curious, so I looked in some real-world code: the Python standard library. I found 67 examples of nested functions. Here are a few, with explanations.
One very simple reason to use a nested function is simply that the function you're defining doesn't need to be global, because only the enclosing function uses it. A typical example from Python's quopri.py standard library module:
def encode(input, output, quotetabs, header = 0):
...
def write(s, output=output, lineEnd='\n'):
# RFC 1521 requires that the line ending in a space or tab must have
# that trailing character encoded.
if s and s[-1:] in ' \t':
output.write(s[:-1] + quote(s[-1]) + lineEnd)
elif s == '.':
output.write(quote(s) + lineEnd)
else:
output.write(s + lineEnd)
... # 35 more lines of code that call write in several places
Here there was some common code within the encode function, so the author simply factored it out into a write function.
Another common use for nested functions is re.sub. Here's some code from the json/encode.py standard library module:
def encode_basestring(s):
"""Return a JSON representation of a Python string
"""
def replace(match):
return ESCAPE_DCT[match.group(0)]
return '"' + ESCAPE.sub(replace, s) + '"'
Here ESCAPE is a regular expression, and ESCAPE.sub(replace, s) finds all matches of ESCAPE in s and replaces each one with replace(match).
In fact, any API, like re.sub, that accepts a function as a parameter can lead to situations where nested functions are convenient. For example, in turtle.py there's some silly demo code that does this:
def baba(xdummy, ydummy):
clearscreen()
bye()
...
tri.write(" Click me!", font = ("Courier", 12, "bold") )
tri.onclick(baba, 1)
onclick expects you to pass an event-handler function, so we define one and pass it in.

Decorators are a very popular use for nested functions. Here's an example of a decorator that prints a statement before and after any call to the decorated function.
def entry_exit(f):
def new_f(*args, **kwargs):
print "Entering", f.__name__
f(*args, **kwargs)
print "Exited", f.__name__
return new_f
#entry_exit
def func1():
print "inside func1()"
#entry_exit
def func2():
print "inside func2()"
func1()
func2()
print func1.__name__

Nested functions avoid cluttering other parts of the program with other functions and variables that only make sense locally.
A function that return Fibonacci numbers could be defined as follows:
>>> def fib(n):
def rec():
return fib(n-1) + fib(n-2)
if n == 0:
return 0
elif n == 1:
return 1
else:
return rec()
>>> map(fib, range(10))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
EDIT: In practice, generators would be a better solution for this, but the example shows how to take advantage of nested functions.

They are useful when using functions that take other functions as input. Say you're in a function, and want to sort a list of items based on the items' value in a dict:
def f(items):
vals = {}
for i in items: vals[i] = random.randint(0,100)
def key(i): return vals[i]
items.sort(key=key)
You can just define key right there and have it use vals, a local variable.
Another use-case is callbacks.

I have only had to use nested functions when creating decorators. A nested function is basically a way of adding some behavior to a function without knowing what the function is that you are adding behavior to.
from functools import wraps
from types import InstanceType
def printCall(func):
def getArgKwargStrings(*args, **kwargs):
argsString = "".join(["%s, " % (arg) for arg in args])
kwargsString = "".join(["%s=%s, " % (key, value) for key, value in kwargs.items()])
if not len(kwargs):
if len(argsString):
argsString = argsString[:-2]
else:
kwargsString = kwargsString[:-2]
return argsString, kwargsString
#wraps(func)
def wrapper(*args, **kwargs):
ret = None
if args and isinstance(args[0], InstanceType) and getattr(args[0], func.__name__, None):
instance, args = args[0], args[1:]
argsString, kwargsString = getArgKwargStrings(*args, **kwargs)
ret = func(instance, *args, **kwargs)
print "Called %s.%s(%s%s)" % (instance.__class__.__name__, func.__name__, argsString, kwargsString)
print "Returned %s" % str(ret)
else:
argsString, kwargsString = getArgKwargStrings(*args, **kwargs)
ret = func(*args, **kwargs)
print "Called %s(%s%s)" % (func.__name__, argsString, kwargsString)
print "Returned %s" % str(ret)
return ret
return wrapper
def sayHello(name):
print "Hello, my name is %s" % (name)
if __name__ == "__main__":
sayHelloAndPrintDebug = printCall(sayHello)
name = "Nimbuz"
sayHelloAndPrintDebug(name)
Ignore all the mumbo jumbo in the "printCall" function for right now and focus only the "sayHello" function and below. What we're doing here is we want to print out how the "sayHello" function was called everytime it is called without knowing or altering what the "sayHello" function does. So we redefine the "sayHello" function by passing it to "printCall", which returns a NEW function that does what the "sayHello" function does AND prints how the "sayHello" function was called. This is the concept of decorators.
Putting "#printCall" above the sayHello definition accomplishes the same thing:
#printCall
def sayHello(name):
print "Hello, my name is %s" % (name)
if __name__ == "__main__":
name = "Nimbuz"
sayHello(name)

Yet another (very simple) example. A function that returns another function. Note how the inner function (that is returned) can use variables from the outer function's scope.
def create_adder(x):
def _adder(y):
return x + y
return _adder
add2 = create_adder(2)
add100 = create_adder(100)
>>> add2(50)
52
>>> add100(50)
150

Python Decorators
This is actually another topic to learn, but if you look at the stuff on 'Using Functions as Decorators', you'll see some examples of nested functions.

OK, besides decorators: Say you had an application where you needed to sort a list of strings based on substrings which varied from time to time. Now the sorted functions takes a key= argument which is a function of one argument: the items (strings in this case) to be sorted. So how to tell this function which substrings to sort on? A closure or nested function, is perfect for this:
def sort_key_factory(start, stop):
def sort_key(string):
return string[start: stop]
return sort_key
Simple eh? You can expand on this by encapsulating start and stop in a tuple or a slice object and then passing a sequence or iterable of these to the sort_key_factory.

Python equivalent of ruby's StringScanner?

Is there a python class equivalent to ruby's StringScanner class? I Could hack something together, but i don't want to reinvent the wheel if this already exists.

Interestingly there's an undocumented Scanner class in the re module:
import re
def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)
scanner = re.Scanner([
(r"[a-zA-Z_]\w*", s_ident),
(r"\d+\.\d*", s_float),
(r"\d+", s_int),
(r"=|\+|-|\*|/", s_operator),
(r"\s+", None),
])
print scanner.scan("sum = 3*foo + 312.50 + bar")
Following the discussion it looks like it was left in as experimental code/a starting point for others.

There is nothing exactly like Ruby's StringScanner in Python. It is of course easy to put something together:
import re
class Scanner(object):
def __init__(self, s):
self.s = s
self.offset = 0
def eos(self):
return self.offset == len(self.s)
def scan(self, pattern, flags=0):
if isinstance(pattern, basestring):
pattern = re.compile(pattern, flags)
match = pattern.match(self.s, self.offset)
if match is not None:
self.offset = match.end()
return match.group(0)
return None
along with an example of using it interactively
>>> s = Scanner("Hello there!")
>>> s.scan(r"\w+")
'Hello'
>>> s.scan(r"\s+")
' '
>>> s.scan(r"\w+")
'there'
>>> s.eos()
False
>>> s.scan(r".*")
'!'
>>> s.eos()
True
>>>
However, for the work I do I tend to just write those regular expressions in one go and use groups to extract the needed fields. Or for something more complicated I would write a one-off tokenizer or look to PyParsing or PLY to tokenize for me. I don't see myself using something like StringScanner.

Looks like a variant on re.split( pattern, string ).
http://docs.python.org/library/re.html
http://docs.python.org/library/re.html#re.split

https://pypi.python.org/pypi/scanner/
Seems a more maintained and feature complete solution. But it uses oniguruma directly.

Maybe look into the built in module tokenize. It looks like you can pass a string into it using the StringIO module.

Today there is a project by Mark Watkinson that implements StringScanner in Python:
http://asgaard.co.uk/p/Python-StringScanner
https://github.com/markwatkinson/python-string-scanner
http://code.google.com/p/python-string-scanner/

Are you looking for regular expressions in Python? Check this link from official docs:
http://docs.python.org/library/re.html

How do I get the name of a function or method from within a Python function or method?

I feel like I should know this, but I haven't been able to figure it out...
I want to get the name of a method--which happens to be an integration test--from inside it so it can print out some diagnostic text. I can, of course, just hard-code the method's name in the string, but I'd like to make the test a little more DRY if possible.

This seems to be the simplest way using module inspect:
import inspect
def somefunc(a,b,c):
print "My name is: %s" % inspect.stack()[0][3]
You could generalise this with:
def funcname():
return inspect.stack()[1][3]
def somefunc(a,b,c):
print "My name is: %s" % funcname()
Credit to Stefaan Lippens which was found via google.

The answers involving introspection via inspect and the like are reasonable. But there may be another option, depending on your situation:
If your integration test is written with the unittest module, then you could use self.id() within your TestCase.

This decorator makes the name of the method available inside the function by passing it as a keyword argument.
from functools import wraps
def pass_func_name(func):
"Name of decorated function will be passed as keyword arg _func_name"
#wraps(func)
def _pass_name(*args, **kwds):
kwds['_func_name'] = func.func_name
return func(*args, **kwds)
return _pass_name
You would use it this way:
#pass_func_name
def sum(a, b, _func_name):
print "running function %s" % _func_name
return a + b
print sum(2, 4)
But maybe you'd want to write what you want directly inside the decorator itself. Then the code is an example of a way to get the function name in a decorator. If you give more details about what you want to do in the function, that requires the name, maybe I can suggest something else.

# file "foo.py"
import sys
import os
def LINE( back = 0 ):
return sys._getframe( back + 1 ).f_lineno
def FILE( back = 0 ):
return sys._getframe( back + 1 ).f_code.co_filename
def FUNC( back = 0):
return sys._getframe( back + 1 ).f_code.co_name
def WHERE( back = 0 ):
frame = sys._getframe( back + 1 )
return "%s/%s %s()" % ( os.path.basename( frame.f_code.co_filename ),
frame.f_lineno, frame.f_code.co_name )
def testit():
print "Here in %s, file %s, line %s" % ( FUNC(), FILE(), LINE() )
print "WHERE says '%s'" % WHERE()
testit()
Output:
$ python foo.py
Here in testit, file foo.py, line 17
WHERE says 'foo.py/18 testit()'
Use "back = 1" to find info regarding two levels back down the stack, etc.

I think the traceback module might have what you're looking for. In particular, the extract_stack function looks like it will do the job.

To elaborate on #mhawke's answer:
Rather than
def funcname():
return inspect.stack()[1][3]
You can use
def funcname():
frame = inspect.currentframe().f_back
return inspect.getframeinfo(frame).function
Which, on my machine, is about 5x faster than the original version according to timeit.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how can I combine a switch-case and regex in Python - python

Related

How to efficiently evaluate a series of methods and call the ones that are not False?

Is there a way to add a conditional string in Python's advance string formatting "foo {}".format(bar)?

Real-world examples of nested functions

Python equivalent of ruby's StringScanner?

How do I get the name of a function or method from within a Python function or method?

Categories

Resources