Is there a way to create a python function from a string? For example, I have the following expression as a string:
dSdt = "-1* beta * s * i"
I've found a way to tokenize it:
>>> import re
>>> re.findall(r"(\b\w*[\.]?\w+\b|[\(\)\+\*\-\/])", dSdt)
['-', '1', '*', 'beta', '*', 's', '*', 'i']
And now I want to (somehow - and this is the part I don't know) convert it to something with the same behavior as:
def dSdt(beta, s, i):
return -1*beta*s*i
I've thought about something like eval(dSdt), but I want it to be more general (the parameters beta, s and i would have to be known to exist ahead of time).
Some close requests have linked to this question for evaluating a mathematical expression in a string. This is not quite the same as this question, as I'm looking to define a function from that string.
One way, using exec to define a new function from string
expr = "-1* beta * s * i"
name = "dSdt"
params = ["beta","s","i"] # Figure out how to build this array from expression
param_str = ",".join(params)
exec (f"def {name}({param_str}): return {expr}")
dSdt(1,2,3)
Out[]: -6
If you don't care about defining a reusable function can also use eval with the global object argument.
expr = "-1* beta * s * i"
param= {"beta":1, "s":2, "i":3} # Find way to build this.
eval(expr,param)
Out[]: -6
This is exactly what a compiler or interpreter does: translate from one language syntax to another. The main question here is when do you want to be able to execute the resulting function? Is it enough to write the function to a file to be used later by some other program? Or do you need to use it immediately by the parser in some way? For both situations, I would write a parser that creates an Abstract Syntax Tree from the tokens. This means you will need to make a more complex tokenizer that labels each token as an "operator", "number", or "variable". Usually this is done by writing a single regular expression for each type of token.
Then you can build a parser that consumes each token one at a time and builds an Abstract Syntax Tree that represents the expression. There is plenty of material online explaining how to do this, so I suggest some googling. You might also want to look for libraries that help with this.
Finally, you can traverse the AST and either write out the corresponding Python syntax to a file or evaluate the expression with some input for values of variables.
You're talking about how you cannot know the arguments beforehand - that's where *args and **kwargs are very useful!
I like this idea of yours and the tokenize function you made works pretty good.
I made a very general function for you that can handle any expression as long as you add the operators and functions you want to use inside the 'ignore' list. Then you simply need to add the variable values in the order that they appear in the expression.
import re
from math import sqrt
ignore = ["+", "-", "*", "/", "(", ")", "sqrt"]
def tokenize(expression):
return re.findall(r"(\b\w*[\.]?\w+\b|[\(\)\+\*\-\/])", expression)
def calculate(expression, *args):
seenArgs = {}
newTokens = []
tokens = tokenize(expression)
for token in tokens:
try:
float(token)
except ValueError:
tokenIsFloat = False
else:
tokenIsFloat = True
if token in ignore or tokenIsFloat:
newTokens.append(token)
else:
if token not in seenArgs:
seenArgs[token] = str(args[len(seenArgs)])
newTokens.append(seenArgs[token])
return eval("".join(newTokens))
print(calculate("-1* beta * s * i", 1, 2, 3))
print(calculate("5.5 * x * x", 3))
print(calculate("sqrt(x) * y", 9, 2))
Results in:
-6
49.5
6.0
Related
In Python, by convention, the underscore (_) is often used to throw away parts of an unpacked tuple, like so
>>> tup = (1,2,3)
>>> meaningfulVariableName,_,_ = tup
>>> meaningfulVariableName
1
I'm trying to do the same for a tuple argument of a lambda. It seems unfair that it can only be done with 2-tuples...
>>> map(lambda (meaningfulVariableName,_): meaningfulVariableName*2, [(1,10), (2,20), (3,30)]) # This is fine
[2, 4, 6]
>>> map(lambda (meaningfulVariableName,_,_): meaningfulVariableName*2, [(1,10,100), (2,20,200), (3,30,300)]) # But I need this!
SyntaxError: duplicate argument '_' in function definition (<pyshell#24>, line 1)
Any ideas why, and what the best way to achieve this is?
As it is in the comments, just use stared arguments
to throw an remaining arguments in "_":
lambda x, *_: x*2
If you were using these in a map statement, as Python does not map each item in a tuple to a different parameter, you could use itertools.starmap, that does that:
from itertools import starmap
result = map(lambda x, *_: x, [(0,1,2),])
But there is no equivalent to that on the key parameter to sort or sorted.
If you won't be using arguments in the middle of the tuple,
just number those:
lambda x, _1, _2, _3, w: x*2 + w
If you get a complaint from some linter tool about the parameters not being used: the purpose of the linter is to suggest mor readable code. My personal preference is not to let that to be in the way of practicity, and if this happens, I just turn off the linter for that line of code, without a second thought.
Otherwise, you will really have to do the "beautiful" thing - just use good sense if it is to please you and your team, or solely to please the linter. In this case, it is to write a full fledged function, and pretend
to consume the unused arguments.
def my_otherwise_lambda(x, unused_1, unused_2, w):
"""My make linter-happy docstring here"""
unused_1, unused_2 # Use the unused variables
return 2 * x + w
Short of having a problem with the linter, is the purpose is to have the lambda parameter readable, then habing a full-fledged function is the recomended anyway. lambda was really close of being stripped of the language in v. 3.0, in order to commit to readability.
And last, but not least, if the semantics of the value in your tuples is that meaningful, maybe you should consider using a class to hold the values in there. In that way you could just pass the instances of that class to the lambda funcion and check the values bytheir respective names.
Namedtuple is one that would work well:
from collections import namedtuple
vector = namedtuple("vector", "x y z")
mydata = [(1,10,100), (2,20,200), (3,30,300)]
mydata = [vector(*v) for v in mydata]
sorted_data = sorted(mydata, lambda v: v.x * 2)
Tuples are immutable in Python so you won't be able to "throw away" (modify) the extraneous values.
Additionally, since you don't care about what those values are, there is absolutely no need to assign them to variables.
What I would do, is to simply index the tuple at the index you are interested in, like so:
>>> list(map(lambda x: x[0] * 2, [(1,10,100), (2,20,200), (3,30,300)]))
[2, 4, 6]
No need for *args or dummy variables.
You are often better off to use list comprehensions rather than lambdas:
some_list = [(1, 10, 100), (2, 20, 200), (3, 30, 300)]
processed_list = [2 * x for x, dummy1, dummy2 in some_list]
If you really insist, you could use _ instead of dummy1 and dummy2 here. However, I recommend against this, since I've frequently seen this causing confusion. People often think _ is some kind of special syntax (which it is e.g. in Haskell and Rust), while it is just some unusual variable name without any special properties. This confusion is completely avoidable by using names like dummy1. Moreover, _ clashes with the common gettext alias, and it also does have a special meaning in the interactive interpreter, so overall I prefer using dummy to avoid all the confusion.
I designed a toy language :
# comments
n = 2 #an integer
k = 3.14 # a float
f(x): # a function
return (x-k)^n
g(z): #another function
return max(z-2.9, 0.1) # max should be understood
h(x,y: # another function ; note that function can call functions
return x*g(y)
pointable x # a formal variable
pointable y # another formal variable
gives h (x # 3.2, y # 1.9) # 6.1 # an instruction, the meaning will be clear later
gives f (x # 1.2) # 3.1 #another instruction
give # the final instruction
The purpose of this language is the following : the previous code, when parsed, is supposed to create the following list :
[[1, (f), (1.2), 3.1], [2, (f, h), (3.2, 1.9), 6.1]]
There are as many elements as there are give instructions.
I have no idea about how to design a grammar for this language. I know that it has to be in a EBNF form because I want to used plyplus as I already toyed with it with simple algebraic expressions parsing, but there is a great move from simple algebraic expression directly evaluated and what I intend to do. (Even if I know that formally it is the same.)
(How are treated functions when you have the abstract syntax tree of the above code btw ?)
For information, at the end, from [[1, (f), (1.2), 3.1], [2, (f, h), (3.2, 1.9), 6.1]] I want to produce the following python function :
def F(X1,X2):
# code defining f and g "as" defined above
# ...
# code using f and g
S1 = exp(-3.1)*f(X1/1.2)
S2 = exp(-6.1)*g(X1/3.2,X2/1.9)
res = S1 + S2
return res
and I would like to product it is as
ast = language_grammar.parse(code_above) #tree
F = tree.generatefunction()
I got the python grammar in the plyplus form and tried to reduce it (I would still like to having conditions and loops in my language) and adapt, without success, as this is really not my area.
I also feel that I would need in the grammar an algebraic subgrammar using all classical math functions and combining them, but I cannot really give a formal definition to this idea.
So, my code is like this:
def func(s,x):
return eval(s.replace('x',x)
#Example:
>> func('x**2 + 3*x',1)
4
The first argument of the function func must be a string because the function eval accepts only string or code objects. However, I'd like to use this function in a kind of calculator, where the user types for example 2 + sin(2*pi-0.15) + func(1.8*x-32,273) and gets the answer of the expression, and it's annoying always to have to write the quotes before in the expression inside func().
Is there a way to make python understands the s argument is always a string, even when it's not between quotes?
No, it is not possible. You can't intercept the Python interpreter before it parses and evaluates 1.8*x-32.
Using eval as a glorified calculator is a highly questionable idea. The user could pass in all kinds of malicious Python code. If you're going to do it, you should provide as minimal an environment as possible for the code to run in. Pass in your own globals dict containing only the variables the user is allowed to reference.
return eval(s, {'x': x})
Besides being safer, this is also a better way to substitute x into the expression.
You could have it handle both cases:
def func(s, x=0):
if isinstance(s, basestring):
# x is in the scope, so you don't need to replace the string
return eval(s)
else:
return s
And the output:
>>> from math import *
>>> func('2 + sin(2*pi-0.15) + func(1.8*x-32,273)')
-30.1494381324736
>>> func('x**2 + 3*x', 1)
4
Caution: eval can do more than just add numbers. I can type __import__('os').system('rm /your/homework.doc') and your calculator will delete your homework.
In a word: no, if I understand you.
In a few more, you can sort of get around the problem by making x be a special object. This is how the Python math library SymPy works. For example:
>>> from sympy import Symbol
>>> x = Symbol('x')
>>> x**2+3*x
x**2 + 3*x
>>> (x**2+3*x).subs(x,1)
4
There's even a handy function to turn strings into sympy objects:
>>> from sympy import sympify, pi
>>> sympify("x**2 - sin(x)")
x**2 - sin(x)
>>> _.subs(x, pi)
pi**2
All the warnings about untrusted user input hold. [I'm too lazy to check whether or not eval or exec is used on the sympify code path, and as they say, every weapon is loaded, even the unloaded ones.]
You can write an interpreter:
import code
def readfunc(prompt):
raw = input(prompt)
if raw.count(',')!=1:
print('Bad expression: {}'.format(raw))
return ''
s, x = raw.split(',')
return '''x={}; {}'''.format(x, s)
code.interact('Calc 0.1', readfunc)
I find that in lots of different projects I'm writing a lot of code where I need to evaluate a (moderately complex, possibly costly-to-evaluate) expression and then do something with it (e.g. use it for string formatting), but only if the expression is True/non-None.
For example in lots of places I end up doing something like the following:
result += '%s '%( <complexExpressionForGettingX> ) if <complexExpressionForGettingX> else ''
... which I guess is basically a special-case of the more general problem of wanting to return some function of an expression, but only if that expression is True, i.e.:
f( e() ) if e() else somedefault
but without re-typing the expression (or re-evaluating it, in case it's a costly function call).
Obviously the required logic can be achieved easily enough in various long-winded ways (e.g. by splitting the expression into multiple statements and assigning the expression to a temporary variable), but that's a bit grungy and since this seems like quite a generic problem, and since python is pretty cool (especially for functional stuff) I wondered if there's a nice, elegant, concise way to do it?
My current best options are either defining a short-lived lambda to take care of it (better than multiple statements, but a bit hard to read):
(lambda e: '%s ' % e if e else '')( <complexExpressionForGettingX> )
or writing my own utility function like:
def conditional(expr, formatStringIfTrue, default='')
... but since I'm doing this in lots of different code-bases I'd much rather use a built-in library function or some clever python syntax if such a thing exists
I like one-liners, definitely. But sometimes they are the wrong solution.
In professional software development, if the team size is > 2, you spent more time on understanding code someone else wrote than on writing new code. The one-liners presented here are definitely confusing, so just do two lines (even though you mentioned multiple statements in your post):
X = <complexExpressionForGettingX>
result += '%s '% X if X else ''
This is clear, concise, and everybody immediately understands what's going on here.
Python doesn't have expression scope (Is there a Python equivalent of the Haskell 'let'), presumably because the abuses and confusion of the syntax outweigh the advantages.
If you absolutely have to use an expression scope, the least worst option is to abuse a generator comprehension:
result += next('%s '%(e) if e else '' for e in (<complexExpressionForGettingX>,))
You could define a conditional formatting function once, and use it repeatedly:
def cond_format(expr, form, alt):
if expr:
return form % expr
else:
return alt
Usage:
result += cond_format(<costly_expression>, '%s ', '')
After hearing the responses (thanks guys!) I'm now convinced there's no way to achieve what I want in Python without defining a new function (or lambda function) since that's the only way to introduce a new scope.
For best clarity I decided this needed to be implemented as a reusable function (not lambda) so for the benefit of others, I thought I'd share the function I finally came up with - which is flexible enough to cope with multiple additional format string arguments (in addition to the main argument used to decide whether it's to do the formatting at all); it also comes with pythondoc to show correctness and illustrate usage (if you're not sure how the **kwargs thing works just ignore it, it's just an implementation detail and was the only way I could see to implement an optional defaultValue= kwarg following the variable list of format string arguments).
def condFormat(formatIfTrue, expr, *otherFormatArgs, **kwargs):
""" Helper for creating returning the result of string.format() on a
specified expression if the expressions's bool(expr) is True
(i.e. it's not None, an empty list or an empty string or the number zero),
or return a default string (typically '') if not.
For more complicated cases where the operation on expr is more complicated
than a format string, or where a different condition is required, use:
(lambda e=myexpr: '' if not e else '%s ' % e)
formatIfTrue -- a format string suitable for use with string.format(), e.g.
"{}, {}" or "{1}, {0:d}".
expr -- the expression to evaluate. May be of any type.
defaultValue -- set this keyword arg to override
>>> 'x' + condFormat(', {}.', 'foobar')
'x, foobar.'
>>> 'x' + condFormat(', {}.', [])
'x'
>>> condFormat('{}; {}', 123, 456, defaultValue=None)
'123; 456'
>>> condFormat('{0:,d}; {2:d}; {1:d}', 12345, 678, 9, defaultValue=None)
'12,345; 9; 678'
>>> condFormat('{}; {}; {}', 0, 678, 9, defaultValue=None) == None
True
"""
defaultValue = kwargs.pop('defaultValue','')
assert not kwargs, 'unexpected kwargs: %s'%kwargs
if not bool(expr): return defaultValue
if otherFormatArgs:
return formatIfTrue.format( *((expr,)+otherFormatArgs) )
else:
return formatIfTrue.format(expr)
Presumably, you want to do this repeatedly to build up a string. With a more global view, you might find that filter (or itertools.ifilter) does what you want to the collection of values.
You'll wind up with something like this:
' '.join(map(str, filter(None, <iterable of <complexExpressionForGettingX>>)))
Using None as the first argument for filter indicates to accept any true value. As a concrete example with a simple expression:
>>> ' '.join(map(str, filter(None, range(-3, 3))))
'-3 -2 -1 1 2'
Depending on how you're calculating the values, it may be that an equivalent list or generator comprehension would be more readable.
I have a small python script which i use everyday......it basically reads a file and for each line i basically apply different string functions like strip(), replace() etc....im constanstly editing the file and commenting to change the functions. Depending on the file I'm dealing with, I use different functions. For example I got a file where for each line, i need to use line.replace(' ','') and line.strip()...
What's the best way to make all of these as part of my script? So I can just say assign numbers to each functions and just say apply function 1 and 4 for each line.
First of all, many string functions – including strip and replace – are deprecated. The following answer uses string methods instead. (Instead of string.strip(" Hello "), I use the equivalent of " Hello ".strip().)
Here's some code that will simplify the job for you. The following code assumes that whatever methods you call on your string, that method will return another string.
class O(object):
c = str.capitalize
r = str.replace
s = str.strip
def process_line(line, *ops):
i = iter(ops)
while True:
try:
op = i.next()
args = i.next()
except StopIteration:
break
line = op(line, *args)
return line
The O class exists so that your highly abbreviated method names don't pollute your namespace. When you want to add more string methods, you add them to O in the same format as those given.
The process_line function is where all the interesting things happen. First, here is a description of the argument format:
The first argument is the string to be processed.
The remaining arguments must be given in pairs.
The first argument of the pair is a string method. Use the shortened method names here.
The second argument of the pair is a list representing the arguments to that particular string method.
The process_line function returns the string that emerges after all these operations have performed.
Here is some example code showing how you would use the above code in your own scripts. I've separated the arguments of process_line across multiple lines to show the grouping of the arguments. Of course, if you're just hacking away and using this code in day-to-day scripts, you can compress all the arguments onto one line; this actually makes it a little easier to read.
f = open("parrot_sketch.txt")
for line in f:
p = process_line(
line,
O.r, ["He's resting...", "This is an ex-parrot!"],
O.c, [],
O.s, []
)
print p
Of course, if you very specifically wanted to use numerals, you could name your functions O.f1, O.f2, O.f3… but I'm assuming that wasn't the spirit of your question.
If you insist on numbers, you can't do much better than a dict (as gimel suggests) or list of functions (with indices zero and up). With names, though, you don't necessarily need an auxiliary data structure (such as gimel's suggested dict), since you can simply use getattr to retrieve the method to call from the object itself or its type. E.g.:
def all_lines(somefile, methods):
"""Apply a sequence of methods to all lines of some file and yield the results.
Args:
somefile: an open file or other iterable yielding lines
methods: a string that's a whitespace-separated sequence of method names.
(note that the methods must be callable without arguments beyond the
str to which they're being applied)
"""
tobecalled = [getattr(str, name) for name in methods.split()]
for line in somefile:
for tocall in tobecalled: line = tocall(line)
yield line
It is possible to map string operations to numbers:
>>> import string
>>> ops = {1:string.split, 2:string.replace}
>>> my = "a,b,c"
>>> ops[1](",", my)
[',']
>>> ops[1](my, ",")
['a', 'b', 'c']
>>> ops[2](my, ",", "-")
'a-b-c'
>>>
But maybe string descriptions of the operations will be more readable.
>>> ops2={"split":string.split, "replace":string.replace}
>>> ops2["split"](my, ",")
['a', 'b', 'c']
>>>
Note:
Instead of using the string module, you can use the str type for the same effect.
>>> ops={1:str.split, 2:str.replace}
To map names (or numbers) to different string operations, I'd do something like
OPERATIONS = dict(
strip = str.strip,
lower = str.lower,
removespaces = lambda s: s.replace(' ', ''),
maketitle = lamdba s: s.title().center(80, '-'),
# etc
)
def process(myfile, ops):
for line in myfile:
for op in ops:
line = OPERATIONS[op](line)
yield line
which you use like this
for line in process(afile, ['strip', 'removespaces']):
...