Expanding expression trees in sympy, keeping certain binomials together - python

I am manipulating multivariate polynomials in sympy that are all expressed as sums of products of terms of the form (x_i + y_j), where the i and j are indices, and I want to keep it that way, i.e. express everything in terms of sums of one x symbol and one y symbol.
For example, I want
(y_{1} + z_{2})*((0 + 1)*(y_{3} + z_{2}) + y_{1} + z_{1} + 0 + 0)
to become
(y_{1} + z_{2})*(y_{3} + z_{2}) + (y_{1} + z_{2})*(y_{1} + z_{1})

The first thing you can do is replace binomials that fit the pattern with dummy and expand. The problem is that you will have some dangling terms to group together. If you always have two, then it's easy. If you have more it will require more work and a good definition of which subindexed x should go with which y (or whatever letters you want paired up).
So let's start with your unevaluated expression which we will call u
Get the free symbols (assuming there are only x and y of interest):
>>> free = u.free_symbols
Replace existing binomials with a unique dummy variable
>>> reps = {}
>>> u.replace(lambda x:x.is_Add and len(x.args) == 2 and all(
... i in free for i in x.args),
... lambda x: reps.setdefault(x, Dummy()))
_Dummy_45*(_Dummy_47*(0 + 1) + y1 + z1)
Now expand
>>> expand(_)
_Dummy_45*_Dummy_47 + _Dummy_45*y1 + _Dummy_45*z1
and collect products of dummy symbols together
>>> _.replace(lambda x:x.is_Mul and len(x.args) == 2 and all(
... i in reps.values() for i in x.args),
... lambda x: reps.setdefault(x, Dummy())))
_Dummy_45*y1 + _Dummy_45*z1 + _Dummy_51
Collect on dummy symbols to get binomials to appear that were previously dangling
>>> collect(_, reps.values())
_Dummy_45*(y1 + z1) + _Dummy_51
Now replace Dummy symbols with their values (which are the keys in reps so we have to invert that dictionary):
>>> _.xreplace({v:k for k,v in reps.items()})
_Dummy_45*_Dummy_47 + (y1 + z1)*(y1 + z2)
Do it again
>>> _.xreplace({v:k for k,v in reps.items()})
(y1 + z1)*(y1 + z2) + (y1 + z2)*(y3 + z2)
Posting specific expressions that you would like to see re-arranged in some way would help to focus a more robust solution, but these techniques can get you started. Here, too, is a function that pairs up free symbols in an Add and replaces them with Dummy symbols.
def collect_pairs(e, X, Y):
free = e.free_symbols
xvars, yvars = [[i for i in free if i.name.startswith(j)] for j in (X, Y)]
reps = {}
def do(e):
if not e.is_Add: return e
x, cy = sift(e.args, lambda x: x in xvars, binary=True)
y, c = sift(cy, lambda x: x in yvars, binary=True)
if x and len(x) != len(y): return e
args = []
for i,j in zip(ordered(x), ordered(y)):
args.append(reps.setdefault(i+j, Dummy()))
return Add(*(c + args))
# hmmm...this destroys the reps and returns {}
#return {v:k for k,v in reps.items()}, bottom_up(e, do)
return reps, bottom_up(e, do)
>>> e1
(y1 + z2)*(y1 + y3 + z1 + z2)
>>> r, e = collect_pairs(e1,'y','z')
>>> expand(e).xreplace({v:k for k,v in r.items()})
(y1 + z1)*(y1 + z2) + (y1 + z2)*(y3 + z2)
This works with the fully expanded e1 if you factor it first:
>>> e2 = factor(expand(e1)); e2
(y1 + z2)*(y1 + y3 + z1 + z2)
>>> r, e = collect_pairs(e2, 'y', 'z')
>>> expand(e).xreplace({v:k for k,v in r.items()})
(y1 + z1)*(y1 + z2) + (y1 + z2)*(y3 + z2)
Looking at the code you originally posted, I would suggest keeping the binomials together and only replace them at the end, like this:
...
def single_variable_diff(perm_dict,k,m):
ret_dict = {}
for perm,val in perm_dict.items():
if len(perm)<k:
ret_dict[perm] = Add(ret_dict.get(perm,0), reps.setdefault(U(var2[k],var3[m]), Dummy())*val,evaluate=False)
else:
ret_dict[perm] = Add(ret_dict.get(perm,0), reps.setdefault(U(var2[perm[k-1]],var3[m]), Dummy())*val,evaluate=False)
...
reps = {}
U = lambda x,y: UnevaluatedExpr(Add(*ordered((x,y))))
ireps = lambda: {v:k for k,v in reps.items()}
perms=[]
curperm = []
...
coeff_perms.sort(key=lambda x: (inv(x),*x))
def touch(e):
from sympy.core.traversal import bottom_up
def do(e):
return e if not e.args else e.func(*e.args)
return bottom_up(e, do)
undo = ireps()
for perm in coeff_perms:
val = touch(coeff_dict[perm]).expand().xreplace(undo))
print(f"{str(perm):>{width}} {str(val)}")
(3, 4, 1, 2) will be given in terms of binomial products, but some elements will not -- they are just sums of binomials. In ordered to keep them together, you can create them as UnevaluatedExpr, e.g. the U lambda that is defined. I am guessing you don't have to use evaluated=False and will, then, not need the touch function.

Related

Sympy's sqf() and sqf_list() give different results once I use poly() or .as_poly()

If I use the sqf() function or the sqf_list() function on a nice example I created it gives me a nice result. For example this:
v = (x1 + 2) ** 2 * (x2 + 4) ** 5
sqf(v) = (1, [(x1 + 2, 2), (x2 + 4, 5)])
But if I first use v = poly(v) it only finds = (1, [(Poly(x1 + 2, x1, x2, domain='ZZ'), 2)]).
Is this intended behaviour? I would assume it is a result of the transformation of
v = (x1 + 2) ^ 2 * (x2 + 4) ^ 5
into
v = Poly(x1^2*x2^5 + 20*x1^2*x2^4 + 160*x1^2*x2^3 + 640*x1^2*x2^2 + 1280*x1^2*x2 + 1024*x1^2 + 4*x1*x2^5 + 80*x1*x2^4 + 640*x1*x2^3 + 2560*x1*x2^2 + 5120*x1*x2 + 4096*x1 + 4*x2^5 + 80*x2^4 + 640*x2^3 + 2560*x2^2 + 5120*x2 + 4096, x1, x2, domain='ZZ')
Is there some way to either get the real result out of the transformed v? Or, if that isn't possible, to see that it prints out a 'wrong'(?) result?
Until this is fixed, a light work-around is:
def mysqf(expr):
s = sqf(expr)
r = cancel(expr/s)
if r == 1:
return s
return s*sqf(r)
def mysqf_list(expr):
s = mysqf(expr)
c, m = s.as_coeff_Mul()
return tuple([c, [f.as_base_exp() for f in Mul.make_args(m)]])
Alternatively, you could make sure you are working with univariate expression before passing them to sqf by doing a separation of variables:
def mvsqf(expr):
d = separatevars(expr)
assert d
return Mul(*[sqf(v) for v in d.values()])
Writing a mvsqf_list is left as an exercise for the interested reader.

Why doesn't SymPy simplify the expression?

I am just looking at the Python module SymPy and try, as a simple (useless) example the fit of a function f(x) by a function set g_i(x) in a given interval.
import sympy as sym
def functionFit(f, funcset, interval):
N = len(funcset) - 1
A = sym.zeros(N+1, N+1)
b = sym.zeros(N+1, 1)
x = sym.Symbol('x')
for i in range(N+1):
for j in range(i, N+1):
A[i,j] = sym.integrate(funcset[i]*funcset[j],
(x, interval[0], interval[1]))
A[j,i] = A[i,j]
b[i,0] = sym.integrate(funcset[i]*f, (x, interval[0], interval[1]))
c = A.LUsolve(b)
u = 0
for i in range(len(funcset)):
u += c[i,0]*funcset[i]
return u, c
x = sym.Symbol('x')
f = 10*sym.cos(x)+3*sym.sin(x)
fooset=(sym.sin(x), sym.cos(x))
interval = (1,2)
print("function to approximate:", f)
print("Basic functions:")
for foo in fooset:
print(" - ", foo)
u,c = functionFit(f, fooset, interval)
print()
print("simplified u:")
print(sym.simplify(u))
print()
print("simplified c:")
print(sym.simplify(c))
The result is the fit function u(x), to be returned, together with the coefficients by functionFit.
In my case
f(x) = 10 * sym.cos(x) + 3 * sym.sin(x)
and I want to fit it according to a linear combination of sin(x), cos(x).
So the coefficients should be 3 and 10.
The result is OK, but for u(x) I get
u(x) = (12*sin(2)**2*sin(4)*sin(x) + 3*sin(8)*sin(x) + 12*sin(2)*sin(x) + 40*sin(2)**2*sin(4)*cos(x) + 10*sin(8)*cos(x) + 40*sin(2)*cos(x))/(2*(sin(4) + 2*sin(2))) :
Function to approximate: 3*sin(x) + 10*cos(x)
Basic functions:
- sin(x)
- cos(x)
Simplified u: (12*sin(2)**2*sin(4)*sin(x) + 3*sin(8)*sin(x) + 12*sin(2)*sin(x) + 40*sin(2)**2*sin(4)*cos(x) + 10*sin(8)*cos(x) + 40*sin(2)*cos(x))/(2*(sin(4) + 2*sin(2)))
Simplified c: Matrix([[3], [10]])
which is indeed the same as 10 * cos(x) + 3 * sin(x).
However I wonder why it is not simplified to that expression. I tried several simplifying function available, but none of it gives the expected result.
Is there something wrong in my code or are my expectations to high?
Don't know if this is a solution for you, but I'd simply use the .evalf method of every Sympy expression
In [26]: u.simplify()
Out[26]: (12*sin(2)**2*sin(4)*sin(x) + 3*sin(8)*sin(x) + 12*sin(2)*sin(x) + 40*sin(2)**2*sin(4)*cos(x) + 10*sin(8)*cos(x) + 40*sin(2)*cos(x))/(2*(sin(4) + 2*sin(2)))
In [27]: u.evalf()
Out[27]: 3.0*sin(x) + 10.0*cos(x)
In [28]:

how to represent a variable with other variables given a equation set in SymPy?

I tried to eliminate r and z from the equation set and get the expression of S without r and z:
var('xi R R_bfs k S z r')
solve(r**2 - 2*R*z + (k + 1)*z**2, S*cos(xi)+z-R_bfs, S*sin(xi)-r, S, r, z)
This returns an empty list for S, but I am sure there is solution for s. Is there any method or function to handle this problem?
When I run into problems like this I try to use the CAS to do the steps for me that lead to the solution that I want. With only 3 equations this is pretty straightforward.
We can eliminate S from the last 2 equations
>>> eqs = r**2 - 2*R*z + (k + 1)*z**2, S*cos(xi)+z-R_bfs, S*sin(xi)-r
>>> solve(eqs[1:],(r,z))
{r: S*sin(xi), z: R_bfs - S*cos(xi)}
This solution can be substituted into the first equation
>>> e1 = eqs[0].subs(_)
This results in a polynomial in S, of degree = 2, that does not contain r or z
>>> degree(e1, S)
2
>>> e1.has(r, z)
False
And the solutions of a general quadratic are
>>> q = solve(a*x**2 + b*x + c, x); q
[(-b + sqrt(-4*a*c + b**2))/(2*a), -(b + sqrt(-4*a*c + b**2))/(2*a)]
So all we need are the values of a, b and c from e1 and we should have our
solutions for S, free or r and z:
>>> A, B, C = Poly(e1, S).all_coeffs()
>>> solns = [i.subs({a: A, b: B, c: C}) for i in q]
Before we look at those, let's let cse remove common expressions
>>> reps, sols = cse(solns)
Here are the replacements that are identified
>>> for i in reps:
... print(i)
(x0, cos(xi))
(x1, x0**2)
(x2, k*x1 + x1 + sin(xi)**2)
(x3, 1/(2*x2))
(x4, 2*R)
(x5, x0*x4)
(x6, 2*R_bfs*x0)
(x7, k*x6)
(x8, x5 - x6 - x7)
(x9, R_bfs**2)
(x10, sqrt(-4*x2*(-R_bfs*x4 + k*x9 + x9) + x8**2))
And in terms of those, here are the solutions:
>>> for i in sols:
... print(i)
x3*(x10 - x5 + x6 + x7)
-x3*(x10 + x8)
If you prefer the non-cse form, you can look at that, too. Here is one solution:
>>> print(filldedent(solns[0]))
(-2*R*cos(xi) + 2*R_bfs*k*cos(xi) + 2*R_bfs*cos(xi) +
sqrt(-4*(-2*R*R_bfs + R_bfs**2*k + R_bfs**2)*(k*cos(xi)**2 +
sin(xi)**2 + cos(xi)**2) + (2*R*cos(xi) - 2*R_bfs*k*cos(xi) -
2*R_bfs*cos(xi))**2))/(2*(k*cos(xi)**2 + sin(xi)**2 + cos(xi)**2))
If your initial all-in-one-go solution fails, try to let SymPy be your Swiss Army Knife :-)

Remove mixed-variable terms in SymPy series expansion

Consider two functions of SymPy symbols e and i:
from sympy import Symbol, expand, Order
i = Symbol('i')
e = Symbol('e')
f = (i**3 + i**2 + i + 1)
g = (e**3 + e**2 + e + 1)
z = expand(f*g)
This will produce
z = e**3*i**3 + e**3*i**2 + e**3*i + e**3 + e**2*i**3 + e**2*i**2 + e**2*i + e**2 + e*i**3 + e*i**2 + e*i + e + i**3 + i**2 + i + 1
However, assume that e and i are both small and we can neglect both terms that are order three or higher. Using Sympy’s series tool or simply adding an O-notation Order class can handle this:
In : z = expand(f*g + Order(i**3) + Order(e**3))
Out: 1 + i + i**2 + e + e*i + e*i**2 + e**2 + e**2*i + e**2*i**2 + O(i**3) + O(e**3)
Looks great. However, I am still left with mixed terms e**2 * i**2. Individual variables in these terms are less than the desired cut-off so SymPy keeps them. However, mathematically small²·small² = small⁴. Likewise, e·i² = small·small² = small³.
At least for my purposes, I want these mixed terms dropped. Adding a mixed Order does not produce the desired result (it seems to ignore the first two orders).
In : expand(f*g + Order(i**3) + Order(e**3) + Order((i**2)*(e**2)))
Out: 1 + i + i**2 + i**3 + e + e*i + e*i**2 + e*i**3 + e**2 + e**2*i + e**3 + e**3*i + O(e**2*i**2, e, i)
Question: Does SymPy have an easy system to quickly remove the n-th order terms, as well as terms that are (e^a)·(i^b) where a+b > n?
Messy Solution: I have found a way to solve this, but it is messy and potentially not general.
z = expand(f*g + Order((e**2)*i) + Order(e*(i**2)))
zz = expand(z.removeO() + Order(e**3) + Order(i**3))
produces
zz = 1 + i + i**2 + e + e*i + e**2 + O(i**3) + O(e**3)
which is exactly what I want. So to specify my question: Is there a way to do this in one step that can be generalized to any n? Also, my solution loses the big-O notation that indicates mixed-terms were lost. This is not needed but would be nice.
As you have a dual limit, you must specify both infinitesimal variables (e and i) in all Order objects, even if they don’t appear in the first argument.
The reason for this is that Order(expr) only automatically chooses those symbols as infinitesimal that actually appear in the expr and thus, e.g., O(e) is only for the limit e→0.
Now, Order objects with different limits don’t mix well, e.g.:
O(e*i)+O(e) == O(e*i) != O(e)+O(e*i) == O(e) # True
This leads to a mess where results depend on the order of addition, which is a good indicator that this is something to avoid.
This can be avoided by explicitly specifying the infinitesimal symbols (as addition arguments of Order), e.g.:
O(e*i)+O(e,e,i) == O(e,e,i)+O(e*i) == O(e,e,i) # True
I haven’t found a way to avoid going through all combinations of e and i manually, but this can be done by a simple iteration:
orders = sum( Order(e**a*i**(n-a),e,i) for a in range(n+1) )
expand(f*g+orders)
# 1 + i + i**2 + e + e*i + e**2 + O(e**2*i, e, i) + O(e*i**2, e, i) + O(i**3, e, i) + O(e**3, e, i)
Without using Order you might try something simple like this:
>>> eq = expand(f*g) # as you defined
>>> def total_degree(e):
... x = Dummy()
... free = e.free_symbols
... if not free: return S.Zero
... for f in free:
... e = e.subs(f, x)
... return degree(e)
>>> eq.replace(lambda x: total_degree(x) > 2, lambda x: S.Zero)
e**2 + e*i + e + i**2 + i + 1
There is a way about it using Poly. I have made a function that keeps the O(...) term and another that does not (faster).
from sympy import Symbol, expand, Order, Poly
i = Symbol('i')
e = Symbol('e')
f = (i**3 + i**2 + i + 1)
g = (e**3 + e**2 + e + 1)
z = expand(f*g)
def neglect(expr, order=3):
z = Poly(expr)
# extract all terms and keep the lower order ones
d = z.as_dict()
d = {t: c for t,c in d.items() if sum(t) < order}
# Build resulting polynomial
return Poly(d, z.gens).as_expr()
def neglectO(expr, order=3):
# This one keeps O terms
z = Poly(expr)
# extract terms of higher "order"
d = z.as_dict()
large = {t: c for t,c in d.items() if sum(t) >= order}
for t in large: # Add each O(large monomial) to the expression
expr += Order(Poly({t:1},z.gens).as_expr(), *z.gens)
return expr
print(neglect(z))
print(neglectO(z))
This code prints the following:
e**2 + e*i + e + i**2 + i + 1
1 + i + i**2 + e + e*i + e**2 + O(e**2*i, e, i) + O(e*i**2, e, i) + O(i**3, e, i) + O(e**3, e, i)

How to do "inverse" substitution in sympy?

I need to find and substitute subexpression with a symbol, doing an "inverse" substitution of sorts.
Here is direct substitution example:
(simplify and collect added to make the resulting expression have the form that I need to work with)
In [1]: from sympy.abc import a, b, x, y, z
...: expr = (1 + b) * z + (1 + b) * y
...: z_expr = a / (1 + b) + x
...: subs_expr = expr.subs(z, z_expr).simplify().collect(1+b)
...: print(expr)
...: print(z_expr)
...: print(subs_expr)
y*(b + 1) + z*(b + 1)
a/(b + 1) + x
a + (b + 1)*(x + y)
Now I want to go back, and subs does not do anything:
In [2]: orig_expr = subs_expr.subs(z_expr, z)
...: print(orig_expr)
a + (b + 1)*(x + y)
How can I get back to y*(b + 1) + z*(b + 1)?
The substitution attempt fails because subs_expr does not actually contain z_expr in its expression tree. "Substitute for an expression that isn't there" is not really a well-defined goal. A well-defined goal would be "eliminate a using the relation z = z_expr". That can be done as follows:
var('orig_expr')
orig_expr = solve([orig_expr - subs_expr, z - z_expr], [orig_expr, a])[orig_expr]
Now orig_expr is equal to b*y + b*z + y + z

Categories