aa = list(symbols('a0:2'))
q1= series(aa[0]/(1-x) + aa[1]/(1-x**2),x,n=6)
q1.subs(aa[0],1)
print(q1)
Output: x**2*(a0 + a1) + x**4*(a0 + a1) + a1 + a0 + a0*x + a0*x**3 + a0*x**5 + O(x**6)
But what I would like for all the a0's in the series to be substitued by the value of 1:
Output: x**2*(1 + a1) + x**4*(1 + a1) + a1 + 1 + 1*x + 1*x**3 + 1*x**5 + O(x**6)
My understanding is that:
q1.subs(aa[0],1)
would do exactly that. Is there any other way to do the same ? Thanks!
With the exception of mutable matrices, SymPy objects are immutable. Their methods do not modify them; a new object is returned instead. This object needs to be assigned to something (or printed, or returned):
q2 = q1.subs(...)
print(q1.subs(...))
return q1.subs(...)
all make sense; the lonely q1.subs(...) is useless.
This is covered in the "Gotchas and Pitfalls" article under Immutability of Expressions; I recommend reading the rest of that page too.
Related
I have a dataframe containing the following columns:
y as the dependent variable
A, B, C, D, E, F as the independent variables.
I want to make a regression using the statsmodels module and I don't want to express the formula argument as follows:
formula = 'y ~ A + B + C + D + E + F'
R glm library does have a simplification by expressing formula = y ~ .
I was wondering if statsmodel shortcut as there is one for the glm library in R.
P.S.: the actual dataframe that I'm working has 27 variables
There is no shortcut like "." in patsy formula handling which is used by statsmodels.
However, python string manipulation is simple.
An example that I'm currently using,
DATA is my dataframe, docvis is the outcome variable, and I have a constant column that is not needed in the formula.
formula = "docvis ~ " + " + ".join([i for i in DATA if i not in ["docvis", "const"]])
formula
'docvis ~ offer + ssiratio + age + educyr + physician + nonphysician + medicaid + private + female + phylim + actlim + income + totchr + insured + age2 + linc + bh + ldocvis + ldocvisa + docbin + aget + aget2 + incomet'
More explicit would be to use column names directly DATA.columns.
In modern Python we don't need to build the list in the list comprehension, and we can use
formula = "docvis ~ " + " + ".join(i for i in DATA.columns if i not in ["docvis", "const"])
How can I make all low values in a SymPy expression zero? For example, my result is:
1.0*a1*cos(q1) - 6.12e-17*(a2*sin(q2) + a3*sin(q2 + q3) + a4*sin(q2 + q3 + q4))*sin(q1) + 1.0*(a2*cos(q2) + a3*cos(q2 + q3) + a4*cos(q2 + q3 + q4))*cos(q1)
and I want to change second term (starting with 6.12e-17) to zero.
A direct way to do this is to replace such numbers with 0. A naive eq.subs(small, 0) will fail because small that you enter is not likely to be exactly the same as the number. But eq.atoms(Float) will give you the set of such numbers:
>>> eq.xreplace(dict([(n,0) for n in eq.atoms(Float) if abs(n) < 1e-12]))
1.0*a1*cos(q1) + (1.0*a2*cos(q2) + 1.0*a3*cos(q2 + q3) + 1.0*a4*cos(q2 + q3 + q4))*cos(q1)
Probably there are much more efficient ways(I am not familiar with that library) but I tried to do with regex. If e- exist in a part of the equation it replaces it to 0(you can remove directly if you want). But to be able to do this, I had to remove to spaces between +- operators inside the parentheses, so I could make a list by splitting from the other +- operators.
import re
result='''1.0*a1*cos(q1) - 6.12e-17*(a2*sin(q2) + a3*sin(q2+q3)
+ a4*sin(q2+q3+q4))sin(q1) + 1.0(a2*cos(q2)
+ a3*cos(q2+q3) + a4*cos(q2+q3+q4))*cos(q1)'''
too_small='e-'
mylist=re.split(r"\s+", result)
for i in range(len(mylist)):
if too_small in mylist[i]:
mylist[i]='0'
new_result=''.join(mylist)
print(new_result)
And this is the output:
1.0*a1*cos(q1)-0+a3*sin(q2+q3)+a4*sin(q2+q3+q4))sin(q1)+1.0(a2*cos(q2)+a3*cos(q2+q3)+a4*cos(q2+q3+q4))*cos(q1)
As I said, there are probably much better ways than this.
What about more details? I guess you want replace part of a symbolic calculation string, Regular expression in Python could be helpful, you can code like this:
In [1]: import re
In [2]: s = '1.0*a1*cos(q1) - 6.12e-17*(a2*sin(q2) + a3*sin(q2 + q3) + ' \
...: 'a4*sin(q2 + q3 + q4))sin(q1) + 1.0(a2*cos(q2) + ' \
...: 'a3*cos(q2 + q3) + a4*cos(q2 + q3 + q4))*cos(q1)'
In [3]: s = re.sub(r'[+-/*/]\s\S*e-[1-9]\d+\S*\s', '', s)
In [4]: s
Out[4]: '1.0*a1*cos(q1) + a3*sin(q2 + q3) + a4*sin(q2 + q3 + q4))sin(q1) + 1.0(a2*cos(q2) + a3*cos(q2 + q3) + a4*cos(q2 + q3 + q4))*cos(q1)'
First argument of re.sub() function decide what you want to reduce, e-[1-9]\d+ represent a number lower than e-10 which you can modify, I hope it helps.
SymPy’s nsimplify function with the rational=True argument converts floats within an expression to rational numbers (within a given tolerance). Something like 6.12e-17 will be converted to 0 if below the threshold. So, in your case:
from sympy import sin, cos, symbols, nsimplify
a1, a2, a3, a4 = symbols("a1, a2, a3, a4")
q1, q2, q3, q4 = symbols("q1, q2, q3, q4")
expr = (
1.0*a1*cos(q1)
- 6.12e-17*(a2*sin(q2) + a3*sin(q2 + q3) + a4*sin(q2 + q3 + q4))*sin(q1)
+ 1.0*(a2*cos(q2) + a3*cos(q2 + q3) + a4*cos(q2 + q3 + q4))*cos(q1)
)
nsimplify(expr,tolerance=1e-10,rational=True)
# a1*cos(q1) + (a2*cos(q2) + a3*cos(q2 + q3) + a4*cos(q2 + q3 + q4))*cos(q1)
I am trying to factor a polynomial of booleans to get the minimal form of a logic net. My variables are a1, a2, a3 ... and the negative counterparts na1, na2, na3 ...
If would expect a function
f = a1*a2*b2*nb1 + a1*b1*na2*nb2 + a1*b1*na2 + a2*b2*na1*nb1
to be factored like this (at least) :
f = a1*b1*(b2*nb1 + na2*(nb2 + 1)) + a2*b2*na1*nb1
I run this script:
import sympy
a1,a2,b1,b2,b3,na1,na2,na3,nb1,nb2,nb3 = \
sympy.symbols("a1:3, b1:4, na1:4, nb1:4", bool=True)
f = "a1*na2*b1 + a1*a2*nb1*b2 + a1*na2*b1*nb2 + na1*a2*nb1*b2"
sympy.init_printing(use_unicode=True)
sympy.factor(f)
and this returns me the same function, not factored.
a1*a2*b2*nb1 + a1*b1*na2*nb2 + a1*b1*na2 + a2*b2*na1*nb1
What am I doing wrong ?
Your expected output
f = a1*b1*(b2*nb1 + na2*(nb2 + 1)) + a2*b2*na1*nb1
is not a factorization of f, so factor is not going to produce it. To factor something means to write it as a product, not "a product plus some other stuff".
If you give a polynomial that can actually be factored, say f = a1*na2*b1 + a1*a2*nb1*b2 + a1*na2*b1*nb2, then factor(f) has an effect.
What you are looking for is closer to collecting the terms with the same variable, which is done with collect.
f = a1*na2*b1 + a1*a2*nb1*b2 + a1*na2*b1*nb2 + na1*a2*nb1*b2
collect(f, a1)
outputs
a1*(a2*b2*nb1 + b1*na2*nb2 + b1*na2) + a2*b2*na1*nb1
The method coeff also works in that direction, e.g., f.coeff(a1) returns the contents of the parentheses in the previous formula.
I have some matrices of decent size (2000*2000) and I wish to have symbolic expressions in the elements of the matrices - i.e. .9**b + .8**b + .7**b ... is an example of an element. The matrices are quite sparse.
I am creating these matrices by adding up intermediate calculations. I would like to store them to disk to be read in later and evaluated with different values of b.
I have played around with sympy and it does exactly what I need it to do however it is mind-numbingly slow to do simple additions. From what I have read it seems theano or tensorflow might be able to do this with Tensors but I could not figure out how to put a symbol in a Tensor.
Can anyone point me in the right direction as to the best tool to use for this task? I'd prefer it to be in python but if something outside python would do the job that'd be nice too.
The problem is likely coming from the fact that you are taking a symbolic power. But, for whatever reason, SymPy tries to find an explicit form for a symbolic power. For example:
In [12]: x = Symbol('x')
In [13]: print(Matrix([[1, 2], [3, 4]])**x)
Matrix([[-2*(5/2 + sqrt(33)/2)**x*(-2/((-3/2 + sqrt(33)/2)*(-1/2 + sqrt(33)/6)**2*(sqrt(33)/4 + 11/4)) + 1/(-1/2 + sqrt(33)/6))/(-sqrt(33)/2 - 3/2) + 2*(-sqrt(33)/2 + 5/2)**x/((-3/2 + sqrt(33)/2)*(-1/2 + sqrt(33)/6)*(sqrt(33)/4 + 11/4)), -4*(5/2 + sqrt(33)/2)**x/((-3/2 + sqrt(33)/2)*(-1/2 + sqrt(33)/6)*(-sqrt(33)/2 - 3/2)*(sqrt(33)/4 + 11/4)) - 2*(-sqrt(33)/2 + 5/2)**x/((-3/2 + sqrt(33)/2)*(sqrt(33)/4 + 11/4))], [(5/2 + sqrt(33)/2)**x*(-2/((-3/2 + sqrt(33)/2)*(-1/2 + sqrt(33)/6)**2*(sqrt(33)/4 + 11/4)) + 1/(-1/2 + sqrt(33)/6)) - (-sqrt(33)/2 + 5/2)**x/((-1/2 + sqrt(33)/6)*(sqrt(33)/4 + 11/4)), 2*(5/2 + sqrt(33)/2)**x/((-3/2 + sqrt(33)/2)*(-1/2 + sqrt(33)/6)*(sqrt(33)/4 + 11/4)) + (-sqrt(33)/2 + 5/2)**x/(sqrt(33)/4 + 11/4)]])
Is this actually what you want to do? Do you know the value of b ahead of time? You can leave the expression unevaluated as a power by using MatPow(arr, b).
I have an equation like:
R₂⋅V₁ + R₃⋅V₁ - R₃⋅V₂
i₁ = ─────────────────────
R₁⋅R₂ + R₁⋅R₃ + R₂⋅R₃
defined and I'd like to split it into factors that include only single variable - in this case V1 and V2.
So as a result I'd expect
-R₃ (R₂ + R₃)
i₁ = V₂⋅───────────────────── + V₁⋅─────────────────────
R₁⋅R₂ + R₁⋅R₃ + R₂⋅R₃ R₁⋅R₂ + R₁⋅R₃ + R₂⋅R₃
But the best I could get so far is
-R₃⋅V₂ + V₁⋅(R₂ + R₃)
i₁ = ─────────────────────
R₁⋅R₂ + R₁⋅R₃ + R₂⋅R₃
using equation.factor(V1,V2). Is there some other option to factor or another method to separate the variables even further?
If it was possible to exclude something from the factor algorithm (the denominator in this case) it would have been easy. I don't know a way to do this, so here is a manual solution:
In [1]: a
Out[1]:
r₁⋅v₁ + r₂⋅v₂ + r₃⋅v₂
─────────────────────
r₁⋅r₂ + r₁⋅r₃ + r₂⋅r₃
In [2]: b,c = factor(a,v2).as_numer_denom()
In [3]: b.args[0]/c + b.args[1]/c
Out[3]:
r₁⋅v₁ v₂⋅(r₂ + r₃)
───────────────────── + ─────────────────────
r₁⋅r₂ + r₁⋅r₃ + r₂⋅r₃ r₁⋅r₂ + r₁⋅r₃ + r₂⋅r₃
You may also look at the evaluate=False options in Add and Mul, to build those expressions manually. I don't know of a nice general solution.
In[3] can be a list comprehension if you have many terms.
You may also check if it is possible to treat this as multivariate polynomial in v1 and v2. It may give a better solution.
Here I have sympy 0.7.2 installed and the sympy.collect() works for this purpose:
import sympy
i1 = (r2*v1 + r3*v1 - r3*v2)/(r1*r2 + r1*r3 + r2*r3)
sympy.pretty_print(sympy.collect(i1, (v1, v2)))
# -r3*v2 + v1*(r2 + r3)
# ---------------------
# r1*r2 + r1*r3 + r2*r3