I'm playing around with Julia and I'm using Sympy to which I think uses PyCall to call Python.
When I run the script below, I get a long error. It's too long to post all of it here, but here is the start of it:
LoadError: PyError (ccall(#pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr,
PyPtr), o, arg, C_NULL)) <type 'exceptions.RuntimeError'>
RuntimeError('maximum recursion depth exceeded while calling a Python object',)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\cache.py", line 93, in wrapper
retval = cfunc(*args, **kwargs)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\compatibility.py", line 809, in wrapper
result = user_function(*args, **kwds)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\function.py", line 427, in __new__
result = super(Function, cls).__new__(cls, *args, **options)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\cache.py", line 93, in wrapper
retval = cfunc(*args, **kwargs)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\compatibility.py", line 809, in wrapper
result = user_function(*args, **kwds)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\function.py", line 250, in __new__
evaluated = cls.eval(*args)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\functions\elementary\integers.py", line 25, in eval
if arg.is_imaginary or (S.ImaginaryUnit*arg).is_real:
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\decorators.py", line 91, in __sympifyit_wrapper
return func(a, b)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\decorators.py", line 132, in binary_op_wrapper
return func(self, other)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\expr.py", line 140, in __mul__
return Mul(self, other)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\cache.py", line 93, in wrapper
retval = cfunc(*args, **kwargs)
File "d:\Users\OEM\AppData\Local\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\Conda\deps\usr\lib\site-packages\sympy\core\compatibility.py", line 809, in wrapper
result = user_function(*args, **kwds)
And as you may be able to see, towards the end it repeats: see line 93 on the end, then line 140, then line 93...
Here is my code:
function oddPeriodSquareRoots()
#=
Get the length of the continued fraction for square root of for the number i.
E.g. √7=[2;(1,1,1,4)]
=#
irrationalNumber, intPart, fractionalPart = symbols(string("irrationalNumber intPart fractionalPart"))
for i in [6451]
# For perfect squares, the period is 0
irrationalNumber = BigFloat(sqrt(BigFloat(i)))
if irrationalNumber == floor(irrationalNumber)
continue
end
# Get the continued fraction using symbolic programming
irrationalNumber = sqrt(Sym(i))
continuedFractionLength = 0
while true
intPart = Sym(BigInt(floor(irrationalNumber)))
if continuedFractionLength == 0
firstContinuedFractionTimes2 = intPart*2
end
continuedFractionLength += 1
if intPart == firstContinuedFractionTimes2
break
end
fractionalPart = irrationalNumber - intPart
irrationalNumber = 1 / fractionalPart
end
continuedFractionLength -= 1 # We ignore the first term.
end
return continuedFractionLength
end
This routine calculates the length of a continued fraction for the square root of some number. For the number 6451 it gives the error.
So my question is can this be resolved please?
I'm glad the recursionlimit solution was found. This hadn't been seen before. This comment is about how to streamline your SymPy code, as you seem to be confused about that. Basically, you just need to make your initial value symbolic, and then Julia's methods should (in most all cases) take care of the rest. Here is a slight rewrite:
using SymPy
using PyCall
#pyimport sys
sys.setrecursionlimit(10000)
"""
Get the length of the continued fraction for square root of for the number i.
E.g. √7=[2;(1,1,1,4)]
"""
function oddPeriodSquareRoots(n)
i = Sym(n)
# For perfect squares, the period is 0
continuedFractionLength = 0
irrationalNumber = sqrt(i)
if is_integer(irrationalNumber)
return continuedFractionLength
end
# Get the continued fraction using symbolic programming
while true
intPart = floor(irrationalNumber)
if continuedFractionLength == 0
firstContinuedFractionTimes2 = intPart*2
end
continuedFractionLength += 1
if intPart == firstContinuedFractionTimes2
break
end
fractionalPart = irrationalNumber - intPart
irrationalNumber = 1 / fractionalPart
end
continuedFractionLength -= 1 # We ignore the first term.
return continuedFractionLength
end
Thanks very much for everyone's input. I managed to solve this by putting these lines at the top of the file (in addition to the "Using Sympy" line which I had the whole time):
using SymPy
using PyCall
#pyimport sys
sys.setrecursionlimit(10000)
This sets the recursion limit in Python. Not sure why it has to be so large for this to work.
I did also remove some of my type conversions etc. I thought this might help with the error and/or speed. But it didn't really.
Also, removing the line where I declare the variables to by symbols doesn't stop the code from working.
irrationalNumber, intPart, fractionalPart = symbols(string("irrationalNumber intPart fractionalPart"))
Same in Python. So not sure what the point of it is.
But in Julia, either way I have to have that Sym() wrapper around these 2 lines:
irrationalNumber = sqrt(Sym(i))
...
intPart = Sym(floor(irrationalNumber))
By inspecting these types, with the use of typeof, I can see they are symbolic, not floats. Without them, everything turns into floats and so I'm not doing it symbolically.
Related
At my site, we're having an issue with MetPy returning a units error when trying to call surface_based_cape_cin
I am seeing the following error:
Traceback (most recent call last):
File "Advanced_Sounding_3Dnetcdf2.py", line 202, in <module>
sbcape, sbcin = mpcalc.surface_based_cape_cin(p1, T1, Td1)
File "/gpfs/group/kal6112/default/sw/anaconda3/lib/python3.6/site-packages/metpy/xarray.py", line 677, in wrapper
return func(*args, **kwargs)
File "/gpfs/group/kal6112/default/sw/anaconda3/lib/python3.6/site-packages/metpy/units.py", line 320, in wrapper
return func(*args, **kwargs)
File "/gpfs/group/kal6112/default/sw/anaconda3/lib/python3.6/site-packages/metpy/calc/thermo.py", line 1851, in surface_based_cape_cin
return cape_cin(p, t, td, profile)
File "/gpfs/group/kal6112/default/sw/anaconda3/lib/python3.6/site-packages/metpy/xarray.py", line 677, in wrapper
return func(*args, **kwargs)
File "/gpfs/group/kal6112/default/sw/anaconda3/lib/python3.6/site-packages/metpy/units.py", line 319, in wrapper
raise ValueError(msg)
ValueError: `cape_cin` given arguments with incorrect units: `temperature` requires "[temperature]" but given "none", `dewpt` requires "[temperature]" but given "none".
When I check the incoming values p1, T1, and Td1 they all have the correct units (hectopascal, degree_Celcius).
Just to be sure I added the following and checked the results prior to the call to surface_based_cape_cin:
p1 = units.hPa * phPa
T1 = units.degC * TdegC
Td1 = units.degC * TddegC
I'm running the following version of MetPy
# Name Version Build Channel
metpy 0.12.2 py_0 conda-forge
I don't recall having this prior to updating to this version but I can't be certain the problem I'm seeing arose after the update or not.
Thanks for any help you can provide.
This is definitely a bug in MetPy, likely due to more challenges with masked arrays and preserving units. I've opened a new issue. In the meanwhile as a work-around, it's probably best to just eliminate masked arrays with something like:
p1 = p1.compressed() * p1.units
T1 = T1.compressed() * T1.units
Td1 = Td1.compressed() * Td1.units
This will work so long as the data have no actual masked values or if all 3 arrays are masked in the same spot. If not, you'll need to do some more work to remove any of the levels where one of the values is masked.
How do I do interval arithmetic in Sympy 1.3? (specifically, addition and multiplication)
For example, given:
q1 = Interval(0,255)
q2 = Interval(0,255)
The addition of those two intervals should be Interval(0, 510). (The plus operator is overloaded to mean "union", so q1+q2 yields Interval(0,255).)
If I try Add(q1, q2), I get an exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/sympy/core/cache.py", line 93, in wrapper
retval = cfunc(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/sympy/core/compatibility.py", line 850, in wrapper
result = user_function(*args, **kwds)
File "/usr/local/lib/python2.7/dist-packages/sympy/core/operations.py", line 45, in __new__
c_part, nc_part, order_symbols = cls.flatten(args)
File "/usr/local/lib/python2.7/dist-packages/sympy/core/add.py", line 223, in flatten
newseq.append(Mul(c, s))
File "/usr/local/lib/python2.7/dist-packages/sympy/core/cache.py", line 93, in wrapper
retval = cfunc(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/sympy/core/compatibility.py", line 850, in wrapper
result = user_function(*args, **kwds)
File "/usr/local/lib/python2.7/dist-packages/sympy/core/operations.py", line 45, in __new__
c_part, nc_part, order_symbols = cls.flatten(args)
File "/usr/local/lib/python2.7/dist-packages/sympy/core/mul.py", line 186, in flatten
r, b = b.as_coeff_Mul()
AttributeError: 'Interval' object has no attribute 'as_coeff_Mul'
(I get a similar exception for Mul).
Yet, the code to add two intervals seems to be right here: https://github.com/sympy/sympy/blob/sympy-1.3/sympy/sets/handlers/add.py#L22
But the dispatcher mechanism doesn't seem to be catching the case of Interval + Interval.
How do I do addition and multiplication on intervals in sympy?
Sympy Intervals do not perform interval arithmetic. The function you found in the repository is one of the handlers for sympy.sets.setexpr.SetExpr, an expression type that takes values in a given set:
from sympy import Interval
from sympy.sets.setexpr import SetExpr
q1 = SetExpr(Interval(0, 255))
q2 = SetExpr(Interval(0, 255))
result = q1 + q2
SetExpr is currently hidden-ish and mostly undocumented.
In addition to SetExpr(Interval(...)) you can also use AccumBounds which is older and was originally intended to give answers to "find the limit of f" where f is an oscillating function. As far as arithmetics is concerned it works about the same:
AccumBounds(3, 5) + AccumBounds(2, 8) # AccumBounds(5, 13)
AccumBounds(-2, 5) * AccumBounds(2, 8) # AccumBounds(-16, 40)
but there are some interval computations where the implementation of AccumBounds is more complete.
sin(AccumBounds(0, 3)) # AccumBounds(0, 1)
sin(SetExpr(Interval(0, 3))) # SetExpr(ImageSet(Lambda(x, sin(x)), Interval(0, 3)))
The below piece of code gives me a memory error when I run it. It is part of a simple neural net I'm building. Keep in mind that all the math is written out for learning purposes:
learning_rate = 0.2
costs = []
for i in range(50000):
ri = np.random.randint(len(data))
point =data[ri]
## print (point)
z = point[0] * w1 + point[1] * w2 + b
pred = sigmoid(z)
target = point[2]
cost = np.square(pred-target)
costs.append(cost)
dcost_pred = 2* (pred-target)
dpred_dz = sigmoid_p(z)
dz_dw1 = point[0]
dz_dw2 = point[1]
dz_db = 1
dcost_dz = dcost_pred*dpred_dz
dcost_dw1 = dcost_pred*dpred_dz*dz_dw1
dcost_dw2 = dcost_pred*dpred_dz
dcost_db = dcost_dz * dz_db
w1 = w1 - learning_rate*dcost_dw1
w2 = w2 - learning_rate*dcost_dw2
b = b - learning_rate*dcost_db
plt.plot(costs)
if i % 100 == 0:
for j in range(len(data)):
cost_sum = 0
point = data[ri]
z = point[0]*w1+point[1]*w2+b
pred = sigmoid(z)
target = point[2]
cost_sum += np.square(pred-target)
costs.append(cost_sum/len(data))
When the program gets to this part it results in the following error:
Traceback (most recent call last):
File "D:\First Neual Net.py", line 89, in <module>
plt.plot(costs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\pyplot.py", line 3261, in plot
ret = ax.plot(*args, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\__init__.py", line 1717, in inner
return func(ax, *args, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\_axes.py", line 1373, in plot
self.add_line(line)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\_base.py", line 1779, in add_line
self._update_line_limits(line)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\_base.py", line 1801, in _update_line_limits
path = line.get_path()
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\lines.py", line 957, in get_path
self.recache()
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\lines.py", line 667, in recache
self._xy = np.column_stack(np.broadcast_arrays(x, y)).astype(float)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\numpy\lib\shape_base.py", line 353, in column_stack
return _nx.concatenate(arrays, 1)
MemoryError
Are there any ways to make this code more efficient? Maybe using generators?
Bugs
You probably meant to move the cost_sum = 0 out of the loop!
Memory Error
You're attempting to plot 50000 points, matplotlib certainly doesn't take kindly to that, so you might want to reduce the number of points you plot. It appears you tried to do this in the loop at the bottom of the code?
Efficiency
Let's address the efficiency part of your question. Now, I can't comment on how to make the algorithm itself to be more efficient, but I thought I'd offer my wisdom on making Python code run faster.
I'll start off by saying this: the Python compiler (that sits behind the interpreter) does next to no optimisation, so the common optimisations compilers usually apply themselves apply to us here.
Common Subexpression Elimination
In your code you make use of stuff like this:
dcost_dz = dcost_pred*dpred_dz
dcost_dw1 = dcost_pred*dpred_dz*dz_dw1
dcost_dw2 = dcost_pred*dpred_dz
This is really inefficient! We are recomputing dcost_pred*dpred_dx 3 times! It would be much better to just make use of the variable we've already assigned to:
dcost_dz = dcost_pred*dpred_dz
dcost_dw1 = dcost_dz*dz_dw1
dcost_dw2 = dcost_dz
To put this into perspective, this is 4 instructions less each iteration of the loop (14 vs 10).
In a similar vein, you recompute point[0] and point[1] twice per iteration, why not use (x, y) = data[ri] instead?
Loop Hoisting
You also do a bit of recomputation during your iterations of the loop. For instance, every iteration you find the size of the data up to 3 different times and this doesn't change. So compute it before the loop:
for i in range(50000):
ri = np.random.randint(len(data))
becomes
datasz = len(data)
for i in range(50000):
ri = np.random.randint(datasz)
Also, that np.random.randint is costing you 3 instructions to even access, without even thinking about the call. If you were being really performance sensitive, you'd move it out of the loop too:
datasz = len(data)
randint = np.random.randint
for i in range(50000):
ri = randint(datasz)
Strength Reduction
Sometimes, some operators are more fast performing than others. For instance;
b = b - learning_rate*dcost_db
Uses a BINARY_SUBTRACT instruction, but
b -= learning_rate*dcost_db
uses an INPLACE_SUBTRACT instruction. It's likely the in-place instruction is a bit faster, so something to consider (but you'd need to test out that theory).
Plot Every Iteration
As stated in the comments (nicely spotted #HFBrowning), you are plotting all the points every iteration, which is going to absolutely cripple performance!
List append
Since you know the size of the list you are inserting into (50500 or something?) you can allocate the list to be that size with something like: costs = [0] * 50500, this saves a lot of time reallocating the size of the list when it gets full. You'd stop using append and start assigning to an index. Bear in mind though, this will cause weirdness in your plot unless you only plot once after the loop has finished!
I have a function which accepts a list R. In this function, I have defined an optimization problem using "pulp", This is my function:
import pulp
from multiprocessing.dummy import Pool as ThreadPool
def optimize(R):
variables = ["x1","x2","x3","x4"]
costs = {"x1":R[0], "x2":R[1], "x3":R[2], "x4":R[3]}
constraint = {"x1":5, "x2":7, "x3":4, "x4":3}
prob_variables = pulp.LpVariable.dicts("Intg",variables,
lowBound=0,
upBound=1,
cat=pulp.LpInteger)
prob = pulp.LpProblem("test1", pulp.LpMaximize)
# defines the constraints
prob += pulp.lpSum([constraint[i]*prob_variables[i] for i in variables]) <= 14
# defines the objective function to maximize
prob += pulp.lpSum([costs[i]*prob_variables[i] for i in variables])
pulp.GLPK().solve(prob)
# Solution
return pulp.value(prob.objective)
To get the output, I used a list as my input and the output is correct:
my_input = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]
results =[]
for i in range(0,len(my_input)):
results.append(optimize(my_input[i]))
print("*"*20)
print(results)
But, I want to use multi-threading instead of the for loop. So, I used:
my_input = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]
pool = ThreadPool(4)
results = pool.map(optimize, my_input)
But it gives me some errors:
Traceback (most recent call last):
File "/Users/Mohammad/PycharmProjects/untitled10/multi_thread.py", line 35, in <module>
results = pool.map(optimize, my_input)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/Users/Mohammad/PycharmProjects/untitled10/multi_thread.py", line 27, in optimize
pulp.GLPK().solve(prob)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/PuLP-1.6.1-py3.5.egg/pulp/solvers.py", line 179, in solve
return lp.solve(self)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/PuLP-1.6.1-py3.5.egg/pulp/pulp.py", line 1643, in solve
status = solver.actualSolve(self, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/PuLP-1.6.1-py3.5.egg/pulp/solvers.py", line 377, in actualSolve
raise PulpSolverError("PuLP: Error while executing "+self.path)
pulp.solvers.PulpSolverError: PuLP: Error while executing glpsol
Can anybody help me?
In my actual code, my_input list has the length of 27 (instead of 4 in the above code) and for each one, in my function I have to perform 80k optimizations (instead of one in the above code). So, multi-threading is a big help for me.
I have seen that class pulp.solvers.COIN_CMD has a threads argument, although the documentation is quite laconic. Taking a look at the code source, it seems to be indeed a way to provide threads to the solver.
If naming is indeed the issue, consider adding the desired name index for a given problem as an input argument to the function. Something like:
def optimize(tup): # here, tup contains (idx, R), so as to be callable using pool.map
...
prob = pulp.LpProblem('test'+str(idx), pulp.LpMaximize)
...
and then something like:
my_input = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]
pool = ThreadPool(4)
results = pool.map(optimize, enumerate(my_input))
I recently started exploring Python and have encountred a problem with a package named PaCal
Everything looks to be working fine except that I keep having this error anytime I want to print out some data (like in print A.mean() )
the error line is :
Traceback (most recent call last):
File "C:\Users\rmobenta\Desktop\tt.py", line 12, in <module>
print A.interval(0.95)
File "C:\Python27\lib\site-packages\pacal\distr.py", line 229, in interval
return self.quantile(p_lim), self.quantile(1.0 - p_lim)
File "C:\Python27\lib\site-packages\pacal\distr.py", line 215, in quantile
return self.get_piecewise_cdf().inverse(y)
File "C:\Python27\lib\site-packages\pacal\segments.py", line 1721, in inverse
x = findinv(segi.f, a = segi.a, b = segi.b, c = y, rtol = params.segments.cumint.reltol, maxiter = params.segments.cumint.maxiter) # TODO PInd, MInf
File "C:\Python27\lib\site-packages\pacal\utils.py", line 384, in findinv
return brentq(lambda x : fun(x) - c, a, b, **kwargs)
File "C:\Python27\lib\site-packages\scipy\optimize\zeros.py", line 414, in brentq
raise ValueError("rtol too small (%g < %g)" % (rtol, _rtol))
ValueError: rtol too small (1e-16 < 4.44089e-16)
I am using a two-line script that I got for a demo (given by the author of this package) and have no idea how to tackle this issue.
Here is the script:
from pacal import *
Y = UniformDistr(1, 2)
X = UniformDistr(3, 4)
A = atan(Y / X)
A.plot()
print A.mean()
print A.interval(0.95)
The problem comes from PaCal that defines in l.141 of params.py: segments.vumint.reltol = 1e-16.
This is the value passed as rtol in segments.py to the SciPy function brentq().
Finally it is compared to numpy.finfo(float).eps * 2 (l.413 and l.10 of scipy/optimize/zeros.py) and is unfortunately lesser.
So it could be a problem of PaCal implementation, not related to your code.
Note that the value you provided to interval() corresponds to the default value (l.222 of distr.py).
I think you should contact the PaCal developers to get more informations and probably open an issue.