Problem with Sympy lambdify with functions of 'x' or constants [duplicate] - python

I need to evaluate the derivative of functions (f') given by the user in many points. The points are in a list (or numpy.array, pandas.Series...). I obtain the expected value when f' depends on a sympy variable, but not when f' is a constant:
import sympy as sp
f1 = sp.sympify('1')
f2 = sp.sympify('t')
lamb1 = sp.lambdify('t',f1)
lamb2 = sp.lambdify('t',f2)
print(lamb1([1,2,3]))
print(lamb2([1,2,3]))
I obtain:
1
[1, 2, 3]
The second is alright, but I expected that the first would be a list of ones.
These functions are in a matrix and the end result of sympy operations, such as taking derivatives. The exact form of f1 and f2 varies per problem.

lamb1 is a function that returns the constant 1: def lamb1(x): return 1.
lamb2 is a function that returns its argument: def lamb2(x): return x.
So, the output is very well the expected one.
Here is an approach that might work. I changed the test function for f2 to t*t as that was more annoying in my tests (dealing with Pow(t,2)).
import sympy as sp
import numpy as np
f1 = sp.sympify('1')
f2 = sp.sympify('t*t')
def np_lambdify(varname, func):
lamb = sp.lambdify(varname, func, modules=['numpy'])
if func.is_constant():
return lambda t: np.full_like(t, lamb(t))
else:
return lambda t: lamb(np.array(t))
lamb1 = np_lambdify('t', f1)
lamb2 = np_lambdify('t', f2)
print(lamb1(1))
print(lamb1([1, 2, 3]))
print(lamb2(2))
print(lamb2([1, 2, 3]))
Outputs:
1
[1 1 1]
4
[1 4 9]

With isympy/ipython introspection:
In [28]: lamb2??
Signature: lamb2(t)
Docstring:
Created with lambdify. Signature:
func(arg_0)
Expression:
t
Source code:
def _lambdifygenerated(t):
return (t)
and for the first:
In [29]: lamb1??
Signature: lamb1(t)
Docstring:
Created with lambdify. Signature:
func(arg_0)
Expression:
1
Source code:
def _lambdifygenerated(t):
return (1)
So one returns the input argument; the other returns just the constant, regardless of the input. lambdify does a rather simple lexical translation from sympy to numpy Python.
edit
Putting your functions in a sp.Matrix:
In [55]: lamb3 = lambdify('t',Matrix([f1,f2]))
In [56]: lamb3??
...
def _lambdifygenerated(t):
return (array([[1], [t]]))
...
In [57]: lamb3(np.arange(3))
Out[57]:
array([[1],
[array([0, 1, 2])]], dtype=object)
So this returns a numpy array; but because of the mix of shapes the result is object dtype, not 2d.
We can see this with a direct array generation:
In [53]: np.array([[1],[1,2,3]])
Out[53]: array([list([1]), list([1, 2, 3])], dtype=object)
In [54]: np.array([np.ones(3,int),[1,2,3]])
Out[54]:
array([[1, 1, 1],
[1, 2, 3]])
Neither sympy nor the np.array attempts to 'broadcast' that constant. There are numpy constructs that will do that, such as multiplication and addition, but this simple sympy function and lambdify don't.
edit
frompyfunc is a way of passing an array (or arrays) to a function that only works with scalar inputs. While lamb2 works with an array input, you aren't happy with the lamb1 case, or presumably lamb3.
In [60]: np.frompyfunc(lamb1,1,1)([1,2,3])
Out[60]: array([1, 1, 1], dtype=object)
In [61]: np.frompyfunc(lamb2,1,1)([1,2,3])
Out[61]: array([1, 2, 3], dtype=object)
This [61] is slower than simply lamb2([1,2,3]) since it effectively iterates.
In [62]: np.frompyfunc(lamb3,1,1)([1,2,3])
Out[62]:
array([array([[1],
[1]]), array([[1],
[2]]),
array([[1],
[3]])], dtype=object)
In this Matrix case the result is an array of arrays. But since shapes match they can be combined into one array (in various ways):
In [66]: np.concatenate(_62, axis=1)
Out[66]:
array([[1, 1, 1],
[1, 2, 3]])

Usually it isn't actually a problem for lambdify to return a constant, because NumPy's broadcasting semantics will automatically treat a constant as an array of that constant of the appropriate shape.
If it is a problem, you can use a wrapper like
def broadcast(fun):
return lambda *x: numpy.broadcast_arrays(fun(*x), *x)[0]
(this is taken from https://github.com/sympy/sympy/issues/5642, which has more discussion on this issue).
Note that using broadcast is better than full_like as in JohanC's answer, because broadcasted constant arrays do not actually take up more memory, whereas full_like will copy the constant in memory to make the array.

I often use the trick t * 0 + 1 to create a zero-vector the same length as my input, but then add 1 to each of its elements. It works with NumPy; check if it works with Sympy!

I never use lambdify so I can't be too critical of how it is working. But it appears that you will need to fool it by giving it an expression that doesn't simplify to a scalar which, when evaluated with numbers will reduce to the desired value:
>>> import numpy as np
>>> lambdify('t','(1+t)*t-t**2-t+42','numpy')(np.array([1,2,3]))
array([42, 42, 42])

Related

PyTorch - Efficient way to apply different functions to different 'row/column' of a tensor

Let's say I have a 2-d tensor:
x = torch.Tensor([[1, 2], [3, 4]])
Is there an efficient way to apply one function to the first 'row' [1, 2] and apply a second different function to the second row [3, 4]? (Doesn't have to be a row, could be across any dimension)
At the moment, I use the following code: Say I have my two functions, f and g, for example,
def f(z):
return 2 * z
def g(z):
return 0.5 * z
Then, to apply them to seperate rows I would do:
torch.cat([f(x[0]).unsqueeze(0), g(x[1]).unsqueeze(0)], dim = 0)
which gives the desired tensor [[2, 4], [1.5, 2]].
Obviously, in this 2-d example this solution is fine, but it seems a bit clunky. Is there a better way of doing this? Particularly in higher dimensions or when there are a large number of elements in the chosen dimension
A handy tip is to slice instead of selecting to avoid the unsqueeze step. Indeed, notice how x[:1] keeps the indexed dimension compared to x[0].
This way you can perform the desired operation in a slightly shorter form:
>>> torch.vstack((f(x[:1]), g(x[1:])))
Optionally you can use vstack to not have to provide dim=0 to torch.stack.
Alternatively, you can use a helper function that will apply both f and g:
>>> fn = lambda a,b: (f(a), g(b))
And split the tensor inline with torch.Tensor.split:
>>> torch.vstack(fn(*x.split(1)))

NumPy - assigning to views returned by function

Suppose I have NumPy N-D array a and a function f(a) which returns any complex view v of a, also array b which has same shape as v.
What is the easiest way to assign b to v? Both of them can be multi-dimensional.
Simplest trial like in next code to assign to function's return value fails with error: SyntaxError: can't assign to function call:
import numpy as np
a, b = np.arange(10), np.arange(2)
a[2:4] = b # Working
f = lambda a: a[2:4] # Returns any view of a
f(a) = b # Not working, syntax error
By the task it is not allowed to pass array b argument to function f, function itself should be un-modified.
#hpaulj suggested next solution that works for any dimensionality (unlike this solution):
f(a)[...] = b
I just figured out myself one simplest solution, it works correctly for any N-D case except for 0-dimensional arrays (scalars):
f(a)[:] = b
Before trying to find a solution, make sure you understand the problem.
In [27]: a, b = np.arange(10), np.arange(2)
In [28]: f = lambda a: a[2:4]
In [29]: f(a)
Out[29]: array([2, 3])
In [30]: f(a) = b
File "<ipython-input-30-df88b52b4d3c>", line 1
f(a) = b
^
SyntaxError: can't assign to function call
This error is a fundamental Python one. A matter of syntax.
But look at what happens when we using indexing.
The slicing you do in f is:
In [31]: a[2:4]
Out[31]: array([2, 3])
In [32]: a.__getitem__(slice(2,4))
Out[32]: array([2, 3])
The desired assignment slicing is:
In [33]: a[2:4] = b
In [34]: a.__setitem__(slice(2,4),b)
In [35]: a
Out[35]: array([0, 1, 0, 1, 4, 5, 6, 7, 8, 9])
Note that setitem takes b as an argument. a.__setitem__(slice(2,4))=b would run into that same syntax error.
This use of setitem allows us to use advanced indexing (a list):
In [38]: a[[0,2]] = b
In [39]: a.__setitem__([0,2],b)
Where as this does not work:
In [40]: a[[0,2]][...] = b
because it is actually a.__getitem__([0,2]).__setitem__(Ellipsis,b). The set modifies the copy produced by the get. This chaining only works when the first index produces a view.

np.vectorize and np.apply_along_axis passing same argument twice to mapping function

I want to map a function f over an array of strings. I construct a vectorized version of f and apply it to my array. But the first element of the array gets passed twice:
import numpy as np
def f(string):
print('called with', string)
a = np.array(['110', '012'])
fv = np.vectorize(f)
np.apply_along_axis(fv, axis=0, arr=a)
called with 110
called with 110
called with 012
Why is that? I would not have expected 110 to get passed to f two times and I don't see why it would be.
What is my misconception about np.vectorize or np.apply_along_axis?
In [145]: def f(string):
...: print('called with', string)
...:
...: a = np.array(['110', '012'])
...:
...: fv = np.vectorize(f)
...:
In [146]: fv(a)
called with 110
called with 110
called with 012
Out[146]: array([None, None], dtype=object)
The function with just a print returns None. vectorized called it once to determine the return dtype - in this case it deduced object.
If we specify an otypes like int, we get an error:
In [147]: fv = np.vectorize(f, otypes=[int])
In [148]: fv(a)
called with 110
called with 012
---------------------------------------------------------------------------
...
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
That otypes was not compatible with the returned object
In [149]: fv = np.vectorize(f, otypes=[object])
In [150]: fv(a)
called with 110
called with 012
Out[150]: array([None, None], dtype=object)
A better, and slightly more meaningful function:
In [151]: def f(string):
...: print('called with', string)
...: return len(string)
...:
...:
In [152]: fv = np.vectorize(f, otypes=[int])
In [153]: fv(a)
called with 110
called with 012
Out[153]: array([3, 3])
Keep in mind that vectorize passes scalar values to your function. In effect it evaluates each element of the input arrays, returning an array with matching shape:
In [154]: fv(np.array([a,a,a]))
called with 110
called with 012
called with 110
called with 012
called with 110
called with 012
Out[154]:
array([[3, 3],
[3, 3],
[3, 3]])
Compared to plain iteration, eg. np.array([f(i) for i in a]), it is slower, but a little more convenient if the input array can have multiple dimensions, and even better if there are several arrays that need to be broadcast against each other.
For a simple one array like a, np.vectorize is overkill.
vectorize has another parameter, cache which can avoid this double call, while still allowing for auto dtype detection:
In [156]: fv = np.vectorize(f, cache=True)
In [157]: fv(a)
called with 110
called with 012
Out[157]: array([3, 3])
Auto dtype detection has sometimes caused bugs. For example if the trial calculation returns a different dtype:
In [160]: def foo(var):
...: if var<0:
...: return -var
...: elif var>0:
...: return var
...: else:
...: return 0
In [161]: np.vectorize(foo)([0,1.2, -1.2])
Out[161]: array([0, 1, 1]) # int dtype
In [162]: np.vectorize(foo)([0.1,1.2, -1.2])
Out[162]: array([0.1, 1.2, 1.2]) # float dtype
apply_along_axis takes a function that accepts a 1d array. It iterates over all other dimensions, passing a set of 1d slices to your function. For a 1d array like your a this doesn't do anything useful. And even if your a was nd, it isn't going to help much. Your fv doesn't expect a 1d input.
It too does a trial calculation to determine the return array shape and dtype. It does automatically cache that result.
Like vectorize, apply_along_axis is a convenience tool, not a performance tool.
Compare
np.apply_along_axis(fv, axis=0, arr=[a,a,a])
np.apply_along_axis(fv, axis=1, arr=[a,a,a])
to get an idea of how apply_along affects the evaluation order.
Or do something with a whole row (or column) with:
np.apply_along_axis(lambda x: fv(x).mean(), axis=0, arr=[a,a,a])
From the docs:
The data type of the output of vectorized is determined by calling the function with the first element of the input. This can be avoided by specifying the otypes argument.
The extra call is made to determine the output dtype.

numpy composition of einsums?

Suppose that I have a np.einsum that performs some calculation, and then pump that directly into yet another np.einsum to do some other thing. Can I, in general, compose those two einsums into a single einsum?
My specific use case is that I am doing a transpose, a matrix multiplication, and then another matrix multiplication to compute b a^T a :
import numpy as np
from numpy import array
a = array([[1, 2],
[3, 4]])
b = array([[1, 2],
[3, 4],
[5, 6]])
matrix_multiply_by_transpose = 'ij,kj->ik'
matrix_multiply = 'ij,jk->ik'
test_answer = np.einsum(matrix_multiply,
np.einsum(matrix_multiply_by_transpose,
b, a
),
a
)
assert np.array_equal(test_answer,
np.einsum(an_answer_to_this_question, b, a, a))
#or, the ultimate most awesomest answer ever, if such a thing even exists
assert np.array_equal(test_answer,
np.einsum(the_bestest_answer(matrix_multiply_by_transpose, matrix_multiply),
b, a, a)
)
In single einsum call, it would be -
np.einsum('ij,kj,kl->il',b,a,a)
The intuition involved would be :
Start off from the innermost einsum call : 'ij,kj->ik'.
Moving out, the second one is : 'ij,jk->ik'. The first argument in it is the output from step#1. So, let's mould this argument for the second one based on the output from the first one, introducing new strings for new iterators : 'ik,kl->il'. Note that 'kl' is the second arg in this second einsum call, which is a.
Thus, combining, we have : 'ij,kj,kl->il' with the inputs in the same sequence, i.e. b,a for the innermost einsum call and then a incoming as the third input.

Unpacking a list in python using .T?

I'm using scipy's method integrate.odeint to solve a second order LDE. The method requires that the equation be put in the form of a system of two first-order equations in two unknowns. The method
odeint(system_matrix,initial_conditions_matrix,time_values)
outputs the solution vector at each point of time in time_values. The solution vector is actually of the form [u,u'], where u is the variable I am interested in. So I want to plot only u. I found online one way of accomplishing this is to use
u,u'=odeint(system_matrix,initial_conditions_matrix,time_values).T
but I don't understand why this works and what does the .T at the end mean?
odeint(system_matrix,initial_conditions_matrix,time_values) is a matrix of 2 columns.
To be able to get the first column, first use .T (transpose) and then you are able to unpack since the elements are oriented like you want.
BTW I doubt that u' is a valid variable name. I would do:
u,_ = odeint(system_matrix,initial_conditions_matrix,time_values).T
since second value is of no interest to you.
The example I have in mind is:
>>> sol = odeint(pend, y0, t, args=(b, c))
The solution is an array with shape (101, 2). The first column is theta(t), and the second is omega(t). The following code plots both components.
>>>
>>> import matplotlib.pyplot as plt
>>> plt.plot(t, sol[:, 0], 'b', label='theta(t)')
>>> plt.plot(t, sol[:, 1], 'g', label='omega(t)')
sol[:,0] selects the first column of sol
Unpacking is usually used with a function that returns a tuple, for example:
def foo():
....
return [1,2,3],{3:3}
x, y = foo()
should end up with x being a list, y a dictionary.
But it works with any iterable, provide the number of terms match. For example a 2 row array can be unpacked into 2 arrays.
In [1]: x, y = np.arange(6).reshape(2,3)
In [4]: x,y
Out[4]: (array([0, 1, 2]), array([3, 4, 5]))
If I'd created a (3,2) array I would have needed x,y,z= ..., or .T.
Because we can index columns and rows, unpacking isn't used a lot in numpy. Usually we have too many rows to unpack. But it works just as basic Python intended to.
As a matter of curiosity, transpose works on a tuple
In [6]: np.transpose((x,y))
Out[6]:
array([[0, 3],
[1, 4],
[2, 5]])
This is actually used in np.argwhere, which turns the tuple of indices produced by np.where into array with the same number of columns as dimensions.

Categories