Applying two equations to one array - python

I'm new to Python and Numpy, and I've spent a lot of time (days) searching for answers to my question, but I'm getting stumped. I have an array of magnitudes for earthquakes, and I need to convert them to a different form of magnitude (Mb to Mo). For magnitudes less than 4.3, I need to apply one conversion, and for magnitudes greater than or equal, I need to apply a second conversion. I need the output to be in the same order as the input, and that's where I'm hitting a wall. I can get the conversions to output into two separate arrays, but I can't figure out how to write a program that chooses one equation based on the magnitude, applies it and moves on to the next magnitude in the array. Even though the following example is obviously incorrect on many levels, I think it shows what I'm trying to achieve:
data = numpy.genfromtxt('OK_mag3.csv')
mag = numpy.asarray(data)
for x in mag:
if x < 4.3:
mw = 1.03 + 0.67 * x
else:
mw = 0.1 + 0.88 * x
Also, an example of getting half of this correct is:
mw = mag[mag<4.3]*0.67+1.03
but then I don't know how to incorporate the second equation.
Any assistance is greatly appreciated!

mw = numpy.where(mag < 4.3, 1.03 + 0.67 * mag, 0.1 + 0.88 * mag)
See docs on numpy.where. The first parameter will transform data into a boolean list, the second two params will calculate the whole vector with one function or another; then where selects which of the two results is better based on the boolean.
EDIT: Based on the issues raised in comments, the following does not double the work, and will avoid cases when one of the computations would be invalid - but does still take up a bit of memory for the selector array.
mw = numpy.zeros(len(mag))
select = mag < 4.3
mw[select] = 1.03 + 0.67 * mag[select]
select = ~select
mw[select] = 0.1 + 0.88 * mag[select]

You can achieve this using np.where:
In [1]: import numpy as np
In [2]: arr = np.eye(3)
In [3]: arr
Out[3]:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
In [4]: arr[np.where(arr==0)] += 2
In [5]: arr
Out[5]:
array([[ 1., 2., 2.],
[ 2., 1., 2.],
[ 2., 2., 1.]])
In your case, for each condition, you could do something like:
mw[np.where(mw < 4.3)] = mw[np.where(mw < 4.3)]*0.67 + 1.03

Related

Does pytorch do eager pruning of its computational graph?

This is a very simple example:
import torch
x = torch.tensor([1., 2., 3., 4., 5.], requires_grad=True)
y = torch.tensor([2., 2., 2., 2., 2.], requires_grad=True)
z = torch.tensor([1., 1., 0., 0., 0.], requires_grad=True)
s = torch.sum(x * y * z)
s.backward()
print(x.grad)
This will print,
tensor([2., 2., 0., 0., 0.]),
since, of course, ds/dx is zero for the entries where z is zero.
My question is: Is pytorch smart and stop the computations when it reaches a zero? Or does in fact do the calculation "2*5", only to later do "10 * 0 = 0"?
In this simple example it doesn't make a big difference, but in the (bigger) problem I am looking at, this will make a difference.
Thank you for any input.
No, pytorch does no such thing as pruning any subsequent calculations when zero is reached. Even worse, due to how float arithmetic works all subsequent multiplication by zero will take roughly the same time as any regular multiplication.
For some cases there are ways around it though, for example if you want to use a masked loss you can just set the masked outputs to be zero, or detach them from gradients.
This example makes the difference clear:
def time_backward(do_detach):
x = torch.tensor(torch.rand(100000000), requires_grad=True)
y = torch.tensor(torch.rand(100000000), requires_grad=True)
s2 = torch.sum(x * y)
s1 = torch.sum(x * y)
if do_detach:
s2 = s2.detach()
s = s1 + 0 * s2
t = time.time()
s.backward()
print(time.time() - t)
time_backward(do_detach= False)
time_backward(do_detach= True)
outputs:
0.502875089645
0.198422908783

Identity matrix stacking in NumPy

I need a 2n x n matrix in NumPy consisting of the n x n identity matrix and the negative n x n identity matrix stacked on top of one another.
This was my original solution, which works fine.
def id_stack(n):
id_ = np.identity(n)
return np.vstack((id_, -id_))
id_stack(3)
# array([[ 1., 0., 0.],
# [ 0., 1., 0.],
# [ 0., 0., 1.],
# [-1., -0., -0.],
# [-0., -1., -0.],
# [-0., -0., -1.]])
Then I figured I could just set the diagonals instead and be faster like this, which also works.
def id_stack2(n):
full = np.zeros((2*n, n))
rng = np.arange(n)
full[rng, rng] = 1
full[rng + n, rng] = -1
return full
I was wondering if there is an even faster way of accomplishing this, maybe using some kind of stride tricks?
As you probably noticed from your own examples, allocating one big buffer and setting elements in it is generally faster than allocating two smaller buffers and a big buffer to copy them into.
The neat thing about numpy is that you can get views to the same buffer without allocating a new array. For example:
output = np.zeros((2 * n, n))
A useful view in this case is
flat = output.ravel()
You can set every n + 1st element to 1, starting from the first, for a total of n elements in the flattened view, and similar for -1. This requires only a simple indexing operation on the raveled view:
output[:n * n:n + 1] = 1
output[n * n::n + 1] = -1
This avoids creating the full range arrays, and triggering advanced indexing semantics, which are more memory intensive as well.

Expanding tensor using native tensorflow ops

I have a single dimensional data (floats) as shown below:
[-8., 18., 9., -3., 12., 11., -13., 38., ...]
I want to replace each negative element with an equivalent number of zeros.
My result would look something like this for the example above:
[0., 0., 0., 0., 0., 0., 0., 0., 18., 9., 0., 0., 0., 12., ...]
I am able to do this in Tensorflow by using tf.py_func().
But it turns out the graph is not serializable if I use that method.
Are there native tensorflow ops that can help me get the same result?
Not a straightforward task! Here is a pure TensorFlow implementation:
import tensorflow as tf
# Input vector
inp = tf.placeholder(tf.int32, [None])
# Find positive and negative indices
mask = inp < 0
num_inputs = tf.size(inp)
pos_idx, neg_idx = tf.dynamic_partition(tf.range(num_inputs), tf.cast(mask, tf.int32), 2)
# Negative values
negs = -tf.gather(inp, neg_idx)
total_neg = tf.reduce_sum(negs)
cum_neg = tf.cumsum(negs)
# Compute the final index of each positive element
pos_neg_idx = tf.cast(pos_idx[:, tf.newaxis] > neg_idx, inp.dtype)
neg_ref = tf.reduce_sum(pos_neg_idx, axis=1)
shifts = tf.gather(tf.concat([[0], cum_neg], axis=0), neg_ref) - neg_ref
final_pos_idx = pos_idx + shifts
# Compute the final size
final_size = num_inputs + total_neg - tf.size(negs)
# Make final vector by scattering positive values
result = tf.scatter_nd(final_pos_idx[:, tf.newaxis], tf.gather(inp, pos_idx), [final_size])
with tf.Session() as sess:
print(sess.run(result, feed_dict={inp: [-1, 1, -2, 2, 1, -3]}))
Output:
[0 1 0 0 2 1 0 0 0]
There is some "more than necessary" computational cost in this solution, namely the computation of final indices of positive elements through pos_neg_idx, which is O(n2), while it could be done iteratively in O(n). However, I cannot think of a way to replicate the loop iteratively, and a TensorFlow loop (using tf.while_loop) would be awkward and slow. In any case, unless you are using quite large vectors (with evenly distributed positive and negative values) it should not be a big issue.

Is it possible to speed up an element-by-element array operation in Python?

I have a list of times (called times in my code, produced by the code suggested to me in the thread astropy.io fits efficient element access of a large table) and I want to do some statistical tests for periodicity, using Zn^2 and epoch folding tests. Some steps in the code take quite a while to run, and I am wondering if there is a faster way to do it. I have tried the equivalent map and lambda functions, but that takes even longer. My list of times has several hundred or maybe thousands of elements, depending on the dataset. Here is my code:
phase=[(x-mintime)*testfreq[m]-int((x-mintime)*testfreq[m]) for x in times]
# the above step takes 3 seconds for the dataset I am using for testing
# testfreq[m] is just one of several hundred frequencies I am testing
# times is of type numpy.ndarray
phasebin=[int(ph*numbins)for ph in phase]
# 1 second (numbins is 20)
powerarray=[phasebin.count(n) for n in range(0,numbins-1)]
# 0.3 seconds
poweravg=np.mean(powerarray)
chisq[m]=sum([(pow-poweravg)**2/poweravg for pow in powerarray])
# the above 2 steps are very quick
for n in range(0,maxn): # maxn is 3
cosparam=sum([(np.cos(2*np.pi*(n+1)*ph)) for ph in phase])
sinparam=sum([(np.sin(2*np.pi*(n+1)*ph)) for ph in phase])
# these steps each take 4 seconds
z2[m,n]=sum(z2[m,])+(cosparam**2+sinparam**2)/count
# this is quick (count is the number of times)
As this steps through several hundred frequencies on either side of frequencies identified through an FFT search, it takes a very long time to run. The same functionality in a lower level language runs much more quickly, but I need some of the Python modules for plotting, etc. I am hoping that Python can be persuaded to do some of the operations, particularly the phase, phasebin, powerarray, cosparam, and sinparam calculations, significantly faster, but I am not sure how to make this happen. Can anyone tell me how this can be done, or do I have to write and call functions in C or fortran? I know that this could be done in a few minutes e.g. in fortran, but this Python code takes hours as it is.
Thanks very much.
Instead of Python lists, you could use the numpy library, it is much faster for linear algebra type operations. For example to add two arrays in an element-wise fashion
>>> import numpy as np
>>> a = np.array([1,2,3,4,5])
>>> b = np.array([2,3,4,5,6])
>>> a + b
array([ 3, 5, 7, 9, 11])
Similarly, you can multiply arrays by scalars which multiplies each element as you'd expect
>>> 2 * a
array([ 2, 4, 6, 8, 10])
As far as speed, here is the Python list equivalent of adding two lists
>>> c = [1,2,3,4,5]
>>> d = [2,3,4,5,6]
>>> [i+j for i,j in zip(c,d)]
[3, 5, 7, 9, 11]
Then timing the two
>>> from timeit import timeit
>>> setup = '''
import numpy as np
a = np.array([1,2,3,4,5])
b = np.array([2,3,4,5,6])'''
>>> timeit('a+b', setup)
0.521275608325351
>>> setup = '''
c = [1,2,3,4,5]
d = [2,3,4,5,6]'''
>>> timeit('[i+j for i,j in zip(c,d)]', setup)
1.2781205834379108
In this small example numpy was more than twice as fast.
for loop substitute - operating on complete arrays
First multiply phase by 2*pi*n using broadcasting
phase = np.arange(10)
maxn = 3
ens = np.arange(1, maxn+1) # array([1, 2, 3])
two_pi_ens = 2*np.pi*ens
b = phase * two_pi_ens[:, np.newaxis]
b.shape is (3,10) one row for each value of range(1, maxn)
Take the cosine then sum to get the three cosine parameters
c = np.cos(b)
c_param = c.sum(axis = 1) # c_param.shape is 3
Take the sine then sum to get the three sine parameters
s = np.sin(b)
s_param = s.sum(axis = 1) # s_param.shape is 3
Sum of the squares divided by count
d = (np.square(c_param) + np.square(s_param)) / count
# d.shape is (3,)
Assign to z2
for n in range(maxn):
z2[m,n] = z2[m,:].sum() + d[n]
That loop is doing a cumulative sum. numpy ndarrays have a cumsum method.
If maxn is small (3 in your case) it may not be noticeably faster.
z2[m,:] += d
z2[m,:].cumsum(out = z2[m,:])
To illustrate:
>>> a = np.ones((3,3))
>>> a
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
>>> m = 1
>>> d = (1,2,3)
>>> a[m,:] += d
>>> a
array([[ 1., 1., 1.],
[ 2., 3., 4.],
[ 1., 1., 1.]])
>>> a[m,:].cumsum(out = a[m,:])
array([ 2., 5., 9.])
>>> a
array([[ 1., 1., 1.],
[ 2., 5., 9.],
[ 1., 1., 1.]])
>>>

Solve seemingly (but not actually!) overdetermined sparse linear system in Python

I have a sparse matrix A (using scipy.sparse) and a vector b, and want to solve Ax = b for x. A has more rows than columns, so it appears to be overdetermined; however, the rows of A are linearly dependent, so that in actuality the row rank of A is equal to the number of columns. For example, A could be
A = np.array([[1., 1.], [-1., -1.], [1., 0.]])
while b is
b = np.array([0., 0., 1.])
The solution is then x = [1., -1.]. I'm wondering how to solve this system in Python, using the functions available in scipy.sparse.linalg. Thanks!
Is your system possibly underdetermined? If it is not, and there is actually a solution, then the least squares solution will be that solution, so you can try
from scipy.sparse.linalg import lsqr
return_values = lsqr(A, b)
x = return_values[0]
If your system is actually underdetermined, this should find you the minimum L2 norm solution. If it doesn't work, set the parameter damp to something very small (e.g. 1e-5).
If your system is exactly determined (i.e. A is of full rank) and has a solution, and your matrix A is tall, as you describe it, then you can find an equivalent system in the normal equations:
A.T.dot(A).dot(x) == A.T.dot(b)
has a unique solution in x. This is a square linear system and is thus solvable using linear system solvers such as scipy.sparse.linalg.spsolve
The formally correct way of solving your problem is to use SVD. You have a system of the form
A [MxN] * x [Nx1] = b [Mx1]
The SVD decomposes the matrix A into three others, so you get:
U [MxM] * S[MxN] * V[N*N] * x[Nx1] = b[Mx1]
The matrices U and V are both orthogonal (their inverse is their transpose), and S is a diagonal matrix. If we rewrite the above we get:
S[MxN] * V [N * N] * x[Nx1] = U.T [MxM] * b [Mx1]
If M > N then the matrix S will have its last M - N rows full of zeros, and if your system is truly determined, then U.T b should also have the last M - N rows zero. That means that you can solve your system as:
>>> a = np.array([[1., 1.], [-1., -1.], [1., 0.]])
>>> b = np.array([0., 0., 1.])
>>> u, s, v = np.linalg.svd(a)
>>> np.allclose(u.T.dot(b)[-m+n:], 0) #check system is not overdetermined
True
>>> np.linalg.solve(s[:, None] * v, u.T.dot(b)[:n])
array([ 1., -1.])

Categories