Setting up boundary conditions to solve PDEs using method of lines - python

Objective: To add boundary/initial conditions (BCs/ICs) to a system of ODEs
I have used the method of lines to convert a system of PDEs into a system of ODEs. The ODEs themselves involve a lot of variables so I will present simple versions here
For the purpose of this simulation all I am interested in is the structure of the code to set up the BCs/ICs, so I have removed any correlations etc and just put constants
For reference, the actual system is the flow of a gas through a packed bed. For terminology, I am calling the initial conditions the constant conditions along the bed initially (the bed temperature, the fluid temperature, the density of the fluid in the bed) and the boundary conditions would be the mass flow into the bed, the temperature of the flow into the bed.
I have two functions. df, which is used to set up the non IC/BC ODEs (generally happy with the structure of this one):
def df(t,x,*constants):
##the time derivative of x
a,b,c,d = constants ## unpack constants
##setting up solution array
x = x.reshape(2,number_of_nodes)
dxdt = np.zeros_like(x)
## internal nodes
for j in range(2,2*n,2): ##start at index 2 (to avoid ICs/BCs), go to 2*n and increment by 2
##equation 1
dxdt[0][j] = a*(x[1][j-1] - b*x[1][j])
## equation 2
dxdt[1][j] = c*(x[1][j-1] - d*x[1][j])
return dxdt.reshape(-1)
And initialize_system, which is supposed to set up the BCs/ICs (which is currently incorrect):
def initialize_system(*constants):
#sets the appropriate initial/boundary conditions
a,b,c,d = constants #unpack constants
x0 = np.zeros((2, number_of_nodes))
y01 = 1
y02 = 1
#set up initial boundary values
# eq 1
x0[0][0] = a*(y01 - b*x0[1][0])
# eq 2
x0[1][0] = c * (y02 - d*x0[1][0])
for j in range(2, 2*n, 2):
# eq 1
x0[0][j] = a*(y0CO2 - b*x0[1][j])
# eq 2
x0[1][j] = b*(y0H2O - d*x0[1][j])
return x0.reshape(-1)
My question is:
How can I correctly set up the ICs/BCs in initialize_system here?
Edit: There are very limited examples of implementing the method of lines in python on the internet. I would also highly appreciate any resources on this

Related

Solving natural convection equations (heat and flow) with the shooting method

TL;DR I've been implementing a python program to solve numerically equations for natural convection based on a particular similarity variable using runge-kutta 4 and the shooting method. However I don't get the right solutions when I plot it. Did I make a mistake somewhere ?
Hi !
Starting from a special case of natural convection, we get these similitude equations.
The first describe the fluid flow, the second describe the heat flow.
"Pr" is for Prandtl it's basically a dimensionless number used in fluid dynamics (Prandtl) :
These equations are subjects to the following boundary values such that the temperature near the plate is greater than the temperature outside the boundary layer and such that the fluid velocity is 0 far away from the boundary layer.
I've been trying to resolve these numerically with Runge-Kutta 4 and the shooting method to transform the boundary value problem into an initial value problem. The way the shooting method is implemented is with the newton method.
However, I don't get the right solutions.
As you can see in the following, the temperature (in red) is increasing as we are moving away from the plate whereas it should decrease exponentially.
It's more consistent for the fluid velocity (in blue), however the speed i think it should go up faster then go down faster. Here the curve is smoother.
Now, the fact is that we have a system of 2 coupled ODE. However, right now, I'm only trying to find the one of the two initials values (e.g. f''(0) = a, trying to find a) such that we have a solution to the boundary value problem (shooting method). Once found, I suppose we have the solution for the whole problem.
I guess I should maybe manage the two (f''(0) = a ; theta'(0) = b) but I don't know how to manage these two in parallel.
Last think to mention, if I try to get the initial value of theta' (so theta'(0)) I don't get the right heat profile.
Here is the code :
"""
The goal is to resolve a 3rd order non-linear ODE for the blasius problem.
It's made of 2 equations (flow / heat)
f''' = 3ff'' - 2(f')^2 + theta
3 Pr f theta' + theta'' = 0
RK4 + Shooting Method
"""
import numpy as np
import math
from scipy.integrate import odeint
from scipy.optimize import newton
from edo_solver.plot import plot
from constants import PRECISION
def blasius_edo(y, t, prandtl):
f = y[0:3]
theta = y[3:5]
return np.array([
# flow edo
f[1], # f' = df/dn
f[2], # f'' = d^2f/dn^2
- 3 * f[0] * f[2] + (2 * math.pow(f[1], 2)) - theta[0], # f''' = - 3ff'' + 2(f')^2 - theta,
# heat edo
theta[1], # theta' = dtheta/dn
- 3 * prandtl * f[0] * theta[1], # theta'' = - 3 Pr f theta'
])
def rk4(eta_range, shoot):
prandtl = 0.01
# initial values
f_init = [0, 0, shoot] # f(0), f'(0), f''(0)
theta_init = [1, shoot] # theta(0), theta'(0)
ci = f_init + theta_init # concatenate two ci
# note: tuple with single argument must have "," at the end of the tuple
return odeint(func=blasius_edo, y0=ci, t=eta_range, args=(prandtl,))
"""
if we have :
f'(t_0) = fprime_t0 ; f'(eta -> infty) = fprime_inf
we can transform it into :
f'(t_0) = fprime_t0 ; f''(t_0) = a
we define the function F(a) = f'(infty ; a) - fprime_inf
if F(a) has a root in "a",
then the solutions to the initial value problem with f''(t_0) = a
is also the solution the boundary problem with f'(eta -> infty) = fprime_inf
our goal is to find the root, we have the root...we have the solution.
it can be done with bissection method or newton method.
"""
def shooting(eta_range):
# boundary value
fprimeinf = 0 # f'(eta -> infty) = 0
# initial guess
# as far as I understand
# it has to be the good guess
# otherwise the result can be completely wrong
initial_guess = 10 # guess for f''(0)
# define our function to optimize
# our goal is to take big eta because eta should approach infty
# [-1, 1] : last row, second column => f'(eta_final) ~ f'(eta -> infty)
fun = lambda initial_guess: rk4(eta_range, initial_guess)[-1, 1] - fprimeinf
# newton method resolve the ODE system until eta_final
# then adjust the shoot and resolve again until we have a correct shoot
shoot = newton(func=fun, x0=initial_guess)
# resolve our system of ODE with the good "a"
y = rk4(eta_range, shoot)
return y
def compute_blasius_edo(title, eta_final):
ETA_0 = 0
ETA_INTERVAL = 0.1
ETA_FINAL = eta_final
# default values
title = title
x_label = "$\eta$"
y_label = "profil de vitesse $(f'(\eta))$ / profil de température $(\\theta)$"
legends = ["$f'(\eta)$", "$\\theta$"]
eta_range = np.arange(ETA_0, ETA_FINAL + ETA_INTERVAL, ETA_INTERVAL)
# shoot
y_set = shooting(eta_range)
plot(eta_range, y_set, title, legends, x_label, y_label)
compute_blasius_edo(
title="Convection naturelle - Solution de similitude",
eta_final=10
)
I could be completely off base here, but I wrote something similar to solve 1D fluid-reaction-heat equations. Try using solve_ivp and using the RADAU solver method, it helps with more difficult systems.
Also maybe try converting your system of ODES to a system of first order ODEs as that may help.
You are implementing the additional but wrong boundary condition f''(0) = theta'(0), as both slots get the same initial value in the shooting method. You need to hold them separate, giving 2 free variables and thus the need for a 2-dimensional Newton method or any other solver for non-scalar functions.
You could just as well use the solve_bvp routine with a sensible initial guess.

Using m.CV vs m.Var

I'm optimizing a tubular column design using gekko python. I experimented with the code using the different variable types m.SV and m.CV in place of m.Var and there was no apparent effect on the solver or the results. What purpose do these different variable types serve?
I've included my model below.
m = GEKKO()
#%% Constants
pi = m.Const(3.14159,'pi')
P = 2300 # compressive load (kg_f)
o_y = 450 # yield stress (kg_f/cm^2)
E = 0.65e6 # elasticity (kg_f/cm^2)
p = 0.0020 # weight density (kg_f/cm^3)
l = 300 # length of the column (cm)
#%% Variables
d = m.CV(value=8.0,lb=2.0,ub=14.0) # mean diameter (cm)
t = m.SV(value=0.3,lb=0.2,ub=0.8) # thickness (cm)
cost = m.Var()
#%% Intermediates
d_i = m.Intermediate(d - t)
d_o = m.Intermediate(d + t)
W = m.Intermediate(p*l*pi*(d_o**2 - d_i**2)/4) # weight (kgf)
o_i = m.Intermediate(P/(pi*d*t)) # induced stress
# second moment of area of the cross section of the column
I = m.Intermediate((pi/64)*(d_o**4 - d_i**4))
# buckling stress (Euler buckling load/cross-sectional area)
o_b = m.Intermediate((pi**2*E*I/l**2)*(1/(pi*d*t)))
#%% Equations
m.Equations([
o_i - o_y <= 0,
o_i - o_b <= 0,
cost == 5*W + 2*d
])
#%% Objective
m.Obj(cost)
#%% Solve and print solution
m.options.SOLVER = 1
m.solve()
print('Optimal cost: ' + str(cost[0]))
print('Optimal mean diameter: ' + str(d[0]))
print('Optimal thickness: ' + str(t[0]))
Variables
Variables are values that are adjusted by the solver to satisfy an equation or determine the best outcome among many options. There is typically at least one variable for every equation. To avoid over-specification, a simulation often has equal numbers of equations and variables. For optimization problems, there are typically more variables than equations. The extra variables are changed to minimize or maximize an objective function. There is more information on these objects in the Gekko documentation and APMonitor documentation.
x = m.Var(5) # declare a variable with initial condition
There are also "special" types of variables that perform certain functions. For example, additional equations are added to the model for variables that have a measurement for data reconciliation. To avoid adding these extra equations for all variables, the measurement equations are only added for those designated as Controlled Variables (CVs). State Variables (SVs) may also be measured are typically designated as such just for monitoring purposes.
State Variables (SVs)
States are model variables that may be measured or are of special interest for observation. For time-varying simulations, the SVs change over the time horizon to satisfy equation feasibility.
x = m.SV() # state variable
Controlled Variables (CVs)
Controlled variables are model variables that are included in the objective of a controller or optimizer. These variables are controlled to a range, maximized, or minimized. Controlled variables may also be measured values that are included for data reconciliation. For time-varying simulations, the CVs change over the time horizon to satisfy the equations and minimize the objective function.
x = m.CV() # controlled variable
Example Application
There is documentation for options for the different variable and parameter types (FV, MV, SV, CV). Below is a Model Predictive Control Application that shows the use of a Manipulated Variable and Controlled Variable.
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
m = GEKKO()
m.time = np.linspace(0,20,41)
# Parameters
mass = 500
b = m.Param(value=50)
K = m.Param(value=0.8)
# Manipulated variable
p = m.MV(value=0, lb=0, ub=100)
p.STATUS = 1 # allow optimizer to change
p.DCOST = 0.1 # smooth out gas pedal movement
p.DMAX = 20 # slow down change of gas pedal
# Controlled Variable
v = m.CV(value=0)
v.STATUS = 1 # add the SP to the objective
m.options.CV_TYPE = 2 # squared error
v.SP = 40 # set point
v.TR_INIT = 1 # set point trajectory
v.TAU = 5 # time constant of trajectory
# Process model
m.Equation(mass*v.dt() == -v*b + K*b*p)
m.options.IMODE = 6 # control
m.solve(disp=False)
# get additional solution information
import json
with open(m.path+'//results.json') as f:
results = json.load(f)
plt.figure()
plt.subplot(2,1,1)
plt.plot(m.time,p.value,'b-',label='MV Optimized')
plt.legend()
plt.ylabel('Input')
plt.subplot(2,1,2)
plt.plot(m.time,results['v1.tr'],'k-',label='Reference Trajectory')
plt.plot(m.time,v.value,'r--',label='CV Response')
plt.ylabel('Output')
plt.xlabel('Time')
plt.legend(loc='best')
plt.show()

Weights minimization issue with linprog

I am trying to use python (and at present failing) to come to a more efficient solution than Excel Solver provides for an optimization problem.
Matrices
The problem is the form AB=C -->D
Where AB produces C where the absolute value for C-D for each row in the matrix is minimized.
I have seven funds contained in matrix B all of which have geographic exposure of the form
FUND_NAME = np.array([UK,USA,EuroZone, Japan,EM,Apac)]
as below
RLS = np.array([0.788743177, 0.168048481,0,0.043208342,0,0])
LIOGLB=np.array([0.084313978,0.578528092,0,0.23641746,0.033709666,0.067030804])
LIONEUR=np.array([0.055032339,0,0,0.944967661,0,0])
STEW_WLDWD=np.array([0.09865472,0.210582713,0.053858632,0.431968002,0.086387178,0.118548755])
EMMK=np.array([0.080150377,0.025212864,0.597285513,0.031832241,0.212440426,0.053078578])
PAC=np.array([0,0.013177633,0.41273195,0,0.510644775,0.063445642])
PICTET=np.array([0.089520913,0.635857603,0,0.218148413,0.023290413,0.033182659])
From this I need to construct an optimal weighting of the seven funds using a matrix (imaginatively named A) [x1,x2,x3,x4,x5,x6,x7] with x1+x2+...+x7=1 & Also for i=(1,7)
xi lower bound =0
xi upper bound =0.25
To arrive at the actual regional weights (matrix C)as close as possible to the below Target array (which corresponds to matrix D above)
Target=np.array([0.2310,0.2576,0.1047,0.1832,0.1103,0.1131])
I've tried using libprog. But I know that the answer I am getting is wrong.
Funds =np.array([RLS,LIOGLB, LIONEUR,STEW_WLDWD, EMMK,PAC,PICTET])
twentyfive=np.full((1, 7), 0.25)
bounds=[0,0.25]
res = linprog(Target,A_ub=Funds,b_ub=twentyfive,bounds=[bounds])
Can anyone help me move on from excel ?
This is really a LAD regression problem (LAD=Least Absolute Deviation) with some side constraints. Different LP formulations for the LAD regression problems can be found here. Based on the sparse bounding problem, we can formulate the LP model:
This is the mathematical model I am going to try to solve with linprog. The coloring as as follows: blue symbols represent data, red symbols are the decision variables. x are the allocations (fractions) we need to find, d are the residuals of the linear fit and r are the absolute values of d.
linprog requires an explicit LP matrix. For the model above, this A matrix can look like:
With this it is no longer very difficult to develop a Python implementation. The Python code can look like:
import numpy as np
import scipy.optimize as sp
B = np.array([[0.788743177, 0.168048481,0,0.043208342,0,0],
[0.084313978,0.578528092,0,0.23641746,0.033709666,0.067030804],
[0.055032339,0,0,0.944967661,0,0],
[0.09865472,0.210582713,0.053858632,0.431968002,0.086387178,0.118548755],
[0.080150377,0.025212864,0.597285513,0.031832241,0.212440426,0.053078578],
[0,0.013177633,0.41273195,0,0.510644775,0.063445642],
[0.089520913,0.635857603,0,0.218148413,0.023290413,0.033182659]]).T
target = np.array([0.2310,0.2576,0.1047,0.1832,0.1103,0.1131])
m,n = np.shape(B)
A_eq = np.block([[B, np.eye(m), np.zeros((m,m))],
[np.ones(n), np.zeros(m), np.zeros(m)]])
A_ub = np.block([[np.zeros((m,n)),-np.eye(m), -np.eye(m)],
[np.zeros((m,n)),np.eye(m), -np.eye(m)]])
b_eq = np.block([target,1])
b_ub = np.zeros(2*m)
c = np.block([np.zeros(n),np.zeros(m),np.ones(m)])
bnd = n*[(0,0.25)] + m*[(None,None)] + m*[(0,None)]
res = sp.linprog(c,A_ub,b_ub,A_eq,b_eq,bnd,options={'disp':True})
allocation = res.x[0:n]
The results look like:
Primal Feasibility Dual Feasibility Duality Gap Step Path Parameter Objective
1.0 1.0 1.0 - 1.0 6.0
0.3777262386888 0.3777262386888 0.3777262386888 0.6478228594143 0.3777262386888 0.3200496644143
0.08438152300367 0.08438152300366 0.08438152300367 0.8087424108466 0.08438152300366 0.1335722585582
0.01563291142478 0.01563291142478 0.01563291142478 0.8341722620104 0.01563291142478 0.1118298108651
0.004083901923022 0.004083901923022 0.004083901923023 0.7432737130498 0.004083901923024 0.1049630948572
0.0006190254179117 0.0006190254179117 0.0006190254179116 0.8815177164943 0.000619025417913 0.1016021916581
3.504935403199e-05 3.504935403066e-05 3.504935403079e-05 0.9676694788778 3.504935402756e-05 0.1012177893279
5.983549975387e-07 5.98354980932e-07 5.983549810074e-07 0.9885372873161 5.983549719474e-07 0.1011921413019
3.056236812029e-11 3.056401712736e-11 3.056394819773e-11 0.9999489201822 3.056087926755e-11 0.1011915586046
Optimization terminated successfully.
Current function value: 0.101192
Iterations: 8
print(allocation)
[2.31621461e-01 2.50000000e-01 9.07425872e-12 2.50000000e-01
4.45030949e-10 2.39692743e-01 2.86857955e-02]

Working out an equation

I'm trying to solve a differential equation numerically, and am writing an equation that will give me an array of the solution to each time point.
import numpy as np
import matplotlib.pylab as plt
pi=np.pi
sin=np.sin
cos=np.cos
sqrt=np.sqrt
alpha=pi/4
g=9.80665
y0=0.0
theta0=0.0
sina = sin(alpha)**2
second_term = g*sin(alpha)*cos(alpha)
x0 = float(raw_input('What is the initial x in meters?'))
x_vel0 = float(raw_input('What is the initial velocity in the x direction in m/s?'))
y_vel0 = float(raw_input('what is the initial velocity in the y direction in m/s?'))
t_f = int(raw_input('What is the maximum time in seconds?'))
r0 = x0
vtan = sqrt(x_vel0**2+y_vel0**2)
dt = 1000
n = range(0,t_f)
r_n = r0*(n*dt)
r_nm1 = r0((n-1)*dt)
F_r = ((vtan**2)/r_n)*sina-second_term
r_np1 = 2*r_n - r_nm1 + dt**2 * F_r
data = [r0]
for time in n:
data.append(float(r_np1))
print data
I'm not sure how to make the equation solve for r_np1 at each time in the range n. I'm still new to Python and would like some help understanding how to do something like this.
First issue is:
n = range(0,t_f)
r_n = r0*(n*dt)
Here you define n as a list and try to multiply the list n with the integer dt. This will not work. Pure Python is NOT a vectorized language like NumPy or Matlab where you can do vector multiplication like this. You could make this line work with
n = np.arange(0,t_f)
r_n = r0*(n*dt),
but you don't have to. Instead, you should move everything inside the for loop to do the calculation at each timestep. At the present point, you do the calculation once, then add the same only result t_f times to the data list.
Of course, you have to leave your initial conditions (which is a key part of ODE solving) OUTSIDE of the loop, because they only affect the first step of the solution, not all of them.
So:
# Initial conditions
r0 = x0
data = [r0]
# Loop along timesteps
for n in range(t_f):
# calculations performed at each timestep
vtan = sqrt(x_vel0**2+y_vel0**2)
dt = 1000
r_n = r0*(n*dt)
r_nm1 = r0*((n-1)*dt)
F_r = ((vtan**2)/r_n)*sina-second_term
r_np1 = 2*r_n - r_nm1 + dt**2 * F_r
# append result to output list
data.append(float(r_np1))
# do something with output list
print data
plt.plot(data)
plt.show()
I did not add any piece of code, only rearranged your lines. Notice that the part:
n = range(0,t_f)
for time in n:
Can be simplified to:
for time in range(0,t_f):
However, you use n as a time variable in the calculation (previously - and wrongly - defined as a list instead of a single number). Thus you can write:
for n in range(0,t_f):
Note 1: I do not know if this code is right mathematically, as I don't even know the equation you're solving. The code runs now and provides a result - you have to check if the result is good.
Note 2: Pure Python is not the best tool for this purpose. You should try some highly optimized built-ins of SciPy for ODE solving, as you have already got hints in the comments.

python - combining argsort with masking to get nearest values in moving window

I have some code for calculating missing values in an image, based on neighbouring values in a 2D circular window. It also uses the values from one or more temporally-adjacent images at the same locations (i.e. the same 2D window shifted in the 3rd dimension).
For each position that is missing, I need to calculate the value based not necessarily on all the values available in the whole window, but only on the spatially-nearest n cells that do have values (in both images / Z-axis positions), where n is some value less than the total number of cells in the 2D window.
At the minute, it's much quicker to calculate for everything in the window, because my means of sorting to get the nearest n cells with data is the slowest part of the function as it has to be repeated each time even though the distances in terms of window coordinates do not change. I'm not sure this is necessary and feel I must be able to get the sorted distances once, and then mask those in the process of only selecting available cells.
Here's my code for selecting the data to use within a window of the gap cell location:
# radius will in reality be ~100
radius = 2
y,x = np.ogrid[-radius:radius+1, -radius:radius+1]
dist = np.sqrt(x**2 + y**2)
circle_template = dist > radius
# this will in reality be a very large 3 dimensional array
# representing daily images with some gaps, indicated by 0s
dataStack = np.zeros((2,5,5))
dataStack[1] = (np.random.random(25) * 100).reshape(dist.shape)
dataStack[0] = (np.random.random(25) * 100).reshape(dist.shape)
testdata = dataStack[1]
alternatedata = dataStack[0]
random_gap_locations = (np.random.random(25) * 30).reshape(dist.shape) > testdata
testdata[random_gap_locations] = 0
testdata[radius, radius] = 0
# in reality we will go through every gap (zero) location in the data
# for each image and for each gap use slicing to get a window of
# size (radius*2+1, radius*2+1) around it from each image, with the
# gap being at the centre i.e.
# testgaplocation = [radius, radius]
# and the variables testdata, alternatedata below will refer to these
# slices
locations_to_exclude = np.logical_or(circle_template, np.logical_or
(testdata==0, alternatedata==0))
# the places that are inside the circular mask and where both images
# have data
locations_to_include = ~locations_to_exclude
number_available = np.count_nonzero(locations_to_include)
# we only want to do the interpolation calculations from the nearest n
# locations that have data available, n will be ~100 in reality
number_required = 3
available_distances = dist[locations_to_include]
available_data = testdata[locations_to_include]
available_alternates = alternatedata[locations_to_include]
if number_available > number_required:
# In this case we need to find the closest number_required of elements, based
# on distances recorded in dist, from available_data and available_alternates
# Having to repeat this argsort for each gap cell calculation is slow and feels
# like it should be avoidable
sortedDistanceIndices = available_distances.argsort(kind = 'mergesort',axis=None)
requiredIndices = sortedDistanceIndices[0:number_required]
selected_data = np.take(available_data, requiredIndices)
selected_alternates = np.take(available_alternates , requiredIndices)
else:
# we just use available_data and available_alternates as they are...
# now do stuff with the selected data to calculate a value for the gap cell
This works, but over half of the total time of the function is taken in the argsort of the masked spatial distance data. (~900uS of a total 1.4mS - and this function will be running tens of billions of times, so this is an important difference!)
I am sure that I must be able to just do this argsort once outside of the function, when the spatial distance window is originally set up, and then include those sort indices in the masking, to get the first howManyToCalculate indices without having to re-do the sort. The answer might involve putting the various bits that we are extracting from, into a record array - but I can't figure out how, if so. Can anyone see how I can make this part of the process more efficient?
So you want to do the sorting outside of the loop:
sorted_dist_idcs = dist.argsort(kind='mergesort', axis=None)
Then using some variables from the original code, this is what I could come up with, though it still feels like a major round-trip..
loc_to_incl_sorted = locations_to_include.take(sorted_dist_idcs)
sorted_dist_idcs_to_incl = sorted_dist_idcs[loc_to_incl_sorted]
required_idcs = sorted_dist_idcs_to_incl[:number_required]
selected_data = testdata.take(required_idcs)
selected_alternates = alternatedata.take(required_idcs)
Note the required_idcs refer to locations in the testdata and not available_data as in the original code. And this snippet I used take for the purpose of conveniently indexing the flattened array.
#moarningsun - thanks for the comment and answer. These got me on the right track, but don't quite work for me when the gap is < radius from the edge of the data: in this case I use a window around the gap cell which is "trimmed" to the data bounds. In this situation the indices reflect the "full" window and thus can't be used to select cells from the bounded window.
Unfortunately I edited that part of my code out when I clarified the original question but it's turned out to be relevant.
I've realised now that if you use argsort again on the output of argsort then you get ranks; i.e. the position that each item would have when the overall array was sorted. We can safely mask these and then take the smallest number_required of them (and do this on a structured array to get the corresponding data at the same time).
This implies another sort within the loop, but in fact we can use partitioning rather than a full sort, because all we need is the smallest num_required items. If num_required is substantially less than the number of data items then this is much faster than doing the argsort.
For example with num_required = 80 and num_available = 15000 the full argsort takes ~900µs whereas argpartition followed by index and slice to get the first 80 takes ~110µs. We still need to do the argsort to get the ranks at the outset (rather than just partitioning based on distance) in order to get the stability of the mergesort, and thus get the "right one" when distance is not unique.
My code as shown below now runs in ~610uS on real data, including the actual calculations that aren't shown here. I'm happy with that now, but there seem to be several other apparently minor factors that can have an influence on the runtime that's hard to understand.
For example putting the circle_template in the structured array along with dist, ranks, and another field not shown here, doubles the runtime of the overall function (even if we don't access circle_template in the loop!). Even worse, using np.partition on the structured array with order=['ranks'] increases the overall function runtime by almost two orders of magnitude vs using np.argpartition as shown below!
# radius will in reality be ~100
radius = 2
y,x = np.ogrid[-radius:radius+1, -radius:radius+1]
dist = np.sqrt(x**2 + y**2)
circle_template = dist > radius
ranks = dist.argsort(axis=None,kind='mergesort').argsort().reshape(dist.shape)
diam = radius * 2 + 1
# putting circle_template in this array too doubles overall function runtime!
fullWindowArray = np.zeros((diam,diam),dtype=[('ranks',ranks.dtype.str),
('thisdata',dayDataStack.dtype.str),
('alternatedata',dayDataStack.dtype.str),
('dist',spatialDist.dtype.str)])
fullWindowArray['ranks'] = ranks
fullWindowArray['dist'] = dist
# this will in reality be a very large 3 dimensional array
# representing daily images with some gaps, indicated by 0s
dataStack = np.zeros((2,5,5))
dataStack[1] = (np.random.random(25) * 100).reshape(dist.shape)
dataStack[0] = (np.random.random(25) * 100).reshape(dist.shape)
testdata = dataStack[1]
alternatedata = dataStack[0]
random_gap_locations = (np.random.random(25) * 30).reshape(dist.shape) > testdata
testdata[random_gap_locations] = 0
testdata[radius, radius] = 0
# in reality we will loop here to go through every gap (zero) location in the data
# for each image
gapz, gapy, gapx = 1, radius, radius
desLeft, desRight = gapx - radius, gapx + radius+1
desTop, desBottom = gapy - radius, gapy + radius+1
extentB, extentR = dataStack.shape[1:]
# handle the case where the gap is < search radius from the edge of
# the data. If this is the case, we can't use the full
# diam * diam window
dataL = max(0, desLeft)
maskL = 0 if desLeft >= 0 else abs(dataL - desLeft)
dataT = max(0, desTop)
maskT = 0 if desTop >= 0 else abs(dataT - desTop)
dataR = min(desRight, extentR)
maskR = diam if desRight <= extentR else diam - (desRight - extentR)
dataB = min(desBottom,extentB)
maskB = diam if desBottom <= extentB else diam - (desBottom - extentB)
# get the slice that we will be working within
# ranks, dist and circle are already populated
boundedWindowArray = fullWindowArray[maskT:maskB,maskL:maskR]
boundedWindowArray['alternatedata'] = alternatedata[dataT:dataB, dataL:dataR]
boundedWindowArray['thisdata'] = testdata[dataT:dataB, dataL:dataR]
locations_to_exclude = np.logical_or(boundedWindowArray['circle_template'],
np.logical_or
(boundedWindowArray['thisdata']==0,
boundedWindowArray['alternatedata']==0))
# the places that are inside the circular mask and where both images
# have data
locations_to_include = ~locations_to_exclude
number_available = np.count_nonzero(locations_to_include)
# we only want to do the interpolation calculations from the nearest n
# locations that have data available, n will be ~100 in reality
number_required = 3
data_to_use = boundedWindowArray[locations_to_include]
if number_available > number_required:
# argpartition seems to be v fast when number_required is
# substantially < data_to_use.size
# But partition on the structured array itself with order=['ranks']
# is almost 2 orders of magnitude slower!
reqIndices = np.argpartition(data_to_use['ranks'],number_required)[:number_required]
data_to_use = np.take(data_to_use,reqIndices)
else:
# we just use available_data and available_alternates as they are...
pass
# now do stuff with the selected data to calculate a value for the gap cell

Categories