I am currently trying to run a Monte Carlo simulation for which I would need more than 5,000,000 iterations (the outputs are still not consistent at that point).
When I try and run it for more than 5 million however I get a memory error when re-arranging my array to obtain the data sorted in a way I can easily plot it.
the error occurs at
np.array([np.array([run_single_regression(inputs) for x in xrange(iterations)])]).transpose()
This is the function I run:
def Monte_Carlo_regressions(filename, iterations, do_plot = False):
inputs = data_assignment_regression(filename)
total_pow, total_energy = np.array([np.array([run_single_regression(inputs) for x in xrange(iterations)])]).transpose()
if do_plot:
plot(total_pow, 'Total Power Capacity (GW)')
plot(total_energy, 'Total Energy Storage Capacity (TWh)')
return total_pow.mean(0), total_pow.std(0), total_energy.mean(0), total_energy.std(0)
The data_assignment_regression(filename) function returns a set of 1D arrays assigned to the inputs.
The run_single_regression(inputs) function estimates the power and energy outputs for that iteration and return a numpy array containing the power and energy for that iteration.
How can I avoid the memory error ? Is there a way I could re-arrange the array without having to store all the value ?
Related
I am fairly new to coding in general and to Python in particular. I am trying to apply a weighted average scheme into a big dataset, which at the moment is taking hours to complete and I would love to speed up the process also because this has to be repeated several times.
The weighted average represents a method used in marine biogeochemistry that includes the history of gas transfer velocities (k) in-between sampling dates, where k is weighted according to the fraction of water column (f) ventilated by the atmosphere as a function of the history of k and assigning more importance to values that are closer to sampling time (so the weight at sampling time step = 1 and then it decreases moving away in time):
Weight average equation extracted from (https://doi.org/10.1029/2017GB005874) pp. 1168
In my attempt I used a nested for loop where at each time step t I calculated the weighted average:
def kw_omega (k, depth, window, samples_day):
"""
calculate the scheme weights for gas transfer velocity of oxygen
over the previous window of time, where the most recent gas transfer velocity
has a weight of 1, and the weighting decreases going back in time. The rate of decrease
depends on the wind history and MLD.
Parameters
----------
k: ndarray
instantaneous O2 gas transfer velocity
depth: ndarray
Water depth
window: integer
weighting period in days which equals the residence time of oxygen at sampling day
samples_day: integer
number of samples in each day composing window
Returns
---------
weighted_kw: ndarray
Notes
---------
n = the weighting period / the time resolution of the wind data
samples_day = the time resolution of the wind data
omega = is the weighting coefficient at each time step within the weighting window
f = the fraction of the water column (mixed layer, photic zone or full water column) ventilated at each time
"""
Dt = 1./samples_day
f = (k*Dt)/depth
f = np.flip(f)
k = np.flip(k)
n = window*samples_day
weighted_kw = np.zeros(len(k))
for t in np.arange(len(k) - n):
omega = np.zeros((n))
omega[0] = 1.
for i in np.arange(1,len(omega)):
omega[i] = omega[i-1]*(1-f[t+(i-1)])
weighted_kw[t] = sum(k[t:t+n]*omega)/sum(omega)
print(f"t = {t}")
return np.flip(weighted_kw)
This should be used on model simulation data which was set to run for almost 2 years where the model time step was set to 60 seconds, and sampling is done at intervals of 7 days. Therefore k has shape (927360) and n, representing the number of minutes in 7 days has shape (10080). At the moment it is taking several hours to run. Is there a way to make this calculation faster?
I would recommend to use the package numba to speed up your calculation.
import numpy as np
from numba import njit
from numpy.lib.stride_tricks import sliding_window_view
#njit
def k_omega(k_win, f_win):
delta_t = len(k_win)
omega_sum = omega = 1.0
k_omega_sum = k_win[0]
for t in range(1, delta_t):
omega *= (1 - f_win[t])
omega_sum += omega
k_omega_sum = k_win[t] * omega
return k_omega_sum / omega_sum
#njit
def windows_k_omega(k_wins, f_wins):
size = len(k_wins)
result = np.empty(size)
for i in range(size):
result[i] = k_omega(k_wins[i], f_wins[i])
return result
def kw_omega(k, depth, window, samples_day):
n = window * samples_day # delta_t
f = k / depth / samples_day
k_wins = sliding_window_view(k, n)
f_wins = sliding_window_view(f, n)
k_omegas = windows_k_omega(k_wins, f_wins)
weighted_kw = np.pad(weighted_kw, (len(k)-len(k_omegas), 0))
return weighted_kw
Here, I have split up the function into three in order to make it more comprehensible. The function k_omega is basically applying your weighted mean function to a k and f window. The function windows_k_omega is just to speed up the loop to apply the function element wise on the windows. Finally, the outer function kw_omega implements your original function interface. It uses the numpy function sliding_window_view to create the moving windows (note that this is a fancy numpy indexing under the hood, so this is not creating a copy of the original array) and performs the calculation with the helper functions and takes care of the padding of the result array (initial zeros).
A short test with your original function showed some different results, which is likely due to your np.flip calls reverse the arrays for your indexing. I just implemented the formula which you provided without checking your indexing in depth, so I leave this task to you. You should maybe call it with some dummy inputs which you can check manually.
As an additional note to your code: If you want to loop on index, you should use the build in range instead of using np.arange. Internally, python uses a generator for range instead of creating the array of indexes first to iterate over each individually. Furthermore, you should try to reduce the number of arrays which you need to create, but instead re-use them, e.g. the omega = np.zeros(n) could be created outside the outer for loop using omega = np.empty(n) and internally only initialized on each new iteration omega[:] = 0.0. Note, that all kind of memory management which is typically the speed penalty, beside array element access by index, is something which you need to do with numpy yourself, because there is no compiler which helps you, therefore I recommend using numba, which compiles your python code and helps you in many ways to make your number crunching faster.
I'm stuck with storing arrays of regression coefficients (in different length) into a series. I use a for loop to call the regressions over time. Across the training period, the regression only returns an array of size 1 with the intercept. Beyond the training period, the coefficients will be of intercept plus the different factors (the factors applicable through each time point may vary). That's where my Python code crashes with the error of: ValueError: setting an array element with a sequence
tempS = pd.Series(index = range(len(temp)))
...... # for loop over i
model = linear_model.OLS(y, x)
results = model.fit()
tempS[i] = results.params.values
Can someone shed me some lights on how to store arrays of different length into my series variable tempS?
Background: I have millions of points in 2D space with (x_position, y_position, value) associated with each point. I am trying to summarize these points by creating an image, where each pixel can contain multiple points. To summarize, each pixel stores the sum of values at that (x_pixel, y_pixel) location in the image.
Question: How can I do this efficiently? Currently, my code does something like this:
image = np.zeros((4096,4096))
for each point in data:
x_pixel, y_pixel = convertPointPos2PixelPos(point)
image[x_pixel, y_pixel] += point.getValue()
but the ETA for this code completing is 450 hours, which is unacceptable. Is there a way to parallelize this? The code is writing to the same image[x,y] index multiple times. I found StackOverflow posts that suggest using multiprocessing, but I think needing to lock to prevent race conditions will mean this will take just as much time as it would without parallelizing.
Assuming you want something on a regular grid, you can use simple division to bin your data. Here is an example:
size = (4096, 4096)
data = np.random.rand(100000000, 3)
image = np.zeros(size)
coords = data[:, :2]
min = coords.min(0)
max = coords.max(0)
index = np.floor_divide(coords - min, (max - min) / np.subtract(size, 1), out=np.empty(coords.shape, dtype=int), casting='unsafe')
index is now an array of indices into image where you want to add the corresponding values. You can do an unbuffered add using np.add.at:
np.add.at(image, tuple(index.T), data[:, -1])
If your data range is better defined than just the bounding box of the coordinates, you can save a little time by not computing coord.max() and coord.min().
The result is something like this:
This entire operation takes 6.4sec on my very moderately powered machine for 10M points, including the call to plt.imshow, plt.colorbar and garbage collection before runs.
Timing collected using the %%timeit cell magic in IPython.
Either way, you're well under 450 hours. Even if your coordinate transformation is not linear binning, I expect that you can run in reasonable time as long as you vectorize it properly. Also, multiprocessing is not likely to give you a huge boost since it requires copying data around.
I am working to generate a monte carlo simulation for oil wells. The end goal is to have all the wells with a smoothed probabilistic production curve. I have optimized what I can, but each of the 3 apply statements I am listing take so much time when I use my full dataset and the number of simulations I want. (Hours) The code I included is has 10 iterations. If you crank it up to 10,000 which is the goal it really starts to drag.
I have generated a Panda that has all the future wells I want to model with a probability of that well being chosen next to be drilled.
I then created a panda where I grouped everything into the categories I want to use to figure out the order that the model will choose the wells. So my "timing" panda contains my categories and an array of every index of those wells in those categories and an array of the well's probabilities.
This all is done in a few seconds. The next part works, but gets very slow.
Next I use a numpy generator choice with percentages to randomly generate the order of the wells for i simulations. As other posts have noted #njit does not work with the probability array. The result is 1 dimension of the array is the order that the wells will be chosen by each category, and the other dimension is each simulation. There are about 150 categories, and 10,000s of wells in each categories. I am hoping to run 10,000 simulations.
a is an array of indexes of wells that can be chosen
size is the length of that array
per is the probability that each well will get chosen
Next I link my timing panda to my panda with all of the wells in it. This attaches the previous array to the wells array. Then I search this array for the well index to figure out for each simulation when that specific well is going to get run. This generates a 1d array with what order that well is going be drilled in each simulation.
This function gets called on 100,000s of wells and as I increase the number of simulations it really slows down.
order is an array of the order each well is drilled per simulation
index is the index of that well
The final difficulty I am having is the averaging out the production curve for the wells. I have how much oil will be produced by each well per month. I need to insert that curve into the array at each point when the well is drilled, then average all of those values together to get the average production of the well given all the simulations.
I have also tried creating an np.zeros array then using the np.insert function, but I could not figure out how to insert an array multiple times without a loop and generating the initial array of 0's took longer than the current method I had. (I overcame inserting the array multiple times by covering everything to a string, inserting the type curve as a string then converting back to an array of numbers, but this did not seem efficient). I need to have the number of leading 0's
order is the time in months that each well will get drilled
curve is the production curve passed as a list
m is the highest value of the months that the well is drilled in all simulations
import numpy as np
from numba import njit
import datetime
import math
def TimingGenerator(a, size, p):
i = 10
g = np.random.Generator(np.random.PCG64())
order = np.concatenate([g.choice(a=a, size=size, replace=False, p=p) for z in range(i)]).reshape(i, size)
return order
#njit
def OrderGenerator(order, index):
result = np.where(order == index)[1]
return result
def CurveAverager(order, curve, m):
matrix = np.array([[0] * math.ceil(i) + curve + [0] * int((m - math.ceil(i))) for i in order])
result = np.mean(matrix, axis=0)
return result
begin_time = datetime.datetime.now()
size = 8000
g = np.random.Generator(np.random.PCG64())
a = g.choice(20_000, size=size, replace=False)
p = np.random.randint(1,100, size=size)
p = p/np.sum(p)
for i in range(150):
q = TimingGenerator(a,size,p)
print(datetime.datetime.now() - begin_time)
index = np.amin(q)
for i in range(100000):
order = OrderGenerator(q, index)
print(datetime.datetime.now() - begin_time)
order = order / 15
curve = list(range(600, 0, -1))
for i in range(20000):
avgcurve = CurveAverager(order, curve, size)
print(datetime.datetime.now() - begin_time)
Thanks for any help you can offer. I am willing to greatly alter my code if you can think of anything to help speed it up. Not sure if there is a better way to apply probabilities and smooth out the production curve which is really the end goal.
Cheers.
I'm using scipy.integrate.ode and would like to know, what happens internally when I get the message UserWarning: zvode: Excess work done on this call. (Perhaps wrong MF.) 'Unexpected istate=%s' % istate))
This appears when I call ode.integrate(t1) for too big t1, so I'm forced to use a for-loop and incrementally integrate my equation, what lowers the speed since the solver can not use adaptive step size very effectively. I already tried different methods and setting for the integrator. The maximal number of steps nsteps=100000 is very big already but with this setting I still can't integrate up to 1000 in one call, which I would like to do.
The code I use is:
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import ode
h_bar=0.658212 #reduced Planck's constant (meV*ps)
m0=0.00568563 #free electron mass (meV*ps**2/nm**2)
m_e=0.067*m0 #effective electron mass (meV*ps**2/nm**2)
m_h=0.45*m0 #effective hole mass (meV*ps**2/nm**2)
m_reduced=1/((1/m_e)+(1/m_h)) #reduced mass of electron and holes combined
kB=0.08617 #Boltzmann's constant (meV/K)
mu_e=-50 #initial chemical potential for electrons
mu_h=-100 #initial chemical potential for holes
k_array=np.arange(0,1.5,0.02) #a list of different k-values
n_k=len(k_array) #number of k-values
def derivative(t,y_list,Gamma,g,kappa,k_list,n_k):
#initialize output vector
y_out=np.zeros(3*n_k+1,dtype=complex)
y_out[0:n_k]=-g*g*2*np.real(y_list[2*n_k:3*n_k])/h_bar
y_out[n_k:2*n_k]=-g*g*2*np.real(y_list[2*n_k:3*n_k])/h_bar
y_out[2*n_k:3*n_k]=((-1.j*(k_list**2/(2*m_reduced))-(Gamma+kappa))*y_list[2*n_k:3*n_k]-y_list[-1]*(1-y_list[n_k:2*n_k]-y_list[0:n_k])+y_list[0:n_k]*y_list[n_k:2*n_k])/h_bar
y_out[-1]=(2*np.real(g*g*sum(y_list[2*n_k:3*n_k]))-2*kappa*y_list[-1])/h_bar
return y_out
def dynamics(t_list,N_ini=1e-3, T=300, Gamma=1.36,kappa=0.02,g=0.095):
#initial values
t0=0 #initial time
y_initial=np.zeros(3*n_k+1,dtype=complex)
y_initial[0:n_k]=1/(1+np.exp(((h_bar*k_array)**2/(2*m_e)-mu_e)/(kB*T))) #Fermi-Dirac distributions
y_initial[n_k:2*n_k]=1/(1+np.exp(((h_bar*k_array)**2/(2*m_h)-mu_h)/(kB*T)))
t_list=t_list[1:] #remove t=0 from list (not feasable for integrator)
r=ode(derivative).set_integrator('zvode',method='adams', atol=10**-6, rtol=10**-6,nsteps=100000) #define ode solver
r.set_initial_value(y_initial,t0)
r.set_f_params(Gamma,g,kappa,k_array,n_k)
#create array for output (the +1 accounts values at t0=0)
y_output=np.zeros((len(t_list)+1,len(y_initial)),dtype=complex)
#insert initial data in output array
y_output[0]=y_initial
#perform integration for time steps given by t_list (the +1 account for the initial values already in the array)
for i in range(len(t_list)):
print(r't = %s' % t_list[i])
r.integrate(t_list[i])
if not (r.successful()):
print('Integration not successful!!')
break
y_output[i+1]=r.y
return y_output
t_list=np.arange(0,100,5)
data=dynamics(t_list,N_ini=1e-3, T=300, Gamma=1.36,kappa=0.02,g=1.095)
The message means that the method reached the number of steps specified by nsteps parameter. Since you asked about internals, I looked into the Fortran source, which offers this explanation:
-1 means an excessive amount of work (more than MXSTEP steps) was done on this call, before completing the requested task, but the integration was otherwise successful as far as T. (MXSTEP is an optional input and is normally 500.)
The conditional statement that brings up the error is this "GO TO 500".
According to LutzL, for your ODE the solver chooses step size 2e-4, which means 5000000 steps to integrate up to 1000. Your options are:
try such a large value of nsteps (which translates to MXSTEP in aforementioned Fortran routine)
reduce error tolerance
run a for loop, as you already do.