How to interpolate numpy.polyval and numpy.polyfit python - python

I did a numpy.polyfit() for latitude, longitude, & altitude data for a satellite orbit and interpolated (50 points) with numpy.polyval().
Now, I want to just take a window (0-4.5 degrees longitude) and do a higher resolution interpolation (6,000 points). I think that I need to use the fit coefficients from the first low res fit in order to interpolate for my longitude window, and I am not quite sure how to do this.
Inputs:
lat = [27.755611104020687, 22.50661883405905, 17.083576087905502, 11.53891099628959, 5.916633366002468, 0.2555772624429494, -5.407902834141322, -11.037514984810027, -16.594621304857206, -22.03556688048686, -27.308475759820045, -32.34927891621322, -37.07690156937186, -41.38803163295967, -45.15306971601912, -48.21703193866987, -50.41165326774015, -51.58419672864487, -51.63883932997542, -50.57025116952513, -48.46557920053242, -45.47329014246061, -41.76143266388077, -37.48707787049647, -32.782653540783, -27.754184631685046, -22.48503337048438, -17.041097574740743, -11.475689837873944, -5.833592289780744, -0.1543286595142316, 5.525119007560692, 11.167878192881306, 16.73476477885508, 22.18160021405449, 27.455997555900108, 32.493386953033685, 37.21222272985329, 41.508824407948275, 45.25350232626601, 48.291788915858554, 50.45698534747271, 51.59925055739275, 51.62660832560593, 50.53733379179681, 48.420673231121725, 45.42531420150485, 41.71819693220144, 37.45473807165676, 32.76569228387106]
lon = [-109.73105744378498, -104.28690174554579, -99.2435132929552, -94.48533149079628, -89.91054414962821, -85.42671400689177, -80.94616150449806, -76.38135021210172, -71.6402674905218, -66.62178379632216, -61.21120467960157, -55.27684029674759, -48.66970878028004, -41.23083703244677, -32.813881865289346, -23.332386757370532, -12.832819226213942, -1.5659455609661785, 10.008077792630402, 21.33116444634303, 31.92601575632583, 41.51883213364072, 50.04498630545507, 57.58103957109249, 64.26993028992476, 70.2708323505337, 75.73441871754586, 80.7944079829813, 85.56734813043659, 90.1558676264546, 94.65309120129724, 99.14730128118617, 103.72658922048785, 108.48349841714494, 113.51966824008079, 118.95024882101737, 124.9072309203375, 131.5395221402974, 139.00523971191907, 147.44847902856114, 156.95146022590976, 167.46163867248032, 178.72228750873975, -169.72898181991064, -158.44642409799974, -147.8993300787564, -138.35373014113995, -129.86955508919888, -122.36868103811106, -115.70852432245486]
alt = [374065.49207488785, 372510.1635949105, 371072.75959230476, 369836.3092635453, 368866.7921820211, 368209.0950216997, 367884.3703536549, 367888.97894243425, 368195.08833668986, 368752.88080031495, 369494.21701128664, 370337.49662954226, 371193.3839051864, 371971.0136622536, 372584.272228585, 372957.752022573, 373032.0104747458, 372767.8112563471, 372149.0940816824, 371184.49208500446, 369907.2992362557, 368373.8795969478, 366660.5935723809, 364859.4071422184, 363072.42955020745, 361405.69765685993, 359962.58417682414, 358837.24421522504, 358108.5277743581, 357834.7679493668, 358049.8054538341, 358760.531463618, 359946.1257064284, 361559.04646970675, 363527.70518032915, 365760.6377191965, 368151.8843206526, 370587.2165838985, 372950.8014553002, 375131.8814988529, 377031.06540952163, 378565.8596562773, 379675.13241518533, 380322.2707576381, 380496.8682141012, 380214.86538256245, 379517.14674525027, 378466.68079100474, 377144.36811517406, 375643.83731560566]
myOrbitJ2000Time =[ 20027712., 20027713., 20027714., 20027715., 20027716.,
20027717., 20027718., 20027719., 20027720., 20027721.,
20027722., 20027723., 20027724., 20027725., 20027726.,
20027727., 20027728., 20027729., 20027730., 20027731.,
20027732., 20027733., 20027734., 20027735., 20027736.,
20027737., 20027738., 20027739., 20027740., 20027741.,
20027742., 20027743., 20027744., 20027745., 20027746.,
20027747., 20027748., 20027749., 20027750., 20027751.,
20027752., 20027753., 20027754., 20027755., 20027756.,
20027757., 20027758., 20027759., 20027760., 20027761.]
Code:
deg = 30 #polynomial degree for fit
fittime = myOrbitJ2000Time - myOrbitJ2000Time[0]
'Latitude Interpolation'
fitLat = np.polyfit(fittime, lat, deg)
polyval_lat = np.polyval(fitLat,fittime)
'Longitude Interpolation'
fitLon = np.polyfit(fittime, lon, deg)
polyval_lon = np.polyval(fitLon,fittime)
'Altitude Interpolation'
fitAlt = np.polyfit(fittime, alt, deg)
polyval_alt = np.polyval(fitAlt,fittime)
'Get Lat, Lon, & Alt values for a window of 0-4.5 deg Longitude'
lonwindow =[]
latwindow = []
altwindow = []
for i in range(len(polyval_lat)):
if 0 < polyval_lon[i] < 4.5: # get lon vals in window
lonwindow.append(polyval_lon[i]) #append lon vals
latwindow.append(polyval_lat[i]) #append corresponding lat vals
altwindow.append(polyval_alt[i]) #append corresponding alt vals
lonwindow = np.array(lonwindow)
Just to be clear -- The issue is I only have one point in the window range, I want to use the interpolation/equation/curve from the previous step. So then I can use that to interpolate again and generate 6,000 points in my window range.

Original answer posted here
First, generate the polynomial fit coefficients using the old time (x-axis) values, and interpolated longitude (y-axis) values.
import numpy as np
import matplotlib.pyplot as plt
poly_deg = 3 #degree of the polynomial fit
polynomial_fit_coeff = np.polyfit(original_times, interp_lon, poly_deg)
Next, use np.linspace() to generate arbitrary time values based on the number of desire points in the window.
start = 0
stop = 4
num_points = 6000
arbitrary_time = np.linspace(start, stop, num_points)
Finally, use the fit coefficients and the arbitrary time to get the actual interpolated longitude (y-axis) values and plot.
lon_intrp_2 = np.polyval(polynomial_fit_coeff, arbitrary_time)
plt.plot(arbitrary_time, lon_intrp_2, 'r') #interpolated window as a red curve
plt.plot(myOrbitJ2000Time, lon, '.') #original data plotted as points

Related

Subtract, Average, Squeeze, then Subset Variable in NetCDF via Python

I have a NetCDF that has multiple pieces pollution data across U.S. in a daily value format. I am hoping to subtract a background variable ('nosmoke background PM25') from another ('PM25') then average this result to create an annual value. From what I understand, by creating an annual value I change my array from three dimensions (time, lat, long) to two (lat, long) since there is now no daily values. I am then hoping to subset this variable to only include a region of interest. Below is my code. Any help is much appreciated.
import netCDF4 as nc
import numpy as np
fn = '24hr_SmokePM25_2018.nc'
ds = nc.Dataset(fn)
#All data defining total PM25
TOT_PM25 = ds.variables['PM25']
NS_BKGD_PM25 = ds.variables['nosmoke background PM25']
#create empty array that is same size as TOT_PM25 variable to fill
Smoke_PM25 = np.full_like(TOT_PM25,0)
#For loop in subtracting total PM25 from no smoke PM25
for i in range (1,365):
Smoke2_PM25[i,:,:]=TOT_PM25[i,:,:]-NS_BKGD_PM25[i,:,:]
#Averaging and squeezing daily value
np.mean(Smoke2_PM25[i,:,:])
np.squeeze(Smoke2_PM25, axis=0)
#boundary input for desired region to pull
latbounds = [ 24 , 39 ]
lonbounds = [ -96 , -77 ]
lons = ds.variables['lon'][:]
lats = ds.variables['lat'][:]
# latitude lower and upper index
latli = np.argmin( np.abs( lats - latbounds[0] ) )
latui = np.argmin( np.abs( lats - latbounds[1] ) )
# longitude lower and upper index
lonli = np.argmin( np.abs( lons - lonbounds[0] ) )
lonui = np.argmin( np.abs( lons - lonbounds[1] ) )
# Value subsets (SmokeP25, latitude, longitude)
Smoke_Region_PM25 = ds.variables['PM25'][ : , latli:latui , lonli:lonui ]
#Close the NetCDF
ds.close()
You can do this easily with v0.3.0 of my nctoolkit package (https://pypi.org/project/nctoolkit/).
First read the data in:
import nctoolkit as nc
fn = '24hr_SmokePM25_2018.nc'
data = nc.open_data(fn)
Then subtract one variable from the other to create your new variable:
data.assign(new = lambda x: x.PM25 - x.nosmoke_background_PM25)
Once that is done you can calculate an annual mean:
data.tmean("year")
Then visualize the results:
data.plot("new")

Xarray Data Array from netcdf returns numpy grid array larger than input

I have a netcdf file with float values representing chlorophyll concentration at latitudes and longitudes. I am trying to draw a line between two sets of lats/lons and return all chlorophyll values from points on the line.
I'm approaching it from a geometry point of view: for points (x1, y1) and (x2, y2), find the slope and intercept of the line and return all values of x for given values of y on the line. Once I have all x and y values (longitude and latitude) I hope to input those into the xarray select method to return the chlorophyll concentration.
ds = '~/apr1.nc'
ds = xarray.open_dataset(ds, decode_times=False)
x1, y1 = [34.3282, 32.4791]
x2, y2 = [34.7, 32.21]
slope = (y2 - y1) / (x2 - x1)
intercept = y1 - (slope * x1)
line_lons = np.arange(x1, x2, step)
line_lats = [slope * x + intercept for x in lons]
values = ds.CHL.sel(lat=line_lats, lon=line_lons, method='nearest')
ds.values
>>> [0.0908799 , 0.06634101, 0.07615771, 0.16289435],
[0.06787204, 0.07480557, 0.0655338 , 0.06064864],
[0.06352911, 0.06586582, 0.06702182, 0.10024723],
[0.0789495 , 0.07035938, 0.07455409, 0.08405576]]], dtype=float32)
line_lons
>>> array([34.3282, 34.4282, 34.5282, 34.6282])
I want to create a plot with longitudes on the x axis, and values on the y axis. The problem is that the ds.values command returns an numpy data array with a shape of (1, 4, 4) while the longitudes are only 4. There are way more values in the returned array.
plt.plot(line_lons, chlvalues.values)
Any idea why that is and how I can return one value for one input?
Thanks.
I assume it is because by default your output was taken from box instead of along a selected transect.
I propose a more complex solution with Numpy and netCDF4, where you first make the transect with random coordinates and then turn these random coordinates into the closest unique coordinates from input file (unique = so that each point along the transect is encounted only once).
Afterwards, when you know your output coordinates, you have 2 possibilities how to take out data along transect:
a) you find the indices of the corresponding coordinates
b) interpolate original data to those coordinates (either nearest or bi-linear method)
Here is the code:
#!/usr/bin/env ipython
# --------------------------------------------------------------------------------------------------------------
import numpy as np
from netCDF4 import Dataset
# -----------------------------
# coordinates:
x1, y1 = [10., 55.]
x2, y2 = [20., 58.]
# --------------------------------
# ==============================================================================================================
# create some test data:
nx,ny = 100,100
dataout = np.random.random((ny,nx));
# -------------------------------
lonout=np.linspace(9.,30.,nx);
latout=np.linspace(54.,66.,ny);
# make data:
ncout=Dataset('test.nc','w','NETCDF3_CLASSIC');
ncout.createDimension('lon',nx);
ncout.createDimension('lat',ny);
ncout.createDimension('time',None);
ncout.createVariable('lon','float64',('lon'));ncout.variables['lon'][:]=lonout;
ncout.createVariable('lat','float64',('lat'));ncout.variables['lat'][:]=latout;
ncout.createVariable('var','float32',('lat','lon'));ncout.variables['var'][:]=dataout;
ncout.close()
#=================================================================================================================
# CUT THE DATA FROM FILE:
# make some arbitrary line between start-end point, later let us convert it to indices:
coords=np.linspace(x1+1j*y1,x2+1j*y2,1000);
xo=np.real(coords);yo=np.imag(coords);
# ------------------------------------------------------
# get transect:
ncin = Dataset('test.nc');
lonin=ncin.variables['lon'][:];
latin=ncin.variables['lat'][:];
# ------------------------------------------------------
# get the transect indices:
rxo=np.array([np.squeeze(np.min(lonout[np.where(np.abs(lonout-val)==np.abs(lonout-val).min())])) for val in xo]);
ryo=np.array([np.squeeze(np.min(latout[np.where(np.abs(latout-val)==np.abs(latout-val).min())])) for val in yo]);
rcoords=np.unique(rxo+1j*ryo);
rxo=np.real(rcoords);ryo=np.imag(rcoords);
# ------------------------------------------------------
ixo=[int(np.squeeze(np.where(lonin==val))) for val in rxo];
jxo=[int(np.squeeze(np.where(latin==val))) for val in ryo];
# ------------------------------------------------------
# get var data along transect:
trans_data=np.array([ncin.variables['var'][jxo[ii],ixo[ii]] for ii in range(len(ixo))]);
# ------------------------------------------------------
ncin.close()
# ================================================================================================================
# Another solution using interpolation, when we already know the target coordinates (original coordinates along the transect):
from scipy.interpolate import griddata
ncin = Dataset('test.nc');
lonin=ncin.variables['lon'][:];
latin=ncin.variables['lat'][:];
varin=ncin.variables['var'][:];
ncin.close()
# ----------------------------------------------------------------------------------------------------------------
lonm,latm = np.meshgrid(lonin,latin);
trans_data_b=griddata((lonm.flatten(),latm.flatten()),varin.flatten(),(rxo,ryo),'nearest')

how to isolate data that are 2 and 3 sigma deviated from mean and then mark them in a plot in python?

I am reading from a dataset which looks like the following when plotted in matplotlib and then taken the best fit curve using linear regression.
The sample of data looks like following:
# ID X Y px py pz M R
1.04826492772e-05 1.04828050287e-05 1.048233088e-05 0.000107002791008 0.000106552433081 0.000108704469007 387.02 4.81947797625e+13
1.87380963036e-05 1.87370588085e-05 1.87372620448e-05 0.000121616280029 0.000151924707761 0.00012371156585 428.77 6.54636174067e+13
3.95579877816e-05 3.95603773653e-05 3.95610756809e-05 0.000163470663023 0.000265203868883 0.000228031803626 470.74 8.66961875758e+13
My code looks the following:
# Regression Function
def regress(x, y):
#Return a tuple of predicted y values and parameters for linear regression.
p = sp.stats.linregress(x, y)
b1, b0, r, p_val, stderr = p
y_pred = sp.polyval([b1, b0], x)
return y_pred, p
# plotting z
xz, yz = M, Y_z # data, non-transformed
y_pred, _ = regress(xz, np.log(yz)) # change here # transformed input
plt.semilogy(xz, yz, marker='o',color ='b', markersize=4,linestyle='None', label="l.o.s within R500")
plt.semilogy(xz, np.exp(y_pred), "b", label = 'best fit') # transformed output
However I can see a lot upward scatter in the data and the best fit curve is affected by those. So first I want to isolate the data points which are 2 and 3 sigma away from my mean data, and mark them with circle around them.
Then take the best fit curve considering only the points which fall within 1 sigma of my mean data
Is there a good function in python which can do that for me?
Also in addition to that may I also isolate the data from my actual dataset, like if the third row in the sample input represents 2 sigma deviation may I have that row as an output too to save later and investigate more?
Your help is most appreciated.
Here's some code that goes through the data in a given number of windows, calculates statistics in said windows, and separates data in well- and misbehaved lists.
Hope this helps.
from scipy import stats
from scipy import polyval
import numpy as np
import matplotlib.pyplot as plt
num_data = 10000
fake_data_x = np.sort(12.8+np.random.random(num_data))
fake_data_y = np.exp(fake_data_x) + np.random.normal(0,scale=50000,size=num_data)
# Regression Function
def regress(x, y):
#Return a tuple of predicted y values and parameters for linear regression.
p = stats.linregress(x, y)
b1, b0, r, p_val, stderr = p
y_pred = polyval([b1, b0], x)
return y_pred, p
# plotting z
xz, yz = fake_data_x, fake_data_y # data, non-transformed
y_pred, _ = regress(xz, np.log(yz)) # change here # transformed input
plt.figure()
plt.semilogy(xz, yz, marker='o',color ='b', markersize=4,linestyle='None', label="l.o.s within R500")
plt.semilogy(xz, np.exp(y_pred), "b", label = 'best fit') # transformed output
plt.show()
num_bin_intervals = 10 # approx number of averaging windows
window_boundaries = np.linspace(min(fake_data_x),max(fake_data_x),int(len(fake_data_x)/num_bin_intervals)) # window boundaries
y_good = [] # list to collect the "well-behaved" y-axis data
x_good = [] # list to collect the "well-behaved" x-axis data
y_outlier = []
x_outlier = []
for i in range(len(window_boundaries)-1):
# create a boolean mask to select the data within the averaging window
window_indices = (fake_data_x<=window_boundaries[i+1]) & (fake_data_x>window_boundaries[i])
# separate the pieces of data in the window
fake_data_x_slice = fake_data_x[window_indices]
fake_data_y_slice = fake_data_y[window_indices]
# calculate the mean y_value in the window
y_mean = np.mean(fake_data_y_slice)
y_std = np.std(fake_data_y_slice)
# choose and select the outliers
y_outliers = fake_data_y_slice[np.abs(fake_data_y_slice-y_mean)>=2*y_std]
x_outliers = fake_data_x_slice[np.abs(fake_data_y_slice-y_mean)>=2*y_std]
# choose and select the good ones
y_goodies = fake_data_y_slice[np.abs(fake_data_y_slice-y_mean)<2*y_std]
x_goodies = fake_data_x_slice[np.abs(fake_data_y_slice-y_mean)<2*y_std]
# extend the lists with all the good and the bad
y_good.extend(list(y_goodies))
y_outlier.extend(list(y_outliers))
x_good.extend(list(x_goodies))
x_outlier.extend(list(x_outliers))
plt.figure()
plt.semilogy(x_good,y_good,'o')
plt.semilogy(x_outlier,y_outlier,'r*')
plt.show()

How to do a second interpolation in python

I did my first interpolation with numpy.polyfit() and numpy.polyval() for 50 longitude values for a full satellite orbit.
Now, I just want to look at a window of 0-4.5 degrees longitude and do a second interpolation so that I have 6,000 points for longitude in the window.
I need to use the equation/curve from the first interpolation to create the second one because there is only one point in the window range. I'm not sure how to do the second interpolation.
Inputs:
lon = [-109.73105744378498, -104.28690174554579, -99.2435132929552, -94.48533149079628, -89.91054414962821, -85.42671400689177, -80.94616150449806, -76.38135021210172, -71.6402674905218, -66.62178379632216, -61.21120467960157, -55.27684029674759, -48.66970878028004, -41.23083703244677, -32.813881865289346, -23.332386757370532, -12.832819226213942, -1.5659455609661785, 10.008077792630402, 21.33116444634303, 31.92601575632583, 41.51883213364072, 50.04498630545507, 57.58103957109249, 64.26993028992476, 70.2708323505337, 75.73441871754586, 80.7944079829813, 85.56734813043659, 90.1558676264546, 94.65309120129724, 99.14730128118617, 103.72658922048785, 108.48349841714494, 113.51966824008079, 118.95024882101737, 124.9072309203375, 131.5395221402974, 139.00523971191907, 147.44847902856114, 156.95146022590976, 167.46163867248032, 178.72228750873975, -169.72898181991064, -158.44642409799974, -147.8993300787564, -138.35373014113995, -129.86955508919888, -122.36868103811106, -115.70852432245486]
myOrbitJ2000Time = [ 20027712., 20027713., 20027714., 20027715., 20027716.,
20027717., 20027718., 20027719., 20027720., 20027721.,
20027722., 20027723., 20027724., 20027725., 20027726.,
20027727., 20027728., 20027729., 20027730., 20027731.,
20027732., 20027733., 20027734., 20027735., 20027736.,
20027737., 20027738., 20027739., 20027740., 20027741.,
20027742., 20027743., 20027744., 20027745., 20027746.,
20027747., 20027748., 20027749., 20027750., 20027751.,
20027752., 20027753., 20027754., 20027755., 20027756.,
20027757., 20027758., 20027759., 20027760., 20027761.]
Code:
deg = 30 #polynomial degree for fit
fittime = myOrbitJ2000Time - myOrbitJ2000Time[0]
'Longitude Interpolation'
fitLon = np.polyfit(fittime, lon, deg) #gets fit coefficients
polyval_lon = np.polyval(fitLon,fittime) #interp.s to get actual values
'Get Longitude values for a window of 0-4.5 deg Longitude'
lonwindow =[]
for i in range(len(polyval_lon)):
if 0 < polyval_lon[i] < 4.5: # get lon vals in window
lonwindow.append(polyval_lon[i]) #append lon vals
lonwindow = np.array(lonwindow)
First, generate the polynomial fit coefficients using the old time (x-axis) values, and interpolated longitude (y-axis) values.
import numpy as np
import matplotlib.pyplot as plt
poly_deg = 3 #degree of the polynomial fit
polynomial_fit_coeff = np.polyfit(original_times, interp_lon, poly_deg)
Next, use np.linspace() to generate arbitrary time values based on the number of desire points in the window.
start = 0
stop = 4
num_points = 6000
arbitrary_time = np.linspace(start, stop, num_points)
Finally, use the fit coefficients and the arbitrary time to get the actual interpolated longitude (y-axis) values and plot.
lon_intrp_2 = np.polyval(polynomial_fit_coeff, arbitrary_time)
plt.plot(arbitrary_time, lon_intrp_2, 'r') #interpolated window as a red curve
plt.plot(myOrbitJ2000Time, lon, '.') #original data plotted as points

linear interpolation with grided data in python

I've a gridded weather data set which have a dimension 33 X 77 X 77. The first dimension is time and rest are Lat and Lon respectively. I need to interpolate (linear or nearest neighbour) the data to different points (lat&lon) for each time and write it into a csv file. I've used interp2d function from scipy and it is successful for one time step. As I've many locations I don't want to loop over time.
below shown is the piece of code that I wrote, Can any one suggest a better method to accomplish the task?
import sys ; import numpy as np ; import scipy as sp ; from scipy.interpolate import interp2d ;import datetime ; import time ; import pygrib as pg ;
grb_f=pg.open('20150331/gfs.20150331.grb2') lat=tmp[0].data(lat1=4,lat2=42,lon1=64,lon2=102)[1] ; lat=lat[:,0];
lon=tmp[0].data(lat1=4,lat2=42,lon1=64,lon2=102)[2] ; lon=lon[0,:] ;
temp=np.empty((0,lon.shape[0]))
for i in range(0,tmp.shape[0]):
dat=tmp[i].data(lat1=4,lat2=42,lon1=64,lon2=102)
temp=np.concatenate([temp,dat[0]-273.15],axis=0)
temp1=temp.reshape(tmp.shape[0],lat.shape[0],lon.shape[0])
x=77 ; y=28 #(many points)
f=interp2d(lon,lat, temp1[0,:,:],kind='linear',copy=False,bounds_error=True ) ; Z=f(x,y)
EDIT ::
Instead of making a 3D matrix, I appended the data in vertically and made data matrix of size 2541 X 77 and lat and lon of size 2541 X 1. the interp2d function gives Invalid length Error.
f=interp2d(lon,lat, temp1[0,:,:],kind='linear',copy=False,bounds_error=True )
"Invalid length for input z for non rectangular grid")
ValueError: Invalid length for input z for non rectangular grid
length of my x,y,z matrix are same (2541,2541,2541). Then why did it throw an Error?
Could any one explain ? Your help will be highly appreciated.
Processing of time series is very easy with RedBlackPy.
import datetime as dt
import redblackpy as rb
index = [dt.date(2018,1,1), dt.date(2018,1,3), dt.date(2018,1,5)]
lat = [10.0, 30.0, 50.0]
# create Series object
lat_series = rb.Series(index=index, values=lat, dtype='float32',
interpolate='linear')
# Now you can access at any key using linear interpolation
# Interpolation does not create new items in Series
# It uses neighbours to calculate value inplace when you call getitem
print(lat_series[dt.date(2018,1,2)]) #prints 20
So, if you want to just write interpolated values to csv file, you can iterate over list of needed keys and call getitem of Series object then put value to file:
# generator for dates range
def date_range(start, stop, step=dt.timedelta(1)):
it = start - step
while it < step:
it += step
yield it
#------------------------------------------------
# create list for keeping output strings
out_data = []
# create output file
out_file = open('data.csv', 'w')
# add head for output table
out_data.append('Time,Lat\n')
for date in date_range(dt.date(2018,1,1), dt.date(2018,1,5)):
out_data.append( '{:},{:}\n'.format(date, lat_series[date]) )
# write output Series
out_file.writelines(out_data)
out_file.close()
By the same way you can add to your processing Lon data.
If you want to create an "interpolator" object once, and use it to sequentially query just the specific points you need, you could take a loot at the scipy.interpolate.Rbf module:
"A class for radial basis function approximation/interpolation of n-dimensional scattered data."
Where n-dimensional would work for your data if you adjust ratio between temporal and spatial dimensions, and scattered meaning you can also use it for regular/uniform data.
If it's the same lat and lon for each time could you do it using slices and a manual interpolation. So if you want a 1D array of values at lat = 4.875, lon = 8.4 (obviously you would need to scale to match your actual spacing)
b = a[:,4:6, 8:10]
c = ((b[:,0,0] * 0.125 + b[:,0,1] * 0.875) * 0.6 + ((b[:,1,0] * 0.125 + b[:,1,1] * 0.875) * 0.4)
obviously you could do it all in one line but it would be even uglier
EDIT to allow variable lat and lon at each time period.
lat = np.linspace(55.0, 75.0, 33)
lon = np.linspace(1.0, 25.0, 33)
data = np.linspace(18.0, 25.0, 33 * 77 * 77).reshape(33, 77, 77)
# NB for simplicity I map 0-360 and 0-180 rather than -180+180
# also need to ensure values on grid lines or edges work ok
lat_frac = lat * 77.0 / 360.0
lat_fr = np.floor(lat_frac).astype(int)
lat_to = lat_fr + 1
lat_frac -= lat_fr
lon_frac = lon * 77.0 / 180.0
lon_fr = np.floor(lon_frac).astype(int)
lon_to = lon_fr + 1
lon_frac -= lon_fr
data_interp = ((data[:,lat_fr,lon_fr] * (1.0 - lat_frac) +
data[:,lat_fr,lon_to] * lat_frac) * (1.0 - lon_frac) +
(data[:,lat_to,lon_fr] * (1.0 - lat_frac) +
data[:,lat_to,lon_to] * lat_frac) * lon_frac)

Categories