Xarray select value based on variable - python

I have a .nc file that I open with xarray as a dataset. This dataset has 3 variables:
Band (5000x300x250)
latitude (300x250)
longitude (300x250)
Its dimensions are:
time (5000)
y (300)
x (250)
I created the dataset myself and made a mistake, because I would like to "grab" the timeseries of a specific point of "Band" based on its coordinates value:
dataset.Band.sel(longitude=6.696e+06,latitude=4.999e+05,method='nearest')
(I based the values to grab on the first values of both variables).
The issue is that when I created the .nc file, I did not enter the latitude and longitude as dimensions but as variables. Is there a way to use my code but modify a few things so I can grab the point based on the nearest values of variables latitude and longitude ? Or should I redefine completely the dimensions of my .nc to replace x and y by longitude and latitude?

there isn't a great way to select data using the lat/lon values - as your data is structured you essentially have mutlidimensional coordinates.
That said, if your lat/lon are actually only indexed by x OR y; that is, latitude has the same value repeated over and over for all levels of x and same for longitude with y, you could reorganize your data pretty easily:
lats = dataset.latitude.mean(dim='x')
lons = dataset.longitude.mean(dim='y')
dataset = dataset.drop(['latitude', 'longitude'])
dataset.coords['latitude'] = latitude
dataset.coords['longitude'] = longitude
dataset = dataset.swap_dims({'x': 'longitude', 'y': 'latitude'})
At this point, your data is indexed by time, latitude, longitude and you can select the data how you'd like

Related

Produce bin averaged latitude and longitude grids

I have a set of data with each row representing a series of observations taken at a particular point. Each column represents a different observation as well as information regarding where the data was collected (longitude and latitude (column 3 and 4 respectively). The data needs to be gridded so that the data is over 5degrees latitude 5degrees longitude bin-averaged grids. How would I go about doing this in python?
I don't know how to solve this. hope someone could help me.thanks so much

Xarray find lat/lon coordinates for maximum/minimum values for each timestep

As the title says, supposing I have a ds with coords: [time lat lon], how can I obtain for each timestep in time the pair of ['lat','lon'] in which the maximum(or minimum) value for a given variable is located.
Use xr.Dataset.idxmax to find the index label of the maximum along a dimension (one at a time). Same for xr.Dataset.idxmin.
max_lons = ds.max(dim="lat").idxmax(dim="lon")
max_lats = ds.max(dim="lon").idxmax(dim="lat")
The results will be datasets, with each variable giving the lon or lat coresponding to the maximum in each time step for that variable.

Is there any way to fill values from a finer resolution grid to coarse resolution grid using nearest neighbour in python?

I am working with MODIS active fire data (data resolution 1 km). After deriving meaningful information from it, I have with me an array of size (72x4797x4797) [time x lat x lon], and mesh for lat (4797 x 4797) and lon (4797 x 4797). The latitude mesh decrease from 40 N to O with uniform dy of 0.0108 so that mesh have values changing in rows while each column is same as other. However, the longitude mesh has values changing in rows and columns which I guess is because for each latitude, the value of longitude is different because of satellite swath.
My objective is to have this data on WRF grid (lat 129 x lon 109 at 30 km resolution). The data has NaN for all points with no fire and values at points of active fire. Using scipy interpolation with griddata returns an array of all NaNs which is of no use as all information is lost.
To make a coarse resolution data on new lat-lon, I am trying to have use a nearest neighbour approach, for example if multiple active fires are present in fine grid, assign that lat-lon to nearest coarse grid lat-lon while taking average of all such points in that coarse grid square.
In the code below,
lon_wrf, lat_wrf are new longitude and latitude of interest at coarse
resolution.
lon_mosaic, lat_mosaic are both 2-D arrays of longitude
and latitude at fine resolution.
Block_OC_day is a 2-D array to be
made on newer coarse resolution.
I am first finding all the locations of fire that are not NaN and extracting the lon and lat for these locations. Next I am finding the nearest lon and lat in target grid. The function gives me the indices and values. Next I would like is averaging the value of Block_OC_day on all points that have same lat-lon on target grid.
lat_1d = lat_mosaic[:,0] # making 2-D latitude mesh to 1-D
fire_loc = np.argwhere(~np.isnan(Block_OC_day)) # finding all locations with non-NaN
fire_lat = lat_1d[fire_loc[:,0]] # finding latitude of fine resolution
fire_lon = lon_mosaic[fire_loc[:,0],fire_loc[:,1]] # finding longitue of fine resolution
# Function to find nearest values and index
def nearest(value_to_search,lookup_array):
'''Finds nearest value and index closest to a value in an array'''
idx = np.argmin(np.abs(lookup_array-value_to_search))
closest_value = lookup_array[idx]
return closest_value, idx
# Storing target latitude and index
va_lat = []
li_lat = []
for i in range(0,len(fire_lat)):
a, idx = nearest(fire_lat[i],lat_wrf)
va_lat.append(a)
li_lat.append(idx)
# storing target longitude and index
va_lon = []
li_lon = []
for i in range(0,len(fire_lon)):
a, idx = nearest(fire_lon[i],lon_wrf)
va_lon.append(a)
li_lon.append(idx)
I have found a solution to solve the problem. It involves finding common indices of lat-lon and then fill them with averaged values of all points in fine-grid.
There may be a better solution but this code works fine.
# Using specific indices
i = 108
j = 128
# np.argwhere(li_lon==i)
# np.argwhere(li_lat == j)
inter = np.intersect1d(np.argwhere(li_lon==i), np.argwhere(li_lat == j)) # finding locations to replace
# fire_lon[inter]
# fire_lat[inter]
NoNaNFire = Block_OC_day[~np.isnan(Block_OC_day)] # extracting values at those locations
AvgFire = np.nanmean(NoNaNFire[inter]) # averaging all fine-grid points
# AvgFire returns the average value of variable at i and j of coarse grid
# For any 2-D coarse grid
Block_WRF = np.empty((lat_wrf.shape[0],lon_wrf.shape[0]))
for i in range(0,len(lon_wrf)):
for j in range(0,len(lat_wrf)):
inter = np.intersect1d(np.argwhere(li_lon==i), np.argwhere(li_lat == j))
Block_WRF[j,i] = np.nanmean(NoNaNFire[inter])

How to make groups using latitude and longitude coordinates?

I have a time series dataset where the pickup and drop off latitude and longitude coordinates are given.
Since coordinates of a city hardly vary, how to categorize them in python?
I want to make groups so that Classification algorithm can be applied.
I am pasting a single row of pick up and drop off longitude and latitude coordinates of New York city.
-73.973052978515625 40.793209075927734 -73.972923278808594 40.782520294189453
I have fixed the range of latitude from 40.6 to 40.9 and longitude range from -73.9 to -74.25
Now, I want to make them into groups so that classification algorithm can be applied.
For example you can insert you coordinates in a list of tuples called coordinates. Please note that I have appended also a couple of coordinates out of range. Here is the code:
coordinates = [
(-73.973052978515625,40.793209075927734),
(-73.972923278808594,40.782520294189453),
(-75.9,40.7)
]
filtered = list()
# filtering coordinates
for c in coordinates:
if -74.25 <= c[0] <= -73.9 and 40.6 <= c[1] <= 40.9:
filtered.append(c)
print filtered # here you have your filtered coordinates
Output:
[(-73.97305297851562, 40.793209075927734), (-73.9729232788086, 40.78252029418945)]

Can we convert longitude and latitude to cartesian co-ordinates without elevation?

Can I convert from longitude/latitude (x,y) coordinates to cartesian (x,y,z) without having elevation. I have checked some forms discussing converting longitude/latitude into cartesian coordinates, and here's the code in python.
R = numpy.float64(6371000) # in meters
longitude = numpy.float64(lon)
latitude = numpy.float64(lat)
X = R * math.cos(longitude) * math.sin(latitude)
Y = R * math.sin(latitude) * math.sin(longitude)
Z = R * math.cos(latitude)
Problem Statement: I have data gathered from different locations. However, these locations are in longitude and latitude format. Are these two attributes are enough to convert the locations into cartesian format ?. Is the code above is correct ?
You need the distance from the center of the Earth to convert latitude+longitude to X,Y,Z (either Earth-Centered, Earth-Fixed or Earth-Centered Inertial). Earth-Centered Inertial also requires the time to convert from longitude to angle.
The reason for this is simple: you need three independent variables since it's a 3D coordinate system. Latitude and longitude are only two variables, you need the distance from the center (R in your equations above) to do that.
If you are looking for ground coordinates, you can use the Google Elevation API to get R in your equations above. But either way you need this information for the coordinate transform.

Categories