How to insert TXT data into netcdf in python

How to insert TXT data into netcdf in python - python

I'm new to python, so I'm sorry if I make any beginner mistakes. I'm trying to insert my text file into a netcdf.
I'm using the netcdf4 package and follow the example in this website: https://pyhogs.github.io/intro_netcdf4.html and I managed to reproduce the example (the example uses random data):
Problem: My text file contains: Lon, Lat , SST and when I try to insert this values, the netcdf file is created, however, it's not correct:
In my code I'm trying to apply a Barnes interpolation (var) or a griddata interpolation (interp).
I think this is what has to enter in my variable netcdf (maybe I'm wrong).
Here my code so far:
import os
import numpy as np
from scipy.interpolate import griddata
import matplotlib.pyplot as plt
import numpy.ma as ma
import netCDF4 as nc4
from numpy.random import uniform, seed
from metpy.interpolate import (interpolate_to_grid, remove_nan_observations, inverse_distance_to_grid, remove_repeat_coordinates)
# Open file
arq_sst = np.loadtxt(fname = "C:\\Users\\Rodrigo\\XYZ.txt", skiprows=0, delimiter=",")
# Getting the Arrays
lonf = arq_sst[:, 0]
latf = arq_sst[:, 1]
sstf = arq_sst[:, 2]
# Atmosphere level
z = [1]
#shapping grid
x_1, y_1 = np.meshgrid(lonf, latf)
#Barnes Interpolation
var = inverse_distance_to_grid(lonf, latf, sstf, x_1, y_1, r=100000, gamma=0.25, kappa=5.052, min_neighbors=3, kind='barnes')
#Or
#Another interpolation
interp = griddata((lonf, latf), sstf, (lonf[None,:], latf[:,None]), method='nearest')
#Open netcdf to write
f = nc4.Dataset('file_created.nc','w', format='NETCDF4')
#Creating group in netcdf file
tempgrp = f.createGroup('SAT_DATA')
#Specifying dimensions
tempgrp.createDimension('lon', len(lonf))
tempgrp.createDimension('lat', len(latf))
tempgrp.createDimension('z', len(z))
tempgrp.createDimension('time', None)
#Building variables
longitude = tempgrp.createVariable('Longitude', 'f4', 'lon')
latitude = tempgrp.createVariable('Latitude', 'f4', 'lat')
levels = tempgrp.createVariable('Levels', 'i4', 'z')
sst = tempgrp.createVariable('sst', 'f4', ('time', 'lon', 'lat', 'z'))
time = tempgrp.createVariable('Time', 'i4', 'time')
#Passing data into variables
longitude[:] = lonf
latitude[:] = latf
levels[:] = z
sst[0,:,:,:] = var
#get time in days since Jan 01,01
from datetime import datetime
today = datetime.today()
time_num = today.toordinal()
time[0] = time_num
#Add global attributes
f.description = "XYZ dataset containing one group"
f.history = "Created " + today.strftime("%d/%m/%y")
#Add local attributes to variable instances
longitude.units = 'degrees east'
latitude.units = 'degrees north'
time.units = 'days since Jan 01, 0001'
sst.units = 'degrees'
levels.units = 'meters'
sst.warning = 'This data is not real!'
#Closing the dataset
f.close()
Here is my text data(Header: Longitude,Latitude,SST). I decreased the number of lines to fit here:
-42.1870,-22.9940,22.4844
-37.4000,-29.9700,20.2000
-37.4200,-29.9600,20.1000
-39.1800,-30.0000,20.5000
-39.2100,-30.0000,20.4000
-39.2300,-30.0000,20.4000
-39.2200,-29.9800,20.4000
-39.2300,-29.9900,20.4000
-39.2000,-29.9800,20.4000
-39.1900,-30.0000,20.5000
-39.2800,-29.9900,20.5000
-39.2700,-29.9900,20.4000
-39.3400,-29.9700,20.5000
-39.3300,-29.9600,20.4000
-39.3100,-29.9600,20.4000
-39.3600,-29.9700,20.6000
-39.3500,-29.9900,20.4000
-39.3900,-29.9900,20.4000
-38.4600,-30.0000,20.3000
-38.4900,-29.9800,20.7000
-37.4800,-29.8800,20.4000
-37.5000,-29.8600,20.3000
-37.4600,-29.8900,20.3000
-41.3800,-29.9900,20.0000
-41.4000,-29.9900,20.1000
-41.0400,-29.9300,20.1000
-41.0200,-29.9200,20.2000
-41.0600,-29.9300,20.1000
-41.1000,-29.9400,19.9000
-41.0900,-29.9600,19.9000
-41.1100,-29.9800,19.9000
-41.1100,-29.9600,20.0000
-41.1200,-29.9400,20.0000
-41.1400,-29.9400,20.0000
-41.1600,-29.9500,20.1000
-41.1700,-29.9500,20.1000
-41.1900,-29.9700,20.0000
-41.1900,-29.9500,20.1000
-40.6800,-29.9900,20.1000
-40.7400,-29.9600,20.1000
-40.7700,-29.9700,20.1000
-40.7800,-29.9700,20.1000
-40.7100,-29.9000,20.1000
-40.7600,-29.9100,20.1000
-40.7400,-29.9000,20.1000
-40.7200,-29.9000,20.2000
-40.7600,-29.9200,20.1000
-40.7500,-29.9400,20.1000
-40.7800,-29.9100,20.2000
-40.8000,-29.9100,20.2000
-40.8100,-29.9300,20.1000
-40.8200,-29.9200,20.2000
-40.7900,-29.9300,20.2000
-40.7900,-29.9500,20.1000
-40.7700,-29.9300,20.1000
-40.8400,-29.9600,20.2000
-40.8600,-29.9600,20.3000
-40.9000,-29.9100,20.1000
-40.9100,-29.9100,20.0000
-40.3900,-29.9400,20.0000
-40.3900,-29.9200,20.0000
-40.4100,-29.9200,20.0000
-40.4100,-29.9400,20.0000
-40.3800,-29.9000,20.0000
-40.3800,-29.9200,20.0000
-40.4000,-29.9000,20.1000
-40.3700,-29.9600,20.0000
-40.3600,-29.9700,20.0000
-40.3800,-29.9800,20.0000
-40.4200,-29.9000,20.0000
-40.4300,-29.9300,20.1000
-40.4500,-29.9300,20.1000
-40.4700,-29.9300,20.0000
-40.4400,-29.9100,20.0000
-40.4500,-29.9100,20.0000
-40.4700,-29.9100,20.0000
-40.5000,-29.9400,19.9000
-40.5300,-29.9200,20.1000
-40.5100,-29.9200,20.1000
-40.4900,-29.9400,19.9000
-40.4900,-29.9200,20.0000
-40.6200,-30.0000,20.2000
-40.6000,-30.0000,20.1000
-40.6800,-29.9900,20.1000
-40.4000,-29.8400,20.1000
-40.4800,-29.8700,20.1000
-40.4500,-29.8300,20.3000
-40.4600,-29.8900,20.1000
-40.4600,-29.8700,20.0000
-40.5000,-29.8800,20.3000
-40.4900,-29.9000,20.1000
-40.5100,-29.9000,20.3000
-40.5300,-29.9000,20.2000
-40.5600,-29.8500,20.3000
-40.5800,-29.8500,20.3000
-40.6300,-29.9000,19.9000
-40.7100,-29.9000,20.1000
-40.0500,-29.9600,20.3000
-40.1100,-29.9800,20.2000
-40.1100,-30.0000,20.2000
Can anybody help me?

So there are a couple of things. First of all, you are not providing the correct equally spaced dimensions for the interpolation and the resulting netCDF file. This is how I created the space for the meshgrid, (I chose a linear space of 100 but depending on what resolution you want your data you may want to change this to whatever suits your purpose):
spacing_x = np.linspace(np.min(lonf),np.max(lonf),100)
spacing_y = np.linspace(np.min(latf),np.max(latf),100)
x_1, y_1 = np.meshgrid(spacing_x, spacing_y)
Then doing the interpolation as follows:
#Barnes Interpolation
var = inverse_distance_to_grid(lonf, latf, sstf, x_1, y_1, r=100000, gamma=0.25, kappa=5.052, min_neighbors=3, kind='barnes')
#Or
#Another interpolation
interp = griddata((lonf, latf), sstf, (x_1, y_1), method='nearest')
Finally you will want to add the linear spaces as the latitude and longitude dimensions since the interpolated data is being broadcasted to them:
#Passing data into variables
longitude[:] = x_1[0]
latitude[:] = y_1[:,0]
Another note is that for Panoply or other software to show your data in a Geo2D format, you will want to name your lat lon dimensions the same as your variables. The full code is below:
import os
import numpy as np
from scipy.interpolate import griddata
import matplotlib.pyplot as plt
import numpy.ma as ma
import netCDF4 as nc4
from numpy.random import uniform, seed
from metpy.interpolate import (interpolate_to_grid, remove_nan_observations, inverse_distance_to_grid, remove_repeat_coordinates)
# Open file
arq_sst = np.loadtxt(fname = r"C:\Users\Rodrigo\XYZ.txt", skiprows=0, delimiter=",")
# Getting the Arrays
lonf = arq_sst[:, 0]
latf = arq_sst[:, 1]
sstf = arq_sst[:, 2]
# Atmosphere level
z = [1]
#shapping grid
spacing_x = np.linspace(np.min(lonf),np.max(lonf),100)
spacing_y = np.linspace(np.min(latf),np.max(latf),100)
x_1, y_1 = np.meshgrid(spacing_x, spacing_y)
#Barnes Interpolation
var = inverse_distance_to_grid(lonf, latf, sstf, x_1, y_1, r=100000, gamma=0.25, kappa=5.052, min_neighbors=3, kind='barnes')
#Or
#Another interpolation
interp = griddata((lonf, latf), sstf, (x_1, y_1), method='nearest')
#Open netcdf to write
f = nc4.Dataset('file_created.nc','w', format='NETCDF4')
#Creating group in netcdf file
tempgrp = f.createGroup('SAT_DATA')
#Specifying dimensions
tempgrp.createDimension('longitude', len(spacing_x))
tempgrp.createDimension('latitude', len(spacing_y))
tempgrp.createDimension('z', len(z))
tempgrp.createDimension('time', None)
#Building variables
longitude = tempgrp.createVariable('longitude', 'f8', 'longitude', fill_value=np.nan)
latitude = tempgrp.createVariable('latitude', 'f8', 'latitude', fill_value=np.nan)
levels = tempgrp.createVariable('z', 'i4', 'z')
sst = tempgrp.createVariable('sst', 'f8', ('time','longitude','latitude','z'), fill_value=np.nan)
time = tempgrp.createVariable('time', 'f8', 'time', fill_value=np.nan)
#Passing data into variables
longitude[:] = x_1[0]
latitude[:] = y_1[:,0]
levels[:] = z
sst[0,:,:,:] = var
#get time in days since Jan 01,01
from datetime import datetime
today = datetime.today()
time_num = today.toordinal()
time[0] = time_num
#Add global attributes
f.description = "XYZ dataset containing one group"
f.history = "Created " + today.strftime("%d/%m/%y")
#Add local attributes to variable instances
longitude.units = 'degrees_east'
longitude.point_spacing = "even";
longitude._CoordinateAxisType = "Lon";
latitude.units = 'degrees_north'
latitude.point_spacing = "even";
latitude._CoordinateAxisType = "Lat";
time.units = "days since Jan 01, 0001";
time._ChunkSizes = [1]
sst.long_name = "SEA SURFACE TEMPERATURE"
sst.history = "From coads_climatology"
sst.units = "Deg C";
sst.missing_value = -1.0
sst._ChunkSizes = [1, 100, 100]
levels.units = 'meters'
sst.warning = 'This data is not real!'
#Closing the dataset
f.close()
Let me know if you have any questions.

Related

How to collocate large datasets most efficiently, comparing time, latitude (x), and longitude (y)

I would like some help trying to efficiently collocate two datasets, one is let's say observations of rainfall, in terms of datetime, latitude and longitude. The other is meteorological data e.g. reanalysis given also in terms of datetime, latitude and longitude. Below I provide two example random df and xarrays and then collocate them.
from numpy.random import rand
from random import randint
from datetime import datetime, timedelta
import xarray as xr
import numpy as np
#create example data of the dataframe we want to collocate with the meterological data
datetimes = pd.date_range(start='2002-01-01 10:00:00', end='2002-01-05 10:00:00', freq='H')
rainfall = rand(len(datetimes))
latitudes = [randint(0, 90) for p in range(0, len(datetimes))]
longitudes = [randint(0, 180) for p in range(0, len(datetimes))]
df_obs = pd.DataFrame({'datetime':datetimes, 'rainfall':rainfall, 'latitude':latitudes,
'longitude':longitudes})
#create an xarray which is the example met data
met_type = np.ones((720, 1440))
rainfall = rand(len(datetimes))
met_list = [x*met_type for x in rainfall]
def produce_xarray(met_list, datetimes, met_type='rain', datetime_var="datetime"): [![enter image description here][1]][1]
if isinstance(datetimes[0], datetime) == False:
dates = [datetime.strptime(x, '%Y%m') for x in datetimes]
if isinstance(datetimes[0], datetime) == True:
dates = datetimes
met_list_dstack = np.dstack(met_list)
lats = np.arange(90, -90, -0.25)
lons = np.arange(-180,180, 0.25)
ds = xr.Dataset(data_vars={met_type:(["latitude","longitude",datetime_var], met_list_dstack),},
coords={"latitude": lats, "longitude": lons, datetime_var: dates})
ds[met_type].attrs["units"] = "g "+str(met_type)+"m$^{-2}$"
return ds
xr_met = produce_xarray(met_list, datetimes, datetime_var="datetime")
#now I wish to collocate the data as quickly as possible, as my datasets are huge -
#here I have a function which finds the closest value using the datetime, latitude and longitude
#the I apply this function to the df of my random observations
var ='rain'
def find_value_lat_lon(lat, lon, traj_datetime):
array = xr_met[var].sel(latitude=lat, longitude=lon, datetime=traj_datetime, method='nearest').squeeze()
value = array.values
return value
def append_var_columnwise(df, var_name):
df = df.copy()
df.loc[:, var_name] = df[['latitude', 'longitude', 'datetime']].apply(lambda x: find_value_lat_lon(*x),
axis=1)
return df
print(df_obs)
print(xr_met)
df_obs = append_var_columnwise(df_obs, var_name='rain_met')
print(df_obs)
The final output is shown in the picture - whereby the df has an additional column with 'rain met' - for 97 data points this takes 212ms.

I don't know that it is any faster, but .sel supports vectorized indexing (see https://docs.xarray.dev/en/stable/user-guide/indexing.html#vectorized-indexing : the last example in this section is a 2D version of your code)
df.loc[:, var_name] = xr_met[var].sel(
latitude=xr.DataArray(df['latitude']),
longitude=xr.DataArray(df['longitude']),
datetime=xr.DataArray(df['datetime']),
method='nearest')

Reverse Array in a dataframe

Hi I am trying to extract data from a netCDF file, but the data is upside down. How can I reverse the database:
The data I want to extract is the height data from the (netcdf) at the points I have in the CSV file. my Data:
import numpy as np
from netCDF4 import Dataset
import matplotlib.pyplot as plt
import pandas as pd
from mpl_toolkits.basemap import Basemap
from matplotlib.patches import Path, PathPatch
csv_data = np.loadtxt('CSV with target coordinates',skiprows=1,delimiter=',')
num_el = csv_data[:,0]
lat = csv_data[:,1]
lon = csv_data[:,2]
value = csv_data[:,3]
data = Dataset("elevation Data",'r')
lon_range = data.variables['x_range'][:]
lat_range = data.variables['y_range'][:]
topo_range = data.variables['z_range'][:]
spacing = data.variables['spacing'][:]
dimension = data.variables['dimension'][:]
z = data.variables['z'][:]
lon_num = dimension[0]
lat_num = dimension[1]
etopo_lon = np.linspace(lon_range[0],lon_range[1],dimension[0])
etopo_lat = np.linspace(lat_range[0],lat_range[1],dimension[1])
topo = np.reshape(z, (lat_num, lon_num))
height = np.empty_like(num_el)
desired_lat_idx = np.empty_like(num_el)
desired_lon_idx = np.empty_like(num_el)
for i in range(len(num_el)):
tmp_lat = np.abs(etopo_lat - lat[i]).argmin()
tmp_lon = np.abs(etopo_lon - lon[i]).argmin()
desired_lat_idx[i] = tmp_lat
desired_lon_idx[i] = tmp_lon
height[i] = topo[tmp_lat,tmp_lon]
height[height<-10]=0
print(len(desired_lat_idx))
print(len(desired_lon_idx))
print(len(height))
dfl= pd.DataFrame({
'Latitude' : lat.reshape(-1),
'Longitude': lon.reshape(-1),
'Altitude': height.reshape(-1)
});
print(dfl)
# but the Lat should not be changed here (the dfl must be correct)
df =dfl
lat=np.array(df['Latitude'])
lon=np.array(df['Longitude'])
val=np.array(df['Altitude'])
m = basemap.Basemap(projection='robin', lon_0=0, lat_0=0, resolution='l',area_thresh=1000)
m.drawcoastlines(color = 'black')
x,y = m(lon,lat)
colormesh= m.contourf(x,y,val,100, tri=True, cmap = 'terrain')
plt.colorbar(location='bottom',pad=0.04,fraction=0.06)
plt.show()
I have already tried:
lat = csv_data[:,1]
lat= lat*(-1)
But this didn´t work

It's a plotting artifact().
Just do:
colormesh= m.contourf(x,y[::-1],val,100, tri=True, cmap = 'terrain')
y[::-1] will reverse the order of the y latitude elements (as opposed to the land-mass outlines; and while keeping the x longitude coordinates the same) and hence flip them.
I've often had this problem with plotting numpy image data in the past.
Your raw CSV data are unlikely to be flipped themselves (why would they be?). You should try sanity-checking them [I am not a domain expert I'm afraid]! Overlaying an actual coordinate grid can help with this.
Another way to do it is given here: Reverse Y-Axis in PyPlot
You could also therefore just do
ax = plt.gca()
ax.invert_yaxis()

Masking a variable with lat and lon but needed 3d array

what i am trying is masking a value from nc file with numpy array, according to specific location but it gives me 1d array and i can not use this array for plotting here my code.
from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as np
import numpy.ma as ma
file = './sample_data/NSS.AMBX.NK.D08214.S0740.E0931.B5312324.WI.nc'
data = Dataset(file,mode='r')
fcdBT89gHz = np.asarray(data.groups['Data_Fields']['fcdr_brightness_temperature_1'][:])
fcdBT150gHz = np.asarray(data.groups['Data_Fields']['fcdr_brightness_temperature_2'][:])
fcdBT183_1gHz = np.asarray(data.groups['Data_Fields']['fcdr_brightness_temperature_3'][:])
fcdBT183_3gHz = np.asarray(data.groups['Data_Fields']['fcdr_brightness_temperature_4'][:])
fcdBT183_7gHz = np.asarray(data.groups['Data_Fields']['fcdr_brightness_temperature_5'][:])
lats = data.groups['Geolocation_Time_Fields']['latitude'] #Enlem degerleri
lons = data.groups['Geolocation_Time_Fields']['longitude'] #Boylam degerleri
latlar = np.asarray(lats[:]) # Lati
lonlar = np.asarray(lons[:]) # Long
lo = ma.masked_outside(lonlar,105,110)
la = ma.masked_outside(latlar,30,35)
merged_coord=~ma.mask_or(la.mask,lo.mask)
h = plt.plot(fcdBT150gHz[merged_coord])
The output is like that but i need latitudes in x axis like this plot
If you need shape of variables:
lo.shape = (2495, 90)
la.shape = (2495, 90)
fcdBT150gHz[merged_coord].shape = (701,)
Maybe i did not use true way for masking. If data is needed here.

Calculate relative phase between two angles - python

I'm trying to calculate the relative phase between a time series of two angles. Using below, the angles are measured by the rotation derived from the xy points associated to Label A and Label B. The angles are moving in a similar direction for the first 3 time points and then deviate for the remaining 3 time points.
My understanding was that the relative phase calculation using a Hilbert transform signified that values closer to 0 ° referred to a pattern of coordination or in-phase. Conversely, values closer to 180° referred to asynchronous patterns or anti-phase. Yet when I export the results below I'm not seeing this?
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.signal import hilbert
df = pd.DataFrame({
'Time' : [1,1,2,2,3,3,4,4,5,5,6,6],
'Label' : ['A','B','A','B','A','B','A','B','A','B','A','B'],
'x' : [-2.0,-1.0,-1.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0],
'y' : [-2.0,-1.0,-2.0,-1.0,-2.0,-1.0,-3.0,0.0,-4.0,1.0,-5.0,2.0],
})
x = df.groupby('Label')['x'].diff().fillna(0).astype(float)
y = df.groupby('Label')['y'].diff().fillna(0).astype(float)
df['Rotation'] = np.arctan2(y, x)
df['Angle'] = np.degrees(df['Rotation'])
df_A = df[df['Label'] == 'A'].reset_index(drop = True)
df_B = df[df['Label'] == 'B'].reset_index(drop = True)
y1 = df_A['Angle'].values
y2 = df_B['Angle'].values
ang1 = np.angle(hilbert(y1),deg=False)
ang2 = np.angle(hilbert(y2),deg=False)
f,ax = plt.subplots(3,1,figsize=(20,5),sharex=True)
ax[0].plot(y1,color='r',label='y1')
ax[0].plot(y2,color='b',label='y2')
ax[0].legend(bbox_to_anchor=(0., 1.02, 1., .102),ncol=2)
ax[1].plot(ang1,color='r')
ax[1].plot(ang2,color='b')
ax[1].set(title='Angle at each Timepoint')
phase_synchrony = 1-np.sin(np.abs(ang1-ang2)/2)
ax[2].plot(phase_synchrony)
ax[2].set(ylim=[0,1.1],title='Instantaneous Phase Synchrony',xlabel='Time',ylabel='Phase Synchrony')
plt.tight_layout()
plt.show()

By your description I would simply use
phase_synchrony = 1-np.sin(np.abs(y1-y2)/2)
The analytic representation via Hilbert Transform applies when you have only the real part of a signal you know (or assume based on reasonable principles) to be analytic, under such conditions you can find a imaginary part that makes the resulting function analytic.
But in your case you already have x and y, so you can calculate the angle directly as you done already.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.signal import hilbert
df = pd.DataFrame({
'Time' : [1,1,2,2,3,3,4,4,5,5,6,6],
'Label' : ['A','B','A','B','A','B','A','B','A','B','A','B'],
'x' : [-2.0,-1.0,-1.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0],
'y' : [-2.0,-1.0,-2.0,-1.0,-2.0,-1.0,-3.0,0.0,-4.0,1.0,-5.0,2.0],
})
x = df.groupby('Label')['x'].diff().fillna(0).astype(float)
y = df.groupby('Label')['y'].diff().fillna(0).astype(float)
df['Rotation'] = np.arctan2(y, x)
df['Angle'] = np.degrees(df['Rotation'])
df_A = df[df['Label'] == 'A'].reset_index(drop = True)
df_B = df[df['Label'] == 'B'].reset_index(drop = True)
y1 = df_A['Angle'].values
y2 = df_B['Angle'].values
# no need to compute the hilbert transforms here
f,ax = plt.subplots(3,1,figsize=(20,5),sharex=True)
ax[0].plot(y1,color='r',label='y1')
ax[0].plot(y2,color='b',label='y2')
ax[0].legend(bbox_to_anchor=(0., 1.02, 1., .102),ncol=2)
ax[1].plot(ang1,color='r')
ax[1].plot(ang2,color='b')
ax[1].set(title='Angle at each Timepoint')
# all I changed
phase_synchrony = 1-np.sin(np.abs(y1-y2)/2)
ax[2].plot(phase_synchrony)
ax[2].set(ylim=[0,1.1],title='Instantaneous Phase Synchrony',xlabel='Time',ylabel='Phase Synchrony')
plt.tight_layout()
plt.show()

Python 2D array -- How to plug in x and retrieve y value?

I have been looking for an answer since yesterday but no luck. So I have a 1D spectrum (.fits) file with flux value at each wavelength. I have converted them into a 2D array (x,y)=(wavelength, flux) and want to write a program which will return flux(y) at some assigned wavelengths(x). I have tried this:
#modules
import scipy
import numpy as np
import pyfits as pf
#Target Global Vaiables
hdulist_tg = pf.open('cutmask1-2.0001.fits')
hdr_tg = hdulist_tg[0].header
flux_tg = hdulist_tg[0].data
crval_tg = hdr_tg['CRVAL1'] #Starting wavelength
cdel_tg = hdr_tg['CDELT1'] #Wavelength axis width
wave_tg = crval_tg + np.arange(3183)*cdel_tg #Create an x-axis
wavelist = [6207,6315,6369,6438,6490,6565,6588]
wave_flux=[]
diff = 10
for wave in wave_tg:
for flux in flux_tg:
wave_flux.append((wave,flux))
for item in wave_flux:
wave = item[0]
flux = item[1]
#Where I got my actual wavelength that exists in wave_tg
diffmatch = np.abs(wave - wavelist[0])
if diffmatch < diff:
flux_wave = flux
diff = diffmatch
wavematch = wave
print wavelist[0],flux_wave,wavematch
but the program always return the same flux value even though the wavelength is different. Please help...

I would skip the creation of the two dimensional table altogether and just use interp:
fluxvalues = np.interp(wavelist, wave_tg, flux_tg)
For the file you posted, the code you posted doesn't work due to the hard-coded length of the wave_tg array. I would therefore recommend you rather use
wave_tg = crval_tg + np.arange(len(flux_tg))*cdel_tg
Also, for some reason it seems that the file you posted doesn't actually go up to the wavelengths you are looking up. You might need to check that you are calculating the corresponding wavelengths correctly or check that you are looking up the right wavelengths.

I've made some changes in your code:
using numpy ot create wave_flux as a ndarray using np.hstack(), np.repeat() and np.tile()
using fancy indexing to get the values matching your search
The resulting code is:
#modules
import scipy
import numpy as np
import pyfits as pf
#Target Global Vaiables
hdulist_tg = pf.open('cutmask1-2.0001.fits')
hdr_tg = hdulist_tg[0].header
flux_tg = hdulist_tg[0].data
crval_tg = hdr_tg['CRVAL1'] #Starting wavelength
cdel_tg = hdr_tg['CDELT1'] #Wavelength axis width
wave_tg = crval_tg + np.arange(3183)*cdel_tg #Create an x-axis
wavelist = [6207,6315,6369,6438,6490,6565,6588]
wave_flux = np.vstack(( np.repeat(wave_tg, len(flux_tg)),
np.tile(flux_tg, len(wave_tg)) )).transpose()
wave_ref = wavelist[0]
diff = 10
print wave_flux[ np.abs(wave_flux[:,0]-wave_ref) < diff ]
Which will return a sub-group of wave_flux with the wave values in column 0 and flux values in column 1:
[[ 6197.10300138 500.21020508]
[ 6197.10300138 523.24102783]
[ 6197.10300138 510.6390686 ]
...,
[ 6216.68436446 674.94732666]
[ 6216.68436446 684.74255371]
[ 6216.68436446 712.20098877]]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to insert TXT data into netcdf in python - python

Related

How to collocate large datasets most efficiently, comparing time, latitude (x), and longitude (y)

Reverse Array in a dataframe

Masking a variable with lat and lon but needed 3d array

Calculate relative phase between two angles - python

Python 2D array -- How to plug in x and retrieve y value?

Categories

Resources