I know there is software like wgrib2 that will convert files in grib and grib2 format to NetCDF files, but I need to go the other way: from NetCDF to grib2, because the local weather offices here can only consume gridded data in grib2 format.
It appears that one solution could be in Python, using the NetCDF4-Python library (or other) to read the NetCDF files and using pygrib to write grib2.
Is there a better way?
After some more research, I ended up using the British Met Office "Iris" package (http://scitools.org.uk/iris/docs/latest/index.html) which can read NetCDF as well as OPeNDAP, GRIB and several other formats, and allows to save as NetCDF or GRIB.
Basically the code looks like:
import iris
cubes = iris.load('input.nc') # each variable in the netcdf file is a cube
iris.save(cubes[0],'output.grib2') # save a specific variable to grib
But if your netcdf file doesn't contain sufficient metadata, you may need to add it, which you can also do with Iris. Here's a full working example:
https://github.com/rsignell-usgs/ipython-notebooks/blob/master/files/Iris_CFSR_wave_wind.ipynb
One can also use climate data operators (cdo's) for the task -https://code.zmaw.de/projects/cdo/wiki
but need to install the software with all additional libraries.
I know CDO is mentioned above, but I thought it would be useful to give the full command
cdo -f grb2 copy in.nc out.grb
ECMWF has a command line based tool to do just this: https://software.ecmwf.int/wiki/display/GRIB/grib_to_netcdf
Related
I've got a GRIB file with ECMWF forecast, and I'm keen to pull data from it based on coordinate inputs. As in, provide coordinates, and get the forecast for the next 5 days, for a specific time (wind speed, gust speed, wind direction, wave height..).
I think Python is probably the best option to accomplish this. Can someone point me in the right direction? Give me some bits.
I'm guessing the binary needs to be converted to a JSON (or another readable format), and then I can parse through and look for the data based on the coordinates provided?
One way of doing this in native Python is using xarray and cfgrib. Here is a tutorial. Here is the key code from the tutorial:
import xarray as xr
ds = xr.tutorial.load_dataset('<your_grib>.grib', engine='cfgrib')
Once you have done this, all the fields in the grib file will be available. The general form is
ds.<field_name>[<index>].values
Be warned that this code is very slow compared to using the GRIB tools provided by the US National Weather Service. Check out degrib. Most of the weather processing code is written in C and Fortran, because it is so much faster than Python. Depending on your available compute resources and data size, you may not be able to process a whole grib file in Python before the forecast it contains expires.
Finally, this topic is discussed more extensively on the GIS stack exchange. "grib" is a tag over there.
I need to read, manipulate and write PLY files in Python. PLY is a format for storing 3D objects. Through a simple search I've found two relevant libraries, PyMesh and plyfile. Has anyone had any experience with either of them, and does anyone have any recommendations? plyfile seems to have been dormant for a year now, judging by Github.
I know this question instigates opinion-based answers but I don't really know where else to ask this question.
As of (2020 January).
None, use open3d. It's the easiest and reads .ply files directly into numpy.
import numpy as np
import open3d as o3d
# Read .ply file
input_file = "input.ply"
pcd = o3d.io.read_point_cloud(input_file) # Read the point cloud
# Visualize the point cloud within open3d
o3d.visualization.draw_geometries([pcd])
# Convert open3d format to numpy array
# Here, you have the point cloud in numpy format.
point_cloud_in_numpy = np.asarray(pcd.points)
References:
http://www.open3d.org/docs/release/tutorial/Basic/visualization.html
http://www.open3d.org/docs/release/tutorial/Basic/working_with_numpy.html
I have succesfully used plyfile while working with pointclouds.
It's true that the poject had not presented any activity from a long time, but It meets its purpose.
And is not like the fact of parsing a ply file were something that allows you to recreate yourself by adding new features.
On the other hand PyMesh offers you many other features besides parsing ply files.
So maybe the question is:
Do you want to just 'read, manipulate and write PLY files' or are you looking for a library that provides more extra features?
What made me choose plyfile was that I'm able to incorporate it to my project by just copying 1 source file. Also I wasn't interested in any of the other features that PyMesh offers.
Update
I ended writing my own functions to read/write ply files (supporting ascii and binary) because I found the plyfile source code a little messy.
If anyone is interested, here is a link to the file:
ply reader/writer
I've just updated meshio to support PLY as well, next to about 20 other formats. Install with
pip install meshio
and use either on the command line
meshio convert in.ply out.vtk
or from within Python like
import meshio
mesh = meshio.read("in.ply")
# mesh.points, mesh.cells, ...
I rolled my own ascii ply writer (because it's so simple, I didn't want to take a dependency). Later, I was lazy and took a dependency on plyfile for loading binary .ply files coming from other places. Nothing has caught on fire yet.
A thing to mention, for better or worse, the .ply format is extensible. We shoehorned custom data into it, and that was easy since we also wrote our own writer.
I would like to find a way in Python to aggregate over the slow index (time) of a NetCDF dataset with dimensions (time,y,x) where the files store blocks of time. Apparently NetCDF4-python do this for a NetCDF4 classic file or NetCDF3, but the files are a done deal. Can anyone explain if there is a way to do this in NetCDF4, either with a multfile access or with something like NcML wrappers?
Or does NetCDF4 not do this for a reason that cannot be overcome?
Thanks.
I have downloaded climate model output in the form of netcdf files with one variable (pr) for the whole world with a daily time-step. My final goal is to have monthly data for Europe.
I have never used netcdf files before and all the specific software for netcdf I could find doesn't seems to work in windows. Since I programme in R, I tried using the ncdf4 package but run into memory size problems (my files are around 2Gb)... I am now trying the netCDF4 module in python (first time I am using python - so go easy on me).
I have managed to install everything and found some code online to import the dataset:
nc_fid = Dataset(nc_f, 'r')
# Extract data from NetCDF file
lats = nc_fid.variables['lat'][:]
lons = nc_fid.variables['lon'][:]
time = nc_fid.variables['time'][:]
pp = nc_fid.variables['pr'][:]
However all the tutorials I found are on how to make a netcdf file... I have no idea how to aggregate this daily rainfall (variable pr) into monthly. Also, I have different types of calender in different files, but I don't even know how to access that information:
time.calendar
AttributeError: 'numpy.ndarray' object has no attribute 'calendar'
Please help, I don't want to have to learn Linux just so I can sort-out some data :(
Why not avoid programming entirely and use NCO which supplies the ncrcat command that aggregates data thusly:
ncrcat day*.nc month.nc
VoilĂ . See more ncrcat examples here.
Added 20160628: If instead of a month-long timeseries you want a monthly average then use the same command only with ncra instead of ncrcat. The manual explains things like this.
If you have a daily timestep and you want to calculate the monthly mean then you can do
cdo monmean input_yyyy.nc output_yyyy.nc
It sounds as if you have several of these files, so you will need to merge them with
cdo mergetime file_*.nc timeseries.nc
where the * is a wildcard for the years.
I am having a few big files sets of HDF5 files and I am looking for an efficient way of converting the data in these files into XML, TXT or some other easily readable format. I tried working with the Python package (www.h5py.org), but I was not able to figure out any methods with which I can get this stuff done fast enough. I am not restricted to Python and can also code in Java, Scala or Matlab. Can someone give me some suggestions on how to proceed with this?
Thanks,
TM
Mathias711's method is the best direct way. If you want to do it within python, then use pandas.HDFStore:
from pandas import HDFStore
store = HDFStore('inputFile.hd5')
store['table1Name'].to_csv('outputFileForTable1.csv')
You can use h5dump -o dset.asci -y -w 400 dset.h5
-o dset.asci specifies the output file
-y -w 400 specifies the dimension size multiplied by the number of positions and spaces needed to print each value. You should take a very large number here.
dset.h5 is of course the hdf5 file you want to convert
I think this is the easiest way to convert it to an ascii file, which you can import to excel or whatever you want. I did it a couple of times, and it worked for me. I got his information from this website.