Preserving the WCS information of a FITS file when rebinned - python

Aim : Rebin an existing image (FITS file) and write the new entries into a new rebinned image (also a FITS file).
Issue : Rebinned FITS file and the original FITS file seem to have mismatched co-ordinates (figure shown later in the question).
Process : I will briefly describe my process to shed more light. The first step is to read the existing fits file and define numpy arrays
from math import *
import numpy as np
import matplotlib.pyplot as plt
from astropy.visualization import astropy_mpl_style
from astropy.io import fits
import matplotlib.pyplot as plt
%matplotlib notebook
import aplpy
from aplpy import FITSFigure
file = 'F0621_HA_POL_0700471_HAWDHWPD_PMP_070-199Augy20.fits'
hawc = fits.open(file)
stokes_i = np.array(hawc[0].data)
stokes_i_rebinned = congrid(stokes_i,newdim,method="neighbour", centre=False, minusone=False)
Here "congrid" is a function I have used for near-neigbhour rebinning that rebins the original array to a new dimension given by "newdim". Now the goal is to write this rebinned array back into the FITS file format as a new file. I have several more such arrays but for brevity, I just include one array as an example. To keep the header information same, I read the header information of that array from the existing FITS file and use that to write the new array into a new FITS file. After writing, the rebinned file can be read just like the original :-
header_0= hawc[0].header
fits.writeto("CasA_HAWC+_rebinned_congrid.fits", rebinned_stokes_i, header_0, overwrite=True)
rebinned_file = 'CasA_HAWC+_rebinned_congrid.fits'
hawc_rebinned= fits.open(rebinned_file)
To check how the rebinned image looks now I plot them
cmap = 'rainbow'
stokes_i = hawc[0]
stokes_i_rebinned = hawc_rebinned[0]
axi = FITSFigure(stokes_i, subplot=(1,2,1)) # generate FITSFigure as subplot to have two axes together
axi.show_colorscale(cmap=cmap) # show I
axi_rebinned = FITSFigure(stokes_i_rebinned, subplot=(1,2,2),figure=plt.gcf())
axi_rebinned.show_colorscale(cmap=cmap) # show I rebinned
# FORMATTING
axi.set_title('Stokes I (146 x 146)')
axi_rebinned.set_title('Rebinned Stokes I (50 x 50)')
axi_rebinned.axis_labels.set_yposition('right')
axi_rebinned.tick_labels.set_yposition('right')
axi.tick_labels.set_font(size='small')
axi.axis_labels.set_font(size='small')
axi_rebinned.tick_labels.set_font(size='small')
axi_rebinned.axis_labels.set_font(size='small')
As you see for the original and rebinned image, the X,Y co-ordinates seem mismatched and my best guess was that WCS (world co-ordinate system) for the original FITS file wasn't properly copied for the new FITS file, thus causing any mismatch. So how do I align these co-ordinates ?
Any help will be deeply appreciated ! Thanks

I'm posting my answer in an astropy slack channel here should this be useful for others.
congrid will not work because it doesn't include information about the WCS. For example, your CD matrix is tied to the original image, not the re-binned set. There are a number of way to re-bin data with proper WCS. You might consider reproject although this often requires another WCS header to re-bin to.
Montage (though not a Python tool but has Python wrappers) is potentially another way.

As #astrochun already said, your re-binning function does not adjust the WCS of the re-binned image. In addition to reproject and Montage, astropy.wcs.WCSobject has slice() method. You could try using it to "re-bin" the WCS like this:
from astropy.wcs import WCS
import numpy as np
wcs = WCS(hawc[0].header, hawc)
wcs_rebinned = wcs.slice((np.s_[::2], np.s_[::2]))
wcs_hdr = wcs_rebinned.to_header()
header_0.update(wcs_hdr) # but watch out for CD->PC conversion
You should also make a "real" copy of hawc[0].header in header_0= hawc[0].header, for example as header_0= hawc[0].header.copy() or else header_0.update(wcs_hdr) will modify hawc[0].header as well.

Related

How to convert 3D .npy files to .nii files using torchio

That is, if I want to generate a heatmap map, I process a set of 3D NumPy arrays, and finally, want to turn this set of arrays into a .nii file to represent it.
This array is not extracted from the .nii file using image.numpy(), but generated by myself using NumPy.
I was using a reader function to generate but I think it's very hard and very strict for the path parameters to be entered, so I would like to ask if there is a simpler path.
import numpy as np
import torch
import os
import torchio as tio
def numpy_reader(npy_name):
data = torch.from_numpy(np.load(npy_name)).unsqueeze(0)
affine = np.eye(4)
return data, affine
image_nii = tio.ScalarImage(npy_name, reader=numpy_reader)
image_nii.save(os.path.join('./nii_temp', savenii_name))

How to extract a profile of value from a raster along a given line?

How to extract a profile of values from a raster along a given shapefile line in Python?
I am struggling finding a method to extract a profile of values (e.g. topographic profile) from a raster (geotiff). The library Rasterio has a method to clip/extract value from a raster based on a polygon, but I cannot find an equivalent method for a line shapefile.
There is a basic method with scipy, but it does not inherently conserve geographic information like a method based on higher level toolbox like rasterio could provide.
In other words, I am looking for an equivalent in Python of what the tool Terrain Profile in QGIS offers.
Thanks
This is a bit different than extracting for a polygon, as you want to sample every pixel touched by the line, in the order they are touched (the polygon approaches don't care about pixel order).
It looks like it would be possible to adapt this approach to use rasterio instead. Given a line read from a shapefile using geopandas or fiona as a shapely object, you use the endpoints to derive a new equidistant projection that you use as dst_crs in a WarpedVRT and read pixel values from that. It looks like you would need to calculate the length of your line in terms of the number of pixels you want sampled, this is the width parameter of the WarpedVRT.
This approach may need to be adapted further if your line is not an approximately straight line between the endpoints.
If you want to just get the raw pixel values under the line, you should be able to use a mask in rasterio or rasterize directly, for each line. You may want to use the all_touched=True in the case of lines.
I had a similar problem and found a solution which works for me. The solution uses shapely to sample points on a line/lines and then accesses respective values from the GeoTiff, therefore the extracted profile follows the direction of the line. Here is the method that I ended up with:
def extract_along_line(xarr, line, n_samples=256):
profile = []
for i in range(n_samples):
# get next point on the line
point = line.interpolate(i / n_samples - 1., normalized=True)
# access the nearest pixel in the xarray
value = xarr.sel(x=point.x, y=point.y, method="nearest").data
profile.append(value)
return profile
Here is a working example with data from the copernicus-dem database and the line is the diagonal of the received tile:
import rioxarray
import shapely.geometry
import matplotlib.pyplot as plt
sample_tif = ('https://elevationeuwest.blob.core.windows.net/copernicus-dem/'
'COP30_hh/Copernicus_DSM_COG_10_N35_00_E138_00_DEM.tif')
# Load xarray
tile = rioxarray.open_rasterio(sample_tif).squeeze()
# create a line (here its the diagonal of tile)
line = shapely.geometry.MultiLineString([[
[tile.x[-1],tile.y[-1]],
[tile.x[0], tile.y[0]]]])
# use the method from above to extract the profile
profile = extract_along_line(tile, line)
plt.plot(profile)
plt.show()

APLpy coordinate change to pixel values

Does anyone know how to change the coordinate values to the pixel values of a fits file image in APLpy?
The only way I could think of is just pass in the data-array instead of the fits file. If it has no WCS information it must operate in pixel-space.
from astropy.io import fits # or import pyfits
with fits.open(filename) as hdus:
data = hdus[0].data
f1 = aplpy.FITSFigure(data)
# ... whatever you want to do thereafter.
I haven't used APLpy maybe there is a better way but I haven't found anything in the documentation.

Incorrect reshape of vtk->numpy array?

I am reading VTK uniform grid into python. When I visualize a slice through the data in Paraview, I get the following (correct) image:
Then I visualize the slice using via numpy & pylab using the following script:
import vtk
from vtk.util.numpy_support import vtk_to_numpy
import pylab
imr=vtk.vtkXMLImageDataReader()
imr.SetFileName('flow.vti')
imr.Update()
im=imr.GetOutput()
nx,ny,nz=im.GetDimensions()
orig=im.GetOrigin()
extent=im.GetExtent()
spacing=im.GetSpacing()
flowVtk=im.GetPointData().GetArray("|flow|")
flow=vtk_to_numpy(flowVtk).reshape(nx,ny,nz)
# bottom z-slice
flowZ0=flow[:,:,0]
# set extent so that axes units are physical
img=pylab.imshow(flowZ0,extent=[orig[0],orig[0]+extent[1]*spacing[0],orig[1],orig[1]+extent[3]*spacing[1]],cmap=pylab.gray())
img.set_clim(vmin=0,vmax=1000)
pylab.show()
which seems to be out-of-phase. I tried reordering dimensions in reshape(...), it did something, but it has never shown the data it is actually supposed to show.
Is there something obviously wrong?
EDIT: I also tried reshape((nx,ny,nz),order="F") (fortran ordering) and now I get a much better image (with jet colormap for better clarity) which is almost correct, but the data is suspiciously rotated by 90°, plus I would like some authoritative explanation which ordering to use and why (which one is used by VTK internally?).
EDIT2: to get the same view as in Paraview, I had to do pylab.imshow(np.rot90(flowZ0)); not sure why, so the question is still open:

Different colors for each line in the graph

I am trying to plot some data from a file. The file contains 13 columns, but i want just the first and the fourth column to plot. Also, there are more than one of the file, i want to plot them on the same diagram. I succeeded to show lines on the diagram. I added my code for plotting arrays. The problem is that i want to have different colors for each file, but my code does the same for all. How can i correct it?
Thank you.
# gen_len is an array, same for all files
# gen_number is an array contains information
# of files
colors="bgrcmyk"
index=0
for gen in gen_number:
plt.plot(gen,gen_len,color=colors[index])
index=index+1
plt.savefig('result.png')
plt.show()
A more elegant solution for reading in your files would be to use numpy's genfromtxt, which can import just your desired columns, and also ignore lines starting with a certain character (the comments='#' keyword). I think this code does as you want:
import numpy as np
import matplotlib.pyplot as plt
import sys
colors="bgrcmyk"
for i in range(1,len(sys.argv)):
gen,gen_len=np.genfromtxt(sys.argv[i],usecols=(0,3),unpack=True,comments='#')
plt.plot(gen,gen_len,c=colors[i])
plt.savefig('result.png')

Categories