zonal_stats: width and height must be > 0 error

zonal_stats: width and height must be > 0 error - python

I am trying to use the function zonal_stats from rasterstats Python package to get the raster statistics from a .tif file of each shape in a .shp file. I manage to do it in QGIS without any problems, but I have to do the same with more than 200 files, which will take a lot of time, so I'm trying the Python way. Both files and replication code are in my Google Drive.
My script is:
import rasterio
import geopandas as gpd
import numpy as np
from rasterio.plot import show
from rasterstats import zonal_stats
from rasterio.transform import Affine
# Import .tif file
raster = rasterio.open(r'M:\PUBLIC\Felipe Dias\Pesquisa\Interpolação Espacial\Arroz_2019-03.tif')
# Read the raster values
array = raster.read(1)
# Get the affine
affine = raster.transform
# Import shape file
shapefile = gpd.read_file(r'M:\PUBLIC\Felipe Dias\Pesquisa\Interpolação Espacial\Setores_Censit_SP_WGS84.shp')
# Zonal stats
zs_shapefile = zonal_stats(shapefile, array, affine = affine,
stats=['min', 'max', 'mean', 'median', 'majority'])
I get the following error:
Input In [1] in <cell line: 22>
zs_shapefile = zonal_stats(shapefile, array, affine = affine,
File ~\Anaconda3\lib\site-packages\rasterstats\main.py:32 in zonal_stats
return list(gen_zonal_stats(*args, **kwargs))
File ~\Anaconda3\lib\site-packages\rasterstats\main.py:164 in gen_zonal_stats
rv_array = rasterize_geom(geom, like=fsrc, all_touched=all_touched)
File ~\Anaconda3\lib\site-packages\rasterstats\utils.py:41 in rasterize_geom
rv_array = features.rasterize(
File ~\Anaconda3\lib\site-packages\rasterio\env.py:387 in wrapper
return f(*args, **kwds)
File ~\Anaconda3\lib\site-packages\rasterio\features.py:353 in rasterize
raise ValueError("width and height must be > 0")
I have found this question about the same problem, but I can't make it work with the solution: I have tried to reverse the signal of the items in the Affine of my raster data, but I couldn't make it work:
''' Trying to use the same solution of question: https://stackoverflow.com/questions/62010050/from-zonal-stats-i-get-this-error-valueerror-width-and-height-must-be-0 '''
old_tif = rasterio.open(r'M:\PUBLIC\Felipe Dias\Pesquisa\Interpolação Espacial\Arroz_2019-03.tif')
print(old_tif.profile) # copy & paste the output and change signs
new_tif_profile = old_tif.profile
# Affine(0.004611149999999995, 0.0, -46.828504575,
# 0.0, 0.006521380000000008, -24.01169169)
new_tif_profile['transform'] = Affine(0.004611149999999995, 0.0, -46.828504575,
0.0, -0.006521380000000008, 24.01169169)
new_tif_array = old_tif.read(1)
new_tif_array = np.fliplr(np.flip(new_tif_array))
with rasterio.open(r'M:\PUBLIC\Felipe Dias\Pesquisa\Interpolação Espacial\tentativa.tif', "w", **new_tif_profile) as dest:
dest.write(new_tif_array, indexes=1)
dem = rasterio.open(r'M:\PUBLIC\Felipe Dias\Pesquisa\Interpolação Espacial\tentativa.tif')
# Read the raster values
array = dem.read(1)
# Get the affine
affine = dem.transform
# Import shape file
shapefile = gpd.read_file(r'M:\PUBLIC\Felipe Dias\Pesquisa\Interpolação Espacial\Setores_Censit_SP_WGS84.shp')
# Zonal stats
zs_shapefile = zonal_stats(shapefile, array, affine=affine,
stats=['min', 'max', 'mean', 'median', 'majority'])
Doing this way, I don't get the "width and height must be > 0" error! But every stat in zs_shapefile is "NoneType", so it doesn't help my problem.
Does anyone understands why this error happens, and which sign I have to reverse for making it work? Thanks in advance!

I would be careful with overriding the geotransform of your raster like this, unless you are really convinced the original metadata is incorrect. I'm not too familiar with Affine, but it looks like you're setting the latitude now as positive? Placing the raster on the northern hemisphere. My guess would be that this lack of intersection between the vector and raster causes the NoneType results.
I'm also not familiar with raster_stats, but I'm guessing it boils down to GDAL & Numpy at the core of it. So something you could try as a test is to add the all_touched=True keyword:
https://pythonhosted.org/rasterstats/manual.html#rasterization-strategy
If that works, it might indicate that the rasterization fails because your polygons are so small compared to the pixels, that the default rasterization method results in a rasterized polygon of size 0 (in at least one of the dimensions). And that's what the error also hints at (my guess).
Keep in mind that all_touched=True changes the stats you get in result, so I would only do it for testing, or if you're comfortable with this difference.
If you really need a valid value for these (too) small polygons, there are a few workarounds you could try. Something I've done is to simply take the centroid for these polygons, and take the value of the pixel where this centroid falls on.
A potential way to identify these polygons would be to use all_touched with the "count" statistic, every polygon with a count of only 1 might be too small to get rasterized correctly. To really find this out you would probably have to do the rasterization yourself using GDAL, given that raster_stats doesn't seem to allow it.
Note that due to the shape of some of the polygons you use, the centroid might fall outside of the polygon. But given how course your raster data is, relative to the vector, I don't think it would impact the result all that much.
An alternative is, instead of modifying the vector, to significantly increase the resolution of your raster. You could use gdal_translate to output this to a VRT, with some form of resampling, and avoid having to write this data to disk. Once the resolution is high enough that all polygons rasterize to at least a 1x1 array, it should probably work. But your polygons are tiny compared to the pixels, so it'll be a lot. You could guess it, or analyze the envelopes of all polygons. For example take the smallest edge of the envelope as more or less the resolution that's necessary for a correct rasterization.
Edit; To clarify the above a bit further.
The default rasterization strategy of GDAL (all_touched=False) is to consider a pixel "within" the polygon if the centroid of the pixel intersects with the polygon.
Using QGIS you can for example convert the pixels to points, and then do a spatial join with your vector. If you remove polygons that can't be joined (there's a checkbox), you'll get a different vector that most likely should work with raster_stats, given your current raster.
You could perhaps use that in the normal way (all_touched=False), and get the stats for the small polygons using all_touched=True.
In the image below, the green polygons are the ones that intersect with the centroid of a pixel, the red ones don't (and those are probably the ones raster_stats "tries" to rasterize to a size 0 array).

Related

covert rgb png and depth txt to point cloud

I have a series of rgb files in png format, as well as the corresponding depth file in txt format, which can be loaded with np.loadtxt. How could I merge these two files to point cloud using open3d?
I followed the procedure as obtain point cloud from depth numpy array using open3d - python, but the result is not readable for human.
The examples is listed here:
the source png:
the pcd result:
You can get the source file from this link ![google drive] to reproduce my result.
By the way, the depth and rgb are not registerd.
Thanks.

I had to play a bit with the settings and data and used mainly the answer of your SO link.
import cv2
import numpy as np
import open3d as o3d
color = o3d.io.read_image("a542c.png")
depth = np.loadtxt("a542d.txt")
vertices = []
for x in range(depth.shape[0]):
for y in range(depth.shape[1]):
vertices.append((float(x), float(y), depth[x][y]))
pcd = o3d.geometry.PointCloud()
point_cloud = np.asarray(np.array(vertices))
pcd.points = o3d.utility.Vector3dVector(point_cloud)
pcd.estimate_normals()
pcd = pcd.normalize_normals()
o3d.visualization.draw_geometries([pcd])
However, if you keep the code as provided, the whole scene looks very weird and unfamiliar. That is because your depth file contains data between 0 and almost 2.5 m.
I introduced a cut-off at 500 or 1000 mm plus removed all 0s as suggested in the other answer. Additionally I flipped the x-axis (float(-x) instead of float(x)) to resemble your photo.
# ...
vertices = []
for x in range(depth.shape[0]):
for y in range(depth.shape[1]):
if 0< depth[x][y]<500:
vertices.append((float(-x), float(y), depth[x][y]))
For a good perspective I had to rotate the images manually. Probably open3d provides methods to do it automatically (I quickly tried pcd.transform() from your SO link above, it can help you if needed).
Results
500 mm cut-off: and 1000 mm cut-off: .

I used laspy instead of open3d because wanted to give some colors to your image:
import imageio
import numpy as np
# first reading the image for RGB values
image = imageio.imread(".../a542c.png")
loading the depth file
depth = np.loadtxt("/home/shaig93/Documents/internship_FWF/a542d.txt")
# creating fake x, y coordinates with meshgrid
xv, yv = np.meshgrid(np.arange(400), np.arange(640), indexing='ij')
# save_las is a function based on laspy that was provided to me by my supervisor
save_las("fn.laz", image[:400, :, 0].flatten(), np.c_[yv.flatten(), xv.flatten(), depth.flatten()], cmap = plt.cm.magma_r)
and the result is this. As you can see objects are visible from front.
However from side they are not easy to distinguish.
This means to me to think that your depth file is not that good.
Another idea would be also getting rid off 0 values from your depth file so that you can get point cloud without a wall kind of structure in the front. But still does not solve depth issue of course.
ps. I know this is not a proper answer but I hope it was helpful on identifying the problem.

Mask raster by extent in Python using rasterio

I want to clip one raster based on the extent of another (smaller) raster. First I determine the coordinates of the corners of the smaller raster using
import rasterio as rio
import gdal
from shapely.geometry import Polygon
src = gdal.Open(smaller_file.tif)
ulx, xres, xskew, uly, yskew, yres = src.GetGeoTransform()
lrx = ulx + (src.RasterXSize * xres)
lry = uly + (src.RasterYSize * yres)
geometry = [[ulx,lry], [ulx,uly], [lrx,uly], [lrx,lry]]
This gives me the following output geometry = [[-174740.0, 592900.0], [-174740.0, 2112760.0], [900180.0, 2112760.0], [900180.0, 592900.0]]. (Note that the crs is EPSG: 32651).
Now I would like to clip the larger file using rio.mask.mask(). According to the documentation, the shape variable should be GeoJSON-like dict or an object that implements the Python geo interface protocol (such as a Shapely Polygon). Therefore I create a Shapely Polygon out of the variable geometry, using
roi = Polygon(geometry)
Now everything is ready to use the rio.mask() function.
output = rio.mask.mask(larger_file.tif, roi, crop = True)
But this gives me the following error
TypeError: 'Polygon' object is not iterable
What do I do wrong? Or if someone knows a more elegant way to do it, please let me know.
(Unfortunately I cannot upload the two files since they're too large)

I found your question when I needed to figure out this kind of clipping myself. I got the same error and fixed it the following way:
rasterio.mask expects a list of features, not a single geometry. So the algorithm wants to run masking over several features bundled in an iterable (e.g. list or tuple) so we need to pass it our polygon within a list (or tuple) object.
The code you posted works after following change:
roi = [Polygon(geometry)]
All we have to do is to enclose the geometry in a list/tuple and then rasterio.mask works as expected.

Given a geotiff file, how does one find the single pixel closest to a given latitude/longitude?

I have a geotiff file that I'm opening with gdal in Python, and I need to find the single pixel closest to a specified latitude/longitude. I was previously working with an unrelated file type for similar data, so I'm completely new to both gdal and geotiff.
How does one do this? What I have so far is
import gdal
ds = gdal.Open('foo.tiff')
width = ds.RasterXSize
height = ds.RasterYSize
gt = ds.GetGeoTransform()
gp = ds.GetProjection()
data = np.array(ds.ReadAsArray())
print(gt)
print(gp)
which produces (for my files)
(-3272421.457337171, 2539.703, 0.0, 3790842.1060354356, 0.0, -2539.703)
and
PROJCS["unnamed",GEOGCS["Coordinate System imported from GRIB file",DATUM["unnamed",SPHEROID["Sphere",6371200,0]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Lambert_Conformal_Conic_2SP"],PARAMETER["latitude_of_origin",25],PARAMETER["central_meridian",265],PARAMETER["standard_parallel_1",25],PARAMETER["standard_parallel_2",25],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
Ideally, there'd be a single simple function call, and it would also return an indication whether the specified location falls outside the bounds of the raster.
My fallback is to obtain a grid from another source containing the latitudes and longitudes for each pixel and then do a brute force search for the desired location, but I'm hoping there's a more elegant way.
Note: I think what I'm trying to do is equivalent to the command line
gdallocationinfo -wgs84 foo.tif <longitude> <latitude>
which returns results like
Report:
Location: (1475P,1181L)
Band 1:
Value: 66
This suggests to me that the functionality is probably already in the gdal module, if I can just find the right method to call.

You basically need two steps:
Convert the lat/lon point to the raster-projection
Convert the mapx/mapy (in raster proj) to pixel coordinates
Given the code you already posted above, defining both projection systems can be done with:
from osgeo import gdal, osr
point_srs = osr.SpatialReference()
point_srs.ImportFromEPSG(4326) # hardcode for lon/lat
# GDAL>=3: make sure it's x/y
# see https://trac.osgeo.org/gdal/wiki/rfc73_proj6_wkt2_srsbarn
point_srs.SetAxisMappingStrategy(osr.OAMS_TRADITIONAL_GIS_ORDER)
file_srs = osr.SpatialReference()
file_srs.ImportFromWkt(gp)
Creating the coordinate transformation, and using it to convert the point from lon/lat to mapx/mapy coordinates (whatever projection it is) with:
ct = osr.CoordinateTransformation(point_srs, file_srs)
point_x = -114.06138 # lon
point_y = 51.03163 # lat
mapx, mapy, z = ct.TransformPoint(point_x, point_y)
To go from map coordinates to pixel coordinates, the geotransform needs to be inverted first. And can then be used to retrieve the pixel coordinates like:
gt_inv = gdal.InvGeoTransform(gt)
pixel_x, pixel_y = gdal.ApplyGeoTransform(gt_inv, mapx, mapy)
Rounding those pixel coordinates should allow you to use them for indexing the data array. You might need to clip them if the point you're querying is outside the raster.
# round to pixel
pixel_x = round(pixel_x)
pixel_y = round(pixel_y)
# clip to file extent
pixel_x = max(min(pixel_x, width-1), 0)
pixel_y = max(min(pixel_y, height-1), 0)
pixel_data = data[pixel_y, pixel_x]

delete pixel from the raster image with specific range value

update :
any idea how to delete pixel from specific range value raster image with
using numpy/scipy or gdal?
or how to can create new raster with some class using raster calculation expressions(better)
for example i have a raster image with the
5 class :
1. 0-100
2. 100-200
3. 200-300
4. 300-500
5. 500-1000
and i want to delete class 1 range value
or maybe class 1,2,4,5
i begin with this script :
import numpy as np
from osgeo import gdal
ds = gdal.Open("raster3.tif")
myarray = np.array(ds.GetRasterBand(1).ReadAsArray())
#print myarray.shape
#print myarray.size
#print myarray
new=np.delete(myarray[::2], 1)
but i cant to complete
the image
White in class 5 and black class 1

Rasters are 2-D arrays of values, with each value being stored in a pixel (which stands for picture element). Each pixel must contain some information. It is not possible to delete or remove pixels from the array because rasters are usually encoded as a simple 1-dimensional string of bits. Metadata commonly helps explain where line breaks are and the length of the bitstring, so that the 1-D bitstring can be understood as a 2-D array. If you "remove" a pixel, then you break the raster. The 2-D grid is no longer valid.
Of course, there are many instances where you do want to effectively discard or clean the raster of data. Such an example might be to remove pixels that cover land from a raster of sea-surface temperatures. To accomplish this goal, many geospatial raster formats hold metadata describing what are called NoData values. Pixels containing a NoData value are interpreted as not existing. Recall that in a raster, each pixel must contain some information. The NoData paradigm allows the structure and format of rasters to be met, while also giving a way to mask pixels from being displayed or analyzed. There is still data (bits, 1s and 0s) at the masked pixels, but it only serves to identify the pixel as invalid.
With this in mind, here is an example using gdal which will mask values in the range of 0-100 so they are NoData, and "do not exist". The NoData value will be specified as 0.
from osgeo import gdal
# open dataset to read, and get a numpy array
ds = gdal.Open("raster3.tif", 'r')
myarray = ds.GetRasterBand(1).ReadAsArray()
# modify numpy array to mask values
myarray[myarray <= 100] = 0
# open output dataset, which is a copy of original
driver = gdal.GetDriverByName('GTiff')
ds_out = driver.CreateCopy("raster3_with_nodata.tif", ds)
# write the modified array to the raster
ds_out.GetRasterBand(1).WriteArray(myarray)
# set the NoData metadata flag
ds_out.GetRasterBand(1).SetNoDataValue(0)
# clear the buffer, and ensure file is written
ds_out.FlushCache()

I want to contribute with an example for landsat data.
In this qick guide, you will be able to exclude Landsat cloud pixels.
Landsat offers the Quality Assessment Band (BQA), which includes int32 values (classes) regarding natural features such as Clouds, Rocks, Ice, Water, Cloud Shadow etc.
We will use the BQA to clip the cloud pixels in the other bands.
# Import Packages
import rasterio as rio
import earthpy.plot as ep
from matplotlib import pyplot
import rioxarray as rxr
from numpy import ma
# Open the Landsat Band 3
Landsat_Image = rxr.open_rasterio(r"C:\...\LC08_L1TP_223075_20210311_20210317_01_T1_B3.tif")
# Open the Quality Assessment Band
BQA = rxr.open_rasterio(r"C:\...\LC08_L1TP_223075_20210311_20210317_01_T1_BQA.tif").squeeze()
# Create a list with the QA values that represent cloud, cloud_shadow, etc.
Cloud_Values = [6816, 6848, 6896, 7072]
# Mask the data using the pixel QA layer
landsat_masked = Landsat_Image.where(~BQA.isin(Cloud_Values))
landsat_masked
# Plot the masked data
landsat_masked_plot = ma.masked_array(landsat_masked.values,landsat_masked.isnull())
# Plot
ep.plot_rgb(landsat_masked_plot, rgb=[2, 1, 0], title = "Masked Data")
plt.show()
###############################################################################
# Export the masked Landsat Scenes to Directory "Masked_Bands_QA"
out_img = landsat_masked
out_img.shape
out_transform = landsat_masked.rio.transform()
# Get a Band of the same Scene for reference
rastDat = rio.open(r"C:\Dados_Espaciais\NDVI_Usinas\Adeco\Indices\Imagens\LC08_L1TP_223075_20210311_20210317_01_T1\LC08_L1TP_223075_20210311_20210317_01_T1_B3.tif")
#copying metadata from original raster
out_meta = rastDat.meta.copy()
#amending original metadata
out_meta.update({'nodata': 0,
'height' : out_img.shape[1],
'width' : out_img.shape[2],
'transform' : out_transform})
# writing and then re-reading the output data to see if it looks good
for i in range(out_img.shape[0]):
with rio.open(rf"C:\Dados_Espaciais\DSM\Bare_Soil_Landsat\Teste_{i+1}_masked.tif",'w',**out_meta) as dst:
dst.write(out_img[i,:,:],1)
This way you tell the program:
Check the areas in BQA with these "Cloud_Values" and exclude these areas, but in the landsat image that I provided.
I hope it works.

Cartesian projection issue in a FITS image through PyFITS / AstroPy

I've looked and looked for a solution to this problem and am turning up nothing.
I'm generating rectangular FITS images through matplotlib and subsequently applying WCS coordinates to them using AstroPy (or PyFITS). My images are in galactic latitude and longitude, so the header keywords appropriate for my maps should be GLON-CAR and GLAT-CAR (for Cartesian projection). I've looked at other maps that use this same map projection in SAO DS9 and the coordinates work great... the grid is perfectly orthogonal as it should be. The FITS standard projections can be found here.
But when I generate my maps, the coordinates are not at all Cartesian. Here's a side-by-side comparison of my map (left) and another reference map of roughly the same region (right). Both are listed GLON-CAR and GLAT-CAR in the FITS header, but mine is screwy when looked at in SAO DS9 (note that the coordinate grid is something SAO DS9 generates based on the data in the FITS header, or at least stored somewhere in the FITS file):
This is problematic, because the coordinate-assigning algorithm will assign incorrect coordinates to each pixel if the projection is wrong.
Has anyone encountered this, or know what could be the problem?
I've tried applying other projections (just to see how they perform in SAO DS9) and they come out fine... but my Cartesian and Mercator projections do not come out with the orthogonal grid like they should.
I can't believe this would be a bug in AstroPy, but I can't find any other cause... unless my arguments in the header are incorrectly formatted, but I still don't see how that could cause the problem I'm experiencing. Or would you recommend using something else? (I've looked at matplotlib basemap but have had some trouble getting that to work on my computer).
My header code is below:
from __future__ import division
import numpy as np
from astropy.io import fits as pyfits # or use 'import pyfits, same thing'
#(lots of code in between: defining variables and simple calculations...
#probably not relevant)
header['BSCALE'] = (1.00000, 'REAL = TAPE*BSCALE + BZERO')
header['BZERO'] = (0.0)
header['BUNIT'] = ('mag ', 'UNIT OF INTENSITY')
header['BLANK'] = (-100.00, 'BLANK VALUE')
header['CRVAL1'] = (glon_center, 'REF VALUE POINT DEGR') #FIRST COORDINATE OF THE CENTER
header['CRPIX1'] = (center_x+0.5, 'REF POINT PIXEL LOCATION') ## REFERENCE X PIXEL
header['CTYPE1'] = ('GLON-CAR', 'COORD TYPE : VALUE IS DEGR')
header['CDELT1'] = (-glon_length/x_length, 'COORD VALUE INCREMENT WITH COUNT DGR') ### degrees per pixel
header['CROTA1'] = (0, 'CCW ROTATION in DGR')
header['CRVAL2'] = (glat_center, 'REF VALUE POINT DEGR') #Y COORDINATE OF THE CENTER
header['CRPIX2'] = (center_y+0.5, 'REF POINT PIXEL LOCATION') #Y REFERENCE PIXEL
header['CTYPE2'] = ('GLAT-CAR', 'COORD TYPE: VALUE IS DEGR') # WAS CAR OR TAN
header['CDELT2'] = (glat_length/y_length, 'COORD VALUE INCREMENT WITH COUNT DGR') #degrees per pixel
header['CROTA2'] = (rotation, 'CCW ROTATION IN DEGR') #NEGATIVE ROTATES CCW around origin (bottom left).
header['DATAMIN'] = (data_min, 'Minimum data value in the file')
header['DATAMAX'] = (data_max, 'Maximum data value in the file')
header['TELESCOP'] = ("Produced from 2MASS")
pyfits.update(filename, map_data, header)
Thanks for any help you can provide.

In the modern definition of the -CAR projection (from Calabretta et al.), GLON-CAR/GLAT-CAR projection only produces a rectilinear grid if CRVAL2 is set to zero. If CRVAL2 is not zero, then the grid is curved (this should have nothing to do with Astropy). You can try and fix this by adjusting CRVAL2 and CRPIX2 so that CRVAL2 is zero. Does this help?
Just to clarify what I mean, try, after your code above, and before writing out the file:
header['CRPIX2'] -= header['CRVAL2'] / header['CDELT2']
header['CRVAL2'] = 0.
Any luck?
If you look at the header for the 'reference' file you looked at, you'll see that CRVAL2 is zero there. Just to be clear, there's nothing wrong with CRVAL2 being non-zero, but the grid is then no longer rectilinear.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.