plotting a large matrix in python - python

I have a data file in excel (.xlsx). The data represents a 100 micrometer by 100 micrometer area. Number of steps were set at 50 for x and 50 for y meaning each pixel is 2 micrometer in size. How can I create a 2D image from this data.

getting data from xslx files can be achieved using the openpyxl python module. after installing the module a simple example is (assuming you have an xslx as in the image attached):
from openpyxl import load_workbook
wb = load_workbook("/path/to/matrix.xlsx")
cell_range = wb['Sheet1']['B2:G16']
for row in cell_range:
for cell in row:
print(str(cell.value) + " ", end='')
print("")
this would print all the vaules in the range, you could also read them into a numpy array and plot. xslx example

If you are willing to plot the pixels instead of points using matplotlib then you can convert your dataframe into numpy array and then plot that array using imshow() method of matplotlib, after manipulating the numpy array as per your need.

Related

How to save multiple images generated using matplotlib & imshow in a folder

I'm working with matplotlib, specifically its imshow() operation. I have a multi-dimension array generated by random.rand function from NumPy.
data_array = np.random.rand(63, 4, 4, 3)
Now I want to generate images using the imshow() function from matplotlib using every 63 entries of this data array, and even this code below has generated the desired image I wanted.
plt.imshow(data_array[0]) #image constructed for the 1st element of the array
Now I wanted to save all the images produced using the imshow() from the array's entries and save those in a specific folder on my computer with a specific name. I tried with this code below.
def create_image(array):
return plt.imshow(array, interpolation='nearest', cmap='viridis')
count = 0
for i in data_array:
count += 63
image_machine = create_image(i)
image_machine.savefig('C:\Users\Asus\Save Images\close_'+str(count)+".png")
Using this code, I want to save each image produced using each entry of data_array using the imshow() function and save it to the specific folder 'C:\Users\Asus\Save Images' with a particular name encoding like 1st image will be saved as close_0.png
Please help me in this saving step, I'm stuck here.
You can do the following:
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
output_folder = "path/to/folder/"
data_array = np.random.rand(63, 4, 4, 3)
for index, array in enumerate(data_array):
fig = plt.figure()
plt.imshow(array, interpolation="nearest", cmap="viridis")
fig.savefig(Path(output_folder, f"close_{index}.png"))
plt.close()
I have added the plt.close otherwise you will end up with a lot of images simultaneously open.

How to replace values in netcdf file with Nan?

I'm using a NASA GISS netcdf file with gridded monthly temperature values. According to the readme file "Missing data is flagged with a value of 9999.f" I am trying to plot the data but keep getting blank maps. I think its because this 9999.f value is throwing off my scale. How do I replace it with Nan? I tried:
from netCDF4 import Dataset
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
data2 = Dataset(r'GriddedAir250.nc')
lats=data2.variables['lat'][:]
lons=data2.variables['lon'][:]
time=data2.variables['time'][:]
air=data2.variables['air'][:]
air=air.astype('float')
air[air==9999]=np.nan
But it looks like this gives me an array of boolean values:
netCDF4 creates masked arrays, and automatically masks the value 9999.0. In your code, this means the result of air = data2.variables['air'][:] is a masked array. So I suspect the problem is that the plotting code that you are trying to use does not handle masked arrays. If you think the plotting code can handle nan, you could try
air = air.filled(fill_value=np.nan)
This will convert air to a regular NumPy array, with the masked values (i.e. the values that were originaly 9999.0 in the .nc file) converted to nan.

convert tiff to netcdf

i try to convert a tiff to netcdf file. errors is saying index error:
import numpy as np
from netCDF4 import Dataset
import rasterio
with rasterio.drivers():
src=rasterio.open(r"ia.tiff","r")
dst_transform=src.transform
dst_width=src.width
dst_height=src.height
print (dst_transform)
xmin = dst_transform[0]
xmax = dst_transform[0] + dst_transform[1]*dst_width
print (xmax)
min = dst_transform[3] + dst_transform[5]*dst_height
print(ymin)
ymax = dst_transform[3]
dst_width=dst_width+1
dst_height=dst_height+1
outf=Dataset(r'ia.nc','w',format='NETCDF4_CLASSIC')
lats=np.linspace(ymin,ymax,dst_width)
lons=np.linspace(xmin,xmax,dst_height)
lat=outf.createDimension('lon',len(lats))
lon=outf.createDimension('lat',len(lons))
longitude=outf.createVariable('longitude',np.float64,('lon',))
latitude=outf.createVariable('latitude',np.float64,('lat',))
SHIA=outf.createVariable('SHIA',np.int8,('lon','lat'))
outf.variables['longitude'][:]=lons
outf.varibales['longitude'][:]=lat
im=src.read()
SHIA[:,:]=im
outf.description="IA for"
longitude.units="degrees east"
latitude.units='degrees north'
print ("created empty array")
outf.close()
outf.close()
error is that index error: size of the data array does not conform to slice. can somebody take a look and help me where i did wrong. Much appreciated!
I use xarray for this kind of things. Create xarray DataArray for each variable you have (seems SHIA for yours). Create DataSet and related DataArray with it. Don't forget to set coordinate variables into Dataset as coordinate.
see:
http://xarray.pydata.org/en/stable/io.html
Also you can convert your netcdf / tiff into dataframe and return again. But i wouldn't recommend this till you have to. Beause netcdf is multidimensional data and dataframe represent all data as cloning to one matrix.
The easiest way I could think of is to use the GDAL tool.
# Convert TIF to netCDF
gdal_translate -of netCDF -co "FOMRAT=NC4" ia.tif ia.nc
# Convert SHP to netCDF
gdal_rasterize -of netCDF -burn 1 -tr 0.01 0.01 input.shp output.nc

Image of Mnist data Python - Error when displaying the image

I'm working with the Mnist data set, in order to learn about Machine learning, and as for now I'm trying to display the first digit in the Mnist data set as an image, and I have encountered a problem.
I have a matrix with the dimensions 784x10000, where each column is a digit in the data set. I have created the matrix myself, because the Mnist data set came in the form of a text file, which in itself caused me quite a lot of problems, but that's a question for itself.
The MN_train matrix below, is my large 784x10000 matrix. So what I'm trying to do below, is to fill up a 28x28 matrix, in order to display my image.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
grey = np.zeros(shape=(28,28))
k = 0
for l in range(28):
for p in range(28):
grey[p,l]=MN_train[k,0]
k = k + 1
print grey
plt.show(grey)
But when I try to display the image, I get the following error:
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Followed by a image plot that does not look like the number five, as I would expect.
Is there something I have overlooked, or does this tell me that my manipulation of the text file, in order to construct the MN_train matrix, has resulted in an error?
The error you get is because you supply the array to show. show accepts only a single boolean argument hold=True or False.
In order to create an image plot, you need to use imshow.
plt.imshow(grey)
plt.show() # <- no argument here
Also note that the loop is rather inefficient. You may just reshape the input column array.
The complete code would then look like
import numpy as np
import matplotlib.pyplot as plt
MN_train = np.loadtxt( ... )
grey = MN_train[:,0].reshape((28,28))
plt.imshow(grey)
plt.show()

delete pixel from the raster image with specific range value

update :
any idea how to delete pixel from specific range value raster image with
using numpy/scipy or gdal?
or how to can create new raster with some class using raster calculation expressions(better)
for example i have a raster image with the
5 class :
1. 0-100
2. 100-200
3. 200-300
4. 300-500
5. 500-1000
and i want to delete class 1 range value
or maybe class 1,2,4,5
i begin with this script :
import numpy as np
from osgeo import gdal
ds = gdal.Open("raster3.tif")
myarray = np.array(ds.GetRasterBand(1).ReadAsArray())
#print myarray.shape
#print myarray.size
#print myarray
new=np.delete(myarray[::2], 1)
but i cant to complete
the image
White in class 5 and black class 1
Rasters are 2-D arrays of values, with each value being stored in a pixel (which stands for picture element). Each pixel must contain some information. It is not possible to delete or remove pixels from the array because rasters are usually encoded as a simple 1-dimensional string of bits. Metadata commonly helps explain where line breaks are and the length of the bitstring, so that the 1-D bitstring can be understood as a 2-D array. If you "remove" a pixel, then you break the raster. The 2-D grid is no longer valid.
Of course, there are many instances where you do want to effectively discard or clean the raster of data. Such an example might be to remove pixels that cover land from a raster of sea-surface temperatures. To accomplish this goal, many geospatial raster formats hold metadata describing what are called NoData values. Pixels containing a NoData value are interpreted as not existing. Recall that in a raster, each pixel must contain some information. The NoData paradigm allows the structure and format of rasters to be met, while also giving a way to mask pixels from being displayed or analyzed. There is still data (bits, 1s and 0s) at the masked pixels, but it only serves to identify the pixel as invalid.
With this in mind, here is an example using gdal which will mask values in the range of 0-100 so they are NoData, and "do not exist". The NoData value will be specified as 0.
from osgeo import gdal
# open dataset to read, and get a numpy array
ds = gdal.Open("raster3.tif", 'r')
myarray = ds.GetRasterBand(1).ReadAsArray()
# modify numpy array to mask values
myarray[myarray <= 100] = 0
# open output dataset, which is a copy of original
driver = gdal.GetDriverByName('GTiff')
ds_out = driver.CreateCopy("raster3_with_nodata.tif", ds)
# write the modified array to the raster
ds_out.GetRasterBand(1).WriteArray(myarray)
# set the NoData metadata flag
ds_out.GetRasterBand(1).SetNoDataValue(0)
# clear the buffer, and ensure file is written
ds_out.FlushCache()
I want to contribute with an example for landsat data.
In this qick guide, you will be able to exclude Landsat cloud pixels.
Landsat offers the Quality Assessment Band (BQA), which includes int32 values (classes) regarding natural features such as Clouds, Rocks, Ice, Water, Cloud Shadow etc.
We will use the BQA to clip the cloud pixels in the other bands.
# Import Packages
import rasterio as rio
import earthpy.plot as ep
from matplotlib import pyplot
import rioxarray as rxr
from numpy import ma
# Open the Landsat Band 3
Landsat_Image = rxr.open_rasterio(r"C:\...\LC08_L1TP_223075_20210311_20210317_01_T1_B3.tif")
# Open the Quality Assessment Band
BQA = rxr.open_rasterio(r"C:\...\LC08_L1TP_223075_20210311_20210317_01_T1_BQA.tif").squeeze()
# Create a list with the QA values that represent cloud, cloud_shadow, etc.
Cloud_Values = [6816, 6848, 6896, 7072]
# Mask the data using the pixel QA layer
landsat_masked = Landsat_Image.where(~BQA.isin(Cloud_Values))
landsat_masked
# Plot the masked data
landsat_masked_plot = ma.masked_array(landsat_masked.values,landsat_masked.isnull())
# Plot
ep.plot_rgb(landsat_masked_plot, rgb=[2, 1, 0], title = "Masked Data")
plt.show()
###############################################################################
# Export the masked Landsat Scenes to Directory "Masked_Bands_QA"
out_img = landsat_masked
out_img.shape
out_transform = landsat_masked.rio.transform()
# Get a Band of the same Scene for reference
rastDat = rio.open(r"C:\Dados_Espaciais\NDVI_Usinas\Adeco\Indices\Imagens\LC08_L1TP_223075_20210311_20210317_01_T1\LC08_L1TP_223075_20210311_20210317_01_T1_B3.tif")
#copying metadata from original raster
out_meta = rastDat.meta.copy()
#amending original metadata
out_meta.update({'nodata': 0,
'height' : out_img.shape[1],
'width' : out_img.shape[2],
'transform' : out_transform})
# writing and then re-reading the output data to see if it looks good
for i in range(out_img.shape[0]):
with rio.open(rf"C:\Dados_Espaciais\DSM\Bare_Soil_Landsat\Teste_{i+1}_masked.tif",'w',**out_meta) as dst:
dst.write(out_img[i,:,:],1)
This way you tell the program:
Check the areas in BQA with these "Cloud_Values" and exclude these areas, but in the landsat image that I provided.
I hope it works.

Categories