I'm reasonably new to Python, and I'm trying to plot long-term mean rainfall data for the African continent. I have various NetCDF files, which have already been cut to just contain the long term mean value - I just need to plot it.
My issue is that the data is only plotting to the right of the 0 degree longitude line. I gather this is due to Basemap wanting -180 to 180 coordinates, and my data is 0 to 360. However, nothing I've tried seems to work.
Here's the code (which gives the correct plot, just cut off to the left of 0 degrees):
nc = Dataset(GISS-E2-H_MAM_plots.nc)
prcp = nc.variables['pr'][0,:,:]
pr = 86400*prcp[:]
lon=nc.variables['lon']
lat=nc.variables['lat']
[lonall, latall] = np.meshgrid(lon, lat)
fig = plt.figure()
m = Basemap(projection='cyl', llcrnrlat=-25, urcrnrlat=15, llcrnrlon=-20, urcrnrlon=60)
m.drawcoastlines()
m.drawcountries()
m.drawparallels(np.arange(-90.,90.,10.), labels = [1,0,0,0], fontsize = 10)
m.drawmeridians(np.arange(-180., 180., 10.), labels = [0,0,0,1], fontsize = 10)
levels=np.arange(2, 11.6, 0.8)
mymapf = plt.contourf(lonall, latall, pr, levels, cmap=plt.cm.gist_rainbow_r)
I've tried to shift the data by 180 using the following, and then np.roll to move it all along.
lonall= lonall-180
nlon=len(lonall)
pr=np.roll(pr, nlon/2, axis=1)
This worked for a colleague in a similar instance, but hasn't worked for me.
Any help would be greatly appreciated!
I think the problem is that you don't have [:] after you read in latitude and longitude. I.e. change the above lines to:
lon=nc.variables['lon'][:]
lat=nc.variables['lat'][:]
Also, you don't need the brackets around [lonall,latall]
Related
In Python I have longitude, latitude and height information that I want to plot (3D):
latitudeMeshGrid (100, 100)
longitudeMeshGrid (100, 100)
heightMeshGrid (100, 100)
heightMeshGrid contains 'NaN's for points that are located outside of a specific (long, lat) region (red points in the figure below):
If I try plotting this using:
ax.plot_surface(longitudeMeshGrid, latitudeMeshGrid, heightMeshGrid, cmap=plt.cm.jet, vmin=np.nanmin(heightMeshGrid), vmax=np.nanmax(heightMeshGrid))
I get the following result:
First of all, the colors at the boundaries seem to be incorrect. Secondly, the plot seems rather "coarse" even though I use a fine grid of data. Is there a possibility to reduce the size of the rectangles?
If I remove the NaN data as follows:
longitudeMeshGrid = longitudeMeshGrid[insideBoundary]
latitudeMeshGrid = latitudeMeshGrid[insideBoundary]
heightMeshGrid = heightMeshGrid[insideBoundary]
Then I end up with:
latitudeMeshGrid (7023,)
longitudeMeshGrid (7023,)
heightMeshGrid (7023,)
This I can plot using:
ax.plot_trisurf(longitudeMeshGrid, latitudeMeshGrid, heightMeshGrid, cmap=plt.cm.jet)
With the result:
At least the artifacts at the edges are gone now, but still the plot looks really coarse.
I expect to get something similar as I get in Matlab using:
surf(longitudeMeshGrid, latitudeMeshGrid, heightMeshGrid)
Which ends up as:
which doesn't have any artifacts at the edges and looks much finer/smoother.
I am plotting a 2D histogram to show, for example, the concentration of lightnings (given by their position registered in longitude and latitude). The number of data points is not too large (53) and the result is too coarse. Here is a picture of the result:
For this reason, I am trying to find a way to weight in data from surrounding bins. For example, there is a bin at longitude = 130 and latitude = 34.395 with 0 lightning registered, but with several around it. I would want this bin to reflect somehow the concentration around it. In other words, I want to smooth the data by having overlapping bins (so that a data point can be counted more than once, by different contiguous bins).
I understand that hist2d has the input option for "weights", but this would only work to make a data point more "important" within its bin.
The simplified code is below and I can clarify anything needed.
import numpy as np
import matplotlib.pyplot as plt
# Here are the data, to experiment if needed
longitude = np.array([119.165, 115.828, 110.354, 117.124, 119.16 , 107.068, 108.628, 126.914, 125.685, 116.608, 122.455, 116.278, 123.43, 128.84, 128.603, 130.192, 124.508, 121.916, 133.245, 125.088, 126.641, 127.224, 113.686, 129.376, 127.312, 121.353, 117.834, 125.219, 138.077, 153.299, 135.66 , 128.391, 118.011, 117.313, 119.986, 118.619, 119.178, 120.295, 121.991, 123.519, 135.948, 132.224, 129.317, 135.334, 132.923, 129.828, 139.006, 140.813, 116.207, 139.254, 120.922, 112.171, 143.508])
latitude = np.array([34.381, 34.351, 34.359, 34.357, 34.364, 34.339, 34.351, 34.38, 34.381, 34.366, 34.373, 34.366, 34.369, 34.387, 34.39 , 34.39 , 34.386, 34.371, 34.394, 34.386, 34.384, 34.387, 34.369, 34.4 , 34.396, 34.37 , 34.374, 34.383, 34.403, 34.429, 34.405, 34.385, 34.367, 34.36 , 34.367, 34.364, 34.363, 34.367, 34.367, 34.369, 34.399, 34.396, 34.382, 34.401, 34.396, 34.392, 34.401, 34.401, 34.362, 34.404, 34.382, 34.346, 34.406])
# Number of bins
Nbins = 15
# Plot histogram of the positions
plt.hist2d(longitude,latitude, bins=Nbins)
plt.plot(longitude,latitude,'o',markersize = 8, color = 'k')
plt.plot(longitude,latitude,'o',markersize = 6, color = 'w')
plt.colorbar()
plt.show()
Perhaps you're getting confused with the concept of 2D-histogram, or histogram. Besides the fact a histogram is a bar plot groupping data into plot, it is also a dicretized estimation of a probability funtion. In your case, the presence probability. For this reason, I would not try to overlap histograms.
Moreover, because the histogram is 'discrete', it will be necessarily coarse. Actually, the resolution of a histogram is an important parameter regarding the desired visualization.
Going back to your question, if you want to disminish the coarse effect, you may to simply want to play on Nbins.
Perhaps, other graph type would suit better your usage: see this gallery and the 2D-density plot with shading.
I'm trying to plot wind barbs, which are spaced 100km by 100km from each other. The data I have is for the northern hemisphere (0.25 degree). I tried to reproduce the problem in the code below:
import numpy as np
from mpl_toolkits.basemap import Basemap
lons,lats = np.meshgrid(np.linspace(-180,180,1440),np.linspace(0,90,360))
m = Basemap(projection='merc',resolution='l',llcrnrlat=33,llcrnrlon=-50,urcrnrlat=68,urcrnrlon=40)
m.drawcoastlines(linewidth=0.6)
X, Y = m(lons,lats)
UWind = np.ones((360,1440))
VWind = np.zeros((360,1440))
xx = np.arange(0, X.shape[1], 8)
yy = np.sin(np.deg2rad(np.linspace(0,90,45)))
yy = yy*360
yy[-1] = 359
yy = yy.astype(int)
points = np.meshgrid(yy, xx)
m.barbs(X[points], Y[points], UWind[points], VWind[points],length=4,linewidth=0.6,pivot='middle')
plt.show()
xx is chosen to plot a barb every 2 degrees (8 boxes at 0.25 degrees resolution). Of course, in this projection, latitude spacing will increase as latitude increases. So, to avoid this, I created yy which varies with sin (to counteract this). It doesn't seem to do anything. Any help is very welcome. The current plot this produces which isn't equal spacing (from a distance, not degrees, perspective).
Replace the m.barbs and add a new line above which does the transform:
uproj,vproj,xx,yy = m.transform_vector(UWind,VWind,np.linspace(-180,180,1440),np.linspace(0,90,360),31,31,returnxy=True,masked=True)
m.barbs(xx,yy,uproj,vproj,length=4,linewidth=0.6,pivot='middle')
Reference: https://matplotlib.org/basemap/users/examples.html
More Information here: https://matplotlib.org/basemap/api/basemap_api.html#module-mpl_toolkits.basemap
I am attempting to plot weather variables on a map of Oklahoma using mpl_toolkits.basemap, but am having issues figuring out how to interpolate the data to plot on top of the map.
Here is a general idea of the current code I have:
lons = [-97.9547, -97.9747, -97.4256]
lats = [35.5322, 35.864, 35.4111]
data = [2,2,2]
map = Basemap(llcrnrlon = -103.068237, llcrnrlat = 33.610045, urcrnrlon = -94.359076, urcrnrlat = 37.040928, resolution = 'i')
CS = map.contour(X, Y, data)
map.drawstates()
plt.show()
What I am attempting to accomplish is to plot the data values on the map based on the related reference index in the lons/lats lists, and then contour the values of the data variable.
Now this obviously won't work, because I need to interpolate the data. Is there a way that I could accomplish this using the griddata function? I am very confused on how I would establish the boundaries of the grid given that latitude and longitude values are not linearly spaced.
Is there an easier way to do this that I am missing?
Any help and/or hints would be greatly appreciated, this is holding me back from moving on to the next major portion of the research project!
I don't have python installed on this machine, so can't test this. But something like this should get you the required inputs for the countour plot...
import numpy as np
lons = [-97.9547, -97.9747, -97.4256]
lats = [35.5322, 35.864, 35.4111]
data = [2,2,2]
xs, ys = np.meshgrid(lons, lats)
dataMesh = np.empty_like(xs)
for i, j, d in zip(lons, lats, data):
dataMesh[lons.index(i), lats.index(j)] = d
map = Basemap(llcrnrlon = -103.068237, llcrnrlat = 33.610045, urcrnrlon = -94.359076, urcrnrlat = 37.040928, resolution = 'i')
CS = map.contour(xs, ys, dataMesh)
map.drawstates()
plt.show()
Like i said though, i haven't tested this. I don't know what happens if you try to plot unititialised values. you might need to use a different numpy array initialisation.
Suppose I've been driving a set route with a 3g modem and GPS on my laptop, while my computer back at home records the ping delay. I've correlated ping with GPS lat/long, and now I'd like to visualise this data.
I've got about 80,000 points of data per day, and I'd like to display several month's worth. I'm especially interested in displaying areas where ping consistently times out (ie ping == 1000).
Scatter plot
My first attempt was with a scatter plot, with one point per data entry. I made the size of the point 5x larger if it was a timeout, so it was obvious where these areas were. I also dropped the alpha to 0.1, for a crude way to see overlaid points.
# Colour
c = pings
# Size
s = [2 if ping < 1000 else 10 for ping in pings]
# Scatter plot
plt.scatter(longs, lats, s=s, marker='o', c=c, cmap=cm.jet, edgecolors='none', alpha=0.1)
The obvious problem with this is that it displays one marker per data point, which is a very poor way to display large amounts of data. If I've drive past the same area twice, then the first pass data is just displayed on top of the second pass.
Interpolate over an even grid
I then had a try at using numpy and scipy to interpolate over an even grid.
# Convert python list to np arrays
x = np.array(longs, dtype=float)
y = np.array(lats, dtype=float)
z = np.array(pings, dtype=float)
# Make even grid (200 rows/cols)
xi = np.linspace(min(longs), max(longs), 200)
yi = np.linspace(min(lats), max(lats), 200)
# Interpolate data points to grid
zi = griddata((x, y), z, (xi[None,:], yi[:,None]), method='linear', fill_value=0)
# Plot contour map
plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
plt.contourf(xi,yi,zi,15,cmap=plt.cm.jet)
From this example
This looks interesting (lots of colours and shapes), but it extrapolates too far around areas I haven't explored. You can't see the routes I've travelled, just red/blue blotches.
If I've driven in a large curve, it'll interpolate for the area between (see below):
Interpolate over an uneven grid
I then had a try at using meshgrid (xi, yi = np.meshgrid(lats, longs)) instead of a fixed grid, but I'm told my array is too big.
Is there an easy way I can create a grid from my points?
My requirements:
Handle large data sets (80,000 x 60 = ~5m points)
Display duplicate data for each point either by averaging (I assume interpolation will do this), or by taking a minimum value for each point.
Don't extrapolate too far from data points
I'm happy with a scatter plot (top), but I need some way to average the data before I display it.
(Apologies for the dodgy mspaint drawings, I can't upload actual data)
Solution:
# Get sum
hsum, long_range, lat_range = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)), weights=pings)
# Get count
hcount, ignore1, ignore2 = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)))
# Get average
h = hsum/hcount
x, y = np.where(h)
average = h[x, y]
# Make scatter plot
scatterplot = ax.scatter(long_range[x], lat_range[y], s=3, c=average, linewidths=0, cmap="jet", vmin=0, vmax=1000)
To simplify your question, you have two set of points, one for ping<1000, one for ping>=1000.
Since the count of points is very large, you can't plot them directly by scatter(). I created some sample data by:
longs = (np.random.rand(60, 1) + np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs) + np.random.rand(len(longs)) * 0.1
bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]
(longs, lats) is points for ping<1000, (bad_longs, bad_lats) is points for ping>1000
You can use numpy.histogram2d() to count the points:
ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(400,400), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(400,400), range=ranges)
h and bad_h are the points count in every little squere area.
Then you can choose many methods to visualize it. For example, you can plot it by scatter():
y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")
count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")
pl.show()
Here is the full code:
import numpy as np
import pylab as pl
longs = (np.random.rand(60, 1) + np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs) + np.random.rand(len(longs)) * 0.1
bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]
ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(300,300), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(300,300), range=ranges)
y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")
count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")
pl.show()
The output figure is:
The GDAL libraries including the Python API and associated utilities, particularly gdal_grid should work for you. It includes a number of interpolation and averaging methods and options for generating gridded data from scattered points. You should be able to manipulate the grid cell size to get a pleasing resolution.
GDAL handles a number of data formats, but you should be able to pass your coordinates and ping values as CSV and get back a PNG or JPEG without much trouble.
Keep in mind lat/lon data is not a planar coordinate system. If you intend to incorporate you results with other map data you'll have to figure out what map projection, units, etc. to use.