Radial Heatmap from data sheet - python

I have a file with 3 columns of data: Zenith (Z, from 0 to 90°) and Azimuth (A, from 0 to 360°). And radiance as the color variable.
I need to use python with matplotlib to plot this data into something resembling this:
This is my code so far (it returns an error):
import matplotlib.pyplot as plt
import numpy as np
# `data` has the following shape:
# [
# [Zenith value going from 0 to 90],
# [Azimuth values (0 to 365) increasing by 1 and looping back after 365],
# [radiance: floats that need to be mapped by the color value]
#]
data = [[6.000e+00 1.200e+01 1.700e+01 2.300e+01 2.800e+01 3.400e+01 3.900e+01
4.500e+01 5.000e+01 5.600e+01 6.200e+01 6.700e+01 7.300e+01 7.800e+01
8.400e+01 8.900e+01 3.934e+01 4.004e+01 4.054e+01 4.114e+01 4.154e+01
4.204e+01 4.254e+01 4.294e+01 4.334e+01 4.374e+01 4.414e+01 4.454e+01
4.494e+01 4.534e+01 4.564e+01 4.604e+01 4.644e+01 4.684e+01 4.714e+01
4.754e+01 4.794e+01 4.824e+01 4.864e+01 4.904e+01 4.944e+01 4.984e+01
5.014e+01 5.054e+01 5.094e+01 5.134e+01 5.174e+01 5.214e+01 5.264e+01
5.304e+01 5.344e+01 5.394e+01 5.444e+01 5.494e+01 5.544e+01 5.604e+01
5.674e+01 5.764e+01]
[1.960e+02 3.600e+01 2.360e+02 7.600e+01 2.760e+02 1.160e+02 3.160e+02
1.560e+02 3.560e+02 1.960e+02 3.600e+01 2.360e+02 7.600e+01 2.760e+02
1.160e+02 3.160e+02 6.500e+00 3.400e+00 3.588e+02 2.500e+00 3.594e+02
3.509e+02 5.000e-01 6.900e+00 1.090e+01 3.478e+02 1.250e+01 1.050e+01
7.300e+00 2.700e+00 3.571e+02 3.507e+02 1.060e+01 3.200e+00 3.556e+02
3.480e+02 7.300e+00 3.597e+02 3.527e+02 1.260e+01 6.600e+00 1.200e+00
3.570e+02 3.538e+02 3.520e+02 3.516e+02 3.528e+02 3.560e+02 1.200e+00
8.800e+00 3.567e+02 1.030e+01 6.800e+00 8.300e+00 3.583e+02 3.581e+02
3.568e+02 3.589e+02]
[3.580e-04 6.100e-04 3.220e-04 4.850e-04 4.360e-04 2.910e-04 1.120e-03
2.320e-04 4.300e-03 2.680e-04 1.700e-03 3.790e-04 7.460e-04 8.190e-04
1.030e-03 3.650e-03 3.050e-03 3.240e-03 3.340e-03 3.410e-03 3.490e-03
3.290e-03 3.630e-03 3.510e-03 3.320e-03 3.270e-03 3.280e-03 3.470e-03
3.720e-03 3.960e-03 3.980e-03 3.700e-03 3.630e-03 4.100e-03 4.080e-03
3.600e-03 3.990e-03 4.530e-03 4.040e-03 3.630e-03 4.130e-03 4.370e-03
4.340e-03 4.210e-03 4.100e-03 4.090e-03 4.190e-03 4.380e-03 4.460e-03
4.080e-03 4.420e-03 3.960e-03 4.230e-03 4.120e-03 4.440e-03 4.420e-03
4.370e-03 4.380e-03]]
rad = data[0]
azm = data[1]
# From what I understand, I need to create a meshgrid from the zenith and azimuth values
r, th = np.meshgrid(rad, azm)
z = data[2] # This doesn't work as `pcolormesh` expects this to be a 2d array
plt.subplot(projection="polar")
plt.pcolormesh(th, r, z, shading="auto")
plt.plot(azm, r, color="k", ls="none")
plt.show()
Note: my actual data goes on for 56k lines and looks like this (Ignore the 4th column):
The example data above is my attempt to reduce the resolution of this massive file, so I only used 1/500 of the lines of data. This might be the wrong way to reduce the resolution, please correct me if it is!
Every tutorial I've seen generate the z value from the r array generated by meshgrid. This is leaving me confused about how I would convert my z column into a 2d array that would properly map to the zenith and azimuth values.
They'll use something like this:
z = (r ** 2.0) / 4.0
So, taking the exact shape of r and applying a transformation to create the color.

The solution was in the data file all along. I needed to better understand what np.meshrid actually did. Turns out the data already is a 2d array, it just needed to be reshaped. I also found a flaw in the file, fixing it reduced its lines from 56k to 15k. This was small enough that I did not need to reduce the resolution.
Here's how I reshaped my data, and what the solution looked like:
import matplotlib.pyplot as plt
import numpy as np
with open("data.txt") as f:
lines = np.array(
[
[float(n) for n in line.split("\t")]
for i, line in enumerate(f.read().splitlines())
]
)
data = [np.reshape(a, (89, 180)) for a in lines.T]
rad = np.radians(data[1])
azm = data[0]
z = data[2]
plt.subplot(projection="polar")
plt.pcolormesh(rad, azm, z, cmap="coolwarm", shading="auto")
plt.colorbar()
plt.show()

The simplest way to plot the given data is with a polar scatter plot.
Using blue for low values and red for high values, it could look like:
import matplotlib.pyplot as plt
import numpy as np
data = [[6.000e+00, 1.200e+01, 1.700e+01, 2.300e+01, 2.800e+01, 3.400e+01, 3.900e+01, 4.500e+01, 5.000e+01, 5.600e+01, 6.200e+01, 6.700e+01, 7.300e+01, 7.800e+01, 8.400e+01, 8.900e+01, 3.934e+01, 4.004e+01, 4.054e+01, 4.114e+01, 4.154e+01, 4.204e+01, 4.254e+01, 4.294e+01, 4.334e+01, 4.374e+01, 4.414e+01, 4.454e+01, 4.494e+01, 4.534e+01, 4.564e+01, 4.604e+01, 4.644e+01, 4.684e+01, 4.714e+01, 4.754e+01, 4.794e+01, 4.824e+01, 4.864e+01, 4.904e+01, 4.944e+01, 4.984e+01, 5.014e+01, 5.054e+01, 5.094e+01, 5.134e+01, 5.174e+01, 5.214e+01, 5.264e+01, 5.304e+01, 5.344e+01, 5.394e+01, 5.444e+01, 5.494e+01, 5.544e+01, 5.604e+01, 5.674e+01, 5.764e+01],
[1.960e+02, 3.600e+01, 2.360e+02, 7.600e+01, 2.760e+02, 1.160e+02, 3.160e+02, 1.560e+02, 3.560e+02, 1.960e+02, 3.600e+01, 2.360e+02, 7.600e+01, 2.760e+02, 1.160e+02, 3.160e+02, 6.500e+00, 3.400e+00, 3.588e+02, 2.500e+00, 3.594e+02, 3.509e+02, 5.000e-01, 6.900e+00, 1.090e+01, 3.478e+02, 1.250e+01, 1.050e+01, 7.300e+00, 2.700e+00, 3.571e+02, 3.507e+02, 1.060e+01, 3.200e+00, 3.556e+02, 3.480e+02, 7.300e+00, 3.597e+02, 3.527e+02, 1.260e+01, 6.600e+00, 1.200e+00, 3.570e+02, 3.538e+02, 3.520e+02, 3.516e+02, 3.528e+02, 3.560e+02, 1.200e+00, 8.800e+00, 3.567e+02, 1.030e+01, 6.800e+00, 8.300e+00, 3.583e+02, 3.581e+02, 3.568e+02, 3.589e+02],
[3.580e-04, 6.100e-04, 3.220e-04, 4.850e-04, 4.360e-04, 2.910e-04, 1.120e-03, 2.320e-04, 4.300e-03, 2.680e-04, 1.700e-03, 3.790e-04, 7.460e-04, 8.190e-04, 1.030e-03, 3.650e-03, 3.050e-03, 3.240e-03, 3.340e-03, 3.410e-03, 3.490e-03, 3.290e-03, 3.630e-03, 3.510e-03, 3.320e-03, 3.270e-03, 3.280e-03, 3.470e-03, 3.720e-03, 3.960e-03, 3.980e-03, 3.700e-03, 3.630e-03, 4.100e-03, 4.080e-03, 3.600e-03, 3.990e-03, 4.530e-03, 4.040e-03, 3.630e-03, 4.130e-03, 4.370e-03, 4.340e-03, 4.210e-03, 4.100e-03, 4.090e-03, 4.190e-03, 4.380e-03, 4.460e-03, 4.080e-03, 4.420e-03, 3.960e-03, 4.230e-03, 4.120e-03, 4.440e-03, 4.420e-03, 4.370e-03, 4.380e-03]]
rad = np.radians(data[1])
azm = data[0]
z = data[2]
plt.subplot(projection="polar")
plt.scatter(rad, azm, c=z, cmap='coolwarm')
plt.colorbar()
plt.show()
Creating such a scatter plot with your real data gives an idea how it looks like. You might want to choose a different colormap, depending on what you want to convey. You also can choose a smaller dot size (for example plt.scatter(rad, azm, c=z, cmap='plasma', s=1, ec='none')) if there would be too many points.
A simple way to create a filled image from non-gridded data uses tricontourf with 256 colors (it looks quite dull with the given data, so I didn't add an example plot):
plt.subplot(projection="polar")
plt.tricontourf(rad, azm, z, levels=256, cmap='coolwarm')

Related

Drawing 2D multiple vectors from an ordered list in Python

tl;dr How do I draw connecting 2D Vectors from a list of Coordinates
First time questioneer, so bare with me if I am not following proper etiquette :D
I study mechanical engineering and I have been trying to create a visual aide for the addition of rotating and oscillating forces within an engine. Every piston initially has its own vector, consisting of a length and an angle.
vectors = [[Force, Angle], [Force, Angle], [Force, Angle], ...]
These are converted to x,y components in EXCEL (will include this in python later)
Individual vector components
vectors = [[0.00, 1296.16],
[-1013.38, -808.14],
[421.22, -96.14],
[-374.92, 778.53],
[-374.92, -778.53],
[0.00, 0.00],
[337.79, -269.38]]
These vectors are then added 1 by 1 for a step-by-step resultant
x_coords
Out[183]:
[0.0, -1013.38, -592.16, -967.08, -1342.0, -1342.0, -1004.21]
y_coords Out[182]:
[1296.16, 488.02, 391.88, 1170.41, 391.88, 391.88, 122.5]
WHAT I WANT
What I have
EXCEL CONVERSION
Converting the data in excel
drawing in tkinter/mpl is appreciated, but I will accept any help :)
CODE
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
n=['A','B','C','D','E','F','G'] # These will be replaced with Engine Firing Order
vectors = [[0.00, 1296.16],
[-1013.38, -808.14],
[421.22, -96.14],
[-374.92, 778.53],
[-374.92, -778.53],
[0.00, 0.00],
[337.79, -269.38]]
vectors = pd.DataFrame(np.array(vectors),columns=('x1', 'y1'))
x_coords = []
y_coords = []
sum_x = 0
sum_y = 0
for x in vectors['x1']:
sum_x += x
x_coords.append(round(sum_x,2))
for y in vectors['y1']:
sum_y += y
y_coords.append(round(sum_y, 2))
coords = []
fig, ax = plt.subplots()
ax.scatter(x_coords, y_coords)
for i, txt in enumerate(n):
ax.annotate(txt, (x_coords[i], y_coords[i]))
plt.show()
PS. Any suggestions/comments/tips about the code are appreciated :)

Contour of scattered data via interpolation or QHull in python

I'm trying to plot a contour at z = .95 out of my data however, I couldn't manage to interpolate as I want. I tried to use griddata as follows
from scipy.interpolate import griddata
N = 1000
xi = np.linspace(min(x), max(y), N)
yi = np.linspace(min(x), max(y), N)
c = griddata((np.array(x),np.array(y)),
np.array(z), (xi[None,:],
yi[:,None]), method='linear')
fig, sys = plt.subplots()
sys.contour(xi, yi, c, levels = [.95],
colors=('darkred',),linestyles=('solid',),linewidths=(2,))
also as can be seen in the graph below I tried to use qhull by cutting the z-axis at 0.95.
a = genfromtxt('data.txt')[:,[0,1]] #data where z <= .95
hull = ConvexHull(a)
sys.plot(a[hull.vertices,0], a[hull.vertices,1], color='red',
linestyle='--', lw=2.5, zorder=90, label=r"QHUL")
Below I tried to illustrate both methods and also how it essentially should look like (its a different data just for illustration purposes), however, due to the dip in my data around (1.7, 420) I am getting zigzags in interpolation for that region which I couldn't even fix by treating pieces of data separately and QHULL method just misses accuracy of the data thus I can not use it. Is there any way to interpolate the data to get a similar curve as shown below?
Thanks!
My data is as follows (x,y,z);
1.950e+00 1.500e+02 9.557e-01
1.950e+00 4.800e+02 9.302e-01
1.950e+00 3.100e+02 9.467e-01
1.900e+00 5.500e+02 9.493e-01
1.700e+00 6.000e+02 9.359e-01
1.700e+00 5.500e+02 9.447e-01
8.430e-01 7.800e+02 9.906e-01
1.300e+00 9.000e+02 9.349e-01
1.655e+00 8.132e+02 9.406e-01
1.138e+00 8.453e+02 9.542e-01
1.728e+00 4.895e+02 9.335e-01
1.953e+00 2.254e+02 9.507e-01
1.932e+00 4.706e+01 9.552e-01
1.661e+00 8.081e+02 9.287e-01
1.956e+00 9.931e+00 9.320e-01
1.947e+00 4.457e+01 9.396e-01
1.949e+00 9.769e+01 9.575e-01
1.912e+00 4.441e+02 9.616e-01
1.956e+00 3.739e+01 9.344e-01
1.953e+00 1.042e+02 9.277e-01
1.957e+00 0.000e+00 9.329e-01
1.938e+00 3.455e+01 9.411e-01
1.946e+00 6.045e+01 9.381e-01
1.951e+00 8.227e+01 9.571e-01
1.962e+00 2.500e+01 9.478e-01
1.951e+00 2.778e+01 9.559e-01
1.949e+00 6.736e+01 9.630e-01
1.949e+00 1.097e+02 9.331e-01
1.708e+00 4.998e+02 9.526e-01
1.951e+00 1.250e+02 9.516e-01
1.730e+00 4.642e+02 9.332e-01
1.912e+00 4.780e+02 9.558e-01
1.927e+00 5.145e+02 9.401e-01
1.712e+00 5.203e+02 9.519e-01
1.722e+00 5.470e+02 9.396e-01
1.962e+00 1.117e+02 9.519e-01
1.962e+00 2.195e+01 9.269e-01
1.962e+00 3.366e+01 9.514e-01
1.959e+00 9.610e+01 9.270e-01
1.959e+00 4.537e+01 9.281e-01
1.959e+00 6.488e+01 9.277e-01
1.959e+00 7.659e+01 9.346e-01
1.953e+00 4.537e+01 9.615e-01
1.950e+00 1.820e+02 9.552e-01
1.950e+00 1.702e+02 9.547e-01
1.950e+00 1.415e+01 9.389e-01
1.947e+00 2.639e+02 9.517e-01
1.947e+00 2.015e+02 9.533e-01
1.941e+00 3.029e+02 9.533e-01
1.935e+00 2.873e+02 9.573e-01
1.959e+00 1.415e+01 9.314e-01
1.959e+00 2.439e+00 9.335e-01
1.899e+00 5.137e+02 9.549e-01
1.896e+00 5.371e+02 9.563e-01
1.888e+00 5.839e+02 9.531e-01
1.870e+00 5.917e+02 9.553e-01
1.722e+00 4.746e+02 9.468e-01
1.716e+00 4.278e+02 9.604e-01
1.704e+00 5.644e+02 9.482e-01
1.683e+00 5.800e+02 9.574e-01
1.609e+00 6.854e+02 9.477e-01
1.263e+00 8.766e+02 9.417e-01
1.198e+00 8.532e+02 9.524e-01
1.172e+00 8.532e+02 9.394e-01
1.927e+00 3.807e+02 9.540e-01
1.582e+00 8.424e+02 9.569e-01
1.000e+00 8.415e+02 9.526e-01
8.817e-01 7.985e+02 9.348e-01
1.954e+00 3.139e+00 9.364e-01
1.932e+00 3.583e+02 9.585e-01
1.910e+00 5.018e+02 9.500e-01
1.891e+00 5.628e+02 9.505e-01
1.858e+00 5.987e+02 9.470e-01
1.752e+00 4.874e+02 9.974e-01
1.711e+00 4.803e+02 9.477e-01
1.698e+00 5.341e+02 9.545e-01
1.687e+00 5.628e+02 9.570e-01
1.638e+00 6.596e+02 9.525e-01
1.624e+00 7.996e+02 9.559e-01
1.624e+00 8.211e+02 9.523e-01
1.619e+00 6.632e+02 9.550e-01
1.611e+00 8.283e+02 9.510e-01
1.605e+00 8.354e+02 9.537e-01
1.597e+00 6.776e+02 9.566e-01
1.592e+00 8.426e+02 9.445e-01
1.956e+00 7.908e+01 9.259e-01
It turns out that the data span and interpolation splitting is important
N = 40
x = linspace(0.5,2.4,N)
y = linspace(0.,1100.,N)
mean_CL = griddata((Mgo,Mn1), mean_CLs, (x[None,:], y[:,None]), method='linear')
sc.contour(x,y,mean_CL,levels = [.95],colors=('darkred',),linestyles=('solid',),linewidths=(2,))
did the job. However, instead of having data clustered in one region, one might need to span the entire x-y plane, points don't need to be too close I gathered grid 25x0.025 and it worked perfectly.

flipping and rotating numpy arrays for contour plots

Short Version:
I have a 10x10 numpy array whose contour plot (plotted with pyplot.contourf) looks like this
Now, I want it look something like this - assuming the plot is symmetric across X and Y axes.
Long version
I have a 10x10 numpy array z as a function of x and y. where x=y=np.arange(0.002,0.022,0.002). Here is what I tried
import numpy as np
import matplotlib.pyplot as plt
z=np.array([[ 2.08273679, -0.06591932, -1.14525488, -1.49923222, -1.74361248,
-1.81418446, -1.90115591, -1.94329043, -1.93130228, -1.96064259],
[ 0.20180514, -0.94522815, -1.34635828, -1.58844515, -1.7528935 ,
-1.84438752, -1.86257547, -1.9439332 , -1.99009407, -1.94829146],
[-1.09749238, -1.48234452, -1.64234357, -1.75344742, -1.83019763,
-1.88547473, -1.92958533, -1.940775 , -1.95535063, -1.9629588 ],
[-1.62892483, -1.70176401, -1.76263555, -1.84966414, -1.87139241,
-1.91879916, -1.90796703, -1.96632612, -1.95794984, -1.94585536],
[-1.71551518, -1.91806287, -1.86999609, -1.90800839, -1.92515012,
-1.93386969, -1.96487487, -1.95405297, -1.97032435, -1.96087146],
[-1.81904322, -1.94790171, -2. , -1.96932249, -1.91842475,
-1.98101775, -1.98521938, -1.97618539, -1.95892852, -2.01410874],
[-1.8138236 , -1.90877811, -1.93966404, -1.98406259, -1.95253807,
-1.95867436, -1.96679456, -2.01126218, -1.99885932, -1.99369292],
[-1.9927308 , -1.97658099, -1.91586737, -1.96813381, -1.98416011,
-1.98639893, -1.99997964, -1.99746813, -1.98126505, -1.97767361],
[-1.96406473, -1.92609437, -1.99171257, -1.94687523, -1.9823819 ,
-1.97786533, -2.02323228, -1.98559114, -1.99172681, -2.00881064],
[-1.92470024, -1.99537152, -1.99419303, -1.97261023, -1.9673841 ,
-1.98801505, -2.02412735, -2.01394008, -2.01956817, -2.04963448]])
x=y=np.arange(0.002,0.022,0.002)
#The following gives the plot I currently have
plt.figure()
plt.contourf(x,y,z)
plt.show()
#Tried to flip the matrix z using np.flipud and np.fliplr
plt.figure()
plt.contourf(x,y,z)
plt.contourf(-x,y,np.fliplr(z))
plt.contourf(x,-y,np.flipud(z))
plt.contourf(-x,-y,np.flipud(np.fliplr(z)))
plt.show()
#Also tried to rotate the matrix z using np.rot90
plt.figure()
plt.contourf(x,y,z)
plt.contourf(x,-y,np.rot90(z))
plt.contourf(-x,-y,np.rot90(z,2))
plt.contourf(-x,y,np.rot90(z,3))
plt.show()
I get the following plots with the above code
and
Ideally I would also like to fill the discontinuity at the origin by interpolation of the plot. But for starters, would like to get the orientation right. Any help is greatly appreciated.
Your problem is that, even though you negate x and y, their order stays the same, so with negative x, you go from -0.002 to -0.022, which means that the flipped z gets flipped back during the plotting. To achieve what you want, you can do the following:
#either don't flip z
plt.figure()
plt.contourf(x,y,z)
plt.contourf(-x,y,z)
plt.contourf(x,-y,z)
plt.contourf(-x,-y,z)
plt.show()
#or reverse also -x and -y:
plt.figure()
plt.contourf(x,y,z)
plt.contourf(-x[::-1],y,np.fliplr(z))
plt.contourf(x,-y[::-1],np.flipud(z))
plt.contourf(-x[::-1],-y[::-1],np.flipud(np.fliplr(z)))
plt.show()
If you would have just concatenated z and the flipped z, everything would have worked as expected. plt.contourf takes care of the interpolation itself.
ztotal = np.concatenate([np.fliplr(z),z],axis=1)
ztotal = np.concatenate([np.flipud(ztotal),ztotal],axis=0)
xtotal = np.concatenate([-x[::-1],x],axis=0)
ytotal = np.concatenate([-y[::-1],y],axis=0)
plt.figure()
plt.contourf(xtotal,ytotal,ztotal)
plt.show()
Combine results of fliplr and flipud of your array z to a new double sized array zz then plot it. You have to skip x and y in interval (-0.002; +0.002) with nan values according to your first figure:
import numpy as np
import matplotlib.pyplot as plt
z=np.array([[ 2.08273679, -0.06591932, -1.14525488, -1.49923222, -1.74361248,
-1.81418446, -1.90115591, -1.94329043, -1.93130228, -1.96064259],
[ 0.20180514, -0.94522815, -1.34635828, -1.58844515, -1.7528935 ,
-1.84438752, -1.86257547, -1.9439332 , -1.99009407, -1.94829146],
[-1.09749238, -1.48234452, -1.64234357, -1.75344742, -1.83019763,
-1.88547473, -1.92958533, -1.940775 , -1.95535063, -1.9629588 ],
[-1.62892483, -1.70176401, -1.76263555, -1.84966414, -1.87139241,
-1.91879916, -1.90796703, -1.96632612, -1.95794984, -1.94585536],
[-1.71551518, -1.91806287, -1.86999609, -1.90800839, -1.92515012,
-1.93386969, -1.96487487, -1.95405297, -1.97032435, -1.96087146],
[-1.81904322, -1.94790171, -2. , -1.96932249, -1.91842475,
-1.98101775, -1.98521938, -1.97618539, -1.95892852, -2.01410874],
[-1.8138236 , -1.90877811, -1.93966404, -1.98406259, -1.95253807,
-1.95867436, -1.96679456, -2.01126218, -1.99885932, -1.99369292],
[-1.9927308 , -1.97658099, -1.91586737, -1.96813381, -1.98416011,
-1.98639893, -1.99997964, -1.99746813, -1.98126505, -1.97767361],
[-1.96406473, -1.92609437, -1.99171257, -1.94687523, -1.9823819 ,
-1.97786533, -2.02323228, -1.98559114, -1.99172681, -2.00881064],
[-1.92470024, -1.99537152, -1.99419303, -1.97261023, -1.9673841 ,
-1.98801505, -2.02412735, -2.01394008, -2.01956817, -2.04963448]])
x=y=np.linspace(-0.020,0.020,21)
zz = np.empty((21,21)); zz[:,:] = np.nan
zz[11:,11:] = z
zz[11:,:10] = np.fliplr(z)
zz[:10,:] = np.flipud(zz[11:,:])
plt.figure()
plt.contourf(x,y,zz)
plt.show()
To fill the gap skip one point of coordinate arrays:
...
x=y=np.linspace(-0.020,0.020,20)
zz = np.empty((20,20)); zz[:,:] = np.nan
zz[10:,10:] = z
zz[10:,:10] = np.fliplr(z)
zz[:10,:] = np.flipud(zz[10:,:])
...

Adding a single label to the legend for a series of different data points plotted inside a designated bin in Python using matplotlib.pyplot.plot()

I have a script for plotting astronomical data of redmapping clusters using a csv file. I could get the data points in it and want to plot them using different colors depending on their redshift values: I am binning the dataset into 3 bins (0.1-0.2, 0.2-0.25, 0.25,0.31) based on the redshift.
The problem arises with my code after I distinguish to what bin the datapoint belongs: I want to have 3 labels in the legend corresponding to red, green and blue data points, but this is not happening and I don't know why. I am using plot() instead of scatter() as I also had to do the best fit from the data in the same figure. So everything needs to be in 1 figure.
import numpy as np
import matplotlib.pyplot as py
import csv
z = open("Sheet4CSV.csv","rU")
data = csv.reader(z)
x = []
y = []
ylow = []
yupp = []
xlow = []
xupp = []
redshift = []
for r in data:
x.append(float(r[2]))
y.append(float(r[5]))
xlow.append(float(r[3]))
xupp.append(float(r[4]))
ylow.append(float(r[6]))
yupp.append(float(r[7]))
redshift.append(float(r[1]))
from operator import sub
xerr_l = map(sub,x,xlow)
xerr_u = map(sub,xupp,x)
yerr_l = map(sub,y,ylow)
yerr_u = map(sub,yupp,y)
py.xlabel("$Original\ Tx\ XCS\ pipeline\ Tx\ keV$")
py.ylabel("$Iterative\ Tx\ pipeline\ keV$")
py.xlim(0,12)
py.ylim(0,12)
py.title("Redmapper Clusters comparison of Tx pipelines")
ax1 = py.subplot(111)
##Problem starts here after the previous line##
for p in redshift:
for i in xrange(84):
p=redshift[i]
if 0.1<=p<0.2:
ax1.plot(x[i],y[i],color="b", marker='.', linestyle = " ")#, label = "$z < 0.2$")
exit
if 0.2<=p<0.25:
ax1.plot(x[i],y[i],color="g", marker='.', linestyle = " ")#, label="$0.2 \leq z < 0.25$")
exit
if 0.25<=p<=0.3:
ax1.plot(x[i],y[i],color="r", marker='.', linestyle = " ")#, label="$z \geq 0.25$")
exit
##There seems nothing wrong after this point##
py.errorbar(x,y,yerr=[yerr_l,yerr_u],xerr=[xerr_l,xerr_u], fmt= " ",ecolor='magenta', label="Error bars")
cof = np.polyfit(x,y,1)
p = np.poly1d(cof)
l = np.linspace(0,12,100)
py.plot(l,p(l),"black",label="Best fit")
py.plot([0,15],[0,15],"black", linestyle="dotted", linewidth=2.0, label="line $y=x$")
py.grid()
box = ax1.get_position()
ax1.set_position([box.x1,box.y1,box.width, box.height])
py.legend(loc='center left',bbox_to_anchor=(1,0.5))
py.show()
In the 1st 'for' loop, I have indexed every value 'p' in the list 'redshift' so that bins can be created using 'if' statement. But if I add the labels that are hashed out against each py.plot() inside the 'if' statements, each data point 'i' that gets plotted in the figure as an intersection of (x[i],y[i]) takes the label and my entire legend attains in total 87 labels (including the 3 mentioned in the code at other places)!!!!!!
I essentially need 1 label for each bin...
Please tell me what needs to done after the bins are created and py.plot() commands used...Thanks in advance :-)
Sorry I cannot post my image here due to low reputation!
The data 'appended' for x, y and redshift lists from the csv file are as follows:
x=[5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547]
y=[5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677]
redshift = [0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19]
Working with numerical data like this, you should really consider using a numerical library, like numpy.
The problem in your code arises from processing each record (a coordinate (x,y) and the corresponding value redshift) one at a time. You are calling plot for each point, thereby creating legends for each of those 84 datapoints. You should consider your "bins" as groups of data that belong to the same dataset and process them as such. You could use "logical masks" to distinguish between your "bins", as shown below.
It's also not clear why you call exit after each plotting action.
import numpy as np
import matplotlib.pyplot as plt
x = np.array([5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547])
y = np.array([5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677])
redshift = np.array([0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19])
bin3 = 0.25 <= redshift
bin2 = np.logical_and(0.2 <= redshift, redshift < 0.25)
bin1 = np.logical_and(0.1 <= redshift, redshift < 0.2)
plt.ion()
labels = ("$z < 0.2$", "$0.2 \leq z < 0.25$", "$z \geq 0.25$")
colors = ('r', 'g', 'b')
for bin, label, co in zip( (bin1, bin2, bin3), labels, colors):
plt.plot(x[bin], y[bin], color=co, ls='none', marker='o', label=label)
plt.legend()
plt.show()

Regridding regular netcdf data

I have a netcdf file containing global sea-surface temperatures. Using matplotlib and Basemap, I've managed to make a map of this data, with the following code:
from netCDF4 import Dataset
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
filename = '/Users/Nick/Desktop/SST/SST.nc'
fh = Dataset(filename, mode='r')
lons = fh.variables['LON'][:]
lats = fh.variables['LAT'][:]
sst = fh.variables['SST'][:].squeeze()
fig = plt.figure()
m = Basemap(projection='merc', llcrnrlon=80.,llcrnrlat=-25.,urcrnrlon=150.,urcrnrlat=25.,lon_0=115., lat_0=0., resolution='l')
lon, lat = np.meshgrid(lons, lats)
xi, yi = m(lon, lat)
cs = m.pcolormesh(xi,yi,sst, vmin=18, vmax=32)
m.drawmapboundary(fill_color='0.3')
m.fillcontinents(color='0.3', lake_color='0.3')
cbar = m.colorbar(cs, location='bottom', pad="10%", ticks=[18., 20., 22., 24., 26., 28., 30., 32.])
cbar.set_label('January SST (' + u'\u00b0' + 'C)')
plt.savefig('SST.png', dpi=300)
The problem is that the data is very high resolution (9km grid) which makes the resulting image quite noisy. I would like to put the data onto a lower resolution grid (e.g. 1 degree), but I'm struggling to work out how this could be done. I followed a worked solution to try and use the matplotlib griddata function by inserting the code below into my above example, but it resulted in 'ValueError: condition must be a 1-d array'.
xi, yi = np.meshgrid(lons, lats)
X = np.arange(min(x), max(x), 1)
Y = np.arange(min(y), max(y), 1)
Xi, Yi = np.meshgrid(X, Y)
Z = griddata(xi, yi, z, Xi, Yi)
I'm a relative beginner to Python and matplotlib, so I'm not sure what I'm doing wrong (or what a better approach might be). Any advice appreciated!
If you regrid your data to a coarser lat/lon grid using e.g. bilinear interpolation, this will result in a smoother field.
The NCAR ClimateData guide has a nice introduction to regridding (general, not Python-specific).
The most powerful implementation of regridding routines available for Python is, to my knowledge, the Earth System Modeling Framework (ESMF) Python interface (ESMPy). If this is a bit too involved for your application, you should look into
EarthPy tutorials on regridding (e.g. using Pyresample, cKDTree, or Basemap).
Turning your data into an Iris cube and using Iris' regridding functions.
Perhaps start by looking at the EarthPy regridding tutorial using Basemap, since you are using it already.
The way to do this in your example would be
from mpl_toolkits import basemap
from netCDF4 import Dataset
filename = '/Users/Nick/Desktop/SST/SST.nc'
with Dataset(filename, mode='r') as fh:
lons = fh.variables['LON'][:]
lats = fh.variables['LAT'][:]
sst = fh.variables['SST'][:].squeeze()
lons_sub, lats_sub = np.meshgrid(lons[::4], lats[::4])
sst_coarse = basemap.interp(sst, lons, lats, lons_sub, lats_sub, order=1)
This performs bilinear interpolation (order=1) on your SST data onto a sub-sampled grid (every fourth point). Your plot will look more coarse-grained afterwards. If you do not like that, interpolate back onto the original grid with e.g.
sst_smooth = basemap.interp(sst_coarse, lons_sub[0,:], lats_sub[:,0], *np.meshgrid(lons, lats), order=1)
I usually run my data through a Laplace filter for smoothing. Perhaps you could try the function below and see if it helps with your data. The function can be called with or without a mask (e.g land/ocean mask for ocean data points). Hope this helps. T
# Laplace filter for 2D field with/without mask
# M = 1 on - cells used
# M = 0 off - grid cells not used
# Default is without masking
import numpy as np
def laplace_X(F,M):
jmax, imax = F.shape
# Add strips of land
F2 = np.zeros((jmax, imax+2), dtype=F.dtype)
F2[:, 1:-1] = F
M2 = np.zeros((jmax, imax+2), dtype=M.dtype)
M2[:, 1:-1] = M
MS = M2[:, 2:] + M2[:, :-2]
FS = F2[:, 2:]*M2[:, 2:] + F2[:, :-2]*M2[:, :-2]
return np.where(M > 0.5, (1-0.25*MS)*F + 0.25*FS, F)
def laplace_Y(F,M):
jmax, imax = F.shape
# Add strips of land
F2 = np.zeros((jmax+2, imax), dtype=F.dtype)
F2[1:-1, :] = F
M2 = np.zeros((jmax+2, imax), dtype=M.dtype)
M2[1:-1, :] = M
MS = M2[2:, :] + M2[:-2, :]
FS = F2[2:, :]*M2[2:, :] + F2[:-2, :]*M2[:-2, :]
return np.where(M > 0.5, (1-0.25*MS)*F + 0.25*FS, F)
# The mask may cause laplace_X and laplace_Y to not commute
# Take average of both directions
def laplace_filter(F, M=None):
if M == None:
M = np.ones_like(F)
return 0.5*(laplace_X(laplace_Y(F, M), M) +
laplace_Y(laplace_X(F, M), M))
To answer your original question regarding scipy.interpolate.griddata, too:
Have a close look at the parameter specs for that function (e.g. in the SciPy documentation) and make sure that your input arrays have the right shapes. You might need to do something like
import numpy as np
points = np.vstack([a.flat for a in np.meshgrid(lons,lats)]).T # (n,D)
values = sst.ravel() # (n)
etc.
If you are working on Linux, you can achieve this using nctoolkit (https://nctoolkit.readthedocs.io/en/latest/).
You have not stated the latlon extent of your data, so I will assume it is a global dataset. Regridding to 1 degree resolution would require the following:
import nctoolkit as nc
filename = '/Users/Nick/Desktop/SST/SST.nc'
data = nc.open_data(filename)
data.to_latlon(lon = [-179.5, 179.5], lat = [-89.5, 89.5], res = [1,1])
# visualize the data
data.plot()
Look at this example with xarray...
use the ds.interp method and specify the new latitude and longitude values.
http://xarray.pydata.org/en/stable/interpolation.html#example

Categories