I have a file with 3 columns of data: Zenith (Z, from 0 to 90°) and Azimuth (A, from 0 to 360°). And radiance as the color variable.
I need to use python with matplotlib to plot this data into something resembling this:
This is my code so far (it returns an error):
import matplotlib.pyplot as plt
import numpy as np
# `data` has the following shape:
# [
# [Zenith value going from 0 to 90],
# [Azimuth values (0 to 365) increasing by 1 and looping back after 365],
# [radiance: floats that need to be mapped by the color value]
#]
data = [[6.000e+00 1.200e+01 1.700e+01 2.300e+01 2.800e+01 3.400e+01 3.900e+01
4.500e+01 5.000e+01 5.600e+01 6.200e+01 6.700e+01 7.300e+01 7.800e+01
8.400e+01 8.900e+01 3.934e+01 4.004e+01 4.054e+01 4.114e+01 4.154e+01
4.204e+01 4.254e+01 4.294e+01 4.334e+01 4.374e+01 4.414e+01 4.454e+01
4.494e+01 4.534e+01 4.564e+01 4.604e+01 4.644e+01 4.684e+01 4.714e+01
4.754e+01 4.794e+01 4.824e+01 4.864e+01 4.904e+01 4.944e+01 4.984e+01
5.014e+01 5.054e+01 5.094e+01 5.134e+01 5.174e+01 5.214e+01 5.264e+01
5.304e+01 5.344e+01 5.394e+01 5.444e+01 5.494e+01 5.544e+01 5.604e+01
5.674e+01 5.764e+01]
[1.960e+02 3.600e+01 2.360e+02 7.600e+01 2.760e+02 1.160e+02 3.160e+02
1.560e+02 3.560e+02 1.960e+02 3.600e+01 2.360e+02 7.600e+01 2.760e+02
1.160e+02 3.160e+02 6.500e+00 3.400e+00 3.588e+02 2.500e+00 3.594e+02
3.509e+02 5.000e-01 6.900e+00 1.090e+01 3.478e+02 1.250e+01 1.050e+01
7.300e+00 2.700e+00 3.571e+02 3.507e+02 1.060e+01 3.200e+00 3.556e+02
3.480e+02 7.300e+00 3.597e+02 3.527e+02 1.260e+01 6.600e+00 1.200e+00
3.570e+02 3.538e+02 3.520e+02 3.516e+02 3.528e+02 3.560e+02 1.200e+00
8.800e+00 3.567e+02 1.030e+01 6.800e+00 8.300e+00 3.583e+02 3.581e+02
3.568e+02 3.589e+02]
[3.580e-04 6.100e-04 3.220e-04 4.850e-04 4.360e-04 2.910e-04 1.120e-03
2.320e-04 4.300e-03 2.680e-04 1.700e-03 3.790e-04 7.460e-04 8.190e-04
1.030e-03 3.650e-03 3.050e-03 3.240e-03 3.340e-03 3.410e-03 3.490e-03
3.290e-03 3.630e-03 3.510e-03 3.320e-03 3.270e-03 3.280e-03 3.470e-03
3.720e-03 3.960e-03 3.980e-03 3.700e-03 3.630e-03 4.100e-03 4.080e-03
3.600e-03 3.990e-03 4.530e-03 4.040e-03 3.630e-03 4.130e-03 4.370e-03
4.340e-03 4.210e-03 4.100e-03 4.090e-03 4.190e-03 4.380e-03 4.460e-03
4.080e-03 4.420e-03 3.960e-03 4.230e-03 4.120e-03 4.440e-03 4.420e-03
4.370e-03 4.380e-03]]
rad = data[0]
azm = data[1]
# From what I understand, I need to create a meshgrid from the zenith and azimuth values
r, th = np.meshgrid(rad, azm)
z = data[2] # This doesn't work as `pcolormesh` expects this to be a 2d array
plt.subplot(projection="polar")
plt.pcolormesh(th, r, z, shading="auto")
plt.plot(azm, r, color="k", ls="none")
plt.show()
Note: my actual data goes on for 56k lines and looks like this (Ignore the 4th column):
The example data above is my attempt to reduce the resolution of this massive file, so I only used 1/500 of the lines of data. This might be the wrong way to reduce the resolution, please correct me if it is!
Every tutorial I've seen generate the z value from the r array generated by meshgrid. This is leaving me confused about how I would convert my z column into a 2d array that would properly map to the zenith and azimuth values.
They'll use something like this:
z = (r ** 2.0) / 4.0
So, taking the exact shape of r and applying a transformation to create the color.
The solution was in the data file all along. I needed to better understand what np.meshrid actually did. Turns out the data already is a 2d array, it just needed to be reshaped. I also found a flaw in the file, fixing it reduced its lines from 56k to 15k. This was small enough that I did not need to reduce the resolution.
Here's how I reshaped my data, and what the solution looked like:
import matplotlib.pyplot as plt
import numpy as np
with open("data.txt") as f:
lines = np.array(
[
[float(n) for n in line.split("\t")]
for i, line in enumerate(f.read().splitlines())
]
)
data = [np.reshape(a, (89, 180)) for a in lines.T]
rad = np.radians(data[1])
azm = data[0]
z = data[2]
plt.subplot(projection="polar")
plt.pcolormesh(rad, azm, z, cmap="coolwarm", shading="auto")
plt.colorbar()
plt.show()
The simplest way to plot the given data is with a polar scatter plot.
Using blue for low values and red for high values, it could look like:
import matplotlib.pyplot as plt
import numpy as np
data = [[6.000e+00, 1.200e+01, 1.700e+01, 2.300e+01, 2.800e+01, 3.400e+01, 3.900e+01, 4.500e+01, 5.000e+01, 5.600e+01, 6.200e+01, 6.700e+01, 7.300e+01, 7.800e+01, 8.400e+01, 8.900e+01, 3.934e+01, 4.004e+01, 4.054e+01, 4.114e+01, 4.154e+01, 4.204e+01, 4.254e+01, 4.294e+01, 4.334e+01, 4.374e+01, 4.414e+01, 4.454e+01, 4.494e+01, 4.534e+01, 4.564e+01, 4.604e+01, 4.644e+01, 4.684e+01, 4.714e+01, 4.754e+01, 4.794e+01, 4.824e+01, 4.864e+01, 4.904e+01, 4.944e+01, 4.984e+01, 5.014e+01, 5.054e+01, 5.094e+01, 5.134e+01, 5.174e+01, 5.214e+01, 5.264e+01, 5.304e+01, 5.344e+01, 5.394e+01, 5.444e+01, 5.494e+01, 5.544e+01, 5.604e+01, 5.674e+01, 5.764e+01],
[1.960e+02, 3.600e+01, 2.360e+02, 7.600e+01, 2.760e+02, 1.160e+02, 3.160e+02, 1.560e+02, 3.560e+02, 1.960e+02, 3.600e+01, 2.360e+02, 7.600e+01, 2.760e+02, 1.160e+02, 3.160e+02, 6.500e+00, 3.400e+00, 3.588e+02, 2.500e+00, 3.594e+02, 3.509e+02, 5.000e-01, 6.900e+00, 1.090e+01, 3.478e+02, 1.250e+01, 1.050e+01, 7.300e+00, 2.700e+00, 3.571e+02, 3.507e+02, 1.060e+01, 3.200e+00, 3.556e+02, 3.480e+02, 7.300e+00, 3.597e+02, 3.527e+02, 1.260e+01, 6.600e+00, 1.200e+00, 3.570e+02, 3.538e+02, 3.520e+02, 3.516e+02, 3.528e+02, 3.560e+02, 1.200e+00, 8.800e+00, 3.567e+02, 1.030e+01, 6.800e+00, 8.300e+00, 3.583e+02, 3.581e+02, 3.568e+02, 3.589e+02],
[3.580e-04, 6.100e-04, 3.220e-04, 4.850e-04, 4.360e-04, 2.910e-04, 1.120e-03, 2.320e-04, 4.300e-03, 2.680e-04, 1.700e-03, 3.790e-04, 7.460e-04, 8.190e-04, 1.030e-03, 3.650e-03, 3.050e-03, 3.240e-03, 3.340e-03, 3.410e-03, 3.490e-03, 3.290e-03, 3.630e-03, 3.510e-03, 3.320e-03, 3.270e-03, 3.280e-03, 3.470e-03, 3.720e-03, 3.960e-03, 3.980e-03, 3.700e-03, 3.630e-03, 4.100e-03, 4.080e-03, 3.600e-03, 3.990e-03, 4.530e-03, 4.040e-03, 3.630e-03, 4.130e-03, 4.370e-03, 4.340e-03, 4.210e-03, 4.100e-03, 4.090e-03, 4.190e-03, 4.380e-03, 4.460e-03, 4.080e-03, 4.420e-03, 3.960e-03, 4.230e-03, 4.120e-03, 4.440e-03, 4.420e-03, 4.370e-03, 4.380e-03]]
rad = np.radians(data[1])
azm = data[0]
z = data[2]
plt.subplot(projection="polar")
plt.scatter(rad, azm, c=z, cmap='coolwarm')
plt.colorbar()
plt.show()
Creating such a scatter plot with your real data gives an idea how it looks like. You might want to choose a different colormap, depending on what you want to convey. You also can choose a smaller dot size (for example plt.scatter(rad, azm, c=z, cmap='plasma', s=1, ec='none')) if there would be too many points.
A simple way to create a filled image from non-gridded data uses tricontourf with 256 colors (it looks quite dull with the given data, so I didn't add an example plot):
plt.subplot(projection="polar")
plt.tricontourf(rad, azm, z, levels=256, cmap='coolwarm')
I have a script for plotting astronomical data of redmapping clusters using a csv file. I could get the data points in it and want to plot them using different colors depending on their redshift values: I am binning the dataset into 3 bins (0.1-0.2, 0.2-0.25, 0.25,0.31) based on the redshift.
The problem arises with my code after I distinguish to what bin the datapoint belongs: I want to have 3 labels in the legend corresponding to red, green and blue data points, but this is not happening and I don't know why. I am using plot() instead of scatter() as I also had to do the best fit from the data in the same figure. So everything needs to be in 1 figure.
import numpy as np
import matplotlib.pyplot as py
import csv
z = open("Sheet4CSV.csv","rU")
data = csv.reader(z)
x = []
y = []
ylow = []
yupp = []
xlow = []
xupp = []
redshift = []
for r in data:
x.append(float(r[2]))
y.append(float(r[5]))
xlow.append(float(r[3]))
xupp.append(float(r[4]))
ylow.append(float(r[6]))
yupp.append(float(r[7]))
redshift.append(float(r[1]))
from operator import sub
xerr_l = map(sub,x,xlow)
xerr_u = map(sub,xupp,x)
yerr_l = map(sub,y,ylow)
yerr_u = map(sub,yupp,y)
py.xlabel("$Original\ Tx\ XCS\ pipeline\ Tx\ keV$")
py.ylabel("$Iterative\ Tx\ pipeline\ keV$")
py.xlim(0,12)
py.ylim(0,12)
py.title("Redmapper Clusters comparison of Tx pipelines")
ax1 = py.subplot(111)
##Problem starts here after the previous line##
for p in redshift:
for i in xrange(84):
p=redshift[i]
if 0.1<=p<0.2:
ax1.plot(x[i],y[i],color="b", marker='.', linestyle = " ")#, label = "$z < 0.2$")
exit
if 0.2<=p<0.25:
ax1.plot(x[i],y[i],color="g", marker='.', linestyle = " ")#, label="$0.2 \leq z < 0.25$")
exit
if 0.25<=p<=0.3:
ax1.plot(x[i],y[i],color="r", marker='.', linestyle = " ")#, label="$z \geq 0.25$")
exit
##There seems nothing wrong after this point##
py.errorbar(x,y,yerr=[yerr_l,yerr_u],xerr=[xerr_l,xerr_u], fmt= " ",ecolor='magenta', label="Error bars")
cof = np.polyfit(x,y,1)
p = np.poly1d(cof)
l = np.linspace(0,12,100)
py.plot(l,p(l),"black",label="Best fit")
py.plot([0,15],[0,15],"black", linestyle="dotted", linewidth=2.0, label="line $y=x$")
py.grid()
box = ax1.get_position()
ax1.set_position([box.x1,box.y1,box.width, box.height])
py.legend(loc='center left',bbox_to_anchor=(1,0.5))
py.show()
In the 1st 'for' loop, I have indexed every value 'p' in the list 'redshift' so that bins can be created using 'if' statement. But if I add the labels that are hashed out against each py.plot() inside the 'if' statements, each data point 'i' that gets plotted in the figure as an intersection of (x[i],y[i]) takes the label and my entire legend attains in total 87 labels (including the 3 mentioned in the code at other places)!!!!!!
I essentially need 1 label for each bin...
Please tell me what needs to done after the bins are created and py.plot() commands used...Thanks in advance :-)
Sorry I cannot post my image here due to low reputation!
The data 'appended' for x, y and redshift lists from the csv file are as follows:
x=[5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547]
y=[5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677]
redshift = [0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19]
Working with numerical data like this, you should really consider using a numerical library, like numpy.
The problem in your code arises from processing each record (a coordinate (x,y) and the corresponding value redshift) one at a time. You are calling plot for each point, thereby creating legends for each of those 84 datapoints. You should consider your "bins" as groups of data that belong to the same dataset and process them as such. You could use "logical masks" to distinguish between your "bins", as shown below.
It's also not clear why you call exit after each plotting action.
import numpy as np
import matplotlib.pyplot as plt
x = np.array([5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547])
y = np.array([5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677])
redshift = np.array([0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19])
bin3 = 0.25 <= redshift
bin2 = np.logical_and(0.2 <= redshift, redshift < 0.25)
bin1 = np.logical_and(0.1 <= redshift, redshift < 0.2)
plt.ion()
labels = ("$z < 0.2$", "$0.2 \leq z < 0.25$", "$z \geq 0.25$")
colors = ('r', 'g', 'b')
for bin, label, co in zip( (bin1, bin2, bin3), labels, colors):
plt.plot(x[bin], y[bin], color=co, ls='none', marker='o', label=label)
plt.legend()
plt.show()