as of right now I am trying to create a graph, which is fine until I try to add a third column of data on my .csv file.
So essentially I am taking pressure-area isotherms, and what I have been tasked with is to make a pressure, area graph, which I achieved (woot!)
import matplotlib.pyplot as plt
import numpy as np
x, y = np.loadtxt("Example.csv", delimiter=',', unpack=True)
plt.plot(x,y)
plt.xlabel('Area-mm^2')
plt.ylabel('Pressure mN/m')
plt.title('Pressure-Area Isotherm\nKibron')
plt.legend()
plt.show()
this is what I got, what I need to do now is to also put the average pixel value of some photos I took into the graph so that I can positively correlate the inverse relation between area and pressure/light intensity.
My.csv (excel file) has three columns if it is not possible to do both of these at the same time could someone show me a way to only pick 2 of the three columns to put on the graph? I.E pressure/area, pressure/pixel values , or area/pixel values. I assume it would involve assigning each column a number(n) and have the pyplot graph "n" vs "n"
Edit: I would also like for their to be a second scale so that the overall graph doesn't look wonky . again thanks for the help!
|1st is area | then pressure| and average pixel value|
You can use zip and create overlaying plots:
import csv
import matplotlib.pyplot as plt
with open('filename.csv') as f:
headers = iter(['area', 'pressure', 'pixel'])
data = {next(headers):list(map(float, b)) for _, *b in zip(*csv.reader(f))}
labels = ['pressure/area', 'pressure/pixel', 'area/pixel']
for i in labels:
num, denom = i.split('/')
plt.plot(data[num], data[denom], label = i)
plt.legend(loc='upper left')
plt.show()
Related
I am new to python and trying to plot a color magnitude diagram(CMD) for a selected cluster by matplotlib, there are 3400000 stars that I need to plot, the data for each star would be color on x axis and magnitude on y axis, However, my code should read two columns in a csv file and plot. The problem is when I using a part of the data (3000 stars), I can plot a CMD succesfully but when I use all the data, the plot is very mess(see figure below) and it seems that points are ploted by their positions in the column instead of its value. For example, a point has data (0.92,20.64) should be close to the y-axis, but is actually located at the far right of the plot just becasue it placed at last few columns of the dataset. So I wanna know how can I plot the entire dataset and show a plot like the first figure.Thanks for yout time. These are my codes:
import matplotlib.pyplot as plt
import pandas as pd
import csv
data = pd.read_csv(r'C:\Users\Peter\Desktop\F275W test.csv', low_memory=False)
# Generate some test data
x = data['F275W-F336W']
y = data['F275W']
#remove the axis
plt.axis('off')
plt.plot(x,y, ',')
plt.show()
This is the plot I got for 3000 stars it's a CMD
This is the plot I got for entire dataset, which is very mess
I have a dataframe that consists of a bunch of x,y data that I'd like to see in scatter form along with a line. The dataframe consists of data with its form repeated over multiple categories. The end result I'd like to see is some kind of grid of the plots, but I'm not totally sure how matplotlib handles multiple subplots of overplotted data.
Here's an example of the kind of data I'm working with:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
category = np.arange(1,10)
total_data = pd.DataFrame()
for i in category:
x = np.arange(0,100)
y = 2*x + 10
data = np.random.normal(0,1,100) * y
dataframe = pd.DataFrame({'x':x, 'y':y, 'data':data, 'category':i})
total_data = total_data.append(dataframe)
We have x data, we have y data which is a linear model of some kind of generated dataset (the data variable).
I had been able to generate individual plots based on subsetting the master dataset, but I'd like to see them all side-by-side in a 3x3 grid in this case. However, calling the plots within the loop just overplots them all onto one single image.
Is there a good way to take the following code block and make a grid out of the category subsets? Am I overcomplicating it by doing the subset within the plot call?
plt.scatter(total_data['x'][total_data['category']==1], total_data['data'][total_data['category']==1])
plt.plot(total_data['x'][total_data['category']==1], total_data['y'][total_data['category']==1], linewidth=4, color='black')
If there's a simpler way to generate the by-category scatter plus line, I'm all for it. I don't know if seaborn has a similar or more intuitive method to use than pyplot.
You can use either sns.FacetGrid or manual plt.plot. For example:
g = sns.FacetGrid(data=total_data, col='category', col_wrap=3)
g = g.map(plt.scatter, 'x','data')
g = g.map(plt.plot,'x','y', color='k');
Gives:
Or manual plt with groupby:
fig, axes = plt.subplots(3,3)
for (cat, data), ax in zip(total_data.groupby('category'), axes.ravel()):
ax.scatter(data['x'], data['data'])
ax.plot(data['x'], data['y'], color='k')
gives:
I remember seeing on a blog post a nice technique to visualize geographical data. It was just lines representing latitude and the high of the lines the variable to be shown. I tried to sketch it on the following picture:
Does some of you remember the library or even the blog post which explained how to generate these maps?
(I vaguely remember it being matplotlib & python, but I could very well be wrong)
I think this is the kind of thing you want - plotting lines of constant latitude on a 3d axis. I've explained what each section does in comments
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import itertools
#read in data from csv organised in columns labelled 'lat','lon','elevation'
data = np.recfromcsv('elevation-sample.csv', delimiter=',')
# create a 3d axis on a figure
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Find unique (i.e. constant) latitude points
id_list = np.unique(data['lat'])
# stride is how many lines to miss. set to 1 to get every line
# higher to miss more
stride = 5
# Extract each line from the dataset and plot it on the axes
for id in id_list[::stride]:
this_line_data = data[np.where(data['lat'] == id)]
lat,lon,ele = zip(*this_line_data)
ax.plot(lon,lat,ele, color='black')
# set the viewpoint so we're looking straight at the longitude (x) axis
ax.view_init(elev=45., azim=90)
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_zlabel('Elevation')
ax.set_zlim([0,1500])
plt.show()
The data set I used to test is not mine, but I found it on github here.
This gives output as follows:
Note - you can swap latitude and longitude if I've misinterpreted the axis labels in your sketch.
Are you thinking a 3D plot similar to this? Possibly you could also do a cascade plot like this? The code for the last type of plot is something like this:
# Input parameters:
padding = 1 # Relative distance between plots
ax = gca() # Matplotlib axes to plot in
spectra = np.random.rand((10, 100)) # Series of Y-data
x_data = np.arange(len(spectra[0])) # X-data
# Figure out distance between plots:
max_value = 0
for spectrum in spectra:
spectrum_yrange = (np.nanmax(spectrum) -
np.nanmin(spectrum))
if spectrum_yrange > max_value:
max_value = spectrum_yrange
# Plot the individual lines
for i, spectrum in enumerate(spectra):
# Normalize the data to max_value
data = (spectrum - spectrum.min()) / float(max_value)
# Offset the individual lines
data += i * padding
ax.plot(x_data, data)
I'm beginning with plotting on python using the very nice pyplot. I aim at showing the evolution of two series of data along time. Instead of doing a casual plot of data function of time, I'd like to have a scatter plot (data1,data2) where the time component is shown as a color gradient.
In my two column file, the time would be described by the line number. Either written as a 3rd column in the file either using the intrinsic capability of pyplot to get the line number on its own.
Can anyone help me in doing that ?
Thanks a lot.
Nicolas
When plotting using matplotlib.pyplot.scatter you can pass a third array via the keyword argument c. This array can choose the colors that you want your scatter points to be. You then also pick an appropriate colormap from matplotlib.cm and assign that with the cmap keyword argument.
This toy example creates two datasets data1 and data2. It then also creates an array colors, an array of continual values equally spaced between 0 and 1, and with the same length as data1 and data2. It doesn't need to know the "line number", it just needs to know the total number of data points, and then equally spaces the colors.
I've also added a colorbar. You can remove this by removing the plt.colorbar() line.
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np
N = 500
data1 = np.random.randn(N)
data2 = np.random.randn(N)
colors = np.linspace(0,1,N)
plt.scatter(data1, data2, c=colors, cmap=cm.Blues)
plt.colorbar()
plt.show()
I am trying to plot multiple lines in a 3D plot using matplotlib. I have 6 datasets with x and y values. What I've tried so far was, to give each point in the data sets a z-value. So all points in data set 1 have z=1 all points of data set 2 have z=2 and so on.
Then I exported them into three files. "X.txt" containing all x-values, "Y.txt" containing all y-values, same for "Z.txt".
Here's the code so far:
#!/usr/bin/python
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np
import pylab
xdata = '/X.txt'
ydata = '/Y.txt'
zdata = '/Z.txt'
X = np.loadtxt(xdata)
Y = np.loadtxt(ydata)
Z = np.loadtxt(zdata)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(X,Y,Z)
plt.show()
What I get looks pretty close to what I need. But when using wireframe, the first point and the last point of each dataset are connected. How can I change the colour of the line for each data set and how can I remove the connecting lines between the datasets?
Is there a better plotting style then wireframe?
Load the data sets individually, and then plot each one individually.
I don't know what formats you have, but you want something like this
from mpl_toolkits.mplot3d.axes3d import Axes3D
import matplotlib.pyplot as plt
fig, ax = plt.subplots(subplot_kw={'projection': '3d'})
datasets = [{"x":[1,2,3], "y":[1,4,9], "z":[0,0,0], "colour": "red"} for _ in range(6)]
for dataset in datasets:
ax.plot(dataset["x"], dataset["y"], dataset["z"], color=dataset["colour"])
plt.show()
Each time you call plot (or plot_wireframe but i don't know what you need that) on an axes object, it will add the data as a new series. If you leave out the color argument matplotlib will choose them for you, but it's not too smart and after you add too many series' it will loop around and start using the same colours again.
n.b. i haven't tested this - can't remember if color is the correct argument. Pretty sure it is though.