Matplotlib 2.02 plotting within a for loop - python

I am having trouble with two things on a plot I am generating within a for loop, my code loads some data in, fits it to a function using curve_fit and then plots measured data and the fit on the same plot for 5 different sets of measured y value (the measured data is represent by empty circle markers and fit by a solid line as the same color as the marker)
Firstly I am struggling to reduce the linewidth of the fit (solid line) however much I reduce the float value of linewidth, I can increase the size just not decrease it by the value displayed in the output below. Secondly I would like the legend to display only circle markers not circles with lines through - I cannot seem to get this to work, any ideas?
Here is my code and attached is the output plot and data file on google drive share link (for some reason it's cutting off long lines of text on this post):
import scipy
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
#define vogel-fulcher-tamman (VFT) function
def vft(x,sigma_0,temp_vf,D):
return np.log(sigma_0)-((D*temp_vf)/(x-temp_vf))
#load and sort data
data=np.genfromtxt('data file',skip_header=3)
temp=data[:,0]
inverse_temp=data[:,1]
dc_conduct=np.log10(data[:,2:11])
only_adam=dc_conduct[:,4:9]
colors = ['b','g','r','c','m']
labels = ['50mg 2-adam','300mg 2-adam','100 mg 2-adam','150 mg 2-adam','250mg
2-adam']
for i in range(0,len(only_adam)):
#fit VTF function
y=only_adam[:,i]
popt, pcov = curve_fit(vft,temp,y)
#plotting
plt.plot(inverse_temp,y,color=colors[i],marker='o',markerfacecolor='none',
label=labels[i])
plt.plot(inverse_temp,vft(temp, *popt),linewidth=0.00001,linestyle='-
',color=colors[i])
plt.ylabel("Ionic Conductivity [Scm**2/mol]")
plt.xlabel("1000 / [T(K)]")
plt.axis('tight')
plt.legend(loc='lower left')

You are looping over the rows of only_adam, but index the columns of that array with the loop variable i. This does not make sense and leads to the error shown.
The plot that shows the data points has lines in it. Those are the lines shown. You cannot make them smaller by decreasing the other plot's linewidth. Instead you need to set the linestyle of that plot off, e.g. plot(..., ls="")

Related

Plots not visible when using a line plot

I am new to python and I am trying to plot x and y (both have a large number of data) but when I use a plt.plot there is not plot visible on the output.
The code I have been using is
for i in range(len(a)):
plt.plot(a[i],b[i])
plt.figure()
plt.show()
when I tried a scatter plot
for i in range(len(a)):
plt.scatter(a[i],b[i])
plt.figure()
plt.show()
I am not able to understand the reason for missing the line plot and even when I try seaborn it showing me an error ValueError: If using all scalar values, you must pass an index
import numpy as np
import matplotlib.pyplot as plt
a = np.linspace(0,5,100)
b = np.linspace(0,10,100)
plt.plot(a,b)
plt.show()
I think this answers your question. I have taken sample values of a and b. The matplotlib line plots are not required to run in loops
A line is created between two points. If you are plotting single values, a line can't be constructed.
Well, you might say "but I am plotting many points," which already contains part of the answer (points). Actually, matplotlib.plot() plots line-objects. So every time, you call plot, it creates a new one (no matter if you are calling it on the same or on a new axis). The reason why you don't get lines is that only single points are plotted. The reason why you're not even seeing the these points is that plot() does not indicate the points with markers per default. If you add marker='o' to plot(), you will end up with the same figure as with scatter.
A scatter-plot on the other hand is an unordered collection of points. There characteristic is that there are no lines between these points because they are usually not a sequence. Nonetheless, because there are no lines between them, you can plot them all at once. Per default, they have all the same color but you can even specify a color vector so that you can encode a third information in it.
import matplotlib.pyplot as plt
import numpy as np
# create random data
a = np.random.rand(10)
b = np.random.rand(10)
# open figure + axes
fig,axs = plt.subplots(1,2)
# standard scatter-plot
axs[0].scatter(a,b)
axs[0].set_title("scatter plot")
# standard line-plot
axs[1].plot(a,b)
axs[1].set_title("line plot")

Plot average of an array in python

I have a 2D array of temperature over time data. There are about 7500 x-values and as much corresponding y-values (so one y for every x).
It looks like this:
The blue line in the middle is the result of my unsuccessful attempt to draw a plot line, which would represent the average of my data. Code:
import numpy as np
import matplotlib.pyplot as plt
data=np.genfromtxt("data.csv")
temp_av=[np.mean(data[1])]*len(data[0])
plt.figure()
plt.subplot(111)
plt.scatter(data[0],data[1])
plt.plot(data[0],temp_av)
plt.show()
However what I need is a curve, which will follow the rise in the temperature. Basically a line which will be somewhere in the middle of data points.
I googled for some solutions, but all I found were suggestions how to compute an average in cases where you have multiple y-values for one x. I understand how to do that, but it doesn't help in this case.
My next idea would be to use a loop to compute an average for every 2 neighbor points. But I am not sure how to do that best and if there aren't better solutions.
Also, I understand that what I need is to compute an other array. Plotting is only for representation.
If I undestrand correclty, what you are trying to plot is a trend line. You could do it by using the numpy function 'polyfit'. If that's what you are looking for, try this small modification to your code
import numpy as np
import matplotlib.pyplot as plt
data=np.genfromtxt("data.csv")
plt.figure()
plt.subplot(111)
plt.scatter(data[0],data[1])
pfit = np.polyfit(data[0], data[1], 1)
trend_line_model = np.poly1d(pfit)
plt.plot(data[0], trend_line_model(data[0]), "m--")
plt.show()
This will plot the trend line in dashed magenta

Python Matplotlib: Large dataset that is cyclic. How can I make the shade of the color darken with index?

I'm plotting large data sets. The data is cyclical. Without using a color gradient, it is very difficult to understand how the data is evolving because of it's cyclical behavior. Multiple data sets are plotted at once, so asking a data set to change between colors is out of the question. example figure
How can I make my line's color darken with index and still use a legend?
EDIT: (editing for clarity and fulfilling the request for a minimum working example)
I want my dataset's color to change it's shade from lighter to darker as the index increases. What I'm trying to do I still need the legend to work however.
Miniumum working example code:
import pandas as pd
import matplotlib.pyplot as plt
x = range(0,1000)
y = []
for i in x:
y.append(2*x[i])
i+=1
plt.plot(x,y, label = "poop")
plt.legend()
plt.show()

Show a (discrete) colorbar next to a plot as a legend for the (automatically chosen) line colors

I tried to make a plot showing many lines, but it is hard to tell them apart. They have different colors, but I would like to make it easy to show which line is which. A normal legend does not really work so well, since I have more than 10 lines.
The lines follow a logical sequence. I would like to (1) have their color automatically chosen from a colormaps (preferably one that has a smooth ordering, such as viridis or a rainbow). Then I would like (2) to have the tick marks next to the color bar to correspond to the index i for each line (or better a text label from an array of strings textlabels[i]).
Here's a minimal piece of code (with some gaps where I am not sure what to use). I hope this illustrates what I am trying.
import numpy as np
import matplotlib.pyplot as plt
# Genereate some values to plot on the x-axis
x = np.linspace(0,1,1000)
# Some code to select a (discrete version of) a rainbow/viridis color map
...
# Loop over lines that should appear in the plot
for i in range(0,9):
# Plot something (using straight lines with different slope as example)
plt.plot(i*x)
# Some code to plot a discrete color bar next
# to the plot with ticks showing the value of i
...
I currently have this. I would like the color bar to have the ticks with values of i, i.e. 0, 1, 2, ... next to it as tick marks.
Example figure of what I have now. It is hard to tell the lines apart now.
One gets a colormap via plt.get_cmap("name of cmap", number_of_colors).
This colormap can be used to compute the colors for the plots. It can also be used to generate a colorbar.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
n = 10 # how many lines to draw or number of discrete color levels
x = np.linspace(0,1,17)
cmap = plt.get_cmap("viridis", n)
for i in range(0,n):
plt.plot(i*x, color=cmap(i))
norm= matplotlib.colors.BoundaryNorm(np.arange(0,n+1)-0.5, n)
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])
plt.colorbar(sm, ticks=np.arange(0,n))
plt.show()

Plot 2 histograms with different length of data points in one graph using matplotlib

I have two set of data with one containing around 11 million data points and the another around 5000. I would like to plot them both on one histogram. But because of the difference in size I need to normalise the frequency so I can plot them on the same figure. Below I have simulated what I have done with my data to be able to plot them. I have used the normed=True.
from numpy.random import randn
import matplotlib.pyplot as plt
import random
datalist1=[]
for x in range(1,50000):
datalist1.append(random.uniform(1,2))
datalist2=randn(5000000)
fig= plt.figure(1)
plt.hist(datalist1,bins=20,color='b',alpha=0.3,label='theoretical',histtype='stepfilled', normed=True)
plt.hist(datalist2,bins=20,alpha=0.5,color='g',label='experimental',histtype='stepfilled',normed=True)
plt.xlabel("Value")
plt.ylabel("Normalised Frequency")
plt.legend()
plt.show()
Can you please tell me if this is a good way to get around this issue? I would like to match the tallest hight between the two histogram frequencies to be 1 (or 100%).
The normed=True setting normalizes the histogram to an area of 1. That gives the histogram an interpretation as estimates of probability density functions.
In short, it actually makes sense not to normalize on the peak but on the area.
But if you really want to normalize by height you can modify the polygon data of the histogram:
h = plt.hist(datalist1,bins=20,color='b',alpha=0.3,label='theoretical',histtype='stepfilled', normed=True)
p = h[2][0]
p.xy[:,1] /= p.xy[:, 1].max()
h = plt.hist(datalist2,bins=20,alpha=0.5,color='g',label='experimental',histtype='stepfilled',normed=True)
p = h[2][0]
p.xy[:,1] /= p.xy[:, 1].max()
This solution feels a bit hackish, but at least it's quick and dirty :)

Categories