Defining and plotting a Schechter function: plot problems

Defining and plotting a Schechter function: plot problems - python

I'm currently defining a function in python as:
def schechter_fit(logM, phi=5.96E-11, log_M0=11.03, alpha=-1.35, e=2.718281828):
schechter = phi*(10**((alpha+1)*(logM-log_M0)))*(e**(pow(-10,logM-log_M0)))
return schechter
schechter_range = numpy.linspace(10.0, 11.9, 10000)
And then plotting said function as:
import numpy
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid.axislines import SubplotZero
schechter_range = numpy.linspace(10, 12, 10000)
fig = plt.figure(1)
ax = SubplotZero(fig, 111)
fig.add_subplot(ax)
ax.plot(schechter_range, schechter_fit(schechter_range), 'k')
This is the graphical output I am receiving is just a blank plot with no curve plotted. There must be a problem with how I have defined the function, but I can't see the problem. The plot should look something like this:
I'm new to python functions so perhaps my equation isn't quite right. This is what I am looking to plot and the parameters I am starting with:

The function you describe returns a complex result over most of your input range. Here I added +0j to the input to allow for an imaginary result; if you don't do this you just get a bunch of nans (which mpl doesn't plot). Here are the plots:
import numpy
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid.axislines import SubplotZero
schechter_range = numpy.linspace(10, 12, 10000)
fig = plt.figure(1)
ax = SubplotZero(fig, 111)
fig.add_subplot(ax)
def schechter_fit(logM, phi=5.96E-11, log_M0=11.03, alpha=-1.35, e=2.718281828):
schechter = phi*(10**((alpha+1)*(logM-log_M0)))*(e**(pow(-10,logM-log_M0)))
return schechter
y = schechter_fit(schechter_range+0j) # Note the +0j here to allow an imaginary result
ax.plot(schechter_range, y.real, 'b', label="Re Part")
ax.plot(schechter_range, y.imag, 'r', label="Im Part")
ax.legend()
plt.show()
Now that you can see why the data is not plotting, and that complex numbers are being generated, and you know physically that you don't want that, it would be reasonable to figure out where these are coming from. Hopefully, it's obvious that these are originate from pow(-10,logM-log_M0), and from there it's clear that this is assuming the wrong operator precedence: the equation isn't pow(-10,logM-log_M0), but -pow(10,logM-log_M0). Making this corrections gives (after a log is taken, because I can see the log in the plot in the question):
I also extended the lower bound from 10 to 8, so the region of constant slope is clear and it better matches the graph shown in the question. This is still off by a factor on the y-axis, but I'm guessing that's a factor of (SFR/M*) that's not being applied correctly (it's difficult to know without seeing the context and the full y-axis).

i did amost the same as tom10 except that i took the log of your expression directly, which turns the factors into summands and may make things easier to debug.
i did not really test the formula!
import numpy
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid.axislines import SubplotZero
def log_schechter_fit(logM, SFR_M=5.96E-11, log_M0=11.03,
alpha=-1.35):
schechter = numpy.log(SFR_M)
schechter += (alpha+1)*(logM-log_M0)*numpy.log(10)
schechter += pow(-10,logM-log_M0)
return schechter
schechter_range = numpy.linspace(10, 12, 10000)
# for i in range(10,13):
for i in numpy.linspace(10, 11.03, 10):
print(i, log_schechter_fit(i+0j))
fig = plt.figure(1)
ax = SubplotZero(fig, 111)
fig.add_subplot(ax)
ax.set_xlim([10,12])
y = log_schechter_fit(schechter_range+0j)
ax.plot(schechter_range, y.real, 'b', label="Re Part")
ax.plot(schechter_range, y.imag, 'r', label="Im Part")
ax.legend()
and i got:
UPDATE
again using tom10's comments on operator precedence and changing the last part in the function:
LOG_10 = numpy.log(10)
SFR_M = 5.96E-11
LOG_SFR_M = numpy.log(SFR_M)
def log_schechter_fit(logM, log_SFR_M=LOG_SFR_M, log_M0=11.03,
alpha=-1.35):
schechter = log_SFR_M
schechter += (alpha+1)*(logM-log_M0)*LOG_10
schechter -= pow(10,logM-log_M0)
return schechter
i can reproduce the plot of the accepted answer. the shape of the curve fits but i can not explain the discrepancy to the values compared with the original plot posted in the question...

Related

Using timedelta to properly allign peaks on graph

I am using scipy.signal library to find the peaks of a time graph. I inputted the y values of my pandas series. And it gave me the location of the the peaks. Now i am trying to use the locations from the find_peaks function to return the position in time of the peaks. Here is my function:
def turn_peaks_to_time_series(df,t_interval):
df_values = df['l'].values
fig, ax1 = plt.subplots()
x_of_peaks, _ = find_peaks(df_values, height=None)
y_of_peaks = df_values[x_of_peaks]
x_values_to_t_values = lambda x : timedelta(minutes=x) * t_interval
time_initial = np.min(df.index)
t_of_peaks = [ time_initial + x_values_to_t_values(int(i)) for i in x_of_peaks ] #source of issue
ax1.plot(t_of_peaks, y_of_peaks, "rp",label='peak') #plot peaks on graph
ax1.plot(df.index,df.l) # plot df line
plt.show()
However, peaks are not properly aligning
I know the issue is with my x_values_to_t_values function. In addition, any suggesting to optimize my code are very welcomed.

Turns out i was trying to reinvent the wheel. The solution to my problem was extremely simple. Also I adjusted the code to be more general.
def turn_peaks_to_time_series(series):
series_values = series.values
series_index = series.index
fig, ax1 = plt.subplots()
x_of_peaks, _ = find_peaks(series_values, height=None)
y_of_peaks = series_values[x_of_peaks]
ax1.plot(series_index[x_of_peaks], y_of_peaks, "rp",label='peak') #plot peaks on graph
ax1.plot(series_index,series_values) # plot df line
plt.show()

Offset secondary axis in matplotlib

I'm trying to bring together to different plot settings in matplotlib. I found nice examples for each of them in the matplotlib example gallery/documentation and stack but I couldn't find anything on my specific problem.
So what I know so far is, how to add one or more axes with offset y-axis for plotting different data with respect to the same x-axis, by using ax.twinx(). The third y-axis is called parasite axis in the example Parasite axis demo. However, if you want to add an additional axis which is just a scaled version of the existing one, you can use ax.secondary_yaxis(), as shown in the Secondary axis demo. There is no additional data to be plotted.
What I could not achieve so far is a secondary y-axis which is offset from the original one. This can be very helpful to make plots more readable across scientific communities. For instance, while some scientists use frequency as reference for the electromagnetic spectrum, others use the wavelength or the wavenumber. Afsar [1] used a very convenient axis labeling which includes all the three variables in the same plot:
I would like to the something similar, just on the y-axis instead of the x-axis. Is there a way to offset the secondary axis from the primary axis? I tried a few parameters but couldn't figure it out.
Thank you for any help!
[1] Afsar, Mohammed Nurul. “Precision Millimeter-Wave Measurements of Complex Refractive Index, Complex Dielectric Permittivity, and Loss Tangent of Common Polymers.” IEEE Transactions on Instrumentation and Measurement IM–36, no. 2 (June 1987): 530–36. https://doi.org/10.1109/TIM.1987.6312733.
[1]:

A complete example. The third-to-last line is the relevant one.
import matplotlib.pyplot as plt
import numpy as np
import datetime
dates = [datetime.datetime(2018, 1, 1) + datetime.timedelta(hours=k * 6)
for k in range(240)]
temperature = np.random.randn(len(dates)) * 4 + 6.7
fig, ax = plt.subplots(constrained_layout=True)
ax.plot(dates, temperature)
ax.set_ylabel(r'$T\ [^oC]$')
plt.xticks(rotation=70)
def date2yday(x):
"""Convert matplotlib datenum to days since 2018-01-01."""
y = x - mdates.date2num(datetime.datetime(2018, 1, 1))
return y
def yday2date(x):
"""Return a matplotlib datenum for *x* days after 2018-01-01."""
y = x + mdates.date2num(datetime.datetime(2018, 1, 1))
return y
secax_x = ax.secondary_xaxis('top', functions=(date2yday, yday2date))
secax_x.set_xlabel('yday [2018]')
def celsius_to_fahrenheit(x):
return x * 1.8 + 32
def fahrenheit_to_celsius(x):
return (x - 32) / 1.8
secax_y = ax.secondary_yaxis(
'right', functions=(celsius_to_fahrenheit, fahrenheit_to_celsius))
secax_y.set_ylabel(r'$T\ [^oF]$')
def celsius_to_anomaly(x):
return (x - np.mean(temperature))
def anomaly_to_celsius(x):
return (x + np.mean(temperature))
# document use of a float for the position:
secax_y2 = ax.secondary_yaxis(
1.2, functions=(celsius_to_anomaly, anomaly_to_celsius))
secax_y2.set_ylabel(r'$T - \overline{T}\ [^oC]$')
plt.show()

Here is another approach, although maybe it's more of a hack:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
#FuncFormatter
def twin1_formatter(x, pos):
return f'{x/np.pi*180:.0f}'
#FuncFormatter
def twin2_formatter(x, pos):
return f'{x/np.pi:.1f} $\pi$'
data = np.arange(0, 2*np.pi, 0.1)
fig, ax = plt.subplots()
twin1 = ax.twiny()
twin1.spines['top'].set_position(('axes', 1.2))
twin1.set_xlabel('Degrees')
twin1.xaxis.set_major_formatter(FuncFormatter(twin1_formatter))
twin2 = ax.twiny()
twin2.set_xlabel('Pies')
twin2.xaxis.set_major_formatter(FuncFormatter(twin2_formatter))
twin2.xaxis.set_ticks(np.array([0, 1/2, 1, 3/2, 2])*np.pi)
ax.plot(data, np.sin(data))
ax.set_xlabel('Radians')
twin1.set_xlim(ax.get_xlim())
twin2.set_xlim(ax.get_xlim())
fig.show()

Matplotlib: get colors and x/y data from a bar plot

I have a bar plot and I want to get its colors and x/y values. Here is a sample code:
import matplotlib.pyplot as plt
def main():
x_values = [1,2,3,4,5]
y_values_1 = [1,2,3,4,5]
y_values_2 = [2,4,6,8,10]
f, ax = plt.subplots(1,1)
ax.bar(x_values,y_values_2,color='r')
ax.bar(x_values,y_values_1,color='b')
#Any methods?
plt.show()
if __name__ == '__main__':
main()
Are there any methods like ax.get_xvalues(), ax.get_yvalues(), ax.get_colors(), which I can use so I could extract back from ax the lists x_values, y_values_1, y_values_2 and the colors 'r' and 'b'?

The ax knows what geometric objects it's drawing, but nothing about it keeps track of when those geometric objects were added, and of course it doesn't know what they "mean": which patch comes from which bar-plot, etc. The coder needs to keep track of that to re-extract the right parts for further use. The way to do this is common to many Python programs: the call to barplot returns a BarContainer, which you can name at the time and use later:
import matplotlib.pyplot as plt
def main():
x_values = [1,2,3,4,5]
y_values_1 = [1,2,3,4,5]
y_values_2 = [2,4,6,8,10]
f, ax = plt.subplots(1,1)
rbar = ax.bar(x_values,y_values_2,color='r')
bbar = ax.bar(x_values,y_values_1,color='b')
return rbar, bbar
if __name__ == '__main__':
rbar, bbar = main()
# do stuff with the barplot data:
assert(rbar.patches[0].get_facecolor()==(1.0,0.,0.,1.))
assert(rbar.patches[0].get_height()==2)

A slight variation to the above answers to put it all within the call to another plotting command:
# plot various patch objects to ax2
ax2 = plt.subplot(1,4,2)
ax2.hist(...)
# start a new plot with same colors as i'th patch object
ax3 = plt.subplot(1,4,3)
plot(...,...,color=ax2.axes.containers[i].patches[0].get_facecolor() )
In other words, I seemed to need an axes attribute in between the axis handle and the containers handle in order for it to be a bit more general.

producing histogram with y axis as relative frequency?

Today my task is to produce a histogram where the y axis is a relative frequency rather than just an absolute count. I've located another question regarding this (see: Setting a relative frequency in a matplotlib histogram) however, when I try to implement it, I get the error message:
'list' object has no attribute size
despite having the exact same code given in the answer -- and despite their information also being stored in a list.
In addition, I have tried the method here(http://www.bertplot.com/visualization/?p=229) with no avail, as the output still doesn't show the y label as ranging from 0 to 1.
import numpy as np
import matplotlib.pyplot as plt
import random
from tabulate import tabulate
import matplotlib.mlab as mlab
precision = 100000000000
def MarkovChain(n,s) :
"""
"""
matrix = []
for l in range(n) :
lineLst = []
sum = 0
crtPrec = precision
for i in range(n-1) :
val = random.randrange(crtPrec)
sum += val
lineLst.append(float(val)/precision)
crtPrec -= val
lineLst.append(float(precision - sum)/precision)
matrix2 = matrix.append(lineLst)
print("The intial probability matrix.")
print(tabulate(matrix2))
baseprob = []
baseprob2 = []
baseprob3 = []
baseprob4 = []
for i in range(1,s): #changed to do a range 1-s instead of 1000
#must use the loop variable here, not s (s is always the same)
matrix_n = np.linalg.matrix_power(matrix2, i)
baseprob.append(matrix_n.item(0))
baseprob2.append(matrix_n.item(1))
baseprob3.append(matrix_n.item(2))
baseprob = np.array(baseprob)
baseprob2 = np.array(baseprob2)
baseprob3 = np.array(baseprob3)
baseprob4 = np.array(baseprob4)
# Here I tried to make a histogram using the plt.hist() command, but the normed=True doesn't work like I assumed it would.
'''
plt.hist(baseprob, bins=20, normed=True)
plt.show()
'''
#Here I tried to make a histogram using the method from the second link in my post.
# The code runs, but then the graph that is outputted isn't doesn't have the relative frequency on the y axis.
'''
n, bins, patches = plt.hist(baseprob, bins=30,normed=True,facecolor = "green",)
y = mlab.normpdf(bins,mu,sigma)
plt.plot(bins,y,'b-')
plt.title('Main Plot Title',fontsize=25,horizontalalignment='right')
plt.ylabel('Count',fontsize=20)
plt.yticks(fontsize=15)
plt.xlabel('X Axis Label',fontsize=20)
plt.xticks(fontsize=15)
plt.show()
'''
# Here I tried to make a histogram using the method seen in the Stackoverflow question I mentioned.
# The figure that pops out looks correct in terms of the axes, but no actual data is posted. Instead the error below is shown in the console.
# AttributeError: 'list' object has no attribute 'size'
fig = plt.figure()
ax = fig.add_subplot(111)
ax.hist(baseprob, weights=np.zeros_like(baseprob)+1./ baseprob.size)
n, bins, patches = ax.hist(baseprob, bins=100, normed=1, cumulative=0)
ax.set_xlabel('Bins', size=20)
ax.set_ylabel('Frequency', size=20)
ax.legend
plt.show()
print("The final probability matrix.")
print(tabulate(matrix_n))
matrixTranspose = zip(*matrix_n)
evectors = np.linalg.eig(matrixTranspose)[1][:,0]
print("The steady state vector is:")
print(evectors)
MarkovChain(5, 1000)
The methods I tried are each commented out, so to reproduce my errors, make sure to erase the comment markers.
As you can tell, I'm really new to Programming. Also this is not for a homework assignment in a computer science class, so there are no moral issues associated with just providing me with code.

The expected input to matplotlib functions are usually numpy arrays, which have the methods nparray.size. Lists do not have size methods so when list.size is called in the hist function, this causes your error. You need to convert, using nparray = np.array(list). You can do this after the loop where you build the lists with append, something like,
baseprob = []
baseprob2 = []
baseprob3 = []
baseprob4 = []
for i in range(1,s): #changed to do a range 1-s instead of 1000
#must use the loop variable here, not s (s is always the same)
matrix_n = numpy.linalg.matrix_power(matrix, i)
baseprob.append(matrix_n.item(0))
baseprob2.append(matrix_n.item(1))
baseprob3.append(matrix_n.item(2))
baseprob = np.array(baseprob)
baseprob2 = np.array(baseprob2)
baseprob3 = np.array(baseprob3)
baseprob4 = np.array(baseprob4)
EDIT: minimal hist example
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
baseprob = np.random.randn(1000000)
ax.hist(baseprob, weights=np.zeros_like(baseprob)+1./ baseprob.size, bins=100)
n, bins, patches = ax.hist(baseprob, bins=100, normed=1, cumulative=0, alpha = 0.4)
ax.set_xlabel('Bins', size=20)
ax.set_ylabel('Frequency', size=20)
ax.legend
plt.show()
which gives,

Put a gap/break in a line plot

I have a data set with effectively "continuous" sensor readings, with the occasional gap.
However there are several periods in which no data was recorded. These gaps are significantly longer than the sample period.
By default, pyplot connects each data point to the next (if I have a line style set), however I feel that this is slightly misleading when it connects the two data points either side of a long gap.
I would prefer to simply have no line there; that is, I would like the line to stop and to start again after the gap.
I have tried adding in an element in these gap sections with the y-value None, but seems to send the line back to an earlier part of the plot (though strangely these lines don't appear at all zoom levels).
The other option I have thought of is to simply plot each piece with a separate call to plot, but this would be a bit ugly and cumbersome.
Is there a more elegant way of achieving this?
Edit: Below is a minimal working example demonstrating the behaviour. The first plot is the joining line I am trying to avoid. The second plot shows that adding a None value appears to work, however if you pan the view of the plot, you get what is shown in the third figure, a line jumping to an earlier part of the plot.
import numpy as np
import matplotlib.pyplot as plt
t1 = np.arange(0, 8, 0.05)
t2 = np.arange(10, 14, 0.05)
t = np.concatenate([t1, t2])
c = np.cos(t)
fig = plt.figure()
ax = fig.gca()
ax.plot(t, c)
ax.set_title('Undesirable joining line')
t1 = np.arange(0, 8, 0.05)
t2 = np.arange(10, 14, 0.05)
c1 = np.cos(t1)
c2 = np.cos(t2)
t = np.concatenate([t1, t1[-1:], t2])
c = np.concatenate([c1, [None,], c2])
fig = plt.figure()
ax = fig.gca()
ax.plot(t, c)
ax.set_title('Ok if you don\'t pan the plot')
fig = plt.figure()
ax = fig.gca()
ax.plot(t, c)
ax.axis([-1, 12, -0.5, 1.25])
ax.set_title('Strange jumping line')
plt.show()

Masked arrays work well for this. You just need to mask the first of the points you don't want to connect:
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
t1 = np.arange(0, 8, 0.05)
mask_start = len(t1)
t2 = np.arange(10, 14, 0.05)
t = np.concatenate([t1, t2])
c = np.cos(t) # an aside, but it's better to use numpy ufuncs than list comps
mc = ma.array(c)
mc[mask_start] = ma.masked
plt.figure()
plt.plot(t, mc)
plt.title('Using masked arrays')
plt.show()
At least on my system (OSX, Python 2.7, mpl 1.1.0), I don't have any issues with panning, etc.

The strange lines were a bug in matplotlib 1.1.1.
There is no need to have the t component of the dummy points in chronological order, zero values will also work.
For the c component, I use np.nan instead of None, which (on conversion from a list) forces the dtype to 'float64' instead of 'O' (object).
Dummy points are best inserted at the time of filling the array with samples (or appending to a list), like so:
samples = [] # (t,c) data pairs.
# Waiting for samples in a loop.
if samples and current_sample[0] > samples[-1][0] + GAP_TOLERANCE:
samples.append((0, np.nan))
samples.append(current_sample)
t, c = np.array(samples).T

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.