The Y limits on my imshow subplot are stuck on a seemingly arbitrary range.
In this example, I'm trying to show the mean of N trials and then plot all the N trials over time as a 2d plot.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
N = 20 # number of trials
M = 3000 # number of samples in each trial
data = np.random.randn(N, M)
x = np.linspace(0,1,M) # the M samples occur in the range 0-1
# ie the sampling rate is 3000 samples per second
f, (ax1, ax2) = plt.subplots(2,1, sharex=True)
ax1.plot(x, np.mean(data, 0))
ax2.imshow(data, cmap="inferno", interpolation="nearest", extent=[0, 1, 0, N])
ax2.set_ylim(0, N)
ax1.set_ylabel("mean over trials")
ax2.set_ylabel("trial")
ax2.set_xlabel("time")
Are there any tricks to set the Y limits correctly?
By default, imshow uses an equal aspect ratio.
Since your x-axis is fixed to the extent of the plot above (ax1), which is 1, the y-axis can only extent to a fraction of 1.
The solution is actually quite simple: You just need to add
ax2.set_aspect('auto')
Related
I am trying to smoothen my data using spline - which is basically cumulative percentile on the y-axis and a reference point they refer to on the x-axis. I get most part of it correct, however, the challenge I am facing is my y axis is increasing in a non linear way - as seen int he spline plot below- y-axis value keep increasing and decreasing, instead of just increasing.
I still want a smooth curve but want y-axis to increase with the x-axis - i.e. each subsequent y-axis point should be equal or a slight increment in value from the previous value, as opposed to increasing and the decreasing later.
Reproducible code:
import pandas as pd
import numpy as np
from scipy.interpolate import make_interp_spline
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
percentile_block_df = pd.DataFrame({
'x' : [0.5,100.5,200.5,400.5,800.5,900.5,1000.5],
'percentile' : [0.0001,0.01,0.065,0.85,0.99,0.9973,0.9999]
})
figure(figsize=(8, 6), dpi=80)
y = percentile_block_df.percentile
x = percentile_block_df.x
X_Y_Spline = make_interp_spline(x, y)
# Returns evenly spaced numbers
# over a specified interval.
X_ = np.linspace(x.min(), x.max(), 1000)
Y_ = X_Y_Spline(X_)
figure(figsize=(18, 6), dpi=80)
plt.subplot(1, 2, 1) # row 1, col 2 index 1
plt.plot(x, y,"ro")
plt.plot(x, y)
plt.title("Original")
plt.xlabel('X')
plt.ylabel('Percentile ')
plt.subplot(1, 2, 2) # index 2
plt.plot(x, y,"ro")
plt.plot(X_, Y_,"green")
plt.title("Spline Plot")
plt.xlabel('X')
plt.ylabel('Percentile ')
plt.show()
What you are looking for is "monotonicity preserving interpolation". A quick search shows that scipy.interpolate.PchipInterpolator does just that. Here is the result for your example when simply plugging in from scipy.interpolate import PchipInterpolator instead of from scipy.interpolate import make_interp_spline.
Whether or not that's appropriate depends of course on your specific requirements for the interpolation. I encourage you to research the other options which are out there.
Similar question:
Fully monotone interpolation in python
Code that eventually worked for me:
This link, explains the need for Monotone cubic interpolation
#this code allows "smoothening" of the data
B_spline_coeff1 = PchipInterpolator(x1, np.log(y1))
X1_Final = np.linspace(x.min(), x.max(), 1000)
Y1_Final = np.exp(B_spline_coeff1(X1_Final))
#plot subplots
figure(figsize=(18, 6), dpi=80)
plt.subplot(1, 2, 1) # row 1, col 2 index 1
plt.plot(x, y,"ro")
plt.plot(x, y)
plt.title("Original")
plt.xlabel('X')
plt.ylabel('Percentile ')
plt.subplot(1, 2, 2) # index 2
plt.plot(x, y,"ro")
plt.plot(X1_Final, Y1_Final,"green")
plt.title("Spline Plot")
plt.xlabel('X')
plt.ylabel('Percentile ')
plt.show()
I am plotting a 5th degree polynomial... for simplicity, lets just go with y=(x-3)(x-2)x(x+2)(x+3). On reasonable intervals for x, say from -5 to 5, the graph isn't very informative because the function grows very quickly outside of the "interesting" range, about -3 to 3:
A symlog scale is somewhat better, but now I'm looking at the log of a 5th degree polynomial, which is a bit hard for me to interpret:
Ideally, I could plot this on a polynomial scale. Since I know I have a 5th degree polynomial, then a 5th root scale would be able to fit all of my data, and the graph should behave linearly out near the edges. Is it possible to scale my axes with an arbitrary function?
I adjusted this example as follows:
import numpy as np
import matplotlib.pyplot as plt
y = np.random.normal(loc=0.5, scale=0.4, size=1000)
x = np.arange(len(y))
fig, ax = plt.subplots(figsize=(6, 8), constrained_layout=True)
t = np.arange(1, 170.0, 0.1)
s = t / 2.
ax.plot(t, s, '-', lw=2)
ax.set_yscale('function', functions=(lambda x: x**5, lambda x: x**(0.2)))
ax.grid(True)
ax.set_ylim(0,5)
plt.show()
I have a code for ctg(x) but I don't want asymptotes or I want that they have a different color. I'm a beginner and I don't know what I can change in this code:
import matplotlib.ticker as tck
import matplotlib.pyplot as plt
import numpy as np
f,ax=plt.subplots(figsize=(8,5))
x=np.linspace(-np.pi, np.pi,100)
y=np.cos(x)/np.sin(x)
plt.ylim([-4, 4])
ax.plot(x/np.pi,y)
plt.title("f(x) = ctg(x)")
plt.xlabel("x")
plt.ylabel("y")
ax.xaxis.set_major_formatter(tck.FormatStrFormatter('%g $\pi$'))
plt.savefig('ctg')
plt.show()
It is not an asymptote being draw, but the line for the points around zero.
To overcome this you should create two plots for the positive and negative parts separately, making sure that the color (style?) for the two plots is the same (and optionally get the first default matplotlib color).
Since np.linspace() includes the extrema, these might accidentally create the same artifact.
To overcome this, it is enough to add/subtract a small number (epsilon) to the extrema.
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
f,ax=plt.subplots(figsize=(8,5))
# get first default color
color = plt.rcParams['axes.prop_cycle'].by_key()['color'][0]
epsilon = 1e-7
intervals = (
(-np.pi, 0),
(0, np.pi), )
for a, b in intervals:
x=np.linspace(a + epsilon, b - epsilon, 50)
y=np.cos(x) / np.sin(x)
ax.plot(x/np.pi,y, color=color)
plt.title("f(x) = ctg(x)")
plt.xlabel("x")
plt.ylabel("y")
plt.ylim([-4, 4])
ax.xaxis.set_major_formatter(mpl.ticker.FormatStrFormatter('%g $\pi$'))
plt.savefig('ctg')
plt.show()
This code creates a figure and one subplot for cotangent function. NaN are inserted when sin(x) is tending to 0 (NaN means "Not a Number" and NaNs are not plotted or connected).
matplot-fmt-pi created by k-donn(https://pypi.org/project/matplot-fmt-pi/) used to change the formatter to make x labels and ticks correspond to multiples of π/8 in fractional format.
plot formatting (grid, legend, limits, axis) is performed as commented.
import matplotlib.pyplot as plt
import numpy as np
from matplot_fmt_pi import MultiplePi
fig, ax = plt.subplots() # creates a figure and one subplot
x = np.linspace(-2 * np.pi, 2 * np.pi, 1000)
y = 1/np.tan(x)
y[np.abs(np.sin(x)) <= np.abs(np.sin(x[1]-x[0]))] = np.nan
# This operation inserts a NaN where sin(x) is reaching 0
# NaN means "Not a Number" and NaNs are not plotted or connected
ax.plot(x, y, lw=2, color="blue", label='Cotangent')
# Set up grid, legend, and limits
ax.grid(True)
ax.axhline(0, color='black', lw=.75)
ax.axvline(0, color='black', lw=.75)
ax.set_title("Trigonometric Functions")
ax.legend(frameon=False) # remove frame legend frame
# axis formatting
ax.set_xlim(-2 * np.pi, 2 * np.pi)
pi_manager = MultiplePi(8) # number= ticks between 0 - pi
ax.xaxis.set_major_locator(pi_manager.locator())
ax.xaxis.set_major_formatter(pi_manager.formatter())
plt.ylim(top=10) # y axis limit values
plt.ylim(bottom=-10)
y_ticks = np.arange(-10, 10, 1)
plt.yticks(y_ticks)
fig
plt.show()
A good way to show the concentration of the data points in a plot is using a scatter plot with non-unit transparency. As a result, the areas with more concentration would appear darker.
# this is synthetic example
N = 10000 # a very very large number
x = np.random.normal(0, 1, N)
y = np.random.normal(0, 1, N)
plt.scatter(x, y, marker='.', alpha=0.1) # an area full of dots, darker wherever the number of dots is more
which gives something like this:
Imagine the case we want to emphasize on the outliers. So the situation is almost reversed: A plot in which the less-concentrated areas are bolder. (There might be a trick to apply for my simple example, but imagine a general case where a distribution of points are not known prior, or it's difficult to define a rule for transparency/weight on color.)
I was thinking if there's anything handy same as alpha that is designed for this job specifically. Although other ideas for emphasizing on outliers are also welcomed.
UPDATE: This is what happens when more then one data point is scattered on the same area:
I'm looking for something like the picture below, the more data point, the less transparent the marker.
To answer the question: You can calculate the density of points, normalize it and encode it in the alpha channel of a colormap.
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap
# this is synthetic example
N = 10000 # a very very large number
x = np.random.normal(0, 1, N)
y = np.random.normal(0, 1, N)
fig, (ax,ax2) = plt.subplots(ncols=2, figsize=(8,5))
ax.scatter(x, y, marker='.', alpha=0.1)
values = np.vstack([x,y])
kernel = stats.gaussian_kde(values)
weights = kernel(values)
weights = weights/weights.max()
cols = plt.cm.Blues([0.8, 0.5])
cols[:,3] = [1., 0.005]
cmap = LinearSegmentedColormap.from_list("", cols)
ax2.scatter(x, y, c=weights, s = 1, marker='.', cmap=cmap)
plt.show()
Left is the original image, right is the image where higher density points have a lower alpha.
Note, however, that this is undesireable, because high density transparent points are undistinguishable from low density. I.e. in the right image it really looks as though you have a hole in the middle of your distribution.
Clearly, a solution with a colormap which does not contain the color of the background is a lot less confusing to the reader.
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
# this is synthetic example
N = 10000 # a very very large number
x = np.random.normal(0, 1, N)
y = np.random.normal(0, 1, N)
fig, ax = plt.subplots(figsize=(5,5))
values = np.vstack([x,y])
kernel = stats.gaussian_kde(values)
weights = kernel(values)
weights = weights/weights.max()
ax.scatter(x, y, c = weights, s=9, edgecolor="none", marker='.', cmap="magma")
plt.show()
Here, low density points are still emphazised by darker color, but at the same time it's clear to the viewer that the highest density lies in the middle.
As far as I know, there is no "direct" solution to this quite interesting problem. As a workaround, I propose this solution:
N = 10000 # a very very large number
x = np.random.normal(0, 1, N)
y = np.random.normal(0, 1, N)
fig = plt.figure() # create figure directly to be able to extract the bg color
ax = fig.gca()
ax.scatter(x, y, marker='.') # plot all markers without alpha
bgcolor = ax.get_facecolor() # extract current background color
# plot with alpha, "overwriting" dense points
ax.scatter(x, y, marker='.', color=bgcolor, alpha=0.2)
This will plot all points without transparency and then plot all points again with some transparency, "overwriting" those points with the highest density the most. Setting the alpha value to other higher values will put more emphasis to outliers and vice versa.
Of course the color of the second scatter plot needs to be adjusted to your background color. In my example this is done by extracting the background color and setting it as the new scatter plot's color.
This solution is independent of the kind of distribution. It only depends on the density of the points. However it produces twice the amount of points, thus may take slightly longer to render.
Reproducing the edit in the question, my solution is showing exactly the desired behavior. The leftmost point is a single point and is the darkest, the rightmost is consisting of three points and is the lightest color.
x = [0, 1, 1, 2, 2, 2]
y = [0, 0, 0, 0, 0, 0]
fig = plt.figure() # create figure directly to be able to extract the bg color
ax = fig.gca()
ax.scatter(x, y, marker='.', s=10000) # plot all markers without alpha
bgcolor = ax.get_facecolor() # extract current background color
# plot with alpha, "overwriting" dense points
ax.scatter(x, y, marker='.', color=bgcolor, alpha=0.2, s=10000)
Assuming that the distributions are centered around a specific point (e.g. (0,0) in this case), I would use this:
import numpy as np
import matplotlib.pyplot as plt
N = 500
# 0 mean, 0.2 std
x = np.random.normal(0,0.2,N)
y = np.random.normal(0,0.2,N)
# calculate the distance to (0, 0).
color = np.sqrt((x-0)**2 + (y-0)**2)
plt.scatter(x , y, c=color, cmap='plasma', alpha=0.7)
plt.show()
Results:
I don't know if it helps you, because it's not exactly you asked for, but you can simply color points, which values are bigger than some threshold. For example:
import matplotlib.pyplot as plt
num = 100
threshold = 80
x = np.linspace(0, 100, num=num)
y = np.random.normal(size=num)*45
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.scatter(x[np.abs(y) < threshold], y[np.abs(y) < threshold], color="#00FFAA")
ax.scatter(x[np.abs(y) >= threshold], y[np.abs(y) >= threshold], color="#AA00FF")
plt.show()
Since the complete simulation is to big to post it right here only the code to plot the spectrum is given (I think this is enough)
d = i.sum(axis=2)
pylab.figure(figsize=(15,15))
pylab = imshow(d)
plt.axis('tight')
pylab.show()
This spectrum is given in pixel. But I would like to have this in the units of length. I will hope you may give me some advices.
Do you mean that you want axis ticks to show your custom dimensions instead of the number of pixels in d? If yes, use the extent keyword of imshow:
import numpy
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
d = numpy.random.normal(size=(20, 40))
fig = plt.figure()
s = fig.add_subplot(1, 1, 1)
s.imshow(d, extent=(0, 1, 0, 0.5), interpolation='none')
fig.tight_layout()
fig.savefig('tt.png')
I'm guess a bit at what your problem is, so let's start by stating my interpretation/ You have some 2D data d that you plot using imshow and the units on the x and y axes are in the number of pixels. For example in the following we see the x axis labelled from 0 -> 10 for the number of data points:
import numpy as np
import matplotlib.pyplot as plt
# Generate a fake d
x = np.linspace(-1, 1, 10)
y = np.linspace(-1, 1, 10)
X, Y = np.meshgrid(x, y)
d = np.sin(X**2 + Y**2)
plt.imshow(d)
If this correctly describes your issue, then the solution is to avoid using imshow, which is designed to plot images. Firstly this will help as imshow attemps to interpolate to give a smoother image (which may hide features in the spectrum) and second because it is an image, there is no meaningful x and y data so it doesn't plot it.
The best alternative would be to use plt.pcolormesh which generate a psuedocolor plot of a 2D array and takes as arguments X and Y, which are both 2D arrays of points to which the values of d correspond.
For example:
# Generate a fake d
x = np.linspace(-1, 1, 10)
y = np.linspace(-1, 1, 10)
X, Y = np.meshgrid(x, y)
d = np.sin(X**2 + Y**2)
plt.pcolormesh(X, Y, d)
Now the x and y values correspond to the values of X and Y.