Plotting two different arrays of different lengths

Plotting two different arrays of different lengths - python

I have two arrays. One is the raw signal of length (1000, ) and the other one is the smooth signal of length (100,). I want to visually represent how the smooth signal represents the raw signal. Since these arrays are of different length, I am not able to plot them one over the other. Is there a way to do so in matplotlib?
Thanks!

As rth suggested, define
x1 = np.linspace(0, 1, 1000)
x2 = np.linspace(0, 1, 100)
and then plot raw versus x1, and smooth versus x2:
plt.plot(x1, raw)
plt.plot(x2, smooth)
np.linspace(0, 1, N) returns an array of length N with equally spaced values from 0 to 1 (inclusive).
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(2015)
raw = (np.random.random(1000) - 0.5).cumsum()
smooth = raw.reshape(-1,10).mean(axis=1)
x1 = np.linspace(0, 1, 1000)
x2 = np.linspace(0, 1, 100)
plt.plot(x1, raw)
plt.plot(x2, smooth)
plt.show()
yields

You will need two different x-axes for this job. You cannot plot two variables with different lengths in one single plot.
import matplotlib.pyplot as plt
import numpy as np
y = np.random.random(100) # the smooth signal
x = np.linspace(0,100,100) # it's x-axis
y1 = np.random.random(1000) # the raw signal
x1 = np.linspace(0,100,1000) # it's x-axis
fig = plt.figure()
ax = fig.add_subplot(121)
ax.plot(x,y,label='smooth-signal')
ax.legend(loc='best')
ax2 = fig.add_subplot(122)
ax2.plot(x1,y1,label='raw-signal')
ax2.legend(loc='best')
plt.suptitle('Smooth-vs-raw signal')
fig.show()

Related

How to convert a matrix to heatmap image in torch [duplicate]

Using Matplotlib, I want to plot a 2D heat map. My data is an n-by-n Numpy array, each with a value between 0 and 1. So for the (i, j) element of this array, I want to plot a square at the (i, j) coordinate in my heat map, whose color is proportional to the element's value in the array.
How can I do this?

The imshow() function with parameters interpolation='nearest' and cmap='hot' should do what you want.
Please review the interpolation parameter details, and see Interpolations for imshow and Image antialiasing.
import matplotlib.pyplot as plt
import numpy as np
a = np.random.random((16, 16))
plt.imshow(a, cmap='hot', interpolation='nearest')
plt.show()

Seaborn is a high-level API for matplotlib, which takes care of a lot of the manual work.
seaborn.heatmap automatically plots a gradient at the side of the chart etc.
import numpy as np
import seaborn as sns
import matplotlib.pylab as plt
uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data, linewidth=0.5)
plt.show()
You can even plot upper / lower left / right triangles of square matrices. For example, a correlation matrix, which is square and is symmetric, so plotting all values would be redundant.
corr = np.corrcoef(np.random.randn(10, 200))
mask = np.zeros_like(corr)
mask[np.triu_indices_from(mask)] = True
with sns.axes_style("white"):
ax = sns.heatmap(corr, mask=mask, vmax=.3, square=True, cmap="YlGnBu")
plt.show()

I would use matplotlib's pcolor/pcolormesh function since it allows nonuniform spacing of the data.
Example taken from matplotlib:
import matplotlib.pyplot as plt
import numpy as np
# generate 2 2d grids for the x & y bounds
y, x = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100))
z = (1 - x / 2. + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
# x and y are bounds, so z should be the value *inside* those bounds.
# Therefore, remove the last value from the z array.
z = z[:-1, :-1]
z_min, z_max = -np.abs(z).max(), np.abs(z).max()
fig, ax = plt.subplots()
c = ax.pcolormesh(x, y, z, cmap='RdBu', vmin=z_min, vmax=z_max)
ax.set_title('pcolormesh')
# set the limits of the plot to the limits of the data
ax.axis([x.min(), x.max(), y.min(), y.max()])
fig.colorbar(c, ax=ax)
plt.show()

For a 2d numpy array, simply use imshow() may help you:
import matplotlib.pyplot as plt
import numpy as np
def heatmap2d(arr: np.ndarray):
plt.imshow(arr, cmap='viridis')
plt.colorbar()
plt.show()
test_array = np.arange(100 * 100).reshape(100, 100)
heatmap2d(test_array)
This code produces a continuous heatmap.
You can choose another built-in colormap from here.

Here's how to do it from a csv:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata
# Load data from CSV
dat = np.genfromtxt('dat.xyz', delimiter=' ',skip_header=0)
X_dat = dat[:,0]
Y_dat = dat[:,1]
Z_dat = dat[:,2]
# Convert from pandas dataframes to numpy arrays
X, Y, Z, = np.array([]), np.array([]), np.array([])
for i in range(len(X_dat)):
X = np.append(X, X_dat[i])
Y = np.append(Y, Y_dat[i])
Z = np.append(Z, Z_dat[i])
# create x-y points to be used in heatmap
xi = np.linspace(X.min(), X.max(), 1000)
yi = np.linspace(Y.min(), Y.max(), 1000)
# Interpolate for plotting
zi = griddata((X, Y), Z, (xi[None,:], yi[:,None]), method='cubic')
# I control the range of my colorbar by removing data
# outside of my range of interest
zmin = 3
zmax = 12
zi[(zi<zmin) | (zi>zmax)] = None
# Create the contour plot
CS = plt.contourf(xi, yi, zi, 15, cmap=plt.cm.rainbow,
vmax=zmax, vmin=zmin)
plt.colorbar()
plt.show()
where dat.xyz is in the form
x1 y1 z1
x2 y2 z2
...

Use matshow() which is a wrapper around imshow to set useful defaults for displaying a matrix.
a = np.diag(range(15))
plt.matshow(a)
https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.matshow.html
This is just a convenience function wrapping imshow to set useful defaults for displaying a matrix. In particular:
Set origin='upper'.
Set interpolation='nearest'.
Set aspect='equal'.
Ticks are placed to the left and above.
Ticks are formatted to show integer indices.

Here is a new python package to plot complex heatmaps with different kinds of row/columns annotations in Python: https://github.com/DingWB/PyComplexHeatmap

Fill between subplots with matplotlib cmap

I have 2 line plots on the same figure, plotted from pandas dataframes.
I want to fill between them with a gradient/colour map of sorts.
I understand I can do this with a cmap, only it will not work for me (see code below).
General example I found are filling between x axis and line, i do not want that and also i am interested in simplest solution possible for this as i am a begginer at this and complicated, though maybe better code will just make it more confusing honestly.
Code for which fill is plain blue:
import matplotlib.pyplot as plt
import pandas as pd
ax = plt.gca()
df0.plot(kind='line', x='something', y='other', color='orange', ax=ax, legend=False, figsize=(20,10))
df1.plot(kind='line', x='something', y='other2', color='c', ax=ax, legend=False, figsize=(20,10))
ax.fill_between(x=df0['daysInAYear'], y1=df0['other'], y2 = df1['other2'], alpha=0.2, cmap=plt.cm.get_cmap("winter"))
plt.show()
EDIT/UPDATE: DATA EXAMPLE
other is ALWAYS >= other2
other other2 something (same for both)
15.6 -16.0 1
13.9 -26.7 2
13.3 -26.7 3
10.6 -26.1 4
12.8 -15.0 5
Final graph example:
I would like the fill to go from orange on top to blue at the bottom

Edit
In response to the edited question, here is an alternative approach which does the gradient vertically but doesn't use imshow.
import matplotlib.pyplot as plt
from matplotlib import colors, patches
import numpy as np
import pandas as pd
n = 100
nc = 100
x = np.linspace(0, np.pi*5, n)
y1 = [-50.0]
y2 = [50.0]
for ii in range(1, n):
y1.append(y1[ii-1] + (np.random.random()-0.3)*3)
y2.append(y2[ii-1] + (np.random.random()-0.5)*3)
y1 = np.array(y1)
y2 = np.array(y2)
z = np.linspace(0, 10, nc)
normalize = colors.Normalize(vmin=z.min(), vmax=z.max())
cmap = plt.cm.get_cmap('winter')
fig, ax = plt.subplots(1)
for ii in range(len(df['x'].values)-1):
y = np.linspace(y1[ii], y2[ii], nc)
yn = np.linspace(y1[ii+1], y2[ii+1], nc)
for kk in range(nc - 1):
p = patches.Polygon([[x[ii], y[kk]],
[x[ii+1], yn[kk]],
[x[ii+1], yn[kk+1]],
[x[ii], y[kk+1]]], color=cmap(normalize(z[kk])))
ax.add_patch(p)
plt.plot(x, y1, 'k-', lw=1)
plt.plot(x, y2, 'k-', lw=1)
plt.show()
The idea here being similar to that in my original answer, except the trapezoids are divided into nc pieces and each piece is colored separately. This has the advantage of scaling correctly for varying y1[ii], y2[ii] distances, as shown in this comparison,
It does, however, have the disadvantages of being much, much slower than imshow or the horizontal gradient method and of being unable to handle 'crossing' correctly.
The code to generate the second image in the above comparison:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import patches
from matplotlib.path import Path
x = np.linspace(0, 10, n)
y1 = [-50.0]
y2 = [50.0]
for ii in range(1, n):
y1.append(y1[ii-1] + (np.random.random()-0.2)*3)
y2.append(y2[ii-1] + (np.random.random()-0.5)*3)
y1 = np.array(y1)
y2 = np.array(y2)
verts = np.vstack([np.stack([x, y1], 1), np.stack([np.flip(x), np.flip(y2)], 1)])
path = Path(verts)
patch = patches.PathPatch(path, facecolor='k', lw=2, alpha=0.0)
plt.gca().add_patch(patch)
plt.imshow(np.arange(10).reshape(10,-1), cmap=plt.cm.winter, interpolation="bicubic",
origin='upper', extent=[0,10,-60,60], aspect='auto', clip_path=patch,
clip_on=True)
plt.show()
Original
This is a bit of a hack, partly based on the answers in this question. It does seem to work fairly well but works best with higher density along the x axis. The idea is to call fill_between separately for each trapezoid corresponding to x pairs, [x[ii], x[ii+1]]. Here is a complete example using some generated data
import matplotlib.pyplot as plt
from matplotlib import colors
import numpy as np
import pandas as pd
n = 1000
X = np.linspace(0, np.pi*5, n)
Y1 = np.sin(X)
Y2 = np.cos(X)
Z = np.linspace(0, 10, n)
normalize = colors.Normalize(vmin=Z.min(), vmax=Z.max())
cmap = plt.cm.get_cmap('winter')
df = pd.DataFrame({'x': X, 'y1': Y1, 'y2': Y2, 'z': Z})
x = df['x'].values
y1 = df['y1'].values
y2 = df['y2'].values
z = df['z'].values
for ii in range(len(df['x'].values)-1):
plt.fill_between([x[ii], x[ii+1]], [y1[ii], y1[ii+1]],
[y2[ii], y2[ii+1]], color=cmap(normalize(z[ii])))
plt.plot(x, y1, 'k-', x, y2, 'k-')
plt.show()
This can be generalized to a 2 dimensional color grid but would require non-trivial modification

How to avoid overlapping error bars in matplotlib?

I want to create a plot for two different datasets similar to the one presented in this answer:
In the above image, the author managed to fix the overlapping problem of the error bars by adding some small random scatter in x to the new dataset.
In my problem, I must plot a similar graphic, but having some categorical data in the x axis:
Any ideas on how to slightly move one the error bars of the second dataset using categorical variables at the x axis? I want to avoid the overlapping between the bars for making the visualization easier.

You can translate each errorbar by adding the default data transform to a prior translation in data space. This is possible when knowing that categories are in general one data unit away from each other.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.transforms import Affine2D
x = list("ABCDEF")
y1, y2 = np.random.randn(2, len(x))
yerr1, yerr2 = np.random.rand(2, len(x))*4+0.3
fig, ax = plt.subplots()
trans1 = Affine2D().translate(-0.1, 0.0) + ax.transData
trans2 = Affine2D().translate(+0.1, 0.0) + ax.transData
er1 = ax.errorbar(x, y1, yerr=yerr1, marker="o", linestyle="none", transform=trans1)
er2 = ax.errorbar(x, y2, yerr=yerr2, marker="o", linestyle="none", transform=trans2)
plt.show()
Alternatively, you could translate the errorbars after applying the data transform and hence move them in units of points.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.transforms import ScaledTranslation
x = list("ABCDEF")
y1, y2 = np.random.randn(2, len(x))
yerr1, yerr2 = np.random.rand(2, len(x))*4+0.3
fig, ax = plt.subplots()
trans1 = ax.transData + ScaledTranslation(-5/72, 0, fig.dpi_scale_trans)
trans2 = ax.transData + ScaledTranslation(+5/72, 0, fig.dpi_scale_trans)
er1 = ax.errorbar(x, y1, yerr=yerr1, marker="o", linestyle="none", transform=trans1)
er2 = ax.errorbar(x, y2, yerr=yerr2, marker="o", linestyle="none", transform=trans2)
plt.show()
While results look similar in both cases, they are fundamentally different. You will observe this difference when interactively zooming the axes or changing the figure size.

Consider the following approach to highlight plots - combination of errorbar and fill_between with non-zero transparency:
import random
import matplotlib.pyplot as plt
# create sample data
N = 8
data_1 = {
'x': list(range(N)),
'y': [10. + random.random() for dummy in range(N)],
'yerr': [.25 + random.random() for dummy in range(N)]}
data_2 = {
'x': list(range(N)),
'y': [10.25 + .5 * random.random() for dummy in range(N)],
'yerr': [.5 * random.random() for dummy in range(N)]}
# plot
plt.figure()
# only errorbar
plt.subplot(211)
for data in [data_1, data_2]:
plt.errorbar(**data, fmt='o')
# errorbar + fill_between
plt.subplot(212)
for data in [data_1, data_2]:
plt.errorbar(**data, alpha=.75, fmt=':', capsize=3, capthick=1)
data = {
'x': data['x'],
'y1': [y - e for y, e in zip(data['y'], data['yerr'])],
'y2': [y + e for y, e in zip(data['y'], data['yerr'])]}
plt.fill_between(**data, alpha=.25)
Result:

Threre is example on lib site: https://matplotlib.org/stable/gallery/lines_bars_and_markers/errorbar_subsample.html
enter image description here
You need parameter errorevery=(m, n),
n - how often plot error lines, m - shift with range from 0 to n

How to automatically set the scale for x-axis to be equal for all subplots in matplotlib?

In this example:
import numpy as np
import matplotlib.pyplot as plt
t1 = np.linspace(0, 1, 1000)
t2 = np.linspace(0, 0.5, 1000)
plt.figure(figsize=(10,5))
plt.subplot(121)
plt.plot(t1, np.sin(t1 * np.pi))
plt.subplot(122)
plt.plot(t2, np.sin(t2 * np.pi))
plt.show()
How can I squeeze the size of the second plot so that the x-axis for both subplots would have the same scale?, so it would look something like this:
I am looking for a simple and automatic way to do this because I have more than 30 subplots and would want them all have the same x-axis scale.

You could approximate the same unit length on both x-axes by specifying the gridspec_kw parameter that defines the ratio of the subplots.
import numpy as np
from matplotlib import pyplot as plt
t1 = np.linspace(0, 1, 1000)
t2 = np.linspace(0, 0.5, 1000)
fig, (ax1, ax2) = plt.subplots(1, 2, gridspec_kw = {"width_ratios": [np.max(t1)-np.min(t1), np.max(t2)-np.min(t2)]})
ax1.plot(t1, np.sin(t1 * np.pi))
ax2.plot(t2, np.sin(t2 * np.pi))
plt.show()
Sample output:

You can use plt.xlim(xmin, xmax) to set the domain of the graph. Using plt.xlim() without giving it parameters returns the current xmin/xmax of the plot. The same applies for plt.ylim().

A presumably not very proper way of doing so but in my opinion useful for a work around would be the use of subplot2grid:
plt.subplot2grid((ROWS, COLS), (START_ROW, START_COL), rowspan=ROWSPAN, colspan=COLSPAN)
using this, you could create two subplots which add up to the desired length and passing the colspan accordingly to the length of your x axis like for example:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 5)
y = np.linspace(0, 10)
plt.figure(figsize=(10,5)) # as specified from your code
# x.max() + y.max() is the total length of your x axis
# this can then be split in the parts of the length x.max() and y.max()
# the parts then should have the correct aspect ratio
ax1 = plt.subplot2grid((1, int(x.max()+y.max()), (0, 0), colspan=int(x.max()))
ax2 = plt.subplot2grid((1, int(x.max()+y.max()), (0, int(x.max())), colspan=int(y.max()))
ax1.plot(x, np.sin(x))
ax2.plot(y, np.sin(y))
plt.show()
The scales seem same for me, you would still have to adjust the xticklabels in case those are supposed to be same as well

You can achieve it by changing the aspect ratio:
import numpy as np
import matplotlib.pyplot as plt
t1 = np.linspace(0, 1, 1000)
t2 = np.linspace(0, 0.5, 1000)
plt.figure(figsize=(10,5))
fig,ax = plt.subplots(nrows = 1,ncols = 2)
ax[0].plot(t1, np.sin(t1 * np.pi))
x1,x2 =ax[1].get_xlim()
x_diff = x2-x1
y1,y2 = ax[1].get_ylim()
y_diff = y2-y1
#plt.subplot(122)
ax[1].plot(t2, np.sin(t2 * np.pi))
ax[1].set_aspect(y_diff/x_diff)
Output:

How to draw a line with matplotlib?

I cannot find a way to draw an arbitrary line with matplotlib Python library. It allows to draw horizontal and vertical lines (with matplotlib.pyplot.axhline and matplotlib.pyplot.axvline, for example), but i do not see how to draw a line through two given points (x1, y1) and (x2, y2). Is there a way? Is there a simple way?

This will draw a line that passes through the points (-1, 1) and (12, 4), and another one that passes through the points (1, 3) et (10, 2)
x1 are the x coordinates of the points for the first line, y1 are the y coordinates for the same -- the elements in x1 and y1 must be in sequence.
x2 and y2 are the same for the other line.
import matplotlib.pyplot as plt
x1, y1 = [-1, 12], [1, 4]
x2, y2 = [1, 10], [3, 2]
plt.plot(x1, y1, x2, y2, marker = 'o')
plt.show()
I suggest you spend some time reading / studying the basic tutorials found on the very rich matplotlib website to familiarize yourself with the library.
What if I don't want line segments?
[edit]:
As shown by #thomaskeefe, starting with matplotlib 3.3, this is now builtin as a convenience: plt.axline((x1, y1), (x2, y2)), rendering the following obsolete.
There are no direct ways to have lines extend to infinity... matplotlib will either resize/rescale the plot so that the furthest point will be on the boundary and the other inside, drawing line segments in effect; or you must choose points outside of the boundary of the surface you want to set visible, and set limits for the x and y axis.
As follows:
import matplotlib.pyplot as plt
x1, y1 = [-1, 12], [1, 10]
x2, y2 = [-1, 10], [3, -1]
plt.xlim(0, 8), plt.ylim(-2, 8)
plt.plot(x1, y1, x2, y2, marker = 'o')
plt.show()

As of matplotlib 3.3, you can do this with plt.axline((x1, y1), (x2, y2)).

I was checking how ax.axvline does work, and I've written a small function that resembles part of its idea:
import matplotlib.pyplot as plt
import matplotlib.lines as mlines
def newline(p1, p2):
ax = plt.gca()
xmin, xmax = ax.get_xbound()
if(p2[0] == p1[0]):
xmin = xmax = p1[0]
ymin, ymax = ax.get_ybound()
else:
ymax = p1[1]+(p2[1]-p1[1])/(p2[0]-p1[0])*(xmax-p1[0])
ymin = p1[1]+(p2[1]-p1[1])/(p2[0]-p1[0])*(xmin-p1[0])
l = mlines.Line2D([xmin,xmax], [ymin,ymax])
ax.add_line(l)
return l
So, if you run the following code you will realize how does it work. The line will span the full range of your plot (independently on how big it is), and the creation of the line doesn't rely on any data point within the axis, but only in two fixed points that you need to specify.
import numpy as np
x = np.linspace(0,10)
y = x**2
p1 = [1,20]
p2 = [6,70]
plt.plot(x, y)
newline(p1,p2)
plt.show()

Just want to mention another option here.
You can compute the coefficients using numpy.polyfit(), and feed the coefficients to numpy.poly1d(). This function can construct polynomials using the coefficients, you can find more examples here
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.poly1d.html
Let's say, given two data points (-0.3, -0.5) and (0.8, 0.8)
import numpy as np
import matplotlib.pyplot as plt
# compute coefficients
coefficients = np.polyfit([-0.3, 0.8], [-0.5, 0.8], 1)
# create a polynomial object with the coefficients
polynomial = np.poly1d(coefficients)
# for the line to extend beyond the two points,
# create the linespace using the min and max of the x_lim
# I'm using -1 and 1 here
x_axis = np.linspace(-1, 1)
# compute the y for each x using the polynomial
y_axis = polynomial(x_axis)
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 1, 1])
axes.set_xlim(-1, 1)
axes.set_ylim(-1, 1)
axes.plot(x_axis, y_axis)
axes.plot(-0.3, -0.5, 0.8, 0.8, marker='o', color='red')
Hope it helps.

In case somebody lands here trying to plot many segments in one go, here is a way. Say the segments are defined by two 2-d arrays of same length, e.g. a and b. We want to plot segments between each a[i] and b[i]. In that case:
Solution 1
ab_pairs = np.c_[a, b]
plt_args = ab_pairs.reshape(-1, 2, 2).swapaxes(1, 2).reshape(-1, 2)
ax.plot(*plt_args, ...)
Example:
np.random.seed(0)
n = 32
a = np.random.uniform(0, 1, (n, 2))
b = np.random.uniform(0, 1, (n, 2))
fig, ax = plt.subplots(figsize=(3, 3))
ab_pairs = np.c_[a, b]
ab_args = ab_pairs.reshape(-1, 2, 2).swapaxes(1, 2).reshape(-1, 2)
# segments
ax.plot(*ab_args, c='k')
# identify points: a in blue, b in red
ax.plot(*a.T, 'bo')
ax.plot(*b.T, 'ro')
plt.show()
Solution 2
The above creates many matplotlib.lines.Line2D. If you'd like a single line, we can do it by interleaving NaN between pairs:
ax.plot(*np.c_[a, b, a*np.nan].reshape(-1, 2).T, ...)
Example:
# same init as example above, then
fig, ax = plt.subplots(figsize=(3, 3))
# segments (all at once)
ax.plot(*np.c_[a, b, a*np.nan].reshape(-1, 2).T, 'k')
# identify points: a in blue, b in red
ax.plot(*a.T, 'bo')
ax.plot(*b.T, 'ro')
plt.show()
(Same figure as above).

Based on #Alejandro's answer:
if you want to add a line to an existing Axes (e.g. a scatter plot), and
all you know is the slope and intercept of the desired line (e.g. a regression line), and
you want it to cover the entire visible X range (already computed), and
you want to use the object-oriented interface (not pyplot).
Then you can do this (existing Axes in ax):
# e.g. slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(xs, ys)
xmin, xmax = ax.get_xbound()
ymin = (xmin * slope) + intercept
ymax = (xmax * slope) + intercept
l = matplotlib.lines.Line2D([xmin, xmax], [ymin, ymax])
ax.add_line(l)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Plotting two different arrays of different lengths - python

Related

How to convert a matrix to heatmap image in torch [duplicate]

Fill between subplots with matplotlib cmap

How to avoid overlapping error bars in matplotlib?

How to automatically set the scale for x-axis to be equal for all subplots in matplotlib?

How to draw a line with matplotlib?

Categories

Resources