Related
I would like my sybplots to be generated in 2x columns and 5x rows.
I've also tried adding ncols=2, nrows=5 to the code. didn't work.
And when I change the subplots to plt.subplots(5,2) instead of plt.subplots(10,1) it says (see added picture of code+error message):
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_9844/709244097.py in
13
14 for ax, afstand, tid in zip(ax, afstande, tider):
---> 15 ax.plot(tid, afstand)
16 ax.set_title("x(t)", fontsize=12)
17 ax.set_xlabel("tid (s)", fontsize=12)
AttributeError: 'numpy.ndarray' object has no attribute 'plot'
My code:
from scipy.optimize import fmin
a = -75.64766759
b = 68.02691163
f = lambda x: a * x + b
afstand1, afstand2, afstand3, afstand4, afstand5, afstand6, afstand7, afstand8, afstand9, afstand10 = f(U1), f(U2), f(U3), f(U4), f(U5),f(U6), f(U7), f(U8), f(U9), f(U10)
afstande = [afstand1, afstand2, afstand3, afstand4, afstand5, afstand6, afstand7, afstand8, afstand9, afstand10]
tider = [tid1, tid2, tid3, tid4, tid5, tid6, tid7, tid8, tid9, tid10]
fig, ax = plt.subplots(10,1, figsize=(7,25))
plt.subplots_adjust(hspace=0.55)
#loop
for ax, afstand, tid in zip(ax, afstande, tider):
ax.plot(tid, afstand)
ax.set_title("x(t)", fontsize=12)
ax.set_xlabel("tid (s)", fontsize=12)
ax.set_ylabel("Position", fontsize=12)
enter image description here
First of all, you're using the same variable name for the array of axis and in the loop, you should change that. Subplot-axes are stored in numpy arrays. If you only have 1 row, then looping over the array gives you the elements, but in a x*y pattern of subplots, you loop over a two-dimensional array of axis, which yields the rows. You can solve that by using .flat to get a one-dimensional view.
fig, axs = plt.subplots(ncols=5, nrows=2)
for ax in axs.flat:
ax.plot(...)
I'm trying to display the topic extraction results of an LDA text analysis across several data sets in the form of a matplotlib subplot.
Here's where I'm at:
I think my issue is my unfamiliarity with matplotlib. I have done all my number crunching ahead of time so that I can focus on how to plot the data:
top_words_master = []
top_weights_master = []
for i in range(len(tf_list)):
tf = tf_vectorizer.fit_transform(tf_list[i])
lda.fit(tf)
n_top_words = 20
tf_feature_names = tf_vectorizer.get_feature_names_out()
top_features_ind = lda.components_[0].argsort()[: -n_top_words - 1 : -1]
top_features = [tf_feature_names[i] for i in top_features_ind]
weights = lda.components_[0][top_features_ind]
top_words_master.append(top_features)
top_weights_master.append(weights)
This gives me my words and my weights (the x axis values) to make my sub-plot matrix of row/bar charts.
My attempt to construct this via matplot lib:
fig, axes = plt.subplots(2, 5, figsize=(30, 15), sharex=True)
plt.subplots_adjust(hspace=0.5)
fig.suptitle("Topics in LDA Model", fontsize=18, y=0.95)
axes = axes.flatten()
for i in range(len(tf_list)):
ax = axes[i]
ax.barh(top_words_master[i], top_weights_master[i], height=0.7)
ax.set_title(topic_map[f"Topic {i +1}"], fontdict={"fontsize": 30})
ax.invert_yaxis()
ax.tick_params(axis="both", which="major", labelsize=20)
for j in "top right left".split():
ax.spines[j].set_visible(False)
fig.suptitle("Topics in LDA Model", fontsize=40)
plt.subplots_adjust(top=0.90, bottom=0.05, wspace=0.90, hspace=0.3)
plt.show()
However, it only showed one, the first one. For the remaining 6 data sets it just printed:
<Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes>
Question
I've been at this for days. I feel I'm close, but this kind of result is really puzzling me, anyone have a solution or able to point me in the right direction?
As far as I understood from your question, your problem is to get the right indices for your subplots.
In your case, you have an array range(len(tf_list)) to index your data, some data (e.g. top_words_master[i]) to plot, and a figure with 10 subplots (rows=2,cols=5). For example, if you want to plot the 7th item (i=6) of your data, the indices of ax would be axes[1,1].
In order to get the correct indices for the subplot axes, you can use numpy.unravel_index. And, of course, you should not flatten your axes.
import matplotlib.pyplot as plt
import numpy as np
# dummy function
my_func = lambda x: np.random.random(x)
x_max = 100
# fig properties
rows = 2
cols = 5
fig, axes = plt.subplots(rows,cols,figsize=(30, 15), sharex=True)
for i in range(rows*cols):
ax_i = np.unravel_index(i,(rows,cols))
axes[ax_i[0],ax_i[1]].barh(np.arange(x_max),my_func(x_max), height=0.7)
plt.show()
You should create the figure first:
def top_word_comparison(axes, model, feature_names, n_top_words):
for topic_idx, topic in enumerate(model.components_):
top_features_ind = topic.argsort()[: -n_top_words - 1 : -1]
top_features = [feature_names[i] for i in top_features_ind]
weights = topic[top_features_ind]
ax = axes[topic_idx]
ax.barh(top_features, weights, height=0.7)
ax.set_title(topic_map[f"Topic {topic_idx +1}"], fontdict={"fontsize": 30})
ax.invert_yaxis()
ax.tick_params(axis="both", which="major", labelsize=20)
for i in "top right left".split():
ax.spines[i].set_visible(False)
tf_list = [cm_array, xb_array]
fig, axes = plt.subplots(len(tf_list), 5, figsize=(30, 15), sharex=True)
fig.suptitle("Topics in LDA model", fontsize=40)
for i in range(enumerate(tf_list)):
tf = tf_vectorizer.fit_transform(tf_list[i])
n_components = 1
lda.fit(tf)
n_top_words = 20
tf_feature_names = tf_vectorizer.get_feature_names_out()
top_word_comparison(axes[i], lda, tf_feature_names, n_top_words)
plt.subplots_adjust(top=0.90, bottom=0.05, wspace=0.90, hspace=0.3)
plt.show()
Motivation:
I'm trying to visualize a dataset of many n-dimensional vectors (let's say i have 10k vectors with n=300 dimensions). What i'd like to do is calculate a histogram for each of the n dimensions and plot it as a single line in a bins*n heatmap.
So far i've got this:
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
import seaborn as sns
# sample data:
vectors = np.random.randn(10000, 300) + np.random.randn(300)
def ndhist(vectors, bins=500):
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, bins = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig = plt.figure(figsize=(16, 9))
sns.heatmap(hists)
axes = fig.gca()
axes.set(ylabel='dimensions', xlabel='values')
print(dims)
print(limits)
ndhist(vectors)
This generates the following output:
300
(-6.538069472429366, 6.52159540162285)
Problem / Question:
How can i change the axes ticks?
for the y-axis i'd like to simply change this back to matplotlib's default, so it picks nice ticks like 0, 50, 100, ..., 250 (bonus points for 299 or 300)
for the x-axis i'd like to convert the shown bin indices into the bin (left) boundaries, then, as above, i'd like to change this back to matplotlib's default selection of some "nice" ticks like -5, -2.5, 0, 2.5, 5 (bonus points for also including the actual limits -6.538, 6.522)
Own solution attempts:
I've tried many things like the following already:
def ndhist_axlabels(vectors, bins=500):
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, bins = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig = plt.figure(figsize=(16, 9))
sns.heatmap(hists, yticklabels=False, xticklabels=False)
axes = fig.gca()
axes.set(ylabel='dimensions', xlabel='values')
#plt.xticks(np.linspace(*limits, len(bins)), bins)
plt.xticks(range(len(bins)), bins)
axes.xaxis.set_major_locator(matplotlib.ticker.AutoLocator())
plt.yticks(range(dims+1), range(dims+1))
axes.yaxis.set_major_locator(matplotlib.ticker.AutoLocator())
print(dims)
print(limits)
ndhist_axlabels(vectors)
As you can see however, the axes labels are pretty wrong. My guess is that the extent or limits are somewhere stored in the original axis, but lost when switching back to the AutoLocator. Would greatly appreciate a nudge in the right direction.
Maybe you're overthinking this. To plot image data, one can use imshow and get the ticking and formatting for free.
import numpy as np
from matplotlib import pyplot as plt
# sample data:
vectors = np.random.randn(10000, 300) + np.random.randn(300)
def ndhist(vectors, bins=500):
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, _ = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig, ax = plt.subplots(figsize=(16, 9))
extent = [limits[0], limits[-1], hists.shape[0]-0.5, -0.5]
im = ax.imshow(hists, extent=extent, aspect="auto")
fig.colorbar(im)
ax.set(ylabel='dimensions', xlabel='values')
ndhist(vectors)
plt.show()
If you read the docs, you will notice that the xticklabels/yticklabels arguments are overloaded, such that if you provide an integer instead of a string, it will interpret the argument as xtickevery/ytickevery and place ticks only at the corresponding locations. So in your case, seaborn.heatmap(hists, yticklabels=50) fixes your y-axis problem.
Regarding your xtick labels, I would simply provide them explictly:
xtickevery = 50
xticklabels = ['{:.1f}'.format(b) if ii%xtickevery == 0 else '' for ii, b in enumerate(bins)]
sns.heatmap(hists, yticklabels=50, xticklabels=xticklabels)
Finally came up with a version that works for me for now and uses AutoLocator based on some simple linear mapping...
def ndhist(vectors, bins=1000, title=None):
t = time.time()
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, bs = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig = plt.figure(figsize=(16, 12))
sns.heatmap(
hists,
yticklabels=50,
xticklabels=False
)
axes = fig.gca()
axes.set(
ylabel=f'dimensions ({dims} total)',
xlabel=f'values (min: {limits[0]:.4g}, max: {limits[1]:.4g}, {bins} bins)',
title=title,
)
def val_to_idx(val):
# calc (linearly interpolated) index loc for given val
return bins*(val - limits[0])/(limits[1] - limits[0])
xlabels = [round(l, 3) for l in limits] + [
v for v in matplotlib.ticker.AutoLocator().tick_values(*limits)[1:-1]
]
# drop auto-gen labels that might be too close to limits
d = (xlabels[4] - xlabels[3])/3
if (xlabels[1] - xlabels[-1]) < d:
del xlabels[-1]
if (xlabels[2] - xlabels[0]) < d:
del xlabels[2]
xticks = [val_to_idx(val) for val in xlabels]
axes.set_xticks(xticks)
axes.set_xticklabels([f'{l:.4g}' for l in xlabels])
plt.show()
print(f'histogram generated in {time.time() - t:.2f}s')
ndhist(np.random.randn(100000, 300), bins=1000, title='randn')
Thanks to Paul for his answer giving me the idea.
If there's an easier or more elegant solution, i'd still be interested though.
I was wondering how I am able to plot images side by side using matplotlib for example something like this:
The closest I got is this:
This was produced by using this code:
f, axarr = plt.subplots(2,2)
axarr[0,0] = plt.imshow(image_datas[0])
axarr[0,1] = plt.imshow(image_datas[1])
axarr[1,0] = plt.imshow(image_datas[2])
axarr[1,1] = plt.imshow(image_datas[3])
But I can't seem to get the other images to show. I'm thinking that there must be a better way to do this as I would imagine trying to manage the indexes would be a pain. I have looked through the documentation although I have a feeling I may be look at the wrong one. Would anyone be able to provide me with an example or point me in the right direction?
EDIT:
See the answer from #duhaime if you want a function to automatically determine the grid size.
The problem you face is that you try to assign the return of imshow (which is an matplotlib.image.AxesImage to an existing axes object.
The correct way of plotting image data to the different axes in axarr would be
f, axarr = plt.subplots(2,2)
axarr[0,0].imshow(image_datas[0])
axarr[0,1].imshow(image_datas[1])
axarr[1,0].imshow(image_datas[2])
axarr[1,1].imshow(image_datas[3])
The concept is the same for all subplots, and in most cases the axes instance provide the same methods than the pyplot (plt) interface.
E.g. if ax is one of your subplot axes, for plotting a normal line plot you'd use ax.plot(..) instead of plt.plot(). This can actually be found exactly in the source from the page you link to.
One thing that I found quite helpful to use to print all images :
_, axs = plt.subplots(n_row, n_col, figsize=(12, 12))
axs = axs.flatten()
for img, ax in zip(imgs, axs):
ax.imshow(img)
plt.show()
You are plotting all your images on one axis. What you want ist to get a handle for each axis individually and plot your images there. Like so:
fig = plt.figure()
ax1 = fig.add_subplot(2,2,1)
ax1.imshow(...)
ax2 = fig.add_subplot(2,2,2)
ax2.imshow(...)
ax3 = fig.add_subplot(2,2,3)
ax3.imshow(...)
ax4 = fig.add_subplot(2,2,4)
ax4.imshow(...)
For more info have a look here: http://matplotlib.org/examples/pylab_examples/subplots_demo.html
For complex layouts, you should consider using gridspec: http://matplotlib.org/users/gridspec.html
If the images are in an array and you want to iterate through each element and print it, you can write the code as follows:
plt.figure(figsize=(10,10)) # specifying the overall grid size
for i in range(25):
plt.subplot(5,5,i+1) # the number of images in the grid is 5*5 (25)
plt.imshow(the_array[i])
plt.show()
Also note that I used subplot and not subplots. They're both different
Below is a complete function show_image_list() that displays images side-by-side in a grid. You can invoke the function with different arguments.
Pass in a list of images, where each image is a Numpy array. It will create a grid with 2 columns by default. It will also infer if each image is color or grayscale.
list_images = [img, gradx, grady, mag_binary, dir_binary]
show_image_list(list_images, figsize=(10, 10))
Pass in a list of images, a list of titles for each image, and other arguments.
show_image_list(list_images=[img, gradx, grady, mag_binary, dir_binary],
list_titles=['original', 'gradx', 'grady', 'mag_binary', 'dir_binary'],
num_cols=3,
figsize=(20, 10),
grid=False,
title_fontsize=20)
Here's the code:
import matplotlib.pyplot as plt
import numpy as np
def img_is_color(img):
if len(img.shape) == 3:
# Check the color channels to see if they're all the same.
c1, c2, c3 = img[:, : , 0], img[:, :, 1], img[:, :, 2]
if (c1 == c2).all() and (c2 == c3).all():
return True
return False
def show_image_list(list_images, list_titles=None, list_cmaps=None, grid=True, num_cols=2, figsize=(20, 10), title_fontsize=30):
'''
Shows a grid of images, where each image is a Numpy array. The images can be either
RGB or grayscale.
Parameters:
----------
images: list
List of the images to be displayed.
list_titles: list or None
Optional list of titles to be shown for each image.
list_cmaps: list or None
Optional list of cmap values for each image. If None, then cmap will be
automatically inferred.
grid: boolean
If True, show a grid over each image
num_cols: int
Number of columns to show.
figsize: tuple of width, height
Value to be passed to pyplot.figure()
title_fontsize: int
Value to be passed to set_title().
'''
assert isinstance(list_images, list)
assert len(list_images) > 0
assert isinstance(list_images[0], np.ndarray)
if list_titles is not None:
assert isinstance(list_titles, list)
assert len(list_images) == len(list_titles), '%d imgs != %d titles' % (len(list_images), len(list_titles))
if list_cmaps is not None:
assert isinstance(list_cmaps, list)
assert len(list_images) == len(list_cmaps), '%d imgs != %d cmaps' % (len(list_images), len(list_cmaps))
num_images = len(list_images)
num_cols = min(num_images, num_cols)
num_rows = int(num_images / num_cols) + (1 if num_images % num_cols != 0 else 0)
# Create a grid of subplots.
fig, axes = plt.subplots(num_rows, num_cols, figsize=figsize)
# Create list of axes for easy iteration.
if isinstance(axes, np.ndarray):
list_axes = list(axes.flat)
else:
list_axes = [axes]
for i in range(num_images):
img = list_images[i]
title = list_titles[i] if list_titles is not None else 'Image %d' % (i)
cmap = list_cmaps[i] if list_cmaps is not None else (None if img_is_color(img) else 'gray')
list_axes[i].imshow(img, cmap=cmap)
list_axes[i].set_title(title, fontsize=title_fontsize)
list_axes[i].grid(grid)
for i in range(num_images, len(list_axes)):
list_axes[i].set_visible(False)
fig.tight_layout()
_ = plt.show()
As per matplotlib's suggestion for image grids:
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import ImageGrid
fig = plt.figure(figsize=(4., 4.))
grid = ImageGrid(fig, 111, # similar to subplot(111)
nrows_ncols=(2, 2), # creates 2x2 grid of axes
axes_pad=0.1, # pad between axes in inch.
)
for ax, im in zip(grid, image_data):
# Iterating over the grid returns the Axes.
ax.imshow(im)
plt.show()
I end up at this url about once a week. For those who want a little function that just plots a grid of images without hassle, here we go:
import matplotlib.pyplot as plt
import numpy as np
def plot_image_grid(images, ncols=None, cmap='gray'):
'''Plot a grid of images'''
if not ncols:
factors = [i for i in range(1, len(images)+1) if len(images) % i == 0]
ncols = factors[len(factors) // 2] if len(factors) else len(images) // 4 + 1
nrows = int(len(images) / ncols) + int(len(images) % ncols)
imgs = [images[i] if len(images) > i else None for i in range(nrows * ncols)]
f, axes = plt.subplots(nrows, ncols, figsize=(3*ncols, 2*nrows))
axes = axes.flatten()[:len(imgs)]
for img, ax in zip(imgs, axes.flatten()):
if np.any(img):
if len(img.shape) > 2 and img.shape[2] == 1:
img = img.squeeze()
ax.imshow(img, cmap=cmap)
# make 16 images with 60 height, 80 width, 3 color channels
images = np.random.rand(16, 60, 80, 3)
# plot them
plot_image_grid(images)
Sample code to visualize one random image from the dataset
def get_random_image(num):
path=os.path.join("/content/gdrive/MyDrive/dataset/",images[num])
image=cv2.imread(path)
return image
Call the function
images=os.listdir("/content/gdrive/MyDrive/dataset")
random_num=random.randint(0, len(images))
img=get_random_image(random_num)
plt.figure(figsize=(8,8))
plt.imshow(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))
Display cluster of random images from the given dataset
#Making a figure containing 16 images
lst=random.sample(range(0,len(images)), 16)
plt.figure(figsize=(12,12))
for index,value in enumerate(lst):
img=get_random_image(value)
img_resized=cv2.resize(img,(400,400))
#print(path)
plt.subplot(4,4,index+1)
plt.imshow(img_resized)
plt.axis('off')
plt.tight_layout()
plt.subplots_adjust(wspace=0, hspace=0)
#plt.savefig(f"Images/{lst[0]}.png")
plt.show()
Plotting images present in a dataset
Here rand gives a random index value which is used to select a random image present in the dataset and labels has the integer representation for every image type and labels_dict is a dictionary holding key val information
fig,ax = plt.subplots(5,5,figsize = (15,15))
ax = ax.ravel()
for i in range(25):
rand = np.random.randint(0,len(image_dataset))
image = image_dataset[rand]
ax[i].imshow(image,cmap = 'gray')
ax[i].set_title(labels_dict[labels[rand]])
plt.show()
In a previous question,
colobar label matplotlib in ImageGrid,
had a solution for adding a label to the colorbar, but this seems to be broken with the current version.
Platforms I have tried:
Mac w/ Canopy:
python: 2.7
matplotlib: 1.4.3-6
Linux:
python: 2.7
matplotlib: 1.3.1
Below is the code from the previous question, with some extra code for running in an iPython notebook:
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import AxesGrid
def get_demo_image():
import numpy as np
from matplotlib.cbook import get_sample_data
f = get_sample_data("axes_grid/bivariate_normal.npy", asfileobj=False)
z = np.load(f)
# z is a numpy array of 15x15
return z, (-3,4,-4,3)
def demo_grid_with_single_cbar(fig):
"""
A grid of 2x2 images with a single colorbar
"""
grid = AxesGrid(fig, 132, # similar to subplot(132)
nrows_ncols = (2, 2),
axes_pad = 0.0,
share_all=True,
label_mode = "L",
cbar_location = "top",
cbar_mode="single",
)
Z, extent = get_demo_image()
for i in range(4):
im = grid[i].imshow(Z, extent=extent, interpolation="nearest")
#plt.colorbar(im, cax = grid.cbar_axes[0])
#grid.cbar_axes[0].colorbar(im)
cbar = grid.cbar_axes[0].colorbar(im)
cbar.ax.set_label_text("$[a.u.]$")
for cax in grid.cbar_axes:
cax.toggle_label(False)
# This affects all axes as share_all = True.
grid.axes_llc.set_xticks([-2, 0, 2])
grid.axes_llc.set_yticks([-2, 0, 2])
#
F = plt.figure(1, (10.5, 2.5))
F.subplots_adjust(left=0.05, right=0.95)
demo_grid_with_single_cbar(F)
plt.draw()
plt.show()
The error message from the code is of the form:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-1-60ebdb832699> in <module>()
40 F = plt.figure(1, (10.5, 2.5))
41 F.subplots_adjust(left=0.05, right=0.95)
---> 42 demo_grid_with_single_cbar(F)
43
44 plt.draw()
<ipython-input-1-60ebdb832699> in demo_grid_with_single_cbar(fig)
29 #grid.cbar_axes[0].colorbar(im)
30 cbar = grid.cbar_axes[0].colorbar(im)
---> 31 cbar.ax.set_label_text("$[a.u.]$")
32
33 for cax in grid.cbar_axes:
AttributeError: 'CbarAxes' object has no attribute 'set_label_text'
Has the matplotlib interface changed since the original question was asked? If so, how do I add the colorbar label?
Personally, I've always perceived matplotlib as black magic, similar to TeX, so I cannot guarantee that my answer is the "official" way of doing what you want, or that it will continue to work in later versions. But thanks to this gallery example, I could devise the following incantation:
grid[0].cax.colorbar(im)
cax = grid.cbar_axes[0]
axis = cax.axis[cax.orientation]
axis.label.set_text("$[a.u.]$")
(don't forget to remove all your colorbar-related code). This works in the current matplotlib version (1.4.3). The result: