matplotlib: why plot a hist would cause IndexError in python? - python

I am learning horse-colic dataset.
thanks to #Vaishali #Ultra TLC #Tom 's help, the data is imported.
p_data = 'https://raw.githubusercontent.com/MachineIntellect/dataset.ml/master/horse_colic.csv'
df = pd.read_csv(p_data)
df = df.replace("?", np.NaN)
df = df.astype(np.float)
to get the number of cols and rows to plot, this piece of code works well too
%matplotlib inline
n_col = 4
n_row = int(math.ceil(df.shape[1] * 1.0/n_col))
when i try to plot a hist
fig, axes = plt.subplots(n_row, n_col, figsize=(15, 30))
plt.tight_layout()
for i, col in enumerate(df.columns):
pos_i = i / n_col
pos_j = i % n_col
df.groupby("cp_data")[col].plot.hist(title=col, alpha=0.5, ax=axes[pos_i, pos_j]);
error shows up
IndexError Traceback (most recent call last)
<ipython-input-1-e6f18b850dfa> in <module>()
17 pos_i = i / n_col
18 pos_j = i % n_col
---> 19 df.groupby("cp_data")[col].plot.hist(title=col, alpha=0.5, ax=axes[pos_i, pos_j]);
20
21
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
why plot a hist would cause IndexError?
It seems that something wrong happens somewhere at plot.hist(), how to figure it out?

Related

Automatically generate subplots based on length of a list

I want to generate colorplot subplots based on a numpy array's length which I call dets. I want to use the array length to determine the right number of columns and rows for the subplots. If square, plot a square matrix of subplots, if not square, add another row. For starters, I have written some code to check if the array's length would create a square matrix of subplots with the following:
data_f = np.random.rand(len(dets),2,5)
dets = np.arange(-5,-0.75,0.25)
x = np.array([1,5,6,3,8,9,2,3,10,12,3])
v = np.linspace(0,10,len(x))
square = np.sqrt(len(dets))
check_square = len(dets)%square
non_square = 1
print(len(data_f))
if check_square == 0:
nrows = int(np.sqrt(len(dets)))
ncols = int(np.sqrt(len(dets)))
else:
nrows = int(np.sqrt(len(dets)))+non_square
ncols = int(np.sqrt(len(dets)))
fig, ax = plt.subplots(nrows, ncols, sharex='col', sharey='row')
for i in range(nrows):
for j in range(ncols):
if i==0:
im = ax[i,j].imshow(data_f[j],extent=(x.min(), x.max(), v.min(), v.max()),origin='lower',aspect='auto')
else:
im = ax[i,j].imshow(data_f[j+ncols*i],extent=(x.min(), x.max(), v.min(), v.max()),origin='lower',aspect='auto')
The output plot:
This plots 17 plots but the resulting plots I cannot adjust because of the error
This plots everything I want, except it always smushes the plots together in a weird way because of the following error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
~\AppData\Local\Temp\1/ipykernel_4560/3817292743.py in <module>
6 im = ax[i,j].imshow(data_f[j],extent=(x.min(), x.max(), v.min(), v.max()),origin='lower',aspect='auto')
7 else:
----> 8 im = ax[i,j].imshow(data_f[j+ncols*i],extent=(x.min(), x.max(), v.min(), v.max()),origin='lower',aspect='auto')
9
IndexError: index 17 is out of bounds for axis 0 with size 17
It is because you have a total of 20 axes, and when you loop through ncols and nrows, you will get 20 iterations. But len(data_f) is only 17.
at the start of your iteration, add
if(j + ncols*i) == len(data_f):
break
I did this and it stopped the error

Subplots in columns and rows

I would like my sybplots to be generated in 2x columns and 5x rows.
I've also tried adding ncols=2, nrows=5 to the code. didn't work.
And when I change the subplots to plt.subplots(5,2) instead of plt.subplots(10,1) it says (see added picture of code+error message):
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_9844/709244097.py in
13
14 for ax, afstand, tid in zip(ax, afstande, tider):
---> 15 ax.plot(tid, afstand)
16 ax.set_title("x(t)", fontsize=12)
17 ax.set_xlabel("tid (s)", fontsize=12)
AttributeError: 'numpy.ndarray' object has no attribute 'plot'
My code:
from scipy.optimize import fmin
a = -75.64766759
b = 68.02691163
f = lambda x: a * x + b
afstand1, afstand2, afstand3, afstand4, afstand5, afstand6, afstand7, afstand8, afstand9, afstand10 = f(U1), f(U2), f(U3), f(U4), f(U5),f(U6), f(U7), f(U8), f(U9), f(U10)
afstande = [afstand1, afstand2, afstand3, afstand4, afstand5, afstand6, afstand7, afstand8, afstand9, afstand10]
tider = [tid1, tid2, tid3, tid4, tid5, tid6, tid7, tid8, tid9, tid10]
fig, ax = plt.subplots(10,1, figsize=(7,25))
plt.subplots_adjust(hspace=0.55)
#loop
for ax, afstand, tid in zip(ax, afstande, tider):
ax.plot(tid, afstand)
ax.set_title("x(t)", fontsize=12)
ax.set_xlabel("tid (s)", fontsize=12)
ax.set_ylabel("Position", fontsize=12)
enter image description here
First of all, you're using the same variable name for the array of axis and in the loop, you should change that. Subplot-axes are stored in numpy arrays. If you only have 1 row, then looping over the array gives you the elements, but in a x*y pattern of subplots, you loop over a two-dimensional array of axis, which yields the rows. You can solve that by using .flat to get a one-dimensional view.
fig, axs = plt.subplots(ncols=5, nrows=2)
for ax in axs.flat:
ax.plot(...)

Display a tensor image in matplotlib

I'm doing a project for Udacity's AI with Python nanodegree.
I'm trying to display a torch.cuda.FloatTensor that I obtained from an image file path. Below that image will be a bar chart showing the top 5 most likely flower names with their associated probabilities.
plt.figure(figsize=(3,3))
path = 'flowers/test/1/image_06743.jpg'
top5_probs, top5_class_names = predict(path, model,5)
print(top5_probs)
print(top5_class_names)
flower_np_image = process_image(Image.open(path))
flower_tensor_image = torch.from_numpy(flower_np_image).type(torch.cuda.FloatTensor)
flower_tensor_image = flower_tensor_image.unsqueeze_(0)
axs = imshow(flower_tensor_image, ax = plt)
axs.axis('off')
axs.title(top5_class_names[0])
axs.show()
fig, ax = plt.subplots()
y_pos = np.arange(len(top5_class_names))
plt.barh(y_pos, list(reversed(top5_probs)))
plt.yticks(y_pos, list(reversed(top5_class_names)))
plt.ylabel('Flower Type')
plt.xlabel('Class Probability')
The imshow function was given to me as
def imshow(image, ax=None, title=None):
if ax is None:
fig, ax = plt.subplots()
# PyTorch tensors assume the color channel is the first dimension
# but matplotlib assumes is the third dimension
image = image.transpose((1, 2, 0))
# Undo preprocessing
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
image = std * image + mean
# Image needs to be clipped between 0 and 1 or it looks like noise when displayed
image = np.clip(image, 0, 1)
ax.imshow(image)
return ax
But I get this output
[0.8310797810554504, 0.14590543508529663, 0.013837042264640331, 0.005048676859587431, 0.0027143193874508142]
['petunia', 'pink primrose', 'balloon flower', 'hibiscus', 'tree mallow']
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-f54be68feb7a> in <module>()
12 flower_tensor_image = flower_tensor_image.unsqueeze_(0)
13
---> 14 axs = imshow(flower_tensor_image, ax = plt)
15 axs.axis('off')
16 axs.title(top5_class_names[0])
<ipython-input-15-9c543acc89cc> in imshow(image, ax, title)
5 # PyTorch tensors assume the color channel is the first dimension
6 # but matplotlib assumes is the third dimension
----> 7 image = image.transpose((1, 2, 0))
8
9 # Undo preprocessing
TypeError: transpose(): argument 'dim0' (position 1) must be int, not tuple
<matplotlib.figure.Figure at 0x7f5855792160>
My predict function works, but the imshow just chokes with the call to transpose. Any ideas on how to fix this? I think it vaguely has something to do with converting back to a numpy array.
The notebook that I'm working on can be found at https://github.com/BozSteinkalt/ImageClassificationProject
Thanks!
You are trying to apply numpy.transpose to a torch.Tensor object, thus calling tensor.transpose instead.
You should convert flower_tensor_image to numpy first, using .numpy()
axs = imshow(flower_tensor_image.detach().cpu().numpy(), ax = plt)

How does the indexing of subplots work

I have the following:
import matplotlib.pyplot as plt
fig = plt.figure()
for i in range(10):
ax = fig.add_subplot(551 + i)
ax.plot([1,2,3,4,5], [10,5,10,5,10], 'r-')
I was imagining that the 55 means that it is creating a grid that is 5 subplots wide and 5 subplots deep - so can cater for 25 subplots?
The for loop will just iterate 10 times - so I thought (obviously wrongly) that 25 possible plots would accomodate those iterations ok but I get the following:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-118-5775a5ea6c46> in <module>()
10
11 for i in range(10):
---> 12 ax = fig.add_subplot(551 + i)
13 ax.plot([1,2,3,4,5], [10,5,10,5,10], 'r-')
14
/home/blah/anaconda/lib/python2.7/site-packages/matplotlib/figure.pyc in add_subplot(self, *args, **kwargs)
1003 self._axstack.remove(ax)
1004
-> 1005 a = subplot_class_factory(projection_class)(self, *args, **kwargs)
1006
1007 self._axstack.add(key, a)
/home/blah/anaconda/lib/python2.7/site-packages/matplotlib/axes/_subplots.pyc in __init__(self, fig, *args, **kwargs)
62 raise ValueError(
63 "num must be 1 <= num <= {maxn}, not {num}".format(
---> 64 maxn=rows*cols, num=num))
65 self._subplotspec = GridSpec(rows, cols)[int(num) - 1]
66 # num - 1 for converting from MATLAB to python indexing
ValueError: num must be 1 <= num <= 30, not 0
In the convience shorthand notation, the 55 does mean there are 5 rows and 5 columns. However, the shorthand notation only works for single-digit integers (i.e. for nrows, ncols and plot_number all less than 10).
You can expand it to full notation (i.e. use commas: add_subplot(nrows, ncols, plot_number)) and then all will work fine for you:
for i in range(10):
ax = fig.add_subplot(5, 5, 1 + i)
ax.plot([1,2,3,4,5], [10,5,10,5,10], 'r-')
From the docs for plt.subplot (which uses the same args as fig.add_subplot) :
Typical call signature:
subplot(nrows, ncols, plot_number)
Where nrows and ncols are used to notionally split the figure into nrows * ncols sub-axes, and
plot_number is used to identify the particular subplot that this
function is to create within the notional grid. plot_number starts at
1, increments across rows first and has a maximum of nrows * ncols.
In the case when nrows, ncols and plot_number are all less than 10, a convenience exists, such that the a 3 digit number can be given
instead, where the hundreds represent nrows, the tens represent ncols
and the units represent plot_number.
Although tom answered your question, in this sort of situation you should be using fig, axs = plt.subplots(n, m). This will create a new figure with the n rows and m columns of subplots. fig is the figure created. axs is a 2D numpy array where each element in the array is the subplot in the corresponding location in the figure. So the top-right element axs is the top-right subplot in the figure. You can access the subplots through normal indexing, or loop over them.
So in your case you can do
import matplotlib.pyplot as plt
# axs is a 5x5 numpy array of axes objects
fig, axs = plt.subplots(5, 5)
# "ravel" flattens the numpy array without making a copy
for ax in axs.ravel():
ax.plot([1,2,3,4,5], [10,5,10,5,10], 'r-')

spacing value between value in x-axis matplotlib using python

I have a FPR and TPR plot. In here I want to ask how to arangge spacing value between the x-axis. My code is following below:
In [85]:
fig, ax = plt.subplots(figsize=(8,6), dpi=80)
ax.plot(x_iter1_TF , y_iter1_TF, label='Iter1', marker='o')
ax.plot(x_iter5_TF, y_iter5_TF ,label='Iter5', marker='v')
ax.plot(x_iter10_TF, y_iter10_TF , label='Iter10', marker='x')
ax.plot(x_iter25_TF, y_iter25_TF , label='Iter20', marker='+')
ax.plot(x_iter50_TF, y_iter50_TF , label='Iter50',marker='D')
ax.legend(loc=1); # upper left corner
ax.set_xlabel('FPR')
ax.set_ylabel('TPR')
ax.set_xlim([0,1, 0.001])
ax.set_ylim([0,1, 0.001])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-85-87b9ef379a9b> in <module>()
8 ax.set_xlabel('FPR')
9 ax.set_ylabel('TPR')
---> 10 ax.set_xlim([0,1, 0.001])
11 ax.set_ylim([0,1, 0.001])
C:\Python27\lib\site-packages\matplotlib\axes\_base.pyc in set_xlim(self, left, right, emit, auto, **kw)
2524
2525 if right is None and iterable(left):
-> 2526 left, right = left
2527
2528 self._process_unit_info(xdata=(left, right))
ValueError: too many values to unpack
In here I used ax.set_xlim([0,1, 0.001]) where 0.001 is spacing value between x-axis. Unfortunately, i faced an error. I think i did the wrong way to set those thing
As mentioned in my comment, set_xlim does not accept a "step" parameter. Also, the method I think you want is set_xticks, which can be used as follows:
In [13]: import numpy as np
...: ticks = np.arange(0, 2, 0.1)
...: ax.set_xticks(ticks)
...: fig
And gives the following result:

Categories