unable to plot histogram(hist2d) - python

Trying to plot length of objects vs total count of objects using hist2d. I am getting the following error. Can you please help me in finding the error.
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
count=799.000000
plt.hist2d(length,count,bins = 10)
plt.xlabel('count')
plt.ylabel('length')
plt.grid(True)
plt.show()
print(length)
1 3.978235
2 4.740024
3 3.470375
4 3.978235
5 3.808948
...
807 5.078597
808 4.655381
809 4.232164
810 4.655381
811 3.470375
Name: length_mm, Length: 799, dtype: float64

I believe the issue in you code is the use of hist2d rather than the good-old hist. With hist, you don't have to pass the number of items - it gets that from the Series:
plt.hist(length, bins = 10)
plt.xlabel('count')
plt.ylabel('length')
plt.grid(True)
plt.show()
The result (for a small amount of data) is:
If, on the other hand, you'd looking for a bar chart, here's the way to do it for length:
fig, ax = plt.subplots()
ax.bar(length.index, length)
fig.show()
The result (for limited data, of course) is:

Related

plotly multiple lines chart with a varying dataframe

I'm trying to make a function that plots all a Dataframe content.
DF Sample:
print(df2)
Close Close Close
Date
2018-12-12 00:00:00-05:00 53.183998 24.440001 104.500504
2018-12-13 00:00:00-05:00 53.095001 25.119333 104.854973
2018-12-14 00:00:00-05:00 52.105000 24.380667 101.578560
2018-12-17 00:00:00-05:00 50.826500 23.228001 98.570381
2018-12-18 00:00:00-05:00 51.435501 22.468666 99.605042
Python:
fig = px.line(df2, x=df2.index, y=df2.columns[1:])
I'm trying to plot it but get this error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
My data frame may have various numbers of columns, so I need my code to plot all columns.
By the way:
print(df2.columns[1:])
Index(['Close', 'Close'], dtype='object')
Try changing the column names. I used the code above with unique column names for the above data and got the following:
Plot obtained.
Also, you can use the date column as your x-axis in the plot. Plotly will generate a timeseries chart for the same.

Plot histogram of all numerical columns in pandas, with mean avxline using tight layout

I am trying to get all the numerical columns plotted within a tight layout with a mean line in each of the subplots for the average of each column. I have tried several options and get the following error:
the truth value of a series is ambiguous. use a.empty, a.bool(), a.item(), a.any() or a.all()
FYI: the code works without the plt.axvline
This is the code I have tried:
from scipy.stats import norm
all_col = data.select_dtypes(include=np.number).columns.tolist()
plt.figure(figsize=(17,75))
for i in range(len(all_col)):
plt.subplot(18,3,i+1)
sns.distplot(data[all_col[i]])
plt.tight_layout()
plt.title(all_col[i],fontsize=25)
plt.axvline(data.mean(), color='k', linestyle='dashed', linewidth=1)#displaying the mean on the chart
plt.show()
If I use a sample dataset, the error lies in plt.axvline(data.mean()), since data.mean() lists the means of all columns and axvline draws only one line at one x value.
I would do all this as follows:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
data = sns.load_dataset('tips') # Sample data
num = data.select_dtypes(include=np.number) # Get numeric columns
n = num.shape[1] # Number of cols
fig, axes = plt.subplots(n, 1, figsize=(14/2.54, 12/2.54)) # create subplots
for ax, col in zip(axes, num): # For each column...
sns.distplot(num[col], ax=ax) # Plot histogaerm
ax.axvline(num[col].mean(), c='k') # Plot mean

Plt.Scatter at seaborn : Error : The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I'm looking for a solution to solve my problem when i try to plot my figure: "The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
If I remove the data_t_quali['TOR'] for example and just put 'TOR' (as it says on the seaborn website) then I get empty graphics.
g = sns.FacetGrid(data_t_quali1, col='Reussie',col_order=['Reprise en main reussie','Echec de la reprise en main'],hue='Age',hue_order=[20,30,40,50,60 ],size=6,aspect=1.2,palette=sns.light_palette('navy', 4)[1:])
g.map(plt.scatter,data_t_quali1['NombreFixation'],data_t_quali1['TOR'],alpha=0.9, edgecolor='white', linewidth=2, s=300)
fig = g.fig
fig.subplots_adjust(top=0.79, wspace=0.3)
fig.suptitle('Time to take over control (TOR) in fonction of the fixation numbers 2 minutes before the regaining control of the vehicle', fontsize=20, fontweight='bold')
g.add_legend(title='Age du conducteur')

Matplotlib ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

When I use the following code below to make a plot, in which I assign customized values to the x axis and set xlim, it works out.
x = np.array([0,1,2,3])
y = np.array([0.650, 0.660, 0.675, 0.685])
customized_x = np.array([3, 4, 5, 6])
plt.xticks(x, customized_x)
plt.xlim(2, 3)
plt.plot(x, y)
plt.show()
However, because I want to make a few more plots together, when I use subplot instead, it pops out at the line ax1.set_xticks(x, customized_x):
fig = plt.figure(figsize=[6.0, 9.0])
ax1 = fig.add_subplot(111)
x = np.array([0,1,2,3])
y = np.array([0.650, 0.660, 0.675, 0.685])
customized_x = np.array([3, 4, 5, 6])
ax1.set_xticks(x, customized_x)
ax1.set_xlim(2, 3)
ax1.plot(x, y)
plt.show()
Error obtained:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Although
I can avoid the issue by not using subplots, I am still wondering what's wrong with subplots?
EDIT
I find a solution to the similar question here. Sorry for the repetition.
You appear to use set_xticks mistakenly. From matplotlib documentation:
Axes.set_xticks(self, ticks, minor=False)ΒΆ
This is because Axes.set_xticks has different arguments, compared to plt.xticks. The code tries to evaluate customized_x as the minor parameter.
I think what you need is Axes.set_xticklabels: see the documentation here

Matplotlib bar chart error: "the truth value of an array with more than one element is ambiguous. use a.any() or a.all()"

I'm making a bar chart in matplotlib, and getting an error as follows:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
My code is like this:
N = 5
set_A = (Table1['A'], Table1['B'],
Table1['C'], Table1['D'],
Table1['E'])
ind = np.arange(N)
width = 0.35
plt.subplot(111)
rects1 = plt.bar(ind, set_A, width, color='g')
set_B = (Table2['A'], Table2['B'],
Table2['C'], Table2['D'],
Table2['E'])
rects2 = plt.bar(ind+width, set_B, width, color='b')
The line the error refers to is
rects1 = plt.bar(ind, set_A, width, color='g')
I don't really understand what's wrong. The code is pretty much taken straight from the example at http://matplotlib.org/users/screenshots.html
My Table1 was of the wrong form: it had 2 rows, instead of the 1 I had assumed. Thus "Table1['A']" was 2 elements, rather than 1.

Categories