How can I plot a legend with multiple rows and columns? - python

I'm working on a bar-plot like this:
I want to move the legend to the upper-left corner and upon the top spine. But it is too high and will covered some bars. So I want to change it as 2 rows and 4 columns. How should I do? I've searched about 1 hour and have no result, maybe I got wrong keywords.

From the fine docs
ncol : integer
      The number of columns that the legend has. Default is 1.

Related

Plot with many data points

New to matplotlib. Basically, I have 2 axis using ax.twinx(). I have daily data going back 20 years. One of plots (LHS - Red) shows up as expected, but when I add the second plot (RHS - Blue), it doesn't show up as I want it as there is a large variation in the data points.
How can I fix it? I want it to be a smooth line.
When I add the second subplot
This is what I want the blue line to look like:
Before I add the second subplot

Center nested boxplots in Python/Seaborn with unequal classes

I have grouped data from which I want to generated boxplots using seaborn. However, not every group has all classes. As a result, the boxplots are not centered if classes are missing within one group:
Figure
The graph is generated using the following code:
sns.boxplot(x="label2", y="value", hue="variable",palette="Blues")
Is there any way to force seaborn to center theses boxes? I didn't find any approbiate way.
Thank you in advance.
Yes there is but you are not going to like it.
Centering these will mean that you will have the same y value for median values, so normalize your data so that the median is 0.5 for each y value for each value of x. That will give you the plot you want, but you should note that somewhere in the plot so people will not be confused.

python bar chart not centered

I am trying to build a simple histogram. For some reason, my bars are behaving abnormally. As you can see in this picture, my bar over "3" is moved to the right side. I am not sure what caused it. I did align='mid' but it did not fix it.
This is the code that I used to create it:
def createBarChart(colName):
df[colName].hist(align='mid')
plt.title(str(colName))
RUNS = [1,2,3,4,5]
plt.xticks(RUNS)
plt.show()
for column in colName:
createBarChart(column)
And this is what I got:
bar is not centered over 3
To recreate my data:
df = pd.DataFrame(np.random.randint(1,6,size=(100, 4)), columns=list('ABCD'))
Thank you for your help!
P/s: idk if this info is relevant, but I am using seaborn-whitegrid style. I tried to recreate a plot with sample data and it's still showing up. Is it a bug?
hist created using random data
The hist function is behaving exactly as it is supposed to. By default it splits the data you pass into 10 bins, with the left edge of the first bin at the data's minimum value and the right edge of the last bin at its maximum. The chart below shows the randomly generated data binned this way, with red dashed lines to mark the edges of the bins.
The way around this is to define the bin edges yourself, with a slight adjustment to the minimum and maximum values to centre the bars over the x axis ticks. This can be done quite easily with numpy's linspace function (using column A in the randomly generated data frame as an example):
bins = np.linspace(df["A"].min() - .5, df["A"].max() + .5, 6)
df["A"].hist(bins=bins)
We ask for 6 values because we are defining the bin edges, this will result in 5 bins, as shown in this chart:
If you wanted to keep the gaps between the bars you can increase the number of bins to 9 and adjust the offset slightly, but this wouldn't work in all cases (it works here because every value is either 1, 2, 3, 4 or 5).
bins = np.linspace(df["A"].min() - .25, df["A"].max() + .25, 10)
df["A"].hist(bins=bins)
Finally, as this data contains discrete values and really you are plotting the counts, you could use the value_counts function to create a series that can then be plotted as a bar chart:
df["A"].value_counts().sort_index().plot(kind="bar")
# Provide a 'color' argument if you need all of the bars to look the same.
df["A"].value_counts().sort_index().plot(kind="bar", color="steelblue")
Try using something like this in your code to create all of the histogram bars to the same place.
plt.hist("Your data goes here", bins=range(1,7), align='left', rwidth=1, normed=True)
place your data where I put your data goes here

Change distance between bar groups in grouped bar chart (plotting with Pandas)

I have a Dataframe with 14 rows and 7 columns where the columns represent groups and the rows represent months. I am trying to create a grouped bar plot such that at each month (on the x-axis) I will have the values for each of the groups as bars. The code is simply
ax = df.plot.bar(width=1,color=['b','g','r','c','orange','purple','y']);
ax.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
ax.set_xticklabels(months2,rotation=45)
Which produces the following result:
I would like to make the individual bars in each group wider but without them overlapping and I would also like to increase the distance between each group of bars so that there is enough space in the plot.
It might be worth mentioning that the index of the dataframe is 0,...,13.
Help would be greatly appreciated!
TH
If you want to pack 10 apples in a box and want the apples to have more space between them you have two options: (1) take a larger box, or (2) use smaller apples.
(1) How do you change the size of figures drawn with matplotlib?
(2) change the width argument.

sns.factorplot separate charts

I'm trying to create factorplots for values from one column with 18 values and I'm adding hue parameters as a different column also with 18 unique values, this results in huge chart that is not easy to read. So I want to create separate charts for every unique value from the column so that it's more clearly visible.
So currently it looks like this:
factorplot
And I want to split those 18 charts divided by hue into separate charts.
I was thinking of using loop but I'm stuck at this point:
for i in dframe.type1.unique():
sns.factorplot(x='type1',data=dframe, kind='count')
You need to use the col parameter. Check out more examples on the seaborn doc page for factorplots about 2/3 of the way down.
sns.factorplot(x='type1', col='type2', col_wrap=4, data=dframe, kind='count',
sharex=False, sharey=False)

Categories