Matplotlib violinplots overlap on the same column - python

I want to create a figure with different violin plots on the same graph (but not on the same column).
My data are a list of dataframes and I want to create a violin plot of one column for each dataframe. (the names of the columns in the final figure I prefer to have as a name that is inside each dataframe in one other column).
I used this code:
for i in range(0,len(sta_list)):
I know that this is wrong, I want to split up the resulting plots in the figure.

You can specify the x-position of the violin plot for each column using positions argument
for i in range(0, len(sta_list)):
plt.violinplot(sta_list[i]['diff_APS_1'], positions=[i])
A sample answer for demonstration taking the dataset from this post
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = np.random.poisson(lam =3, size=100)
y = np.random.choice(["S{}".format(i+1) for i in range(4)], size=len(x))
df = pd.DataFrame({"Scenario":y, "LMP":x})
fig, ax = plt.subplots()
for i, key in enumerate(['S1', 'S2', 'S3', 'S4']):
ax.violinplot(df[df.Scenario == key]["LMP"].values, positions=[i])


How to force display of x- and y-axis for each subplot in

I want to plot a histogram with row and colum facets using where each subplot gets its own x- and y-axis (for better readability). When looking at the documentation (e.g. go to section "Histogram Facet Grids") I can see a lot of examples where the x- and y-axes are repeated. But in my case, this somehow is not done automatically.
import numpy as np
import pandas as pd
import as px
# create a dummy dataframe with lots of variables
rng = np.random.default_rng(42)
n_vars = 3
n_samples = 10
random_vars = [rng.normal(size=n_samples) for v in range(n_vars)]
m = np.vstack(random_vars).T
columns = pd.MultiIndex.from_tuples([('a','b'),('a','c'),('b','c')],names=['src','tgt'])
df = pd.DataFrame(m,columns=columns)
# convert to long format
df_long = df.melt()
# plot with plotly
fig = px.histogram(df_long,x='value',facet_row='src',facet_col='tgt')
fig.update_layout(yaxis={'side': 'left'})
which gives me:
How do I post-hoc configure the figure so that the x- and y-axis are shown for each subplot?
All you need to do is to customize each y and x axis by:
fig.for_each_yaxis(lambda y: y.update(showticklabels=True,matches=None))
fig.for_each_xaxis(lambda x: x.update(showticklabels=True,matches=None))

Multiple boxplot in a single Graphic in Python

I'm a beginner in Python.
In my internship project I am trying to plot bloxplots from data contained in a csv
I need to plot bloxplots for each of the 4 (four) variables showed above (AAG, DENS, SRG e RCG). Since each variable presents values ​​in the range from [001] to [100], there will be 100 boxplots for each variable, which need to be plotted in a single graph as shown in the image.
This is the graph I need to plot, but for each variable there will be 100 bloxplots as each one has 100 columns of values:
The x-axis is the "Year", which ranges from 2025 to 2030, so I need a graph like the one shown in figure 2 for each year and the y-axis is the sets of values ​​for each variable.
Using Pandas-melt function and seaborn library I was able to plot only the boxplots of a column. But that's not what I need:
import pandas as pd
import seaborn as sns
df = pd.read_csv("2DBM_50x50_Central_Aug21_Sim.cliped.csv")
mdf= df.melt(id_vars=['Year'], value_vars='AAG[001]')
ax=sns.boxplot(x='Year', y='value',width = 0.2, data=mdf)
Result of the code above:
What can I try to resolve this?
The following code gives you five subplots, where each subplot only contains the data of one variable. Then a boxplot is generated for each year. To change the range of columns used for each variable, change the upper limit in var_range = range(1, 101), and to see the outliers change showfliers to True.
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
df = pd.read_csv("2DBM_50x50_Central_Aug21_Sim.cliped.csv")
variables = ["AAG", "DENS", "SRG", "RCG", "Thick"]
period = range(2025, 2031)
var_range = range(1, 101)
fig, axes = plt.subplots(2, 3)
flattened_axes = fig.axes
for i, var in enumerate(variables):
var_columns = [f"TB_acc_{var}[{j:05}]" for j in var_range]
data = df.melt(id_vars=["Period"], value_vars=var_columns, value_name=var)
ax = flattened_axes[i]
sns.boxplot(x="Period", y=var, width=0.2, data=data, ax=ax, showfliers=False)

Changing the order of pandas/matplotlib line plotting without changing data order

Given the following example:
df = pd.DataFrame(np.random.randint(1,10, size=(8,3)), columns=list('XYZ'))
The order of plotting puts the last column on top:
How can I make this keep the data & legend order but change the behaviour so that it plots X on top of Y on top of Z?
(I know I can change the data column order and edit the legend order but I am hoping for a simpler easier method leaving the data as is)
UPDATE: final solution used:
(Thanks to r-beginners) I used the get_lines to modify the z-order of each plot
df = pd.DataFrame(np.random.randint(1,10, size=(8,3)), columns=list('XYZ'))
fig = plt.figure()
ax = fig.add_subplot(111)
df.plot(ax=ax, linewidth=10)
lines = ax.get_lines()
for i, line in enumerate(lines, -len(lines)):
In a notebook produces:
Get the default zorder and sort it in the desired order.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(1,10, size=(8,3)), columns=list('XYZ'))
ax = df.plot(linewidth=10)
l = ax.get_children()
Before definition
After defining zorder
I will just put this answer here because it is a solution to the problem, but probably not the one you are looking for.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# generate data
df = pd.DataFrame(np.random.randint(1,10, size=(8,3)), columns=list('XYZ'))
# read columns in reverse order and plot them
# so normally, the legend will be inverted as well, but if we invert it again, you should get what you want
df[df.columns[::-1]].plot(linewidth=10, legend="reverse")
Note that in this example, you don't change the order of your data, you just read it differently, so I don't really know if that's what you want.
You can also make it easier on the eyes by creating a corresponding method.
def plot_dataframe(df: pd.DataFrame) -> None:
df[df.columns[::-1]].plot(linewidth=10, legend="reverse")
# then you just have to call this
df = pd.DataFrame(np.random.randint(1,10, size=(8,3)), columns=list('XYZ'))

How to remove certain values before plotting data

I'm using python for the first time. I have a csv file with a few columns of data: location, height, density, day etc... I am plotting height (i_h100) v density (i_cd) and have managed to constrain the height to values below 50 with the code below. I now want to constrain the values on the y axis to be within a certain 'day' range say (85-260). I can't work out how to do this.
import pandas
import matplotlib.pyplot as plt
Use .loc to subset data going into graph.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Make some dummy data
df = pd.DataFrame({'a':np.random.randint(0,365,20),
# all data: plot of 'b' vs. 'c'
df.plot(kind='scatter', x='b', y='c')
# use .loc to subset data displayed based on value in 'a'
# can also use .loc to restrict values of 'b' displayed rather than plt.xlim
df.loc[df['a'].between(85,260) & (df['b'] < 0.5)].plot(kind='scatter', x='b', y='c')

Can we plot a particular column of a row against another column of same row in matplotlib

I am using python to plot my data set. I want a particular column of a row to be plotted against another column of same row. To be precise, I want my two columns to be the x-axis and y-axis and then plot a particular value entered by the user to be plotted on that graph.
import matplotlib.pyplot as plt
import pandas
import numpy as np
filename = 'friuts.csv'
raw_data = open(filename, 'rb')
data = pandas.read_csv(raw_data)
mydata = pandas.DataFrame(np.random.randn(10,2), columns=['col1','col2'])
my data set has column with fruit name and their weights in two different columns. Can those two weights be taken as x and y axis. But, I only want a graph of single row at a time.
What I have tried is taking the entire columns of all the rows.
Is this what you are looking for?
plt.scatter(mydata.col1, mydata.col2)
Assuming that you want to plot a single point with the information in a given row:
Select the row using the fruit name, this will return a pd.Series
Invoke the method plot over the result
For example:
import pandas as pd
import matplotlib.pyplot as plt
# Create the data frame
mydata = pd.DataFrame({
'name': ['banana', 'mango', 'lima', 'apple'],
'weight': [1, 2, 3, 4]})
# Select the fruit you want to plot. This will return a pd.Series
# including the colums 'name' and 'weight'
to_plot = mydata[mydata['name'] == 'banana']
# Call the plot function indicating the which column X and Y axis.
fig, ax = plt.subplots()
to_plot.plot(x='name', y='weight', marker='o', ax=ax)
