I've come across the issue of having overlapping xlables on a Seaborn line plot. The data set has multiple occurrences of those labels, which is understandable. But is there a way to fix the xlabels without having to change the format of the plot or the data frame?
The xlabels have been formatted to the Timestamp type earlier on, and the plot is shown below;
code:
plt.figure(figsize=(15,10))
sns.lineplot(data=data_no_orkney, x="Data_Month_Date", y=percentage)
plt.xticks(list(set(data_no_orkney.Data_Month_Date)))
#plt.axvline(x=pd.Timestamp(year=2020,month=3,day=23), color='r', ls='--', label="Date of first lockdown")
plt.xlabel("Year")
plt.ylabel("Percentage meeting target")
plt.show()
Also, would it be correct of me to assume that the solid, blue line in the middle is the mean out of the values shown in the lighter blue area? I've never seen such line plot before, but that's more or less my understanding, judging by the looks of it.
I tried using plt.xticks(list), where I tried having the list to contain unduplicated Timestamp (date) values. The only result was that it to took the code longer to run, and the labels did not change.
Related
I am trying to plot line charts for both nighttime and daytime to compare the differences in traffic volume in both time periods.
plt.subplot(2,1,1) #plot in grid chart to better compare differences
by_hour_business_night['traffic_volume'].plot.line()
plt.title('Business Nights Traffic Volume by Hours')
plt.ylabel('Traffic Volume')
plt.ylim(0,6500)
plt.show()
The chart for nighttime shows up alright, but the xtick labels are in [0,5,10,15,20,25], how can I change the labels to fit the hours? Something along the lines like: [0,1,2,3,4,5,6,19,20,21,22,23]
I have tried
x=[0,1,2,3,4,5,6,19,20,21,22,23]
plt.xticks(x)
But then I just got [0-6] on the left, and [19-23] on the right, both crammed on either side, leaving the middle of the xticks blank.
Or is there a better way to plot the chart? Since there will be a breaking point between 6 and 19 hours, is there a way to avoid this?
I am new to python and matplotlib, so forgive me if my wordings aren't precise enough.
xticks takes in two arguments: an array-like object of the placements and an array-like object of the labels. So you can do something like this:
plt.xticks(x, x)
This will set a label equal to the placement of the xtick. For more info you can read the docs for xtick here
Hello I am very new to using python, I am starting to use it for creating graphs at work (for papers and reports etc). I was just wondering if someone could help with the problem which I have detailed below? I am guessing there is a very simple solution but I can't figure it out and it is driving me insane!
Basically, I am plotting the results from an experiment where by on the Y-axis I have the results which in this case is a numerical number (Result), against the x-axis which is categorical and is labeled Location. The data is then split across four graphs based on which machine the experiment is carried out on (Machine)(Also categorical).
This first part is easy the code used is this:
'sns.catplot(x='Location', y='Result', data=df3, hue='Machine', col='Machine', col_wrap = 2, linewidth=2, kind='swarm')'
this provides me with the following graph:
I now want to add another layer to the plot where by it is a red line which represents the Upper spec limit for the data.
So I add the following line off code to the above:
'sns.lineplot(x='Location',y=1.8, data=df3, linestyle='--', color='r',linewidth=2)'
This then gives the following graph:
As you can see the red line which I want is only on one of the graphs, all I want to do is add the same red line across all four graphs in the exact same position etc.
Can anyone help me???
You could use .map to draw a horizontal lines on each of the subplots. You need to catch the generated FacetGrid object into a variable.
Here is an example:
import matplotlib.pyplot as plt
import seaborn as sns
titanic = sns.load_dataset('titanic').dropna()
g = sns.catplot(x='class', y='age', data=titanic,
hue='embark_town', col='embark_town', col_wrap=2, linewidth=2, kind='swarm')
g.map(plt.axhline, y=50, ls='--', color='r', linewidth=2)
plt.tight_layout()
plt.show()
I am making a simple plot in Python with Matplotlib that shows populations of different regions over time. I have a CSV file that has columns of each region's population over the years, so the years is on the x-axis and population is on the y-axis. The plot looks okay except the y-axis. As you can see in the image, every single population value is included on the y-axis, which is too many values and is unnecessary. I would like to y-axis to have some increments (such as 100 million). Is there a simple way to do that or would I have to manually add my own increments?
And I tried to scale it linearly and logarithmic but I would still prefer to have increments on the y-axis.
This is what the plot looks like right now.
(I took out unnecessary code such as legend and formatting):
data2 = pd.read_csv('data02_world.csv')
for region in data2:
if region != 'Year':
plt.plot(data2.Year, data2[region], marker='.', label=region)
plt.xlabel('Year')
plt.ylabel('Population')
plt.show()
I think you can simply do with pandas:
data2 = pd.read_csv('data02_world.csv')
data2.set_index('Year', inplace=True)
data2.plot()
if you would like to adopt matplotlib plt.yticks is what you need
I have created two line plots with this dataset. The first lineplot shows the number of flight accidents in a given year. The second lineplot shows the number of fatalities in a given year. I want to put both line plots on the same graph. This is the code I have used:
fatalities=df[['Fatalities','Date']]
fatalities['Year of Fatality']=fatalities['Date'].dt.year
fatalities.drop('Date',inplace=True)
fatalities.set_index('Year of Fatality',inplace=True)
fatalities.sort_index(inplace=True)
plt.figure(figsize=(12,9))
plt.title("Number of Flight Accidents Since 1908",fontsize=20)
plt.ylabel("Number of Flight Accidents")
plt.xlabel("Year")
plt.xticks(year.index,rotation=90)
year.plot()
fatalities.plot()
plt.show()
What I get are two plots, with on above the other: the plot which shows the number of fatalities and the plot which shows the number of flight accidents.
What I want is one graph that shows the two line plots. Any help would be much appreciated. (Side note: how can I rotate the xticks 90 degrees? I used the rotation argument in the plt.xticks() but this had zero affect).
Given the use of .plot() and variables called df, I assume you're using pandas dataframes (if that's not the case, the answer still probably applies, look up the docs for your plot function).
Pandas' plot by default puts the plots in their own axis, unless you pass one to draw on via the ax attribute:
fig, ax = plt.subplots()
year.plot(ax=ax)
fatalities.plot(ax=ax)
I'm making a scatter plot from a Pandas DataFrame with 3 columns. The first two would be the x and y axis, and the third would be classicfication data that I want to visualize by points having different colors. My question is, how can I add the legend to this plot:
df= df.groupby(['Month', 'Price'])['Quantity'].sum().reset_index()
df.plot(kind='scatter', x='Month', y='Quantity', c=df.Price , s = 100, legend = True);
As you can see, I'd like to automatically color the dots based on their price, so adding labels manually is a bit of an inconvenience. Is there a way I could add something to this code, that would also show a legend to the Price values?
Also, this colors the scatter plot dots on a range from black to white. Can I add custom colors without giving up the easy usage of c=df.Price?
Thank you!