I'm trying to plot a line chart based on 2 columns using seaborn from a dataframe imported as a .csv with pandas.
The data consists of ~97000 records across 19 years of timeframe.
First part of the code: (I assume the code directly below shouldn't contribute to the issue, but will list it just in case)
# use pandas to read CSV files and prepare the timestamp column for recognition
temporal_fires = pd.read_csv("D:/GIS/undergraduateThesis/data/fires_csv/mongolia/modis_2001-2019_Mongolia.csv")
temporal_fires = temporal_fires.rename(columns={"acq_date": "datetime"})
# recognize the datetime column from the data
temporal_fires["datetime"] = pd.to_datetime(temporal_fires["datetime"])
# add a year column to the dataframe
temporal_fires["year"] = temporal_fires["datetime"].dt.year
temporal_fires['count'] = temporal_fires['year'].map(temporal_fires['year'].value_counts())
The plotting part of the code:
# plotting (seaborn)
plot1 = sns.lineplot(x="year",
y="count",
data=temporal_fires,
color='firebrick')
plt.gca().xaxis.set_major_formatter(FuncFormatter(lambda x, _: int(x)))
plt.xlabel("Шаталт бүртгэгдсэн он", fontsize=10)
plt.ylabel("Бүртгэгдсэн шаталтын тоо")
plt.title("2001-2019 он тус бүрт бүртгэгдсэн шаталтын график")
plt.xticks(fontsize=7.5, rotation=45)
plt.yticks(fontsize=7.5)
Python doesn't return any errors and does show the figure:
... but (1) the labels are not properly aligned with the graph vertices and (2) I want the X label ticks to show each year instead of skipping some. For the latter, I did find a stackoverflow post, but it was for a heatmap, so I'm not sure how I'll advance in this case.
How do I align them properly and show all ticks?
Thank you.
I found my answer, just in case anyone makes the same mistake.
The line
plt.gca().xaxis.set_major_formatter(FuncFormatter(lambda x, _: int(x)))
converted the X ticks on my plot to to its nearest number, but the original values stayed the same. The misalignment was because I had just renamed the "years" "2001.5" to "2001", not actually modifying the core data itself.
As for the label intervals, the addition of this line...
plt.xticks(np.arange(min(temporal_fires['year']), max(temporal_fires['year'])+1, 1.0))
...showed me all of the year values in the plot instead of skipping them.
I'm new to python and trying to plot the PSD in separate plots for each electrode of my EEG dataset via a for loop. The title of the plot should include the respective electrode name.
Here is the code I use to load the data from a .txt file:
k = pd.read_csv(r'C:\Users\LPC\Desktop\rest txt 7min\AB24_rest_asr_ICA_MARA_7min.txt',usecols=['AFp2','F9','AFF5h','AFF1h','AFF2h','AFF6h','F10','FFT9h','FFT7h','FFC5h','FFC3h','FFC1h','FFC2h','FFC4h','FFC6h','FFT8h','FFT10h','FC1','FCz','FC2','FTT9h','FTT7h','FCC5h','FCC3h','FCC1h','FCC2h','FCC4h','FCC6h','FTT8h','FTT10h','Cz','TTP7h','CCP5h','CCP3h','CCP1h','CCP2h','CCP4h','CCP6h','TTP8h','CPz','TPP9h','TPP7h','CPP5h','CPP3h','CPP1h','CPP2h','CPP4h','CPP6h','TPP8h','TPP10h','Pz','PPO1h','PPO2h','P9','PPO9h','POO1','POO2','PPO10h','P10','POO9h','OI1h','OI2h','POO10h'], sep=",")
k.columns = ['AFp2','F9','AFF5h','AFF1h','AFF2h','AFF6h','F10','FFT9h','FFT7h','FFC5h','FFC3h','FFC1h','FFC2h','FFC4h','FFC6h','FFT8h','FFT10h','FC1','FCz','FC2','FTT9h','FTT7h','FCC5h','FCC3h','FCC1h','FCC2h','FCC4h','FCC6h','FTT8h','FTT10h','Cz','TTP7h','CCP5h','CCP3h','CCP1h','CCP2h','CCP4h','CCP6h','TTP8h','CPz','TPP9h','TPP7h','CPP5h','CPP3h','CPP1h','CPP2h','CPP4h','CPP6h','TPP8h','TPP10h','Pz','PPO1h','PPO2h','P9','PPO9h','POO1','POO2','PPO10h','P10','POO9h','OI1h','OI2h','POO10h']
I don't know if this way of doing is useful, but I try to have k to contain the data and k.columns to call the columns.
Then I use the following for loop:
for columns in k:
freqs, psd = signal.welch(k[columns], fs=500,
window='hanning',nperseg=40, noverlap=20, scaling='density', average='mean')
plt.figure(figsize=(5, 4))
plt.plot(freqs, psd)
plt.title('PSD: power spectral density')
plt.xlabel('Frequency')
plt.ylabel('Power')
plt.axis([0,50, -1, 5])
plt.show()
How can I add a loop in the title of the plot that contains the electrode name?
Thank you very much for your precious help! :)
The response from #Mr.T is really helpful!!
Use f-string formatting plt.title(f'PSD: power spectral density for {columns}')? You probably will also benefit from getting familiar with subplots and axis objects. – Mr. T
I'm new to matplotlib. I'm writing a Stock Market application in Python.
In the application I have a chart with 2 different line-graphs to display. One is "Price" and the other is "VVAP Indicator". I'm trying to plot it using matplotlib twinx() function, so that both of them share the same x-axis.
The problem is: the price dataset has a length of 100, while the "VVAP Indicator" dataset has a length of just 1 (it will increase to 100 as new data is fetched from the server and calculated).
Here is my code:
self.figure, ax1 = plt.subplots()
ax1.plot(prices_dataframe, 'b-')
ax2 = ax1.twinx()
ax2.plot(vwaps_dataframe, 'r-')
plt.autoscale(enable=True, axis='x')
plt.title("Intraday with VWAP")
plt.grid()
helper.chart_figure = self.figure
Here are the datasets:
And here is what I get on the charts:
How do I solve this? Do I need to pad up the second dataset with dummy rows? Or is there a more easy and elegant solution to my problem? Any help would be appreciated.
Thanks in advance.
I found the solution to the issue in the dataset image:
The first dataset was having timezone-aware datetime objects, while in the second dataset I was inserting date and time as a string.
I also had to add the following lines to make the x axis display the proper time in my timezone:
ax1.xaxis_date(tz='Asia/Kolkata')
ax2.xaxis_date(tz='Asia/Kolkata')
I have created two line plots with this dataset. The first lineplot shows the number of flight accidents in a given year. The second lineplot shows the number of fatalities in a given year. I want to put both line plots on the same graph. This is the code I have used:
fatalities=df[['Fatalities','Date']]
fatalities['Year of Fatality']=fatalities['Date'].dt.year
fatalities.drop('Date',inplace=True)
fatalities.set_index('Year of Fatality',inplace=True)
fatalities.sort_index(inplace=True)
plt.figure(figsize=(12,9))
plt.title("Number of Flight Accidents Since 1908",fontsize=20)
plt.ylabel("Number of Flight Accidents")
plt.xlabel("Year")
plt.xticks(year.index,rotation=90)
year.plot()
fatalities.plot()
plt.show()
What I get are two plots, with on above the other: the plot which shows the number of fatalities and the plot which shows the number of flight accidents.
What I want is one graph that shows the two line plots. Any help would be much appreciated. (Side note: how can I rotate the xticks 90 degrees? I used the rotation argument in the plt.xticks() but this had zero affect).
Given the use of .plot() and variables called df, I assume you're using pandas dataframes (if that's not the case, the answer still probably applies, look up the docs for your plot function).
Pandas' plot by default puts the plots in their own axis, unless you pass one to draw on via the ax attribute:
fig, ax = plt.subplots()
year.plot(ax=ax)
fatalities.plot(ax=ax)
This question already has answers here:
How to plot multiple dataframes to the same plot axes
(1 answer)
How to plot different groups of data from a dataframe into a single figure
(5 answers)
Closed 3 years ago.
I have separated my training data to carry out validation and was trying to follow an example on how to graph the training and validation data in one time series chart where the lines would connect and change colors at the validation data (see picture for example I was following).
I am following the example fairly closely (adjusted for my own data) and am getting two separate charts.
Example followed coding and output:
Train.Count.plot(figsize=(15,8), title= 'Daily Ridership', fontsize=14, label='train')
valid.Count.plot(figsize=(15,8), title= 'Daily Ridership', fontsize=14, label='valid')
plt.xlabel("Datetime") plt.ylabel("Passenger count") plt.legend(loc='best') plt.show()
My code (using different data that is set up to provide the sum of values in pandas dataframes bananas_train and bananas_val) is as follows:
bananas_train.plot(figsize=(15,8), title= 'Bananas Daily', fontsize=14, label='train')
bananas_val.plot(figsize=(15,8), title= 'Bananas Daily', fontsize=14, label='valid')
plt.xlabel("Date")
plt.ylabel("Quantity Picked")
plt.legend(loc='best')
plt.show()
The bottom line has been separated compared to the original code due to syntax errors received.
I have attempted to redo with the following code and still get two separate graphs:
#attempt 2
plt.figure(figsize=(28,8))
ax1 = bananas_train.plot(color='blue', grid=True, label='train')
ax2 = bananas_val.plot(color='red', grid=True, label='valid')
plt.xlabel("Date")
plt.ylabel("Quantity Picked")
h1, l1 = ax1.get_legend_handles_labels()
h2, l2 = ax2.get_legend_handles_labels()
plt.legend(h1+h2, l1+l2, loc='best')
plt.show()
My result shows two separate time series charts rather than one. Any ideas on how I can get these two charts to merge?
Edit: If I follow the pages that were listed as duplicates of this question...
Plotting multiple dataframes using pandas functionality
Plot different DataFrames in the same figure
I get a graph with both lines, but with the x axis matching up from the beginning rather than continuing from where the last line left off.
Here is my code where I attempted to followed those directions and got this result:
#attempt 3
ax = bananas_train.plot()
bananas_val.plot(ax=ax)
plt.show()
I need my result to continue the line graphed with the newly merged data, not overlap the lines.
Next time it would be nice to include some sample data in your question.
I am sure there are better ways of achieving what you want but I will list one here:
# get the length of your training and checking set assuming your data are
# in 2 dataframes (test and check respectively) with a column named after 'count'
t_len=len(test['count'].index)
c_len=len(check['count'].index)
# concaternate your dataframes
test_check= pd.concat([test['count'],check['count']],ignore_index=True)
#create 2 dataframes assigning 0s to training and checking set respectively
test_1=test_check.copy()
check_1= test_check.copy()
test_1.iloc[0:t_len]=0
check_1.iloc[c_len:]=0
#make the plot
test_1.plot(figsize=(15,8),color='blue')
check_1.plot(figsize=(15,8),color='red')