I have a datetime series (called date) like this:
0 2012-06-26
1 2011-02-22
2 2012-06-06
3 2013-02-10
4 2004-01-01
5 2011-01-25
6 2015-11-02
And i want to scatter plot with dates for Y axis and months and years on X axis.
I've played around with pyplot.plot_date, but can't figure out any solution.
It shoud be something like this only with dates on Y axis.
Any advice?
Take a look into matplotlib, it's really usefull.
This might also help you to start the plotting.
dates = [
'2012-06-26',
'2011-02-22',
'2012-06-06',
'2013-02-10',
'2004-01-01',
'2011-01-25',
'2015-11-02',
]
year = []
month = []
day = []
# Sorts your data in an useable dataset
def sort_data():
for i in range(len(dates)):
extracted_year = dates[i][0:4]
extracted_year = int(extracted_year)
year.append(extracted_year)
for j in range(len(dates)):
extracted_month = dates[j][5:7]
extracted_month = int(extracted_month)
month.append(extracted_month)
for k in range(len(dates)):
extracted_day = dates[k][8:10]
extracted_day = int(extracted_day)
day.append(extracted_day)
sort_data()
# Just checking if sort_date() worked correctly
print(year)
print(month)
print(day)
As far my best solution is converting year and month to floats and days to int:
from matplotlib import pyplot
dates = [
'2012-06-26',
'2011-02-22',
'2012-06-06',
'2013-02-10',
'2004-01-01',
'2011-01-25',
'2015-11-02',]
fig, ax = pyplot.subplots()
ax.scatter(date.apply(lambda x: float(x.strftime('%Y.%m'))),date.apply(lambda x: x.day), marker='o')
As your question is not completely clear to me, I assume you want to scatter plot all input datetime entries between oldest and recent most year (to be selected from the given input only).
Also, it is my very first attempt in "matplotlib" library. So, I just hope following answer would lead you to your expected result.
import datetime
import numpy
import matplotlib.pyplot as plt
from matplotlib.dates import MONDAY
from matplotlib.dates import DateFormatter, MonthLocator, WeekdayLocator
from matplotlib.dates import date2num
# Input datetime series.
dt_series = ["2012-06-26", "2011-02-22", "2012-06-06", "2013-02-10", "2004-01-01", "2011-01-25", "2015-11-02"]
mondays = WeekdayLocator(MONDAY)
months = MonthLocator(range(1, 13), bymonthday=1, interval=6)
monthsFmt = DateFormatter("%b'%y")
# Loop to create our own X-axis and Y-axis values.
# mths is for X-axis values, which are months along with year.
# dates is for Y-axis values, which are dates.
mths = list()
dates = list()
for dt in dt_series:
mths.append(date2num(datetime.datetime.strptime(dt.replace('-', ''), "%Y%m%d")))
dates.append(numpy.float64(float(dt.split('-')[2])))
fig, ax = plt.subplots(squeeze=True)
ax.plot_date(mths, dates, 'o', tz=None, xdate=True, ydate=False)
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(monthsFmt)
ax.xaxis.set_minor_locator(mondays)
ax.autoscale_view()
ax.grid(True)
fig.autofmt_xdate()
plt.show()
I would also suggest you to go through following link which I used for my reference:
http://matplotlib.org/examples/pylab_examples/date_demo2.html
Thank You
Related
I have read in a monthly temperature anomalies csv file using Pandas read.csv() function. Years are from 1881 to 2022. I excluded the last 3 months of 202 to avoid -999 values). Date format is yyyy-mm-dd. How can I just plot the year and only one value instead of 12 on the x-axis (i.e., I don't need 12 1851s, 1852s, etc.)?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.dates import YearLocator, MonthLocator, DateFormatter
import matplotlib.dates as mdates
ds = pd.read_csv('path_to_file.csv', header='infer', engine='python', skipfooter=3)
dates = ds['Date']
tAnoms = ds[' Berkeley Earth 2m Air Temperature (degree C) 0N-90N;0E-360E']
fig = plt.figure(figsize=(10,10))
ax = plt.subplot(111)
ax.plot(dates,tAnoms)
ax.plot(dates,tAnoms.rolling(60, center=True).mean())
ax.xaxis.set_major_locator(mdates.YearLocator(month=1) # EDIT
years_fmt = mdates.DateFormatter('%Y') # EDIT 2
ax.xaxis.set_major_formatter(years_fmt) # EDIT 2
plt.show()
EDIT: adding the following gives me the 2nd plot
EDIT 2: Gives me yearly values, but only from 1970-1975. 3rd plot
You could:
Create a new column year from your Date column.
Compute the average temperature for each year (using mean or median): df.groupby(['year']).mean()
So, I found a good, but maybe not perfect solution. First thing I needed to do was use parse_dates & infer_datetime_format when reading in the csv file. Then, convert dates to pydatetime(). mdates.AutoDateLocator() was what I needed along with set_major_formatter. Not sure how I could manually change the interval, however (e.g., change to every 10 years or 25 years instead of using the default. This does work well enough though.
ds = pd.read_csv('path_to_file.csv', parse_dates=['Date'], infer_datetime_format=True,
header='infer', engine='python', skipfooter=3)
dates = ds['Date'].dt.to_pydatetime() # Convert to pydatetime()
tAnoms = ds[' Berkeley Earth 2m Air Temperature (degree C) 0N-90N;0E-360E']
fig = plt.figure(figsize=(10,10))
ax = plt.subplot(111)
# Produce plot
ax.plot(dates,tAnoms.rolling(60, center=True).mean())
# Use AutoDateLocator() from matplotlib.dates (mdates)
# Set date format to years
ax.xaxis.set_major_locator(mdates.AutoDateLocator())
years_fmt = mdates.DateFormatter('%Y')
ax.xaxis.set_major_formatter(years_fmt)
plt.show()
I am able to create a plot with the name of the month when I have 365 data points with the following code:
y = np.random.normal(size=365)
x = np.array(range(len(y)))
plt.plot(x, y)
plt.xlabel('Month')
locator = mdates.MonthLocator()
fmt = mdates.DateFormatter('%b')
X = plt.gca().xaxis
X.set_major_locator(locator)
X.set_major_formatter(fmt)
Here is the result, which is exactly what I'm looking for:
I would like to do the same thing but with only 12 data points (one for each month). If I just change the 365 to 12 (y = np.random.normal(size=12)), it looks like this:
How can I get it to show all the months in the x axis as in the first graph?
I tried passing arguments to MonthLocator (bymonth, bymonthday, interval) but none of them seemed to do what I'm looking for.
You only have 12 points, so MonthLocator won't work as desired.
It will be easier to set the x-axis as the month names with a list:
import calendar to get a list of month names, or type them manually, and then use x = calendar.month_abbr[1:]
import calendar # part of the standard library
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(365)
y = np.random.normal(size=12)
x = calendar.month_abbr[1:]
plt.plot(x, y)
plt.xlabel('Month')
I am trying to create a plot with an amount (int) in the y-axis and days in the x-axis.
I want the plot to always have the whole month in the x-axis although I dont have data for all days.
This is the code I tryed:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates
import datetime as dt
df=get_pandas_data(datab) #Taking data from database in pandas DataFrame
fig = plt.figure(figsize=(10,10)) #Initialize plot
ax1 = fig.add_subplot(1,1,1)
dates=[dt.datetime.strptime(d,'%Y-%m-%d').date() for d in df['date']]
dates=list(set(dates)) #Takes all the dates from de Dataframe and sets to avoid repeated dates
s=df.resample('D', on='date')['amount'].sum() #Takes the total amount for the same date
ax1.bar(dates,s) #Bar plot for dates and amount
ax1.set(xlabel="Date",
ylabel="Balance (€)",
title="Total Monthly balance") # Plot information
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
#this is soposed to set all days of the month in the x-axis
ax1.xaxis.set_major_locator(mdates.DayLocator(interval=1))
fig.autofmt_xdate()
plt.show()
The result I get from this is a plot but only with those days that have data.
How can I make the plot to have all days in the month and plot the bar on those who have data?
This works fine with bare datetimes and matplotlib so you must be malforming your data somehow when doing your pandas manipulations. But we can't really help because we don't have your dataframe. Its always preferable to create a standalone example with dummy data, and as little code as possible to recreate the issue. a) 90% of the time you will realize your problem b) if not, we can help...
import numpy as np
import matplotlib.pyplot as plt
import datetime
x = np.array([1, 3, 7, 8, 10])
y = x * 2
dates = [datetime.datetime(2000, 2, xx) for xx in x]
fig, ax = plt.subplots()
ax.bar(dates, y)
fig.autofmt_xdate()
plt.show()
I have a series whose index is datetime that I wish to plot. I want to plot the values of the series on the y axis and the index of the series on the x axis. The Series looks as follows:
2014-01-01 7
2014-02-01 8
2014-03-01 9
2014-04-01 8
...
I generate a graph using plt.plot(series.index, series.values). But the graph looks like:
The problem is that I would like to have only year and month (yyyy-mm or 2016 March). However, the graph contains hours, minutes and seconds. How can I remove them so that I get my desired formatting?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# sample data
N = 30
drange = pd.date_range("2014-01", periods=N, freq="MS")
np.random.seed(365) # for a reproducible example of values
values = {'values':np.random.randint(1,20,size=N)}
df = pd.DataFrame(values, index=drange)
fig, ax = plt.subplots()
ax.plot(df.index, df.values)
ax.set_xticks(df.index)
# use formatters to specify major and minor ticks
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%Y-%m"))
_ = plt.xticks(rotation=90)
You can try something like this:
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
df = pd.DataFrame({'values':np.random.randint(0,1000,36)},index=pd.date_range(start='2014-01-01',end='2016-12-31',freq='M'))
fig,ax1 = plt.subplots()
plt.plot(df.index,df.values)
monthyearFmt = mdates.DateFormatter('%Y %B')
ax1.xaxis.set_major_formatter(monthyearFmt)
_ = plt.xticks(rotation=90)
You should check out this native function of matplotlib:
fig.autofmt_xdate()
See examples on the source website Custom tick formatter
My dataframe is like this-
Energy_MWh Month
0 39686.82 1979-01
1 35388.78 1979-02
2 50134.02 1979-03
3 37499.22 1979-04
4 20104.08 1979-05
5 17440.26 1979-06
It goes on like this to the month 2015-12. So you can imagine all the data.
I want to plot a continuous graph with the months as the x-axis and the Energy_MWh as the y-axis. How to best represent this using matplotlib?
I would also like to know for my knowledge if there's a way to print 1979-01 as Jan-1979 on the x-axis and so on. Probably a lambda function or something while plotting.
Borrowed liberally from this answer, which you should go out and upvote:
from datetime import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
df = <set_your_data_frame_here>
myDates = pd.to_datetime(df['Month'])
myValues = df['Energy_MWh']
fig, ax = plt.subplots()
ax.plot(myDates,myValues)
myFmt = DateFormatter("%b-%Y")
ax.xaxis.set_major_formatter(myFmt)
## Rotate date labels automatically
fig.autofmt_xdate()
plt.show()
Set Month as the index:
df.set_index('Month', inplace=True)
Convert the index to Datetime:
df.index = pd.DatetimeIndex(df.index)
Plot:
df.plot()