Plotly: using fig.update_xaxes showes wrong month - python

I am looking for a solution to show the x_axis correct. the date 2021-01-31 is displayed as "Feb 2021". i would like to show it as "Jan 2021". thanks for help!
sdate = date(2021,1,31)
edate = date(2021,8,30)
date_range = pd.date_range(sdate,edate-timedelta(days=1),freq='m')
df_test = pd.DataFrame({ 'Datum': date_range})
df_test['values'] = 10
fig = px.line(df_test, x=df_test['Datum'], y=df_test['values'])
fig.layout = go.Layout(yaxis=dict(tickformat=".0%"))
fig.update_xaxes(dtick="M1", tickformat="%b %Y")
fig.update_layout(width=1485, height=1100)
plotly.io.write_image(fig, file='test_line.png', format='png')

You can force the ticks to start at 2021-01-31 by setting the starting tick to the starting date of your data sdate.
from datetime import date, timedelta
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
sdate = date(2021,1,31)
edate = date(2021,8,30)
date_range = pd.date_range(sdate,edate-timedelta(days=1),freq='m')
df_test = pd.DataFrame({ 'Datum': date_range})
df_test['values'] = 10
fig = px.line(df_test, x=df_test['Datum'], y=df_test['values'])
fig.layout = go.Layout(yaxis=dict(tickformat=".0%"))
fig.update_xaxes(dtick="M1", tickformat="%b %Y")
## set tick0 to the starting date
fig.update_layout(
xaxis=dict(tick0=sdate),
width=1485, height=1100
)
fig.show()
I should point out that this plot has the potential to be misleading as I believe most people would interpret each tickmark as starting at the beginning of the month (e.g. most people would think that the data starts on 2021-01-01) if you don't specify the day in your tickformat, but that is up to you depending on what you want to show on your chart.
If you instead you change the tickformat by rewriting the line fig.update_xaxes(dtick="M1", tickformat="%b %d %Y") then you get the following plot:

Related

plotly express shows mid of the day marks that are not in the dataframe, how to fix?

how not to show mid of the 12pm on the chard. I want days only.
Here is the dataset from the chart https://easyupload.io/gxfiq1
df_combined = df_combined.sort_values(by='sold_date', ascending=True)
df_smt = df_combined.loc[df_combined['model'] == 'Adapt'].groupby('date').agg({'price': 'sum', 'sold_date': 'count'}).reset_index()
fig = px.line(df_smt, x='date', y='price', title='Adapt')
fig.show()
Looks like you are trying to update the "x-tick" labels. The plotly documentation has some information on how to do this (https://plotly.com/python/tick-formatting/)
Something like
fig.update_layout(
xaxis_tickformat = '%b %d %Y',
)
should format the x-ticks as month (%b), day (%d), year (%Y).
You can see more information on formatting options here (https://github.com/d3/d3-time-format/blob/main/README.md)

Plot datetime data in 24 hour window on x axis

I have a dataframe with datetime data:
Start_time: eg(2013-09-21 00:14:00) - the timestamp a task has started
End_time: eg(2013-09-22 11:04:00) - the timestamp a task has ended
Time_diff:eg(0 days 06:07:00) - the time the task took.
I want to plot a histogram of the time events start and end, without considering the date (so only the 24 clock).
I have tried to use:
df['Start_time'].dt.time
to just get the time and plot.
However I am then unable afterwards to BIN the timestamps (now objects) in 20 bins.
This is my result so far:
This is what I am trying to get, a plot with 24hours on the x axis, and the binned distribution of start time & end_time for the y
Here is the code
from random import randrange
import datetime
import pandas as pd
import plotly.express as px
# make the EXAMPLE dataset
startDate = datetime.datetime(2013, 9, 20,13,00)
start_lst = []
end_lst = []
for i in range(200):
start_time= startDate + datetime.timedelta(hours=randrange(23), minutes= randrange(60))
end_time = start_time + datetime.timedelta(hours=randrange(2,7), minutes= randrange(60))
startDate = startDate + datetime.timedelta(days=randrange(4))
start_lst.append(start_time)
end_lst.append(end_time)
df = pd.DataFrame({'Start_time': start_lst,
'End_time': end_lst
})
df['Time_diff'] = df['End_time']-df['Start_time']
#start of code
#tried just using histogram, but sicne the date changes, it wont plot over 24hours
fig = px.histogram(df, x=['Start_time', 'End_time'], nbins=20)
fig.show()
#so tried removing the date part, and just leaving time, however now it wont properly bin
df['Start_time_nodate'] = df['Start_time'].dt.time
df['End_time_nodate'] = df['End_time'].dt.time
fig = px.histogram(df, x=['Start_time_nodate', 'End_time_nodate'], nbins=20)
fig.show()
If I understand correctly, with your example dataframe, here is one way to do it with Matplotlib:
from matplotlib import pyplot as plt
# Setup
df["Start_time_nodate"] = df["Start_time"].dt.hour
df["End_time_nodate"] = df["End_time"].dt.hour
fig, ax = plt.subplots(figsize=(8, 4))
# Plot frequencies
ax.plot(df["Start_time_nodate"].value_counts(sort=False).sort_index())
ax.plot(df["End_time_nodate"].value_counts(sort=False).sort_index())
# Style plot
ax.legend(["Start time", "End time"])
ax.set_xticks(ticks=[i for i in range(0, 25)])
ax.set_xticklabels([i for i in range(0, 25)])
plt.xlabel("24 hours")
plt.ylabel("Frequency")
ax.margins(x=0)
In a Jupyter notebook, this code outputs the following image:

How to add hourly ticks in an axis from datetime formatted data

I have a dataframe of daily temperature variation with time
time temp temp_mean
00:01:51.57 185.94 185.94
00:01:52.54 187.48 186.71
00:01:53.51 197.85 190.4233333
00:01:54.49 195.71 191.745
00:01:55.46 197.22 192.84
00:01:56.43 187.33 191.9216667
00:01:57.41 194.18 192.2442857
00:01:58.38 199.9 193.20125
00:01:59.35 184.23 192.2044444
00:02:00.33 201.34 193.118
00:02:01.30 200.12 193.7545455
00:02:02.27 199.13 194.2025
00:02:03.24 187.47 193.6846154
00:02:04.22 187.65 193.2535714
00:02:05.19 195.59 193.4093333
00:02:06.17 188.7 193.115
00:02:07.14 196.16 193.2941176
00:02:08.11 191.17 193.1761111
00:02:09.08 198.62 193.4626316
00:02:10.06 190.79 193.329
00:02:11.03 193.35 193.33
00:02:12.00 199.36 193.6040909
00:02:12.98 190.76 193.4804348
00:02:13.95 205.16 193.9670833
00:02:14.92 194.89 194.004
00:02:15.90 185.3 193.6692308
like this. (12000+ rows)
I want to plot time vs temp as a line plot, with hourly ticks on x-axis(1 hr interval).
But somehow I couldn't assign x ticks with proper frequency.
fig, ax = plt.subplots()
ax.plot(data['time'], data['temp'])
ax.plot(data['time'], data['temp_mean'],color='red')
xformatter = mdates.DateFormatter('%H:%M')
xlocator = mdates.HourLocator(interval = 1)
## Set xtick labels to appear every 15 minutes
ax.xaxis.set_major_locator(xlocator)
## Format xtick labels as HH:MM
ax.xaxis.set_major_formatter(xformatter)
fig.autofmt_xdate()
ax.tick_params(axis='x', rotation=45)
plt.show()
Here xticks seems to be crowded and overlapping, but I need ticks from 0:00 to 23:00 with one hour interval.
What should I do ?
Convert the 'time' column to a datetime dtype with pd.to_datetime, and then extract the time component with the .dt accessor.
See python datetime format codes to specify the format=... string.
Plot with pandas.DataFrame.plot
Tested in python 3.8.12, pandas 1.3.3, matplotlib 3.4.3
import pandas as pd
# sample data
data = {'time': ['00:01:51.57', '00:01:52.54', '00:01:53.51', '00:01:54.49', '00:01:55.46', '00:01:56.43', '00:01:57.41', '00:01:58.38', '00:01:59.35', '00:02:00.33', '00:02:01.30', '00:02:02.27', '00:02:03.24', '00:02:04.22', '00:02:05.19', '00:02:06.17', '00:02:07.14', '00:02:08.11', '00:02:09.08', '00:02:10.06', '00:02:11.03', '00:02:12.00', '00:02:12.98', '00:02:13.95', '00:02:14.92', '00:02:15.90'],
'temp': [185.94, 187.48, 197.85, 195.71, 197.22, 187.33, 194.18, 199.9, 184.23, 201.34, 200.12, 199.13, 187.47, 187.65, 195.59, 188.7, 196.16, 191.17, 198.62, 190.79, 193.35, 199.36, 190.76, 205.16, 194.89, 185.3],
'temp_mean': [185.94, 186.71, 190.4233333, 191.745, 192.84, 191.9216667, 192.2442857, 193.20125, 192.2044444, 193.118, 193.7545455, 194.2025, 193.6846154, 193.2535714, 193.4093333, 193.115, 193.2941176, 193.1761111, 193.4626316, 193.329, 193.33, 193.6040909, 193.4804348, 193.9670833, 194.004, 193.6692308]}
df = pd.DataFrame(data)
# convert column to datetime and extract time component
df.time = pd.to_datetime(df.time, format='%H:%M:%S.%f').dt.time
# plot
ax = df.plot(x='time', color=['tab:blue', 'tab:red'])

How to remove ending 00:00:00 in timestamp in a series

I am trying to graph coronavirus cases over time, but my timestamps are being weird. I want to remove the 00:00:00 at the end of the timestamp. How can I do this?
the index of the series I am plotting:
DatetimeIndex(['2020-03-01', '2020-03-02', '2020-03-03', '2020-03-04',
'2020-03-05', '2020-03-06', '2020-03-07', '2020-03-08',
'2020-03-09', '2020-03-10',
...
'2020-07-09', '2020-07-10', '2020-07-11', '2020-07-12',
'2020-07-13', '2020-07-14', '2020-07-15', '2020-07-16',
'2020-07-17', '2020-07-18'],
dtype='datetime64[ns]', name='Date', length=140, freq=None)
code:
plt.figure()
totalconfirm['Cases'].plot(kind='bar', rot=15, title="Cases per Day in all Michigan Counties", color='r', label = 'Confirmed')
totalprob['Cases'].plot(kind = 'bar', rot=15, bottom=totalconfirm['Cases'], color = 'b', label = 'Probable')
#totalconfirm['Deaths'].plot(color = 'black', label = 'Deaths')
ax = plt.gca()
ax.xaxis.set_major_locator(plt.MaxNLocator(10))
ax.xaxis.set_major_formatter( mdates.DateFormatter("%b %d", tz=None) )
plt.show()
after the new code:
Have looked at remove 00:00:00 from 2015-05-14 00:00:00 string jquery and Remove "days 00:00:00"from dataframe but both didn't work/gave me errors.
This works to remove 00:00:00 from the dates (df.index.format())
Also, you can create your DatetimeIndex using pandas in a more simplified way.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
#Create the DatetieIndex auto
df = pd.DataFrame( index= pd.date_range(start="2020-03-01",end="2020-07-18"))
#for y value only
df['cases'] = np.arange(140)
ax = df.plot( kind='bar')
ax.set_xticklabels(df.index.format(), rotation='vertical', size=6)
plt.locator_params(axis='x', nbins=30)
You can cast your column as date before plotting:
df['Date'] = df['Date'].dt.date

Date into matplotlib graph

How can I use a date from a Sqlite database on the x-axis to make a bar graph with matplotlib?
If I convert the date to unix timestamp the graph works, but I would like to get something like this: http://i.stack.imgur.com/ouKBy.png
lowestNumber = self.c.execute('SELECT number,date, time FROM testDB ORDER BY number ASC LIMIT 1')
for rows in lowestNumber:
datesLow = rows[1]#returns 2016-02-23
splitDate = datesLow.split('-' )
spaces = ""
# tabs = '/'
# tabsDatesLow = tabs.join( splitDate )
joinDatesLow = spaces.join( splitDate )
x = int(joinDatesLow)
plt.bar(x,low, label="Minimum number of players", color="red")
plt.show()
You need to have an integer time format for plotting dates in matplotlib, and then a date formatting object is passed to format the axes. Matplotlib's date2num function can do this for you. Another good example is Matplotlib's documentation with an example here: http://matplotlib.org/examples/pylab_examples/date_demo1.html. Here is a solution yo may find useful:
import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import AutoDateLocator, AutoDateFormatter, date2num
#make my own data:
date = '2016-02-23'
low = 10
#how to format dates:
date_datetime = datetime.datetime.strptime(date, '%Y-%m-%d')
int_date = date2num( date_datetime)
#create plots:
fig, ax = plt.subplots()
#plot data:
ax.bar(int_date,low, label="Minimum number of players", color="red")
#format date strings on xaxis:
locator = AutoDateLocator()
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter( AutoDateFormatter(locator) )
#adjust x limits and apply autoformatter fordisplay of dates
min_date = date2num( datetime.datetime.strptime('2016-02-16', '%Y-%m-%d') )
max_date = date2num( datetime.datetime.strptime('2016-02-28', '%Y-%m-%d') )
ax.set_xlim([min_date, max_date])
fig.autofmt_xdate()
#show plot:
plt.show()

Categories