Formating timestams on x axis using matplotlib - python

How to format timestamps on x axis as "%Y-%m-%d %H:%M". ts is list of timestamps and how to show on x axis values:
"2018-5-23 14:00", "2018-5-23 14:15" and "2018-5-23 14:30".
My current chart shows:
23 14:00, 23 14:05, 23 14:10, 23 14:15, 23 14:20, 23 14:25, 23 14:30.
import datetime
import matplotlib.pyplot as plt
from matplotlib import style
style.use('fivethirtyeight')
ts = [datetime.datetime(2018, 5, 23, 14, 0), datetime.datetime(2018, 5, 23, 14, 15), datetime.datetime(2018, 5, 23, 14, 30)]
values =[3, 7, 6]
plt.plot(ts, values, 'o-')
plt.show()

Firstly, you need to set your x ticks so that only the values you want will be displayed. This can be done using plt.xticks(tick_locations, tick_labels).
To get the dates in the right format you need to specify a DateFormatter and apply it to your x axis.
Your code would look like:
import datetime
import matplotlib.pyplot as plt
from matplotlib import style
from matplotlib.dates import DateFormatter
style.use('fivethirtyeight')
ts = [datetime.datetime(2018, 5, 23, 14, 0), datetime.datetime(2018, 5, 23, 14, 15), datetime.datetime(2018, 5, 23, 14, 30)]
values =[3, 7, 6]
plt.plot(ts, values, 'o-')
plt.xticks(ts, ts) # set the x ticks to your dates
date_formatter = DateFormatter("%Y-%m-%d %H:%M") # choose desired date format
ax = plt.gca()
ax.xaxis.set_major_formatter(date_formatter)
plt.show()

Related

Python - Bland-Altman Plot with Text Customization

I am trying to Create the Bland-Altman Plot with the text having on the left side of the plot instead of having it as the default configuration on the right hand side
This is my code
import pandas as pd
df = pd.DataFrame({'A': [5, 5, 5, 6, 6, 7, 7, 7, 8, 8, 9,
10, 11, 13, 14, 14, 15, 18, 22, 25],
'B': [4, 4, 5, 5, 5, 7, 8, 6, 9, 7, 7, 11,
13, 13, 12, 13, 14, 19, 19, 24]})
import statsmodels.api as sm
import matplotlib.pyplot as plt
#create Bland-Altman plot
f, ax = plt.subplots(1, figsize = (8,5))
sm.graphics.mean_diff_plot(df.A, df.B, ax = ax)
#display Bland-Altman plot
plt.show()
So I want to have the "mean", the "SD+" and the "SD-" on the left side of the X-axis, not on the right.
thanks for your help or any suggestions!
I don't know, but I can use pyplot so:
mean_diff = (df.A-df.B).mean()
diff_range = (df.A-df.B).std()*1.96
plt.figure(figsize = (9,6))
plt.scatter(df.A, df.A-df.B, alpha=.5)
plt.hlines(mean_diff, df.A.min()-2, df.A.max()+2, color="k", linewidth=1)
plt.text(
df.A.min()-1, mean_diff+.05*diff_range, "mean diff: %.2f"%mean_diff,
fontsize=13,
)
plt.hlines(
[mean_diff+diff_range, mean_diff-diff_range],
df.A.min()-2, df.A.max()+2, color="k", linewidth=1,
linestyle="--"
)
plt.text(
df.A.min()-1, mean_diff+diff_range+.05*diff_range,
"+SD1.96: %.2f"%(mean_diff+diff_range),
fontsize=13,
)
plt.text(
df.A.min()-1, mean_diff-diff_range+.05*diff_range,
"-SD1.96: %.2f"%(mean_diff-diff_range),
fontsize=13,
)
plt.xlim(df.A.min()-2, df.A.max()+2)
plt.ylim(mean_diff-diff_range*1.5, mean_diff+diff_range*1.5)
plt.xlabel("Means", fontsize=15)
plt.ylabel("Difference", fontsize=15)
plt.show()
result:

How to display Hours on the x-axis using Prophet plot function

I'd like some assistance in using a more granular time-series in my Prophet forecast plots, specifically an Hour grain on the x-axis.
My data is aggregated for each Hour of the day. In addition to the aggregated data, I create the necessary Prophet variables with:
ads_mod['y'] = ads_mod[target1]
ads_mod['ds'] = ads_mod['hour']
I then start the modeling process:
m = Prophet(interval_width=interval_width)
m.add_seasonality(name='hourly', period=1, fourier_order=30)
m.fit(ads_mod)
future = m.make_future_dataframe(periods=1,freq='H')
forecast = m.predict(future)
I plot the Forecast with:
fig = m.plot(forecast)
I have reviewed the actual code in the plot function and tried a variety of modifications to display the hour along with date(i.e., datetime value) on the x-axis, without success.
In particular, I looked at the date transform:
fcst_t = fcst['ds'].dt.to_pydatetime()
After the transform, I see my data is now in the following format, with the Hour still included.
A fragment of the plot is below and you see that on the x-axis the date(i.e., YYYY,MM,DD) is the only value displayed:
fcst_t[:10]
Out[277]:
array([datetime.datetime(2019, 12, 2, 0, 0),
datetime.datetime(2019, 12, 2, 1, 0),
datetime.datetime(2019, 12, 2, 2, 0),
datetime.datetime(2019, 12, 2, 3, 0),
datetime.datetime(2019, 12, 2, 4, 0),
datetime.datetime(2019, 12, 2, 5, 0),
datetime.datetime(2019, 12, 2, 6, 0),
datetime.datetime(2019, 12, 2, 7, 0),
datetime.datetime(2019, 12, 2, 8, 0),
datetime.datetime(2019, 12, 2, 9, 0)], dtype=object)
import matplotlib.dates as mdates
.... additional plot code here......
hours = mdates.HourLocator(interval = 1)
h_fmt = mdates.DateFormatter('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(h_fmt)
Here is the link: https://urldefense.com/v3/https://stackoverflow.com/questions/48790378/how-to-get-ticks-every-hour;!!M-nmYVHPHQ!cfpmWmLR0J5OMTJIH0aiEwrHWzsnD7pHJSBdVXxRTcAMK6mQ3v8K-FudC7uC6RN78uhTDCkD$

How do I customize the colours in the bars using custom number set in matplotlib?

I am trying to add colors to the bar according to the integer value, lets say the values are 1 to 20, 1 will be the lightest and 20 will be the darkest, but none of the colors can be the same, so far I am at using an incorrect colorbar method:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({'values': [17, 16, 16, 15, 15, 15, 14, 13, 13, 13]})
df.plot(kind='barh')
plt.imshow(df)
plt.colorbar()
plt.show()
But it gives a strange result of:
How do I fix it?
I just realized using plt.barh and colormaps provide better plots, use:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'values': [0, 0, 0, 0, 0, 17, 16, 16, 15, 15, 15, 14, 13, 13, 13]})
df = df.sort_values(by='values').reset_index(drop=True)
s = df['values'].replace(0, df.loc[df['values'] != 0, 'values'].min())
s = s.sub(s.min())
colors = (1 - (s / s.max())).astype(str).tolist()
plt.barh(df.index, df['values'].values, color=colors)
plt.show()
Which gives:

Advice on plotting large amount of data

I'm working on a very cheap seismometer mainly for educational purposes and some research. I would like to show every few hours the seismic signal of one of the channels as the image I have attached, using matplotlib.
The problem is that every second I get 100 datapoints and while plotting this data on a raspberry pi, usually hangs and stop working.
The way I plot the data for each 4 hours subplot is reading again all the data and plotting only between the limits of the subplot, but I find this not efficient and probably the cause of the raspberry hanging.
I've been thinking for days how I could do this to avoid using a lot of memory for each subplot, but I can't find an answer as I'm a geologist and programming is a big issue for me.
Does anybody have a better idea for doing this?
import matplotlib.pyplot as plt
import time
import os.path
import datetime
import sys
import numpy
import pytz
import matplotlib.dates as mdates
import ftplib
from pylab import *
import numpy as np
from itertools import islice
from time import sleep
from matplotlib.pyplot import specgram
from scipy.signal import medfilt
import csv
archivo='sismo1545436800'
def subirftp(archivosubir):
session = ftplib.FTP('---', 's--- ', '----')
file = open(archivosubir+'.png', 'rb') # file to send
session.storbinary('STOR '+ archivosubir +'.png', file) # send the file
dirlist = session.retrlines('LIST')
file.close() # close file and FTP
session.quit()
font = {'family': 'serif',
'color': 'darkred',
'weight': 'normal',
'size': 16,
}
fu = open('Z:/nchazarra/sismografos/' + str(archivo) + '.txt')
nr_of_lines = sum(1 for line in fu)
fu.close()
f = open('Z:/nchazarra/sismografos/' + str(archivo) + '.txt')
print(nr_of_lines)
csv_f = csv.reader(f)
#row_count = sum(1 for row in csv_f)
#print(row_count)
tiempo = []
valora = []
valores = []
tiempor = []
i=0
final=0
empiezo=time.time()
for row in islice(csv_f,0,nr_of_lines-1):
# print (row[0])
if i == 0:
inicio = double(row[0])
valor = datetime.datetime.fromtimestamp(float(row[0]),tz=pytz.utc)
tiempo.append(valor)
i = i + 1
else:
valor = datetime.datetime.fromtimestamp(float(row[0]),tz=pytz.utc)
#print(valor)
tiempo.append(valor)
# print(row)
try:
valora.append(int(row[1]))
# print(row[0])
except IndexError:
valora.append(0)
except ValueError:
valora.append(0)
valores = valora
tiempor = tiempo
mediana = np.mean(valores)
minimo = np.amin(valores)
maximo = np.amax(valores)
std = np.std(valores)
for index in range(len(valores)):
valores[index] = float(((valores[index] - minimo) / (maximo - minimo))-1)
mediananueva = float(np.median(valores))
for index in range(len(valores)):
valores[index] = float(valores[index] - mediananueva)
valores2=np.asarray(valores)
tiempo2=np.asarray(tiempo)
#Franja de 0 a 4
franja1=plt.subplot(611)
franja1.axis([datetime.datetime(2018, 12, 22,00,00), datetime.datetime(2018, 12, 22,3,59,59),-0.05,0.05])
franja1.plot(tiempo2, valores2, lw=0.2,color='red')
#Franja de 4 a 8
franja2=plt.subplot(612)
franja2.axis([datetime.datetime(2018, 12, 22,4,00), datetime.datetime(2018, 12, 22,8,00),-0.05,0.05])
franja2.plot(tiempo2, valores2, lw=0.2,color='green')
#Franja de 8 a 12
franja3=plt.subplot(613)
franja3.axis([datetime.datetime(2018, 12, 22,8,00), datetime.datetime(2018, 12, 22,12,00),-0.05,0.05])
franja3.plot(tiempo2, valores2, lw=0.2,color='blue')
#Franja de 12 a 16
franja4=plt.subplot(614)
franja4.axis([datetime.datetime(2018, 12, 22,12,00), datetime.datetime(2018, 12, 22,16,00),-0.05,0.05])
franja4.plot(tiempo2, valores2, lw=0.2,color='red')
#franja de 16 a 20
franja5=plt.subplot(615)
franja5.axis([datetime.datetime(2018, 12, 22,16,00), datetime.datetime(2018, 12, 22,20,00),-0.05,0.05])
franja5.plot(tiempo2, valores2, lw=0.2,color='green')
#Franja de 20 a 24
franja6=plt.subplot(616)
franja6.axis([datetime.datetime(2018, 12, 22,20,00), datetime.datetime(2018, 12, 22,23,59,59),-0.05,0.05])
franja6.plot(tiempo2, valores2, lw=0.2,color='blue')
franja1.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja2.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja3.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja4.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja5.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja6.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
acabo=time.time()
cuantotardo=acabo-empiezo
print('Madre mía, he tardado en hacer esto '+str(cuantotardo)+' segundos')
savefig(archivo + ".png", dpi=300)
subirftp(archivo)
plt.show()
Do you need to plot every data point? You could consider plotting every 100 or so. As long as the frequency of your signal isn't too high, I think it could work. Something like this:
import matplotlib.pyplot as plt
import numpy as np
X = np.arange(10000) / 10000 * 2 * np.pi
Y = np.sin(X) + np.random.normal(size=10000) / 10
plt.plot(X[::100], Y[::100])
versus all points:
You can save a fair bit of memory by sub-setting the arrays before you plot them:
import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
n_times = 24 * 60 * 60 * 100
times = [
datetime.datetime(2018, 12, 22,00,00) +
datetime.timedelta(milliseconds=10 * x) for x in range(n_times)]
tiempo2 = np.array(times)
valores2 = np.random.normal(size=n_times)
#Franja de 0 a 4
franja1=plt.subplot(611)
index = np.logical_and(tiempo2 >= datetime.datetime(2018, 12, 22, 0, 0),
tiempo2 < datetime.datetime(2018, 12, 22, 4, 0, 0))
franja1.plot(tiempo2[index], valores2[index], lw=0.2,color='red')
#Franja de 4 a 8
franja2=plt.subplot(612)
index = np.logical_and(tiempo2 >= datetime.datetime(2018, 12, 22, 4, 0),
tiempo2 < datetime.datetime(2018, 12, 22, 8, 0, 0))
franja2.plot(tiempo2[index], valores2[index], lw=0.2,color='green')
#Franja de 8 a 12
franja3=plt.subplot(613)
index = np.logical_and(tiempo2 >= datetime.datetime(2018, 12, 22, 8, 0),
tiempo2 < datetime.datetime(2018, 12, 22, 12, 0, 0))
franja3.plot(tiempo2[index], valores2[index], lw=0.2,color='blue')
#Franja de 12 a 16
franja4=plt.subplot(614)
index = np.logical_and(tiempo2 >= datetime.datetime(2018, 12, 22, 12, 0),
tiempo2 < datetime.datetime(2018, 12, 22, 16, 0, 0))
franja4.plot(tiempo2[index], valores2[index], lw=0.2,color='red')
#franja de 16 a 20
franja5=plt.subplot(615)
index = np.logical_and(tiempo2 >= datetime.datetime(2018, 12, 22, 16, 0),
tiempo2 < datetime.datetime(2018, 12, 22, 20, 0, 0))
franja5.plot(tiempo2[index], valores2[index], lw=0.2,color='green')
#Franja de 20 a 24
franja6=plt.subplot(616)
index = np.logical_and(tiempo2 >= datetime.datetime(2018, 12, 22, 20, 0),
tiempo2 < datetime.datetime(2018, 12, 23, 0, 0, 0))
franja6.plot(tiempo2[index], valores2[index], lw=0.2,color='blue')
franja1.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja2.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja3.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja4.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja5.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
franja6.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plt.show()

Custom Yaxis plot in matplotlib python

Let's say if I have Height = [3, 12, 5, 18, 45] and plot my graph then the yaxis will have ticks starting 0 up to 45 with an interval of 5, which means 0, 5, 10, 15, 20 and so on up to 45. Is there a way to define the interval gap (or the step). For example I want the yaxis to be 0, 15, 30, 45 for the same data set.
import matplotlib.pyplot as plt
import numpy as np
plt.plot([3, 12, 5, 18, 45])
plt.yticks(np.arange(0,45+1,15))
plt.show()
This should work
matplotlib.pyplot.yticks(np.arange(start, stop+1, step))

Categories