In the below code I plotted a time-line chart but I don't know how can I show all Y-axis values by integer type and standard interval. Does anyone has any idea?
code link: https://colab.research.google.com/drive/1Fq91PXlylJMKh6oUpysM95gpLwBfUcGx?usp=sharing
from matplotlib.patches import Patch
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams.update(matplotlib.rcParamsDefault)
fig, ax = plt.subplots(1, figsize=(20,7))
ax.barh(df.year, df.days_start_to_end, left=df.startNum, color=df.color)
xticks = np.arange(0, df.endNum.max()+1, 3)
xticks_labels = pd.date_range(proj_start, end=df.end.max()).strftime("%m/%d ")
xticks_minor = np.arange(0, df.endNum.max()+1, 1)
#ax.set_yticks(np.arange(len(df.year)))
ax.set_xticks(xticks)
#ax.set_xticks(xticks_minor, minor=True)
ax.set_xticklabels(xticks_labels[::3])
c_dict = { 'Perfect': '#4db249', 'Good':'#539de3',
'Normal':'#ffbd63', 'Severe': '#ff7361', 'Drastic':'#ff2626'}
legendEl = [Patch(facecolor = c_dict[i], label = i) for i in c_dict]
plt.legend(handles = legendEl)
plt.show()
the dates (year) that you have are just numbers and not datetime. If you would like to see all the labels in Y-axis, you can simply convert that column to string while plotting. This will tell matplotlib that these are texts and need to be plotted as categorical data. Below is the updated line... Note that this will plot the years without the decimal. Hope you don't need that.
ax.barh(df.year.astype('string'), df.days_start_to_end, left=df.startNum, color=df.color)
Updated plot
Related
I was trying to plot multiple bar charts as subplot but the y axis keeps on getting scientific notation values. The initial code I ran was:
from matplotlib import pyplot as plt
fig, axes = plt.subplots(7,3,figsize = (25, 40)) # axes is a numpy array of pyplot Axes
axes = iter(axes.ravel())
cat_columns=['Source','Side','State','Timezone',
'Amenity', 'Bump', 'Crossing', 'Give_Way',
'Junction', 'No_Exit', 'Railway', 'Roundabout', 'Station', 'Stop',
'Traffic_Calming', 'Traffic_Signal', 'Turning_Loop', 'Sunrise_Sunset',
'Civil_Twilight', 'Nautical_Twilight', 'Astronomical_Twilight']
for col in cat_columns:
ax = df[col].value_counts().plot(kind='bar',label = col, ax=axes.__next__())
And the output looks like this:
enter image description here
fig, axes = plt.subplots(7,3,figsize = (25, 40)) # axes is a numpy array of pyplot Axes
axes = iter(axes.ravel())
cat_columns=['Source','Side','State','Timezone',
'Amenity', 'Bump', 'Crossing', 'Give_Way',
'Junction', 'No_Exit', 'Railway', 'Roundabout', 'Station', 'Stop',
'Traffic_Calming', 'Traffic_Signal', 'Turning_Loop', 'Sunrise_Sunset',
'Civil_Twilight', 'Nautical_Twilight', 'Astronomical_Twilight']
for col in cat_columns:
ax = df[col].value_counts().plot(kind='bar',label = col, ax=axes.__next__())
ax.ticklabel_format(useOffset=False, style='plain')
After using this line ax.ticklabel_format(useOffset=False, style='plain') I am getting an error like:
enter image description here
Please guide me on this error.
You can turn this off by creating a custom ScalarFormatter object and turning scientific notation off. For more details, see the matplotlib documentation pages on tick formatters and on ScalarFormatter.
# additional import statement at the top
from matplotlib import pyplot as plt
from matplotlib import ticker
# additional code for every axis
formatter = ticker.ScalarFormatter()
formatter.set_scientific(False)
ax.yaxis.set_major_formatter(formatter)
I am trying to create a heat map from pandas dataframe using seaborn library. Here, is the code:
test_df = pd.DataFrame(np.random.randn(367, 5),
index = pd.DatetimeIndex(start='01-01-2000', end='01-01-2001', freq='1D'))
ax = sns.heatmap(test_df.T)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%d'))
However, I am getting a figure with nothing printed on the x-axis.
Seaborn heatmap is a categorical plot. It scales from 0 to number of columns - 1, in this case from 0 to 366. The datetime locators and formatters expect values as dates (or more precisely, numbers that correspond to dates). For the year in question that would be numbers between 730120 (= 01-01-2000) and 730486 (= 01-01-2001).
So in order to be able to use matplotlib.dates formatters and locators, you would need to convert your dataframe index to datetime objects first. You can then not use a heatmap, but a plot that allows for numerical axes, e.g. an imshow plot. You may then set the extent of that imshow plot to correspond to the date range you want to show.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.DataFrame(np.random.randn(367, 5),
index = pd.DatetimeIndex(start='01-01-2000', end='01-01-2001', freq='1D'))
dates = df.index.to_pydatetime()
dnum = mdates.date2num(dates)
start = dnum[0] - (dnum[1]-dnum[0])/2.
stop = dnum[-1] + (dnum[1]-dnum[0])/2.
extent = [start, stop, -0.5, len(df.columns)-0.5]
fig, ax = plt.subplots()
im = ax.imshow(df.T.values, extent=extent, aspect="auto")
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
fig.colorbar(im)
plt.show()
I found this question when trying to do a similar thing and you can hack together a solution but it's not very pretty.
For example I get the current labels, loop over them to find the ones for January and set those to just the year, setting the rest to be blank.
This gives me year labels in the correct position.
xticklabels = ax.get_xticklabels()
for label in xticklabels:
text = label.get_text()
if text[5:7] == '01':
label.set_text(text[0:4])
else:
label.set_text('')
ax.set_xticklabels(xticklabels)
Hopefully from that you can figure out what you want to do.
I am trying to plot a simple .csv file downloaded from Yahoo-finance (file example here), but I cannot understand why the years appear as (apparently) random numbers. Please see image below:
Another thing that I would like to do is to remove the x axis from the top graph (since the same axis is already in the bottom plot) but I would like to keep the dashed grid. I tired to use ax[0].set_xticklabels([]), but it didn't work.
Here is my code:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, MonthLocator, YearLocator
#LOAD DATA
df_name = "0P0000UL8U.L.csv"
col_list = ["Date", "Adj Close"] #list of column to import
df = pd.read_csv(df_name, header=0, usecols=col_list, na_values=['null'], thousands=r',', parse_dates=["Date"], dayfirst=True)
df = df.dropna() #Drop the rows where at least one element is missing.
df.set_index("Date", inplace = True)
df.index = [pd.to_datetime(date).date() for date in df.index] #convert index to datetime.date, not datetime.datetime.
print("Opening df:\n", df)
print("\nLength of df: ", len(df.index))
#PLOT DATA
fig, ax = plt.subplots(2,1, figsize=(11,5))
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.25, hspace=0.8) #Adjust space between graphs
df[["Adj Close"]].plot(ax=ax[0], kind="line", style="-", color="blue", stacked=False, rot=90)
ax[0].set_axisbelow(True) # To put plot grid below plots
ax[0].yaxis.grid(color='gray', linestyle='dashed')
ax[0].xaxis.grid(color='gray', linestyle='dashed')
ax[0].xaxis.set_major_locator(YearLocator()) # specify a MonthLocator
ax[0].xaxis.set_major_formatter(DateFormatter("%b %Y"))
ax[0].set(xlabel=None, ylabel="Price") # Set title and labels for axes
df[["Adj Close"]].plot(ax=ax[1], kind="line", style="-", color="blue", stacked=False, rot=90)
ax[1].set_axisbelow(True) # To put plot grid below plots
ax[1].yaxis.grid(color='gray', linestyle='dashed')
ax[1].xaxis.grid(color='gray', linestyle='dashed')
ax[1].xaxis.set_major_locator(YearLocator()) # specify a MonthLocator
ax[1].xaxis.set_major_formatter(DateFormatter("%b %Y"))
ax[1].set(xlabel="Time", ylabel="Price") # Set title and labels for axes
fig.savefig("0P0000UL8U.L.png", bbox_inches='tight', dpi=300)
What am I doing wrong? Thank for any help in advance.
To remove the x-Axis labels from the top graph, you can add the following line:
ax[0].tick_params(labelbottom=False)
before ax[0].set(xlabel=None, ylabel="Price")
It's not your fault. Python 3 is very far from stable yet. That's why hardcore developers still prefer Python 2. This time matplotlib devs screwed dates handling. They even have a number of corresponding bugs (#18010, #17983, #34850).
Meantime you can downgrade matplotlib to v 3.2.2, it's working perfectly and wait if devs repair the bug.
I want to set the x tick density by specifying how many ticks to skip each time. For example, if the x axis is labelled by 100 consecutive dates, and I want to skip every 10 dates, then I will do something like
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
ts = pd.period_range("20060101", periods=100).strftime("%Y%m%d")
y = np.random.randn(100)
ax = plt.subplot(1, 1, 1)
ax.plot(ts, y)
xticks = ax.get_xticks()
ax.set_xticks(xticks[::10])
plt.xticks(rotation="vertical")
plt.show()
However the output is out of place. Pyplot only picks the first few ticks and place them all in the wrong positions, although the spacing is correct:
What can I do to get the desired output? Namely the ticks should be instead:
['20060101' '20060111' '20060121' '20060131' '20060210' '20060220'
'20060302' '20060312' '20060322' '20060401']
#klim's answer seems to put the correct marks on the axis, but the labels still won't show. An example where the date axis is correctly marked yet without labels:
Set xticklabels also. Like this.
xticks = ax.get_xticks()
xticklabels = ax.get_xticklabels()
ax.set_xticks(xticks[::10])
ax.set_xticklabels(xticklabels[::10], rotation=90)
Forget the above, which doesn't work.
How about this?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
ts = pd.period_range("20060101", periods=100).strftime("%Y%m%d")
x = np.arange(len(ts))
y = np.random.randn(100)
ax = plt.subplot(1, 1, 1)
ax.plot(x, y)
ax.set_xticks(x[::10])
ax.set_xticklabels(ts[::10], rotation="vertical")
plt.show()
This works on my machine.
I have data that shows some values collected on three different dates: 2015-01-08, 2015-01-09 and 2015-01-12. For each date there are several data points that have timestamps.
Date/times are in a list and it looks as follows:
['2015-01-08-09:00:00', '2015-01-08-10:00:00', '2015-01-08-11:00:00', '2015-01-08-12:00:00', '2015-01-08-13:00:00', '2015-01-09-14:00:00', '2015-01-09-15:00:00', '2015-01-09-16:00:00', '2015-01-12-09:00:00', '2015-01-12-10:00:00', '2015-01-12-11:00:00']
On the other hand I have corresponding values (floats) in another list:
[12210.0, 12210.0, 12180.0, 12240.0, 12250.0, 12420.0, 12390.0, 12400.0, 12380.0, 12450.0, 12460.0]
To put all this together and plot a graph I use following code:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.dates as md
import dateutil
from matplotlib.font_manager import FontProperties
timestamps = ['2015-01-08-09:00:00', '2015-01-08-10:00:00', '2015-01-08-11:00:00', '2015-01-08-12:00:00', '2015-01-08-13:00:00', '2015-01-09-14:00:00', '2015-01-09-15:00:00', '2015-01-09-16:00:00', '2015-01-12-09:00:00', '2015-01-12-10:00:00', '2015-01-12-11:00:00']
ticks = [12210.0, 12210.0, 12180.0, 12240.0, 12250.0, 12420.0, 12390.0, 12400.0, 12380.0, 12450.0, 12460.0]
plt.subplots_adjust(bottom=0.2)
plt.xticks( rotation=90 )
dates = [dateutil.parser.parse(s) for s in timestamps]
ax=plt.gca()
ax.set_xticks(dates)
ax.tick_params(axis='x', labelsize=8)
xfmt = md.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(dates, ticks, label="Price")
plt.xlabel("Date and time", fontsize=12)
plt.ylabel("Price", fontsize=12)
plt.suptitle("Price during last three days", fontsize=12)
plt.legend(loc=0,prop={'size':8})
plt.savefig("figure.pdf")
When I try to plot these datetimes and values I get a messy graph with the line going back and forth.
It looks like the dates are being ignored and only timestamps are taken in account which is the reason for the messy chart. I tried to edit the datetimes to have the same date and consecutive timestamps and it fixed the chart. However, I must have dates as well..
What am I doing wrong?
When I try to plot these datetimes and values I get a messy graph with the line going back and forth.
Your plots are going all over the place because plt.plot connects the dots in the order you give it. If this order is not monotonically increasing in x, then it looks "messy". You can sort the points by x first to fix this. Here is a minimal example:
import numpy as np
import pylab as plt
X = np.random.random(20)
Y = 2*X+np.random.random(20)
idx = np.argsort(X)
X2 = X[idx]
Y2 = Y[idx]
fig,ax = plt.subplots(2,1)
ax[0].plot(X,Y)
ax[1].plot(X2,Y2)
plt.show()