I want to draw a graphic with using datas in datetime format as xaxis, but the process lasts very, very, extremly long, over 30 mins there is still no graphic. But once I apply datas in another column, the graphic will occur very soon. All the datas' formats are 'list'.
I'm confused about that, since they are all in the same format, why I can't draw the graphic out using the datetime formate as xaxis??
here is my code, I cherish all your time and help!
from matplotlib import pyplot as plt
import csv
names = []
x = []
y = []
names=[]
with open('all.csv','r') as csvfile: #this csv file contains over 16000 datas
plots= csv.reader(csvfile,delimiter=',')
for row in plots:
x.append(row[1]) #row1 is the datetime format data
y.append(row[2])
print(x,y)
plt.plot(x,y)
plt.show()
Lines of my csv file look something like:
2016/05/02 10:47:45,14.1,20.1,N.C.,170.7,518.3,-1259,-12.61,375.8,44.92,13.76,92.74,132.6,38.86,165.3,170.9,311.5,252.3,501.2,447.2,378.4,35.48,7.868,181.2,
I want the first column as xaxis and the following colums as yaxis...
and the y axis doesn't change, no matter how I change the y axis limit.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv('all.csv')
x = df.iloc[:,1]
y = df.iloc[:,3]
x = pd.to_datetime(x)
plt.figure(num=3, figsize=(15, 5))
plt.plot(x,y)
my_y_ticks = np.arange(0, 40, 10)
plt.xticks(rotation = 90)
plt.show()
I havent understood exactly what you mean with all the datas' format are list, but I think you could use something like this:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('all.csv')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = pd.to_datetime(x)
plt.plot(x,y)
plt.show()
Maybe showing some rows can be useful
EDIT:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.read_csv('all.csv')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = pd.to_datetime(x, format="%Y/%m/%d %H/%M/%S") #if the format is different, change here
fig, ax = plt.subplots()
ax.plot(x, y)
xfmt = mdates.DateFormatter("%Y/%m/%d %H:%M:%S")
ax.xaxis.set_major_formatter(xfmt)
plt.xticks(rotation=70)
plt.show()
Related
I have a simple dataframe with the time as index and dummy values as example.[]
I did a simple scatter plot as you see here:
Simple question: How to adjust the xaxis, so that all time values from 00:00 to 23:00 are visible in the xaxis? The rest of the plot is fine, it shows all the datapoints, it is just the labeling. Tried different things but didn't work out.
All my code so far is:
import pandas as pd
import seaborn as sns
import matplotlib.dates as mdates
from datetime import time
data = []
for i in range(0, 24):
temp_list = []
temp_list.append(time(i))
temp_list.append(i)
data.append(temp_list)
my_df = pd.DataFrame(data, columns=["time", "values"])
my_df.set_index(['time'],inplace=True)
my_df
fig = sns.scatterplot(my_df.index, my_df['values'])
fig.set(xlabel='time', ylabel='values')
I think you're gonna have to go down to the matplotlib level for this:
import pandas as pd
import seaborn as sns
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
data = []
for i in range(0, 24):
temp_list = []
temp_list.append(time(i))
temp_list.append(i)
data.append(temp_list)
df = pd.DataFrame(data, columns=["time", "values"])
df.time = pd.to_datetime(df.time, format='%H:%M:%S')
df.set_index(['time'],inplace=True)
ax = sns.scatterplot(df.index, df["values"])
ax.set(xlabel="time", ylabel="measured values")
ax.set_xlim(df.index[0], df.index[-1])
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
This produces
i think you have 2 options:
convert the time to hour only, for that just extract the hour to new column in your df
df['hour_'] = datetime.hour
than use it as your xaxis
if you need the time in the format you described, it may cause you a visibility problem in which timestamps will overlay each other. i'm using the
plt.xticks(rotation=45, horizontalalignment='right')
ax.xaxis.set_major_locator(plt.MaxNLocator(12))
so first i rotate the text then i'm limiting the ticks number.
here is a full script where i used it:
sns.set()
sns.set_style("whitegrid")
sns.axes_style("whitegrid")
for k, g in df_forPlots.groupby('your_column'):
fig = plt.figure(figsize=(10,5))
wide_df = g[['x', 'y', 'z']]
wide_df.set_index(['x'], inplace=True)
ax = sns.lineplot(data=wide_df)
plt.xticks(rotation=45,
horizontalalignment='right')
ax.yaxis.set_major_locator(plt.MaxNLocator(14))
ax.xaxis.set_major_locator(plt.MaxNLocator(35))
plt.title(f"your {k} in somthing{g.z.unique()}")
plt.tight_layout()
hope i halped
I am trying to plot information against dates. I have a list of dates in the format "01/02/1991".
I converted them by doing the following:
x = parser.parse(date).strftime('%Y%m%d'))
which gives 19910102
Then I tried to use num2date
import matplotlib.dates as dates
new_x = dates.num2date(x)
Plotting:
plt.plot_date(new_x, other_data, fmt="bo", tz=None, xdate=True)
But I get an error. It says "ValueError: year is out of range". Any solutions?
You can do this more simply using plot() instead of plot_date().
First, convert your strings to instances of Python datetime.date:
import datetime as dt
dates = ['01/02/1991','01/03/1991','01/04/1991']
x = [dt.datetime.strptime(d,'%m/%d/%Y').date() for d in dates]
y = range(len(x)) # many thanks to Kyss Tao for setting me straight here
Then plot:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.plot(x,y)
plt.gcf().autofmt_xdate()
Result:
I have too low reputation to add comment to #bernie response, with response to #user1506145. I have run in to same issue.
The answer to it is an interval parameter which fixes things up
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.plot(days,y)
plt.gcf().autofmt_xdate()
plt.show()
As #KyssTao has been saying, help(dates.num2date) says that the x has to be a float giving the number of days since 0001-01-01 plus one. Hence, 19910102 is not 2/Jan/1991, because if you counted 19910101 days from 0001-01-01 you'd get something in the year 54513 or similar (divide by 365.25, number of days in a year).
Use datestr2num instead (see help(dates.datestr2num)):
new_x = dates.datestr2num(date) # where date is '01/02/1991'
Adapting #Jacek Szałęga's answer for the use of a figure fig and corresponding axes object ax:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(days,y)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))
ax.tick_params(axis='x', labelrotation=45)
plt.show()
I am plotting two pandas series. The index is a date (1-1 to 12-31)
s1.plot()
s2.plot()
pd.plot() interprets the dates and assigns them to axis values as such:
I would like to modify the major ticks to be the 1st of every month and minor ticks to be the days in between
This works:
%matplotlib notebook
import matplotlib as mpl
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('data.csv')
df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%m-%d')
s2014max = df2014.groupby(['Date'], sort=True)['Data_Value'].max()/10
s2014min = df2014.groupby(['Date'], sort=True)['Data_Value'].min()/10
#remove the leap day and convert to datetime for plotting
s2014min = s2014min[s2014min.index != '02-29']
s2014max = s2014max[s2014max.index != '02-29']
dateslist = s2014min.index.tolist()
dates = [pd.datetime.strptime(date, '%m-%d').date() for date in dateslist]
plt.figure()
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
monthFmt = mdates.DateFormatter('%b')
dayFmt = mdates.DateFormatter('%d')
ax.xaxis.set_major_formatter(monthFmt)
ax.xaxis.set_minor_formatter(dayFmt)
ax.tick_params(direction='out', pad=15)
s2014min.plot()
s2014max.plot()
This results in no ticks:
A possible way is to use matplotlib for plotting the dates instead of pandas.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
dates = pd.date_range("2016-01-01", "2016-12-31" )
y = np.cumsum(np.random.normal(size=len(dates)))
df = pd.DataFrame({"Dates" : dates, "y": y})
fig, ax = plt.subplots()
ax.plot_date(df["Dates"], df.y, '-')
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
monthFmt = mdates.DateFormatter('%b')
ax.xaxis.set_major_formatter(monthFmt)
plt.show()
You were so close! All you needed to do was add the formatters similar to how the other answer did it. Here is a working sample similar to your code (note I did mine in ipython notebook hence the %matplotlib inline).
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime, timedelta
from random import random
y = [random() for i in range(25)]
x = [(datetime.now() - timedelta(days=i)) for i in range(25)]
x.reverse()
s = pd.Series(y, index=x) # NOTE: S, not df, since you said you were using series
# format the ticks
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
monthFmt = mdates.DateFormatter('%b')
dayFmt = mdates.DateFormatter('%d')
ax.xaxis.set_major_formatter(monthFmt) # This is what you needed
ax.xaxis.set_minor_formatter(dayFmt) # This is what you needed
ax.tick_params(direction='out', pad=15)
# format the coords message box
s.plot(figsize=(10,3))
which will look like this:
I cannot figure this out at all, how do I read a date from csv but I CANNOT represent the date as a label on the x axis. I have tried all the approaches that people have suggested but I cannot get it to work. SO could someone look at the stripped down version of my code and tell me what I am missing please?
a sample of the data being read from csv file
2015-08-04 02:14:05.249392,AA,0.0193103612,0.0193515212,0.0249713335,30.6542480634,30.7195875454,39.640763021,0.2131498442,29.0406746589,13524.5347810182,89,57,99
2015-08-05 02:14:05.325113,AAPL,0.0170506271,0.0137941891,0.0105915637,27.0670313481,21.8975963326,16.8135861893,-19.0986405157,-23.2172064279,21.5647072302,33,26,75
2015-08-06 02:14:05.415193,AIG,0.0080808151,0.0073296055,0.0076213535,12.8278962785,11.635388035,12.0985236788,-9.2962105215,3.980405659,-142.8175077335,71,42,33
2015-08-07 02:14:05.486185,AMZN,0.0235649449,0.0305828226,0.0092703502,37.4081902773,48.5487257749,14.7162247572,29.7810062852,-69.6877219282,-334.0005615016,2,92,10
stripped down code
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.font_manager as fm
ax = plt.subplots(1, 1, figsize=(16, 20), dpi=50) #width height inches
data=np.genfromtxt('/home/dave/Desktop/development/hvanal2015s.csv',
dtype='M8[us],S5,float,float,float',delimiter=',',usecols= [0,1,11,12,13])
my_dates = np.array([d[0] for d in data]).astype('datetime64[D]')
dates = np.unique(my_dates)
print(dates)
x_list = []
y_list = [10,11,12,13]
x_list = dates
plt.plot(x_list,y_list)
plt.title('hv 20 to 10 ranks',fontsize=20)
plt.xlabel('dates')
plt.ylabel('symbol ranks',fontsize=30)
plt.show()
and the output as a png file
matplotlib does not support numpy datetime64 objects, you need to convert it to python datetime object and then select formatter like in code below:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.font_manager as fm
from datetime import datetime
import matplotlib.dates as mdates
fig,ax = plt.subplots(1, 1) #width height inches
data=np.genfromtxt('data',
dtype='M8[us],S5,float,float,float',delimiter=',',usecols= [0,1,11,12,13])
my_dates = np.array([d[0] for d in data]).astype('datetime64[D]')
dates = np.unique(my_dates)
print(dates)
x_list = []
x_list[:] = dates.astype(datetime)
y_list = [10,11,12,13]
plt.plot(x_list,y_list)
plt.title('hv 20 to 10 ranks',fontsize=20)
plt.xlabel('dates',fontsize=16)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.ylabel('symbol ranks',fontsize=30)
plt.show()
i am try to plot subplot in matplotlib with pandas but there are issue i am facing. when i am plot subplot not show the date of stock...there is my program
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import pandas.io.data
df = pd.io.data.get_data_yahoo('goog', start=datetime.datetime(2008,1,1),end=datetime.datetime(2014,10,23))
fig = plt.figure()
r = fig.patch
r.set_facecolor('#0070BB')
ax1 = fig.add_subplot(2,1,1, axisbg='#0070BB')
ax1.grid(True)
ax1.plot(df['Close'])
ax2 = fig.add_subplot(2,1,2, axisbg='#0070BB')
ax2.plot(df['Volume'])
plt.show()
run this program own your self and solve date issue.....
When you're calling matplotlib's plot(), you are only giving it one array (e.g. df['Close'] in the first case). When there's only one array, matplotlib doesn't know what to use for the x axis data, so it just uses the index of the array. This is why your x axis shows the numbers 0 to 160: there are presumably 160 items in your array.
Use ax1.plot(df.index, df['Close']) instead, since df.index should hold the date values in your pandas dataframe.
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import pandas.io.data
df = pd.io.data.get_data_yahoo('goog', start=datetime.datetime(2008,1,1),end=datetime.datetime(2014,10,23))
fig = plt.figure()
r = fig.patch
r.set_facecolor('#0070BB')
ax1 = fig.add_subplot(2,1,1, axisbg='#0070BB')
ax1.grid(True)
ax1.plot(df.index, df['Close'])
ax2 = fig.add_subplot(2,1,2, axisbg='#0070BB')
ax2.plot(df.index, df['Volume'])
plt.show()