Python scatterplot vs plot - python

I want to use Python's plt.scatter or ax.scatter to show a car finishing times as scatterplot chart. So my x axis contains an array:
'car001','car002','car003', ...
The y axes should contain the finish time in datetime format like:
'2019-01-01 23:32:01','2019-01-01 23:32:01','2019-01-01 23:32:01', ...
Why it is so difficult to use datetime values as pandas dataframe with a scatterplot?
I don't want to use plt.plot() with linestyle 'o'.
Thank you very much!

Did you try something like this ?
import pandas as pd
import matplotlib.pyplot as plt
dates = ['2017-01-01 23:32:01','2018-01-01 23:32:01','2019-01-01 23:32:01']
PM_25 = ['car001','car002','car003']
dates = [pd.to_datetime(d) for d in dates]
plt.scatter(dates, PM_25)
plt.show()

Related

How to graph data using purely lists in python

I have two lists representing dates and values respectively:
dates = ['10/6/2020',
'10/7/2020',
'10/8/2020',
'10/9/2020',
'10/12/2020',
'10/13/2020',
'10/14/2020',
'10/15/2020',
'10/16/2020',
'10/19/2020']
and
values = ['40.660',
'39.650',
'41.010',
'41.380',
'39.950',
'40.790',
'41.050',
'40.370',
'40.880',
'40.860']
I want to use seaborn/matplotlib to plot them without using pandas. Is that possible? I've made a few attempts but it doesn't seem to be going too well.
Here's what I've got so far:
def plots(values=values,dates=dates):
sns.lineplot(x=dates,y=sorted(values)[::-1])
sns.scatterplot(x=dates,y=sorted(values)[::-1])
plt.show()
return
crude_data = plots()
But it gives me this:
This is obviously wrong but I don't know how to fix it. The x-axis is also messy, and I'd like to fix that as well and make it more legible without expanding the width of the graph if possible. If that's impossible, then a way to do so while expanding the graph would be happily accepted as well.
Cheers!
Just parse dates from string to datetime.
Otherwise they are treated as strings and lose all their properties besides sortability (which can be wrong, depending on date format).
Also add xticks rotation for better x axis labels.
edit:
Also as others noticed, your numeric data is in string type aswell.
import seaborn as sns
import matplotlib.pyplot as plt
import datetime
dates = ['10/6/2020',
'10/7/2020',
'10/8/2020',
'10/9/2020',
'10/12/2020',
'10/13/2020',
'10/14/2020',
'10/15/2020',
'10/16/2020',
'10/19/2020']
values = ['40.660',
'39.650',
'41.010',
'41.380',
'39.950',
'40.790',
'41.050',
'40.370',
'40.880',
'40.860']
dates = [datetime.datetime.strptime(date, '%m/%d/%Y') for date in dates]
values = [float(val) for val in values]
def plots(values=values,dates=dates):
sns.lineplot(x=dates,y=sorted(values)[::-1])
sns.scatterplot(x=dates,y=sorted(values)[::-1])
plt.xticks(rotation=45)
plt.show()
return crude_data = plots()
You can use datetime library.
import matplotlib.pyplot as plt
from datetime import datetime
import seaborn as sns
dates = ['10/6/2020',
'10/7/2020',
'10/8/2020',
'10/9/2020',
'10/12/2020',
'10/13/2020',
'10/14/2020',
'10/15/2020',
'10/16/2020',
'10/19/2020']
x = [datetime.strptime(i, '%m/%d/%Y') for i in dates]
values = ['40.660',
'39.650',
'41.010',
'41.380',
'39.950',
'40.790',
'41.050',
'40.370',
'40.880',
'40.860']
nvalues = [float(i) for i in values]
def plots(values=nvalues,dates=x):
sns.lineplot(x=dates,y=sorted(values)[::-1])
sns.scatterplot(x=dates,y=sorted(values)[::-1])
plt.xticks(rotation=45, ha='right')
plt.show()
return
import matplotlib.pyplot as plt
from datetime import datetime
crude_data = plots()
Output:
I made 3 changes to your code:
I set the y axis data to be float instead of string by removing the '',
I removed sorted from the two plot lines, and
I added sort=False to the lineplot call
import matplotlib.pyplot as plt
import seaborn as sns
dates = ['10/6/2020',
'10/7/2020',
'10/8/2020',
'10/9/2020',
'10/12/2020',
'10/13/2020',
'10/14/2020',
'10/15/2020',
'10/16/2020',
'10/19/2020']
values = [40.660,
39.650,
41.010,
41.380,
39.950,
40.790,
41.050,
40.370,
40.880,
40.860]
def plots(values=values,dates=dates):
sns.lineplot(x=dates,y=values[::-1], sort=False)
s = sns.scatterplot(x=dates,y=values[::-1])
plt.xticks(rotation=45, ha='right')
plt.show()
return s
crude_data = plots()
This gives:

Represent a continuous graph using `matplotlib` and `pandas`

My dataframe is like this-
Energy_MWh Month
0 39686.82 1979-01
1 35388.78 1979-02
2 50134.02 1979-03
3 37499.22 1979-04
4 20104.08 1979-05
5 17440.26 1979-06
It goes on like this to the month 2015-12. So you can imagine all the data.
I want to plot a continuous graph with the months as the x-axis and the Energy_MWh as the y-axis. How to best represent this using matplotlib?
I would also like to know for my knowledge if there's a way to print 1979-01 as Jan-1979 on the x-axis and so on. Probably a lambda function or something while plotting.
Borrowed liberally from this answer, which you should go out and upvote:
from datetime import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
df = <set_your_data_frame_here>
myDates = pd.to_datetime(df['Month'])
myValues = df['Energy_MWh']
fig, ax = plt.subplots()
ax.plot(myDates,myValues)
myFmt = DateFormatter("%b-%Y")
ax.xaxis.set_major_formatter(myFmt)
## Rotate date labels automatically
fig.autofmt_xdate()
plt.show()
Set Month as the index:
df.set_index('Month', inplace=True)
Convert the index to Datetime:
df.index = pd.DatetimeIndex(df.index)
Plot:
df.plot()

Dates in X-axis using pandas and matplotlib

I am trying to plot some data from pandas. First I group by weeks and count for each grouped week, them I want to plot for each date, however when I try to plot I get just some dates, not all of them.
I am using the following code:
my_data = res1.groupby(pd.Grouper(key='d', freq='W-MON')).agg('count').u
p1, = plt.plot(my_data, '.-')
a = plt.xticks(rotation=45)
My result is the following:
I wanted a value in the x-axis for each date in the grouped dataframe.
EDIT: I tried to use plt.xticks(list(my_data.index.astype(str)), rotation=45)
The plot I get is the following:
Please find a working chunk of code below:
from datetime import date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
a = pd.Series(np.random.randint(10, 99, 10))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.plot(pd.date_range(date(2016,1,1), periods=10, freq='D'), a)
plt.gcf().autofmt_xdate()
Hope it helps :)

How to mark the beginning of a new year while plotting pandas Series?

I am plotting such data:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
a = pd.DatetimeIndex(start='2010-01-01',end='2011-06-01' , freq='M')
b = pd.Series(np.random.randn(len(a)), index=a)
I would like the plot to be in the format of bars, so I use this:
b.plot(kind='bar')
This is what I get:
As you can see, the dates are formatted in full, which is very ugly and unreadable. I happened to test this command which creates a very nice Date format:
b.plot()
As you can see:
I like this format very much, it includes the months, marks the beginning of the year and is easily readable.
After doing some search, the closest I could get to that format is using this:
fig, ax = plt.subplots()
ax.plot(b.index, b)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
However the output looks like this:
I am able to have month names on x axis this way, but I like the first formatting much more. That is much more elegant. Does anyone know how I can get the same exact xticks for my bar plot?
Here's a solution that will get you the format you're looking for. You can edit the tick labels directly, and use set_major_formatter() method:
fig, ax = plt.subplots()
ax.bar(b.index, b)
ticklabels = [item.strftime('%b') for item in b.index] #['']*len(b.index)
ticklabels[::12] = [item.strftime('%b\n%Y') for item in b.index[::12]]
ax.xaxis.set_major_formatter(matplotlib.ticker.FixedFormatter(ticklabels))
ax.set_xticks(b.index)
plt.gcf().autofmt_xdate()
Output:

Create a weekly timetable using matplotlib

Edit: I changed Data Type to Pandas DataFrame that looks like this (datetime.datetime,int) in order to make the problem more simple.
Original Post:
I have a numpy array of data reports that looks like this (datetime.datetime,int,int) and I can't seem to plot it right. I need the X axes to be a 24 hours and this array
np.array([datetime.datetime.time(x) for x in DataArr])
the Y should be the days(monday,tuesday and so on) from the datetime
and the int should give me different colors for different events but I can't find an example
in matplotlib's web site.
An example of what I'm looking for:
It sounds like you want something like this?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# I'm using pandas here just to easily create a series of dates.
time = pd.date_range('01/01/2013', '05/20/2013', freq='2H')
z = np.random.random(time.size)
# There are other ways to do this, but we'll exploit how matplotlib internally
# handles dates. They're floats where a difference of 1.0 corresponds to 1 day.
# Therefore, modulo 1 results in the time of day. The +1000 yields a valid date.
t = mdates.date2num(time) % 1 + 1000
# Pandas makes getting the day of the week trivial...
day = time.dayofweek
fig, ax = plt.subplots()
scat = ax.scatter(t, day, c=z, s=100, edgecolor='none')
ax.xaxis_date()
ax.set(yticks=range(7),
yticklabels=['Mon', 'Tues', 'Wed', 'Thurs', 'Fri', 'Sat', 'Sun'])
# Optional formatting tweaks
ax.xaxis.set_major_formatter(mdates.DateFormatter('%l%p'))
ax.margins(0.05)
fig.colorbar(scat)
plt.show()

Categories