Read time series csv file to plot with matplotlib

Read time series csv file to plot with matplotlib - python

I'm trying to plot a time series from the csv file.
eg. datalog.csv contains:
19:06:17.188,12.2
19:06:22.360,3.72
19:06:27.348,72
19:06:32.482,72
19:06:37.515,74
19:06:47.660,72
tried some thing like below:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
time, impressions = np.loadtxt("datalog_new.csv", unpack=True,
converters={ 0: mdates.strptime2num('%H:%M:%S.%f')})
plt.plot_date(x=time, y=impressions)
plt.show()
but could not parse the time, mdates.strptime2num('%H:%M:%S.%f')
Any suggestions are greatly appreciated.

You have to use bytespdate2num function to read csv file (because you read the file in binary mode):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.dates import bytespdate2num
time, impressions = np.loadtxt("datalog_new.csv",
unpack=True, delimiter=',', converters={0: bytespdate2num('%H:%M:%S.%f')})
plt.plot_date(x=time, y=impressions)
plt.show()

Related

Why does Pandas Plot looks different when using csv or xlsx data?

i've got two datasets with the exact same data but they look different when plotted the same way. One is a .xlsx file and one is a .csv file.
Here are the two codes:
For the CSV:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.cluster import KMeans
daten = pd.read_csv(r"Path\Übungsdaten.csv", header=0, sep=";")
print("Total rows: {0}".format(len(daten)))
print(daten.columns)
plt.scatter(daten['InsuredValue'], daten['Policy'])
plt.xlim(2500000)
plt.ylim(100100)
plt.show()
And for the xlsx:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.cluster import KMeans
daten = pd.read_excel(r"Path\Übungsdaten.xlsx")
print("Total rows: {0}".format(len(daten)))
plt.scatter(daten['InsuredValue'],daten['Policy'] )
plt.xlim(2500000)
plt.ylim(100100)
plt.show()
Here are the two Plots:
csv with plt.xlim(2500000) plt.ylim(100100)
and the csv without restrictions:
and finally the .xlsx plot:
My question is first of all, why is there a black bar on the bottom of the first two plots? (im guessing this is every single value of "InsuredValue") and how can I form the csv plo to the same ratio as the xlsx plot?
Thank you very much

I had to convert the "InsuredValue" column to int with the following code:
daten.astype({'InsuredValue':'int'})

Python netcdf cartopy - Plotting a selection of data

I have a netcdf file ('test.nc'). The variables of the netcdf file are the following:
variables(dimensions): float64 lon(lon), float64 lat(lat), int32 crs(), int16 Band1(lat,lon)
I am interested in the ´Band1´ variable.
Using cartopy, I could plot the data using the following code:
import numpy as np
import pandas as pd
import gzip
from netCDF4 import Dataset,num2date
import time
import matplotlib.pyplot as plt
import os
import matplotlib as mplt
#mplt.use('Agg')
import cartopy.crs as ccrs
import cartopy.feature as cfea
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
projection=ccrs.PlateCarree()
bbox=[-180,180,-60,85];creg='glob'
mplt.rc('xtick', labelsize=9)
mplt.rc('ytick', labelsize=9)
nc = Dataset('test.nc','r')
lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
kopi= (nc.variables['Band1'][:,:])
nc.close()
fig=plt.figure(figsize=(11,5))
ax=fig.add_subplot(1,1,1,projection=projection)
ax.set_extent(bbox,projection)
ax.add_feature(cfea.COASTLINE,lw=.5)
ax.add_feature(cfea.RIVERS,lw=.5)
ax.add_feature(cfea.BORDERS, linewidth=0.6, edgecolor='dimgray')
ax.background_patch.set_facecolor('.9')
levels=[1,4,8,11,14,17,21,25,29]
cmap=plt.cm.BrBG
norm=mplt.colors.BoundaryNorm(levels,cmap.N)
ddlalo=.25
pc=ax.contourf(lon,lat,kopi,levels=levels,transform=projection,cmap=cmap,norm=norm,extend='both')
divider = make_axes_locatable(ax)
ax_cb = divider.new_horizontal(size="3%", pad=0.1, axes_class=plt.Axes)
fig.colorbar(pc,extend='both', cax=ax_cb)
fig.add_axes(ax_cb)
fig.colorbar(pc,extend='both', cax=ax_cb)
ttitle='Jony'
ax.set_title(ttitle,loc='left',fontsize=9)
plt.show()
However, I would like just to plot a selection of values inside the variable ´Band1´. I thought I could use the following code:
kopi= (nc.variables['Band1'][:,:])<=3
However it does not work and instead of plotting the area corresponding to the value selection it selected the all map.
How could I select and plot a desired range of values inside the variables ´Band1´?

Just mask the values with np.nan
kopi[kopi <=3] = np.nan
This should yield to white pixels in your plot.
Please provide test data in the future.

Over lapping of timeseries on x-axis

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
load_file=pd.read_excel(r'E:\CCNC\CCNCCodes\Modulated
output\plot_oriented_ss_data.xlsx',header=0)
load_file.columns
s=load_file.loc[0:49,['Timeseries','ccn_0.1']] s
s1=s
s['Timeseries'] = s['Timeseries'].astype(str)
plt.plot(s1[0:49]['Timeseries'],s1[0:5762]['ccn_0.1'],color='b')
plt.grid()
plt.show()
Please tell me where do I exact need to make the change to avoid the overlapping of time series in x-axis.

Instead of converting your 'Timeseries' to str, you should convert them to datetime using:
s['Timeseries'] = pd.to_datetime(s['Timeseries'])

How to display dates in matplotlib x-axis instead of sequence numbers

I am trying to develop a candlestick chart with matplotlib but for some reason, dates are not coming up in the x-axis. After searching in stackoverflow, I understood that the dates need to be converted to float numbers so i converted them as well but still it's not working. New to this python and matplotlib. ANy help would be greatly appreciated.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick2_ohlc
import matplotlib.dates as dts
import matplotlib.ticker as mTicker
from datetime import datetime
my_file=pd.read_csv("C:\\path\\to\\file\\file.csv",sep=",",names=['Date','Open','High','Low','Close','AdjClose','Volume'],skiprows=1)
dateseries=[]
for i in my_file['Date']:
dateseries.append(dts.date2num(datetime.strptime(i,'%Y-%m-%d')))
print(dateseries)
fig,ax1=plt.subplots()
candlestick2_ohlc(ax1,my_file['Open'], my_file['High'],my_file['Low'], my_file['Close'], width=0.7,colorup='#008000', colordown='#FF0000')
plt.show()
Sample data:
Date,Open,High,Low,Close,Volume1,Volume2
2017-05-08,149.029999,153.699997,149.029999,153.009995,153.009995,48752400
2017-05-09,153.869995,154.880005,153.449997,153.990005,153.990005,39130400
2017-05-10,153.630005,153.940002,152.110001,153.259995,153.259995,25805700

In general, you are right about "the dates need to be converted to float numbers". Then to display dates on x-axis, you would need to "convert" them back. If you don't mind using candlestick_ohlc, that might be easier for setting the x-axis for your case here:
import io
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick_ohlc
from matplotlib.dates import date2num, DayLocator, DateFormatter
import pandas as pd
s = """Date,Open,High,Low,Close,Volume1,Volume2
2017-05-08,149.029999,153.699997,149.029999,153.009995,153.009995,48752400
2017-05-09,153.869995,154.880005,153.449997,153.990005,153.990005,39130400
2017-05-10,153.630005,153.940002,152.110001,153.259995,153.259995,25805700"""
my_file = pd.read_table(io.StringIO(s), sep=',', header=0)
my_file['Date'] = date2num(pd.to_datetime(my_file['Date']).tolist())
fig, ax=plt.subplots()
candlestick_ohlc(ax, my_file.as_matrix())
ax.xaxis.set_major_locator(DayLocator())
ax.xaxis.set_major_formatter(DateFormatter('%Y-%m-%d'))
plt.show()

Plotting timestampt data from CSV using matplotlib

I am trying to plot data from a csv file using matplotlib. There is 1 column against a timestamp:
26-08-2016 00:01 0.062964691
26-08-2016 00:11 0.047209214
26-08-2016 00:21 0.047237823
I have only been able to create a simple plot using only integers using the code below, which doesn't work when the y data is a timestamp. What do I need to add?
This may seem simple, but I am pressed for time :/
thanks!
from matplotlib import pyplot as plt
from matplotlib import style
import numpy as np
import datetime as dt
x,y = np.loadtxt('I112-1.csv',
unpack=True,
delimiter = ',')
plt.plot(x,y)
plt.title('Title')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()

Here's my example for this problem:
import pandas as pd
from io import StringIO
from datetime import datetime
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
data_file = StringIO("""
time,value
26-08-2016 00:01,0.062964691
26-08-2016 00:11,0.047209214
26-08-2016 00:21,0.047237823""")
df = pd.read_table(data_file,delimiter=",")
df['datetime']= df.time.map(lambda l: datetime.strptime(l, '%d-%m-%Y %H:%M'))
ax = df.set_index("datetime",drop=False)[['value','datetime']].plot(title="Title",yticks=df.value)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Read time series csv file to plot with matplotlib - python

Related

Why does Pandas Plot looks different when using csv or xlsx data?

Python netcdf cartopy - Plotting a selection of data

Over lapping of timeseries on x-axis

How to display dates in matplotlib x-axis instead of sequence numbers

Plotting timestampt data from CSV using matplotlib

Categories

Resources