Generating animation from a list of pandas dataframe - python

I have a list of pandas dataframes, with the dataframes representing subsequent frames in a process. As such, the dataframes are all of the same size and structure (same columns and indices). I am wondering if there is a way to animate these dataframes and save the animation as mpeg movie.
I have tried to do the following, however, did not have any luck:
#X: a list of 1000 dataframe, all of same size
fig = plt.figure()
ax = fig.add_subplot(111)
im = ax.imshow(X,interpolation='nearest')
def update_img(n):
im.set_data(n.values)
return im
animation = animation.FuncAnimation(fig,update_img, frames = X, interval=30)
the above gets stuck in the first frame.

To save the animation as mp4, you can use
import matplotlib.pyplot as plt
import matplotlib.animation
import numpy as np; np.random.seed(1)
import pandas as pd
X = [pd.DataFrame(np.random.rand(10,10)) for i in range(100)]
fig = plt.figure()
ax = fig.add_subplot(111)
im = ax.imshow(X[0],interpolation='nearest')
def update_img(n):
im.set_data(n.values)
ani = matplotlib.animation.FuncAnimation(fig,update_img, frames = X, interval=30)
FFwriter = matplotlib.animation.FFMpegWriter(fps=30)
ani.save(__file__+".mp4",writer = FFwriter)
plt.show()

Related

matplotlib: Add AxesSubplot instances to a figure

I'm going insane here ... this should be a simple exercise but I'm stuck:
I have a Jupyter notebook and am using the ruptures Python package. All I want to do is, take the figure or AxesSubplot(s) that the display() function returns and add it to a figure of my own, so I can share the x-axis, have a single image, etc.:
import pandas as pd
import matplotlib.pyplot as plt
myfigure = plt.figure()
l = len(df.columns)
for index, series in enumerate(df):
data = series.to_numpy().astype(int)
algo = rpt.KernelCPD(kernel='rbf', min_size=4).fit(data)
result = algo.predict(pen=3)
myfigure.add_subplot(l, 1, index+1)
rpt.display(data, result)
plt.title(series.name)
plt.show()
What I get is a figure with the desired number of subplots (all empty) and n separate figures from ruptures:
When instead I want want the subplots to be filled with the figures ...
I basically had to recreate the plot that ruptures.display(data,result) produces, to get my desired figure:
import pandas as pd
import numpy as np
import ruptures as rpt
import matplotlib.pyplot as plt
from matplotlib.ticker import EngFormatter
fig, axs = plt.subplots(len(df.columns), figsize=(22,20), dpi=300)
for index, series in enumerate(df):
resampled = df[series].dropna().resample('6H').mean().pad()
data = resampled.to_numpy().astype(int)
algo = rpt.KernelCPD(kernel='rbf', min_size=4).fit(data)
result = algo.predict(pen=3)
# Create ndarray of tuples from the result
result = np.insert(result, 0, 0) # Insert 0 as first result
tuples = np.array([ result[i:i+2] for i in range(len(result)-1) ])
ax = axs[index]
# Fill area beween results alternating blue/red
for i, tup in enumerate(tuples):
if i%2==0:
ax.axvspan(tup[0], tup[1], lw=0, alpha=.25)
else:
ax.axvspan(tup[0], tup[1], lw=0, alpha=.25, color='red')
ax.plot(data)
ax.set_title(series)
ax.yaxis.set_major_formatter(EngFormatter())
plt.subplots_adjust(hspace=.3)
plt.show()
I've wasted more time on this than I can justify, but it's pretty now and I can sleep well tonight :D

Reading specific rows and plotting using Matplotlib

I have an Excel sheet that has a column of image frames. These frames numbers are not uniformly distributed, e.g. frame 1 may have entries from row 1 to 20 and frame 2 from 21 to 25 and so on. I want to read this data from an Excel sheet that has x and y coordinate for each frame and plot these x and y coordinate in a scattered plot using matplotlib. Here's my code, frame numbers are identified as image index.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from matplotlib.pyplot import figure
df_xlsx = pd.read_excel('X10.xlsx')
temp = df_xlsx['Image index'][0]
i = 0; #number of the row
xList = []
yList = []
dt = df_xlsx.loc[df_xlsx['Image index'] == 19]
xList = np.array(dt['X position'])
yList = np.array(dt['Y position'])
rList = np.array(dt['Diameter'])
figure(figsize=(10.24,7.68), dpi=100)
fig, ax = plt.subplots()
plt.xlim([0,1024])
plt.ylim([0,768])
plt.scatter(xList, yList, color ='r')
plt.axis('off')
plt.gcf().set_size_inches((10.24,7.68))
for i in range(len(xList)):
circle1 = plt.Circle((xList[i], yList[i]), rList[i], color='r')
ax.add_artist(circle1)
plt.tight_layout(pad=0)
plt.savefig('f=19.png',dpi=100)
plt.show()
Excel sheet example
The problem is every time I need to enter the image index and then save the plot. Can this be done in a loop such that the plot is continuously generated as different plots for each frame number (index frames)? This will save me a lot of time, as I have lots of frames and excel sheets. I am new to Python.
You can use groupby to step over the image index:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_excel('X10.xlsx')
for idx, group in df.groupby('Image index'):
fig, ax = plt.subplots(figsize=(10.24, 7.68), dpi=100)
diameter = 1 * group['Diameter']**2
ax.scatter(group['X position'], group['Y position'], s=diameter)
ax.set_xlim(0, 1024)
ax.set_ylim(0, 768)
ax.axis('off')
plt.tight_layout(pad=0)
plt.savefig(f'Image_index_{idx}.png', dpi=100)
Some notes on the other changes I made:
You don't need to cast the DataFrame columns to arrays or lists.
You can pass a size parameter s to plt.scatter() to make the circles; you'll just need to scale the numbers to fit your scale. E.g. you could multiply by some factor other than 1. Note that in matplotlib you are specifying the area of the marker, not the diameter.

How can I plot the animation from the csv data with date time information?

first I would like to share the data of csv file.
date, total_cases, total_deaths
12-5-2020,6,2
13-5-2020,7,3
14-5-2020,10,2
15-5-2020,18,5
Now I want to make an animated comparison graph where the x axis will be plotted the dates and y axis will be plotted the total_cases and total_deaths.
from matplotlib import dates as mdate
from matplotlib import pyplot as plt
import matplotlib.animation as animation
import pandas as pd
m=pd.read_csv("covid-data.csv")
m['date']=pd.to_datetime(m['date'])
m.sort_values('date',inplace=True)
cdate=m['date']
ccase=m['total_cases']
cdeath=m['total_deaths']
fig = plt.figure()
ax1 = fig.add_subplot(111)
def animate(i):
ax1.clear()
ax1.plot(cdate,ccase)
ax1.plot(cdate,cdeath)
ani = animation.FuncAnimation(fig, animate, interval=1000)
plt.show()
Now
I can't get our desired output or animation. How can I overcome this issue and get a solution?
Sorry for my english
Check this code:
from matplotlib import dates as mdate
from matplotlib import pyplot as plt
import matplotlib.animation as animation
import pandas as pd
m = pd.read_csv("covid-data.csv")
m['date'] = pd.to_datetime(m['date'], format = '%d-%m-%Y')
m.sort_values('date', inplace = True)
cdate = m['date']
ccase = m['total_cases']
cdeath = m['total_deaths']
fig = plt.figure()
ax1 = fig.add_subplot(111)
def animate(i):
ax1.clear()
ax1.plot(cdate[:i], ccase[:i], label = 'cases')
ax1.plot(cdate[:i], cdeath[:i], label = 'deaths')
ax1.legend(loc = 'upper left')
ax1.set_xlim([cdate.iloc[0],
cdate.iloc[-1]])
ax1.set_ylim([min(ccase.iloc[0], cdeath.iloc[0]),
max(ccase.iloc[-1], cdeath.iloc[-1])])
ax1.xaxis.set_major_locator(mdate.DayLocator(interval = 5))
ax1.xaxis.set_major_formatter(mdate.DateFormatter('%d-%m-%Y'))
ani = animation.FuncAnimation(fig, animate, interval = 1000)
plt.show()
I changed your animate function in order to use the i counter (which increases by 1 at each frame). You can control what you want to change during the animation with this counter. The I added some formatting in order to improve the visualization. The code above gives me this animation:
In order to get an appreciable animation, I added some "fake" data to the one you provided, in order to have more days over which run the animation. Replace them with your data.
In the case of the error
TypeError: 'builtin_function_or_method' object is not subscriptable
Replace the .iloc[0] with [m.index[0]] and the same for .iloc[-1] with [m.index[-1]]. For example ccase.iloc[0] becomes ccase[m.index[0]].

parse_dates causes grid() to be displaced

The following code is a sample showing how the problem arises.
import pandas as pd
import matplotlib.pyplot as plt
#Reading data
data = pd.read_csv("mydata.csv",parse_dates=['date'])
data = data.iloc[0:17, :]
#Plotting data
fig = plt.figure(figsize=(10, 7))
ax = fig.add_subplot(111)
ax.plot(data['date'],data['y'],'-o')
ax.set(xlabel='Date', ylabel='y')
ax.grid()
plt.show()
The result is the following: the grid is displaced with respect to data point grid_displaced.
If I remove ,parse_dates=['date'], everything works fine grid_not_displaced.
Here is the link to the data file https://drive.google.com/file/d/1AWcyIKgtDY_xkT_gaUxsiwjq9vLGfMog/view?usp=sharing

Problem when using datetime data to draw graphic

I want to draw a graphic with using datas in datetime format as xaxis, but the process lasts very, very, extremly long, over 30 mins there is still no graphic. But once I apply datas in another column, the graphic will occur very soon. All the datas' formats are 'list'.
I'm confused about that, since they are all in the same format, why I can't draw the graphic out using the datetime formate as xaxis??
here is my code, I cherish all your time and help!
from matplotlib import pyplot as plt
import csv
names = []
x = []
y = []
names=[]
with open('all.csv','r') as csvfile: #this csv file contains over 16000 datas
plots= csv.reader(csvfile,delimiter=',')
for row in plots:
x.append(row[1]) #row1 is the datetime format data
y.append(row[2])
print(x,y)
plt.plot(x,y)
plt.show()
Lines of my csv file look something like:
2016/05/02 10:47:45,14.1,20.1,N.C.,170.7,518.3,-1259,-12.61,375.8,44.92,13.76,92.74,132.6,38.86,165.3,170.9,311.5,252.3,501.2,447.2,378.4,35.48,7.868,181.2,
I want the first column as xaxis and the following colums as yaxis...
and the y axis doesn't change, no matter how I change the y axis limit.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv('all.csv')
x = df.iloc[:,1]
y = df.iloc[:,3]
x = pd.to_datetime(x)
plt.figure(num=3, figsize=(15, 5))
plt.plot(x,y)
my_y_ticks = np.arange(0, 40, 10)
plt.xticks(rotation = 90)
plt.show()
I havent understood exactly what you mean with all the datas' format are list, but I think you could use something like this:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('all.csv')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = pd.to_datetime(x)
plt.plot(x,y)
plt.show()
Maybe showing some rows can be useful
EDIT:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.read_csv('all.csv')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = pd.to_datetime(x, format="%Y/%m/%d %H/%M/%S") #if the format is different, change here
fig, ax = plt.subplots()
ax.plot(x, y)
xfmt = mdates.DateFormatter("%Y/%m/%d %H:%M:%S")
ax.xaxis.set_major_formatter(xfmt)
plt.xticks(rotation=70)
plt.show()

Categories