I have an Excel sheet that has a column of image frames. These frames numbers are not uniformly distributed, e.g. frame 1 may have entries from row 1 to 20 and frame 2 from 21 to 25 and so on. I want to read this data from an Excel sheet that has x and y coordinate for each frame and plot these x and y coordinate in a scattered plot using matplotlib. Here's my code, frame numbers are identified as image index.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from matplotlib.pyplot import figure
df_xlsx = pd.read_excel('X10.xlsx')
temp = df_xlsx['Image index'][0]
i = 0; #number of the row
xList = []
yList = []
dt = df_xlsx.loc[df_xlsx['Image index'] == 19]
xList = np.array(dt['X position'])
yList = np.array(dt['Y position'])
rList = np.array(dt['Diameter'])
figure(figsize=(10.24,7.68), dpi=100)
fig, ax = plt.subplots()
plt.xlim([0,1024])
plt.ylim([0,768])
plt.scatter(xList, yList, color ='r')
plt.axis('off')
plt.gcf().set_size_inches((10.24,7.68))
for i in range(len(xList)):
circle1 = plt.Circle((xList[i], yList[i]), rList[i], color='r')
ax.add_artist(circle1)
plt.tight_layout(pad=0)
plt.savefig('f=19.png',dpi=100)
plt.show()
Excel sheet example
The problem is every time I need to enter the image index and then save the plot. Can this be done in a loop such that the plot is continuously generated as different plots for each frame number (index frames)? This will save me a lot of time, as I have lots of frames and excel sheets. I am new to Python.
You can use groupby to step over the image index:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_excel('X10.xlsx')
for idx, group in df.groupby('Image index'):
fig, ax = plt.subplots(figsize=(10.24, 7.68), dpi=100)
diameter = 1 * group['Diameter']**2
ax.scatter(group['X position'], group['Y position'], s=diameter)
ax.set_xlim(0, 1024)
ax.set_ylim(0, 768)
ax.axis('off')
plt.tight_layout(pad=0)
plt.savefig(f'Image_index_{idx}.png', dpi=100)
Some notes on the other changes I made:
You don't need to cast the DataFrame columns to arrays or lists.
You can pass a size parameter s to plt.scatter() to make the circles; you'll just need to scale the numbers to fit your scale. E.g. you could multiply by some factor other than 1. Note that in matplotlib you are specifying the area of the marker, not the diameter.
Related
That's not easy to describe with words, so I will reveal a picture for you in order to understand:
As the image shows, I want to plot a line on each row separately based on their values on a data frame. Is it possible with Python libraries?
Here's an example to get you started: it uses table to plot the dataframe and overplots the stacked lines. The line for each row is shifted by ymax, the maximum value in the dataframe, to prevent overlapping.
import matplotlib as mpl
import numpy as np
import pandas as pd
# make sample data
np.random.seed(0)
df = pd.DataFrame(np.random.rand(41,5))
df.index = [f'Row {i}' for i in df.index]
fig, ax = plt.subplots(figsize=(4,10))
ax.set_axis_off()
# plot data as table
plt.matplotlib.table.table(ax, df.applymap('{:.1f}'.format).values.tolist(), rowLabels=df.index, bbox=[0,0,1,1])
# plot curve over table
ymax = df.max().max()
ax.set_ylim(0, ymax * len(df))
ax.plot((df.to_numpy() + ((len(df) - 1 - df.reset_index(drop=True).index.to_numpy()) * ymax)[:, None]).T, color='C0')
To use alternating colors, you can set the color cycler:
from cycler import cycler
# ...
ax.set_prop_cycle(cycler(color='rg'))
ax.plot((df.to_numpy() + ((len(df) - 1 - df.reset_index(drop=True).index.to_numpy()) * ymax)[:, None]).T)
I have a numpy array with:
col[0]=time=xaxis_data
col[1:32]= lines for y axis.
Every second a new row of data is added to the array.
I am plotting the data and updating the plots, however I cannot get the colors of each line to stay fixed.
import numpy as np
import time
import matplotlib.pyplot as plt
#add time column
start_measurment = time.time()
#storing the updated data
to_plot = np.zeros((1, 33))
#maybe using this? my_colors = plt.rcParams['axes.prop_cycle'][:32]()
fig,ax = plt.subplots(1,1)
ax.set_xlabel('time(s)')
ax.set_ylabel('sim. Data')
for i in range (20): #updating plot 20 times
#simulate the data for Stack example
Simulated_data = (np.arange(32)*i).reshape((1, 32))
#insert the time as col[0]
Simulated_data = np.insert(Simulated_data, 0, [time.time()-start_measurment], axis=1) #insert time
#append new data to a numpy array
to_plot = np.append(to_plot,Simulated_data , axis=0)
#Plot Data
ax.plot(to_plot[:,0], to_plot[:,1:]) #Add here how to fix colours
fig.canvas.draw()
time.sleep(1)
I don't think you can plot different colours in a single line plot statement but if you put in a nested for loop it is then possible:
import numpy as np
import time
import matplotlib.pyplot as plt
#add time column
start_measurment = time.time()
#storing the updated data
to_plot = np.zeros((1, 33))
#maybe using this? my_colors = plt.rcParams['axes.prop_cycle'][:32]()
fig,ax = plt.subplots(1,1)
ax.set_xlabel('time(s)')
ax.set_ylabel('sim. Data')
for i in range (100): #updating plot 20 times
#simulate the data for Stack example
Simulated_data = (np.arange(32)*i).reshape((1, 32))
#insert the time as col[0]
Simulated_data = np.insert(Simulated_data, 0, [time.time()-start_measurment], axis=1) #insert time
#append new data to a numpy array
to_plot = np.append(to_plot,Simulated_data , axis=0)
#Plot Data
for j in range(1,len(to_plot[0])-1):
ax.plot(to_plot[:,0], to_plot[:,j:j+1],c = f"C{j}") #Add here how to fix colours
fig.canvas.draw()
time.sleep(1)
I have a set of data captured in a pandas data frame which I would like to plot on a contourf plot. When plotting, I can see much white space in certain areas of the contour which I'm not sure how to fix. My x-data is semilog. I'm not sure if some kind of interpolation would help, or if it is someway I am generating my mesh grid and contour itself. I will attach an image and 2 sets of data frames as examples.
contourplot
Data file can be found here: https://drive.google.com/drive/folders/13aO1_P0wzLCjZSTIgalXyaR4cdW1_Rh8?usp=sharing
import os,sys
import numpy as np
import pandas as pd
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
#dev
import pprint
np.set_printoptions(threshold=sys.maxsize)
np.set_printoptions(suppress=True)
# start
Data = pd.read_csv('DF.csv',index_col=0)
plt.rcParams['figure.figsize'] = (16,10)
Freqs = Data.iloc[:,0] # Frequencies for data
angleFullset= ['{:03}'.format(x) for x in [*range(0,360,15)]] # test set, name of df cols in degrees
angleContour = [[int(x) for x in angleFullset],[int(x) if int(x) < 181 else int(x) - 360 for x in angleFullset]] # rename colum names to -180 to 180 deg
angleContour[0].append(angleContour[0][0]); angleContour[1].append(angleContour[1][0] - 1) # append 1 more column for last data set (which is same as first)
idx_180 = angleContour[1].index(180)
angleContour[0].insert(idx_180 + 1,-180); angleContour[1].insert(idx_180 + 1,-180) # insert another column after 180 to cover -180 case
[X,Y] = np.meshgrid(Freqs,angleContour[1])
fig,ax = plt.subplots(1,1)
ax.semilogx()
plt.hlines(0,20,20000,'k',linewidth=1.5) # zero axis
plt.vlines(100,-200,200,'k',linewidth=2) # 100Hz axis
plt.vlines(1000,-200,200,'k',linewidth=2) # 1kHz axis
plt.vlines(10000,-200,200,'k',linewidth=2) # 10kHz axis
plt.xlim([85,8000])
plt.ylim([-180,180])
plt.xticks([100,1000,8000],('100','1000','8000'))
plt.yticks(range(-180,181,30))
plt.xlabel('Frequency [Hz]')
plt.ylabel('Angle [deg]')
plt.grid(b=True,which='major'); plt.grid(b=True,which='minor')
plt.title('Contour')
newData = Data.copy()
newData.drop("Freq",axis=1,inplace=True)
newData['001'] = newData['000'] # for data from -345 to 0
newData.insert(newData.columns.get_loc('180')+1,'-180',newData['180']) # for data from -180 to -165
lev_min,lev_max,levels = -70,-19,range(-70,-19,1)
CM = ax.contourf(X,Y,newData.transpose(),cmap=matplotlib.cm.jet,levels=levels,vmin=lev_min,vmax=lev_max)
plt.colorbar(CM,label='Magnitude [dB]',fraction=0.1)
outputFileName = os.path.join(os.getcwd(),'Contour.png')
plt.savefig(outputFileName,orientation='landscape',format='png')
plt.clf()
plt.cla()
I'm going insane here ... this should be a simple exercise but I'm stuck:
I have a Jupyter notebook and am using the ruptures Python package. All I want to do is, take the figure or AxesSubplot(s) that the display() function returns and add it to a figure of my own, so I can share the x-axis, have a single image, etc.:
import pandas as pd
import matplotlib.pyplot as plt
myfigure = plt.figure()
l = len(df.columns)
for index, series in enumerate(df):
data = series.to_numpy().astype(int)
algo = rpt.KernelCPD(kernel='rbf', min_size=4).fit(data)
result = algo.predict(pen=3)
myfigure.add_subplot(l, 1, index+1)
rpt.display(data, result)
plt.title(series.name)
plt.show()
What I get is a figure with the desired number of subplots (all empty) and n separate figures from ruptures:
When instead I want want the subplots to be filled with the figures ...
I basically had to recreate the plot that ruptures.display(data,result) produces, to get my desired figure:
import pandas as pd
import numpy as np
import ruptures as rpt
import matplotlib.pyplot as plt
from matplotlib.ticker import EngFormatter
fig, axs = plt.subplots(len(df.columns), figsize=(22,20), dpi=300)
for index, series in enumerate(df):
resampled = df[series].dropna().resample('6H').mean().pad()
data = resampled.to_numpy().astype(int)
algo = rpt.KernelCPD(kernel='rbf', min_size=4).fit(data)
result = algo.predict(pen=3)
# Create ndarray of tuples from the result
result = np.insert(result, 0, 0) # Insert 0 as first result
tuples = np.array([ result[i:i+2] for i in range(len(result)-1) ])
ax = axs[index]
# Fill area beween results alternating blue/red
for i, tup in enumerate(tuples):
if i%2==0:
ax.axvspan(tup[0], tup[1], lw=0, alpha=.25)
else:
ax.axvspan(tup[0], tup[1], lw=0, alpha=.25, color='red')
ax.plot(data)
ax.set_title(series)
ax.yaxis.set_major_formatter(EngFormatter())
plt.subplots_adjust(hspace=.3)
plt.show()
I've wasted more time on this than I can justify, but it's pretty now and I can sleep well tonight :D
I have a list of pandas dataframes, with the dataframes representing subsequent frames in a process. As such, the dataframes are all of the same size and structure (same columns and indices). I am wondering if there is a way to animate these dataframes and save the animation as mpeg movie.
I have tried to do the following, however, did not have any luck:
#X: a list of 1000 dataframe, all of same size
fig = plt.figure()
ax = fig.add_subplot(111)
im = ax.imshow(X,interpolation='nearest')
def update_img(n):
im.set_data(n.values)
return im
animation = animation.FuncAnimation(fig,update_img, frames = X, interval=30)
the above gets stuck in the first frame.
To save the animation as mp4, you can use
import matplotlib.pyplot as plt
import matplotlib.animation
import numpy as np; np.random.seed(1)
import pandas as pd
X = [pd.DataFrame(np.random.rand(10,10)) for i in range(100)]
fig = plt.figure()
ax = fig.add_subplot(111)
im = ax.imshow(X[0],interpolation='nearest')
def update_img(n):
im.set_data(n.values)
ani = matplotlib.animation.FuncAnimation(fig,update_img, frames = X, interval=30)
FFwriter = matplotlib.animation.FFMpegWriter(fps=30)
ani.save(__file__+".mp4",writer = FFwriter)
plt.show()