I generate two plots in seaborn which share y-axis. I'm wondering how I can make the shared y-axis labels center-aligned. I am looking for some ideas and improvements. The plot is attached.
import seaborn as sns
import matplotlib.pylab as plt
import numpy as np
import string
import random
labels = []
for i in range(10):
labels.append(''.join(random.choices(string.ascii_lowercase, k=4)))
labels.append(''.join(random.choices(string.ascii_lowercase, k=7)))
score = np.abs(np.random.randn(20))
fig, axes = plt.subplots(1,2 , figsize=(5,5 ), sharey=True )
for ii in range(2):
ti = sns.barplot(y=[j for j in range(len(score))],x=score, ax=axes[ii],
orient='h' )
ti.set_yticklabels(labels)
if ii ==0:
ti.invert_xaxis()
ti.yaxis.tick_right()
fig.tight_layout(w_pad=0, pad=1)
plt.show()
The ti.set_yticklabels(labels) has options to align horizontally using ha, multi-align text with ma. I have made use of ha and adjusted using the position argument as well. It seems to be aligned. Probably this could also help to design and play with it to get to right format. Reference here (Look at other parameters)
Below is the updated code snippet for the part of the changes only:
for ii in range(2):
ti = sns.barplot(y=[j for j in range(len(score))],x=score, ax=axes[ii],
orient='h')
ti.set_yticklabels(labels, ha="center", position=(1.2,0)) # ha and position adjustments.
if ii ==0:
ti.invert_xaxis()
ti.yaxis.tick_right()
plt.tight_layout(w_pad=0, pad=2.5) # Some padding adjustments.
plt.show()
Finally the output looks like this:
Related
I have a df containing x,y coordinates of a mouse's snout that I want to use for an animated scatterplot. Currently, I have the code for a static scatterplot.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import os
from pathlib import Path
from IPython.display import HTML
#import video pose estimation data
video='topNoF.mp4'
DLCscorer='C:/Users/Bri-Guy/Desktop/DLC/h5/topFDLC_resnet50_topAnalysisJan20shuffle1_100000'
dataname = DLCscorer+'.h5' #can change to .csv instead; make sure to change pd.read_hdf() to pd.read_csv()
df=pd.read_hdf(os.path.join(dataname))
#get X & Y coordinates of snout bdp
scorer=df.columns.get_level_values(0)[0]
x=df[scorer]['snout']['x'].values
y=df[scorer]['snout']['y'].values
#produce a static scatterplot trace of snout movement
length=len(x)
n = len(x)
color = []
for i in range (1,n+1):
color.append(i/n)
scatter=plt.scatter(x,y,c=color, cmap='inferno')
ax = scatter.axes
which yields
I want to use a variation of the code from this Stack Exchange answer: Matplotlib Plot Points Over Time Where Old Points Fade to animate the scatterplot in a manner that looks like this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from IPython.display import HTML
from matplotlib.colors import LinearSegmentedColormap
plt.rcParams['animation.ffmpeg_path'] = r'C:/Users/Bri-Guy/anaconda3/envs/pWGA/Library/bin/ffmpeg.exe'
x=df[scorer]['snout']['x'].values
y=df[scorer]['snout']['y'].values
fig, ax = plt.subplots()
plt.xlim(0, max(x)+100)
plt.ylim(0, max(y)+100)
graph, = plt.plot([], [], 'o')
def animate(i):
graph.set_data(x[:i+1], y[:i+1])
return graph
ani = FuncAnimation(fig, animate, frames=len(y), interval=20)
html = ani.to_html5_video()
HTML(html)
Ideally, I'd like for the points in this animation to fade over time (smooth fade like the example linked). Also, I'm not sure how to set a colormap such as inferno to the animated scatterplot. Fade & the ability to set a colormap are my most important priorities.
The main issue I'm running into with the code from the linked example is the following area:
def get_new_vals():
n = np.random.randint(1,5)
x = np.random.rand(n)
y = np.random.rand(n)
return list(x), list(y)
def update(t):
global x_vals, y_vals, intensity
# Get intermediate points
new_xvals, new_yvals = get_new_vals()
x_vals.extend(new_xvals)
y_vals.extend(new_yvals)
# Put new values in your plot
scatter.set_offsets(np.c_[x_vals,y_vals])
#calculate new color values
intensity = np.concatenate((np.array(intensity)*0.96, np.ones(len(new_xvals))))
scatter.set_array(intensity)
# Set title
ax.set_title('Time: %0.3f' %t)
I need to get one new value from x=df[scorer]['snout']['x'].values & y=df[scorer]['snout']['y'].values. These values have to be called in order from x[0] to x[len(x)-1] so that the plot will update chronologically. However, I get errors when trying to add a parameter in get_new_vals(i) because I think the t variable for animation isn't an integer. I'm not sure if there is also an issue with the element types inside my arrays x & y since they are floating points.
I appreciate your help in advance! Please let me know if I can clarify anything for you. Below I have posted some of the data inside my x,y variables:
print(x[:200])
[276.89370728 280.57974243 285.25439453 285.55096436 284.71258545
283.52386475 284.31976318 285.08609009 285.56118774 285.38183594
289.21246338 295.28497314 303.41043091 315.51828003 324.87826538
333.36367798 338.73730469 341.11685181 349.20300293 357.63671875
366.72702026 395.68356323 385.37298584 387.86871338 387.58526611
382.20205688 378.13674927 373.97241211 368.39953613 591.94116211
347.27310181 616.52069092 608.12902832 605.11340332 602.10974121
598.72052002 598.48504639 599.19256592 601.30432129 603.32104492
604.9621582 605.21533203 621.36779785 627.51617432 626.20269775
621.00164795 618.92498779 617.44885254 615.73883057 598.8916626
594.17883301 593.38647461 589.3248291 592.67895508 593.67053223
589.05767822 589.08850098 303.0085144 568.39239502 555.08520508
550.79425049 547.77197266 547.21954346 313.01544189 333.96121216
348.59899902 353.26141357 358.76705933 360.81588745 363.94262695
527.38165283 522.80316162 518.20489502 521.84442139 525.30664062
526.43286133 532.38995361 536.35961914 536.51574707 540.41906738
545.77844238 545.22381592 545.34112549 540.22357178 537.93457031
534.03442383 532.6651001 522.52618408 505.38290405 489.37664795
469.75460815 448.28039551 424.54315186 403.87719727 383.25265503
359.93307495 335.06869507 318.53125 296.43450928 288.86499023
442.87780762 443.70950317 440.31652832 439.50854492 439.11328125
434.4161377 216.55622864 216.7456665 215.32554626 213.63644409
213.8143158 213.96568298 213.87882996 214.10801697 214.05957031
212.54373169 216.25740051 214.80444336 216.47532654 218.31072998
215.78303528 213.40249634 292.92352295 290.80630493 287.03222656
283.45129395 390.10848999 274.83648682 269.53741455 215.76341248
218.75086975 220.38156128 219.87997437 219.83804321 218.52023315
216.93737793 218.18110657 218.31959534 224.79884338 224.69064331
221.88998413 218.67016602 216.9510498 216.63031006 215.88612366
217.3243103 217.01783752 214.08659363 213.87808228 211.14770508
206.47595215 214.88208008 214.39358521 212.50665283 212.39123535
213.95169067 217.72639465 313.20504761 288.23443604 283.30273438
283.1756897 281.46990967 276.20397949 273.39535522 274.38088989
267.42678833 269.19915771 271.11810303 331.80795288 330.61746216
329.03930664 227.14578247 226.81338501 227.80999756 229.65690613
229.97644043 207.91325378 219.31289673 225.3374939 230.50515747
313.22525024 310.83474731 306.02667236 299.73217773 288.91854858
278.45489502 266.55349731 264.91229248 256.15029907 249.70783997
245.47244263 246.11851501 245.35572815 246.03157043 246.50708008
246.63691711 245.77215576 245.09873962 241.44792175 243.87677002]
print(y[:200])
[1321.18652344 1316.84301758 1316.04064941 1315.66455078 1315.38586426
1315.74560547 1317.04711914 1318.11218262 1320.09631348 1321.45703125
1328.39794922 1339.04956055 1353.28076172 1364.76757812 1371.4901123
1376.57568359 1381.58361816 1383.17236328 1390.66870117 1401.93652344
1412.72961426 1393.1583252 1437.81616211 1440.15380859 1440.01843262
1438.75891113 1437.50012207 1436.55932617 1429.39416504 1620.10900879
1400.08654785 1506.79125977 1497.49243164 1491.12316895 1488.84606934
1487.79296875 1489.73840332 1488.69665527 1490.09631348 1490.75927734
1490.99291992 1487.58154297 1610.09851074 1612.06115723 1613.1282959
1615.44006348 1619.22009277 1620.50915527 1618.51501465 1495.30175781
1496.44470215 1497.99816895 1492.10546875 1474.48339844 1481.59130859
1475.67248535 1477.48083496 1375.00964355 1480.91625977 1486.29675293
1493.73193359 1501.6204834 1507.31176758 1366.4050293 1358.88977051
1346.67797852 1339.04455566 1333.7755127 1328.54821777 1322.38439941
1502.17102051 1497.28198242 1493.86022949 1479.5690918 1469.38928223
1458.9095459 1448.56921387 1448.53613281 1443.38757324 1444.77124023
1443.64233398 1440.03857422 1431.31848145 1427.53820801 1427.53393555
1428.34887695 1429.39526367 1431.30249023 1440.09594727 1457.33044434
1474.85864258 1492.55554199 1498.40905762 1497.65649414 1503.4185791
1507.8626709 1507.43164062 1506.86315918 1507.16052246 1506.81115723
1313.21789551 1317.2467041 1322.02380371 1320.96960449 1314.60107422
1314.20336914 1640.76843262 1656.12866211 1667.98669434 1675.00683594
1675.64379883 1678.5567627 1678.24157715 1673.58984375 1667.44628906
1651.3112793 1635.78564453 1626.48413086 1616.28967285 1594.74316406
1580.41320801 1579.52124023 1506.03894043 1504.37646484 1501.6105957
1500.49902344 1316.76806641 1491.15759277 1471.89294434 1575.55322266
1585.22619629 1594.93151855 1598.60961914 1597.25134277 1598.54443359
1599.59936523 1597.15466309 1594.58544922 1585.51281738 1582.63903809
1584.89025879 1585.41784668 1589.9654541 1592.00085449 1596.66369629
1598.4128418 1598.8269043 1600.46069336 1596.1394043 1597.69421387
1598.36975098 1582.69897461 1580.45617676 1581.58618164 1580.9498291
1578.98144531 1575.28918457 1494.18151855 1490.47180176 1490.87109375
1485.41467285 1480.73291016 1481.30554199 1485.26867676 1489.50756836
1497.07263184 1501.99645996 1505.94543457 1267.27172852 1270.28515625
1281.31738281 1568.67248535 1576.60119629 1578.63500977 1577.04748535
1577.89709473 1681.06103516 1681.95227051 1684.63745117 1686.96484375
1325.62084961 1331.21569824 1331.70007324 1331.41809082 1336.49401855
1345.08703613 1351.00073242 1356.37145996 1360.50805664 1367.5177002
1371.69470215 1376.6151123 1377.27844238 1376.92382812 1376.37487793
1377.03894043 1377.87390137 1378.53796387 1373.02990723 1365.00915527]
In case anyone's curious. I managed to solve the issue of adapting the linked code to my dataset by converting x,y numpy arrays to lists using the list() function, then chronologically iterate over each list in get_new_vals() without introducing a parameter by using .pop(). The last piece of troubleshooting was with extend() and len() in update() function. Since extend() doesn't work on numpy.float64, I switched to append(). The variable new_xvals has no real length due to how the data is returned from get_new_vals(), so I switched np.ones(len()) to np.ones(1) because I'm moving through one data point at a time anyway. Final code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation
import os
from matplotlib.colors import LinearSegmentedColormap
from matplotlib.animation import PillowWriter
from IPython.display import HTML
#set path to ffmpeg for animation processing
plt.rcParams['animation.ffmpeg_path'] = r'C:/Users/Bri-Guy/anaconda3/envs/pWGA/Library/bin/ffmpeg.exe'
#import video pose estimation data
video='topNoF.mp4'
DLCscorer='C:/Users/Bri-Guy/Desktop/DLC/h5/topFDLC_resnet50_topAnalysisJan20shuffle1_100000'
dataname = DLCscorer+'.h5' #can change to .csv instead; make sure to change pd.read_hdf() to pd.read_csv()
df=pd.read_hdf(os.path.join(dataname))
#get X & Y coordinates of snout bdp
scorer=df.columns.get_level_values(0)[0]
x=df[scorer]['snout']['x'].values
y=df[scorer]['snout']['y'].values
x=list(x)
y=list(y)
#set plt & axes elements (empty canvas to be iterated over)
fig, ax = plt.subplots()
ax.set_xlabel('X Axis', size = 12)
ax.set_ylabel('Y Axis', size = 12)
ax.axis([0,max(x),0,max(y)])
x_vals = []
y_vals = []
intensity = []
iterations = 1000 #set number of frames for video
t_vals = np.linspace(0,1, iterations)
#define colormap and scatterplot
colors = [[0,0,1,0],[0,0,1,0.5],[0,0.2,0.4,1], [1,0.2,0.4,1]]
cmap = LinearSegmentedColormap.from_list("", colors)
scatter = ax.scatter(x_vals,y_vals, c=[], cmap=cmap, vmin=0,vmax=1)
def get_new_vals():
xp = x.pop(0)
yp = y.pop(0)
return xp, yp
def update(t):
global x_vals, y_vals, intensity
# Get intermediate points
new_xvals, new_yvals = get_new_vals()
x_vals.append(new_xvals)
y_vals.append(new_yvals)
# Put new values in your plot
scatter.set_offsets(np.c_[x_vals,y_vals])
#calculate new color values
intensity = np.concatenate((np.array(intensity)*0.98, np.ones(1)))
scatter.set_array(intensity)
# Set title
ax.set_title('title')
ani = matplotlib.animation.FuncAnimation(fig, update, frames=t_vals,interval=50)
html = ani.to_html5_video() #necessary to view anim in Jup.NoteBook
HTML(html)
I want to plot multiple signals in one figure and label them according to the predefined range.
This is the beginning of the code. I put some random values for the arrays tot_voltage and for x, because the real ones are too big.
import numpy as np
import matplotlib.pyplot as plt
shot_min= 1
shot_max = 4
shot_range = range(shot_min, shot_max)
tot_voltage = np.array([[ 0.00140459, 0.000847097, 0.000388473, 0.000223704],
[0.000415936, -4.54262e-05, 0.000577968, 0.000638384],
[-0.000237666, 0.000836115, 0.000229195, 0.000336297],
[-0.00045187, 0.00135515, 0.000566982, 0.000523042],
[-0.000179999, 0.000448897, 0.00120137, 0.000998143],
[0.000127584, 0.00027588, -0.000350259, 0.00130298]])
x=np.array([-1.8401, -1.84, -1.8399, -1.8398, -1.8397, -1.8396])
Now I want to plot the data and put labels according to the shot_range. But this syntax is invalid and I don't know how to fix it.
for i in range(0, len(tot_voltage[1,:])):
y=tot_voltage[:,i]*1e3
plt.figure(1)
plt.plot(x, y, label = 'shot {}'.format(for i in shot_range))
plt.legend()
plt.show()
I have two dictionaries-
selected candidates and rejected candidates.
the structure is like as shown below-
selected={"name":score} #same for rejected
I want to show selected candidates in the green and rejected candidates in red.
How can I do that?
I have tried this way but it is giving me some absurd result:
#Husain Shaikh
#test 3 matplotlib
import matplotlib.pyplot as plt
selected={"Husain":92, "Asim":65,"Chirag": 74 }
rejected={"Absar":70,"premraj":57}
plt.bar(range(len(selected)),list(selected.values()),color="green")
plt.xticks(range(len(selected)),list(selected.keys()))
plt.bar(range(len(rejected)),list(rejected.values()),color="red")
plt.xticks(range(len(rejected)),list(rejected.keys()))
plt.xlabel("Candidates")
plt.ylabel("Score")
plt.plot()
plt.show()
you can try something like this
import matplotlib.pyplot as plt
selected={"Husain":92, "Asim":65,"Chirag": 74 }
rejected={"Absar":70,"premraj":57}
selected_candidates_number = len(selected)
rejected_candidates_number = len(rejected)
plt.bar(range(selected_candidates_number ),list(selected.values()),color="green")
plt.bar(range(selected_candidates_number,selected_candidates_number +rejected_candidates_number ),list(rejected.values()),color="red")
plt.xticks(range(selected_candidates_number +rejected_candidates_number), list(selected.keys()) + list(rejected.keys()))
plt.xlabel("Candidates")
plt.ylabel("Score")
plt.plot()
plt.show()
Here you go. I am only showing the relevant/modified part of the code
# Import numpy, matplotlib and data here
loc_s = np.arange(len(selected))+0.1 # Offsetting the tick-label location
loc_r = np.arange(len(rejected))-0.1 # Offsetting the tick-label location
xtick_loc = list(loc_s) + list(loc_r)
xticks = list(selected.keys())+ list(rejected.keys())
plt.bar(loc_s,list(selected.values()),color="green", width=0.2,label='Selected')
plt.bar(loc_r,list(rejected.values()),color="red", width=0.2,label='Rejected')
plt.xticks(xtick_loc, xticks, rotation=45)
# Labels and legend here
Output
I am translating a set of R visualizations to Python. I have the following target R multiple plot histograms:
Using Matplotlib and Seaborn combination and with the help of a kind StackOverflow member (see the link: Python Seaborn Distplot Y value corresponding to a given X value), I was able to create the following Python plot:
I am satisfied with its appearance, except, I don't know how to put the Header information in the plots. Here is my Python code that creates the Python Charts
""" Program to draw the sampling histogram distributions """
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import seaborn as sns
def main():
""" Main routine for the sampling histogram program """
sns.set_style('whitegrid')
markers_list = ["s", "o", "*", "^", "+"]
# create the data dataframe as df_orig
df_orig = pd.read_csv('lab_samples.csv')
df_orig = df_orig.loc[df_orig.hra != -9999]
hra_list_unique = df_orig.hra.unique().tolist()
# create and subset df_hra_colors to match the actual hra colors in df_orig
df_hra_colors = pd.read_csv('hra_lookup.csv')
df_hra_colors['hex'] = np.vectorize(rgb_to_hex)(df_hra_colors['red'], df_hra_colors['green'], df_hra_colors['blue'])
df_hra_colors.drop(labels=['red', 'green', 'blue'], axis=1, inplace=True)
df_hra_colors = df_hra_colors.loc[df_hra_colors['hra'].isin(hra_list_unique)]
# hard coding the current_component to pc1 here, we will extend it by looping
# through the list of components
current_component = 'pc1'
num_tests = 5
df_columns = df_orig.columns.tolist()
start_index = 5
for test in range(num_tests):
current_tests_list = df_columns[start_index:(start_index + num_tests)]
# now create the sns distplots for each HRA color and overlay the tests
i = 1
for _, row in df_hra_colors.iterrows():
plt.subplot(3, 3, i)
select_columns = ['hra', current_component] + current_tests_list
df_current_color = df_orig.loc[df_orig['hra'] == row['hra'], select_columns]
y_data = df_current_color.loc[df_current_color[current_component] != -9999, current_component]
axs = sns.distplot(y_data, color=row['hex'],
hist_kws={"ec":"k"},
kde_kws={"color": "k", "lw": 0.5})
data_x, data_y = axs.lines[0].get_data()
axs.text(0.0, 1.0, row['hra'], horizontalalignment="left", fontsize='x-small',
verticalalignment="top", transform=axs.transAxes)
for current_test_index, current_test in enumerate(current_tests_list):
# this_x defines the series of current_component(pc1,pc2,rhob) for this test
# indicated by 1, corresponding R program calls this test_vector
x_series = df_current_color.loc[df_current_color[current_test] == 1, current_component].tolist()
for this_x in x_series:
this_y = np.interp(this_x, data_x, data_y)
axs.plot([this_x], [this_y - current_test_index * 0.05],
markers_list[current_test_index], markersize = 3, color='black')
axs.xaxis.label.set_visible(False)
axs.xaxis.set_tick_params(labelsize=4)
axs.yaxis.set_tick_params(labelsize=4)
i = i + 1
start_index = start_index + num_tests
# plt.show()
pp = PdfPages('plots.pdf')
pp.savefig()
pp.close()
def rgb_to_hex(red, green, blue):
"""Return color as #rrggbb for the given color values."""
return '#%02x%02x%02x' % (red, green, blue)
if __name__ == "__main__":
main()
The Pandas code works fine and it is doing what it is supposed to. It is my lack of knowledge and experience of using 'PdfPages' in Matplotlib that is the bottleneck. How can I show the header information in Python/Matplotlib/Seaborn that I can show in the corresponding R visalization. By the Header information, I mean What The R visualization has at the top before the histograms, i.e., 'pc1', MRP, XRD,....
I can get their values easily from my program, e.g., current_component is 'pc1', etc. But I don't know how to format the plots with the Header. Can someone provide some guidance?
You may be looking for a figure title or super title, fig.suptitle:
fig.suptitle('this is the figure title', fontsize=12)
In your case you can easily get the figure with plt.gcf(), so try
plt.gcf().suptitle("pc1")
The rest of the information in the header would be called a legend.
For the following let's suppose all subplots have the same markers. It would then suffice to create a legend for one of the subplots.
To create legend labels, you can put the labelargument to the plot, i.e.
axs.plot( ... , label="MRP")
When later calling axs.legend() a legend will automatically be generated with the respective labels. Ways to position the legend are detailed e.g. in this answer.
Here, you may want to place the legend in terms of figure coordinates, i.e.
ax.legend(loc="lower center",bbox_to_anchor=(0.5,0.8),bbox_transform=plt.gcf().transFigure)
Is there any possibility to do a bar plot without y-(x-)axis? In presentations all redundant informations have to be erased, so I would like to begin to delete the axis. I did not see helpful informations in the matplotlib documentation. Maybe you have better solutions than pyplot..?
Edit: I would like to have lines around the bars except the axis at the bottom. Is this possible
#!/usr/bin/env python
import matplotlib.pyplot as plt
ind = (1,2,3)
width = 0.8
fig = plt.figure(1)
p1 = plt.bar(ind,ind)
# plt.show()
fig.savefig("test.svg")
Edit: I did not see using plt.show()
that there is still the yaxis without ticks.
To make the axes not visible, try something like
import matplotlib.pyplot as plt
ind = (1,2,3)
width = 0.8
fig,a = plt.subplots()
p1 = a.bar(ind,ind)
a.xaxis.set_visible(False)
a.yaxis.set_visible(False)
plt.show()
Is this what you meant?
Here is the code I used at the end. It is not minimal anymore. Maybe it helps.
import matplotlib.pyplot as plt
import numpy as np
def adjust_spines(ax,spines):
for loc, spine in ax.spines.items():
if loc in spines:
spine.set_smart_bounds(True)
else:
spine.set_color('none') # don't draw spine
# turn off ticks where there is no spine
if 'left' in spines:
ax.yaxis.set_ticks_position('left')
else:
# no yaxis ticks
ax.yaxis.set_ticks([])
def nbar(samples, data, err, bWidth=0.4, bSafe=True, svgName='out'):
fig,a = plt.subplots(frameon=False)
if len(data)!=len(samples):
print("length(data) must be equal to length(samples)!")
return
ticks = np.arange(len(data))
p1 = plt.bar(ticks, data, bWidth, yerr=err)
plt.xticks(ticks+bWidth/2., samples )
adjust_spines(a,['bottom'])
a.xaxis.tick_bottom()
if bSafe:
fig.savefig(svgName+".svg")
samples = ('Sample1', 'Sample2','Sample3')
qyss = (91, 44, 59)
qysserr = (1,5,4)
nbar(samples,qyss,qysserr,svgName="test")
Thx to all contributors.