Python matplotlib: legend gives wrong result for scatter

Python matplotlib: legend gives wrong result for scatter - python

I'm trying to visualize fashion MNIST dataset with different dimensional reduction techniques and I also want to attach the legend to resulted picture with so called real_labels which tells the real name of the label. For fashion MNIST real labels are:
real_labels = ['t-shirt','trouser','pullover','dress','coat','sandal','shirt','sneaker','bag','ankle boot']
I'm doing the plotting part inside of following fucntion:
def Draw_datasamples_to_figure(X_scaled, labels, axis):
y = ['${}$'.format(i) for i in labels]
num_cls = len(list(set(labels)))
for (X_plot, Y_plot, y1, label1) in zip(X_scaled[:,0], X_scaled[:,1], y, labels):
axis.scatter(X_plot, Y_plot, color=cm.gnuplot(int(label1)/num_cls),label=y1, marker=y1, s=60)
, where X_scaled tells x and y coordinate, labels are integer numbers (0-9) for class information and axis tells in which subplot window picture will be drawn.
The legend is drawn with following command:
ax3.legend(real_labels, loc='center left', bbox_to_anchor=(1, 0.5))
Everything seems to work pretty well until the legend is drawn to picture. As you can see from picture in below, instead of numbers goes from 0 to 9, the chosen numbers in legend are arbitrary.
I know that the problem is probably in scatter part and I should implement it in another way but I hope that there is still something simple that I miss which can fix my implementation. I don't want either to use hand-made legend in where markers and names are defined in the code because I have also other datasets with different classes and real label names. Thanks in advance!

Related

How to align y labels when using 'add_axes'?

I have created a plot with some data points (in blue), a fit (in purple) and I have managed to include the fit residuals (fit-datapoints) by using 'add_axes' as shown below:
#Plot and fit:
fig1 = plt.figure(1)
frame1 = fig1.add_axes((.1,.3,.8,.6))
plt.scatter(a/nshots,m/nshots,zorder=-1,s=1)
plt.plot(a/nshots,fit(a/nshots),color='purple')
plt.xlabel(r'$a_i/N_s$ (mV)')
plt.ylabel(r'$m_i/N_s$ (count/$N_s$)')
plt.tick_params(axis='both',which='both',direction='in',right=True,top=True)
#Residuals:
frame2=fig1.add_axes((.1,.1,.8,.2))
plt.scatter(a/nshots,m/nshots-fit(a/nshots),zorder=-1,s=1,color='pink')
plt.xlabel(r'$a_i/N_s$ (mV)')
plt.ylabel(r'residuals')
plt.tick_params(axis='both',which='both',direction='in',right=True,top=True)
However, I cannot seem to align the y labels on the resulting figure:
I have tried using things like plt.gca().yaxis.set_label_coords(-0.1,0.1) and plt.gca().yaxis.labelpad=20 but I would very much prefer an approach where alignment is automated and I need not align the labels by hand.
Thank you very much for your help.

Plot colours in custom function (matplotlib)

I am attempting to write a function that can plot a best fit curve and its original data points. I would ideally like to run the function for 4-5 data sets and have them all appear on the same figure. The function I have at the moment does this well for plotting the best fit curve, but when I add in the individual data points they show up as a different colour to the best fit curve.
I would like them both to be the same colour so that when I run the function 4-5 times it is not too messy with 10 or so different colours. Ideally I would like the output to be like this
My code:
def plot(k, w, lab):
popt, pcov = cf(linfunc, np.log(k), np.log(w))
yfit = linfunc(np.log(k), *popt)
plt.plot(np.log(k), yfit, '-', label = lab)
plt.plot(np.log(k), np.log(w), 'o')
plt.legend();
plot(k2ml, w2ml, '2ml')
Additionally, is there a way that I could make my function take any input for the parameter "lab" and have it automatically converted to a string so it can be used in the legend?

So what You want is to plot line and it's fit in the same colors.
To achieve Your goal, You can plot first line, get it's color and then set this color to the fit line.
Here is small code snippet doing that:
# Plot first line and get list of plotted lines
lines = plt.plot([0,1,2,3,4], [5,6,7,8,9])
# Get color of first (and only) line
line_color = lines[0].get_color()
# Plot Your fit with same color parameter
plt.plot([0,1,2,3,4], [0,1,2,3,4], color=line_color)
As for label, I would just convert it into string with str(lab).

How to reproduce this legend with multiple curves?

I've been working hard on a package of functions for my work, and I'm stuck on a layout problem. Sometimes I need to work with a lot of columns subplots (1 row x N columns) and the standard matplotlib legend sometimes is not helpful and makes it hard to visualize all the data.
I've been trying to create something like the picture below. I already tried to create a subplot for the curves and another one for the legends (and display the x-axis scale as a horizontal plot). Also, I tried to spine the x-axis, but when I have a lot of curves plotted inside the same subplots the legend becomes huge.
The following image is from a software. I'd like to create a similar look. Notice that these legends are "static": it remains fixed independent of the zooming. Another observation is, I don't need all the ticks or anything like that.
What I'm already have is the following (the code is a mess, becouse I'm trying many different solutions and it is not organized nor pythonic yet.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,2, sharey = True)
ax[0].semilogx(np.zeros_like(dados.Depth)+0.02, dados.Depth)
ax[0].semilogx(dados.AHT90, dados.Depth, label = 'aht90')
ax[0].set_xlim(0.2,2000)
ax[0].grid(True, which = 'both', axis = 'both')
axres1 = ax[0].twiny()
axres1.semilogx(dados.AHT90, dados.Depth, label = 'aht90')
axres1.set_xlim(0.2 , 2000)
axres1.set_xticks(np.logspace(np.log10(0.2),np.log10(2000),2))
axres1.spines["top"].set_position(("axes", 1.02))
axres1.get_xaxis().set_major_formatter(matplotlib.ticker.ScalarFormatter())
axres1.tick_params(axis='both', which='both', labelsize=6)
axres1.set_xlabel('sss')#, labelsize = 5)
axres2 = ax[0].twiny()
axres2.semilogx(dados.AHT10, dados.Depth, label = 'aht90')
axres2.set_xlim(0.2 , 2000)
axres2.set_xticks(np.logspace(np.log10(0.2),np.log10(2000),2))
axres2.spines["top"].set_position(("axes", 1.1))
axres2.get_xaxis().set_major_formatter(matplotlib.ticker.ScalarFormatter())
axres2.tick_params(axis='both', which='both', labelsize=6)
axres2.set_xlabel('aht10')#, labelsize = 5)
fig.show()
and the result is:
But well, I'm facing some issues on make a kind of make it automatic. If I add more curves, the prameter "set position" it is not practical to keep setting the position "by hand"
set_position(("axes", 1.02))
and another problem is, more curves I add, that kind of "legend" keep growing upward, and I have to adjust the subplot size with
fig.subplots_adjust(top=0.75)
And I'm also want to make the adjustment automatic, without keeping updating that parameter whenever I add more curves

Plot two datasets at same position based on their index

I'm trying to plot two datasets (called Height and Temperature) on different y axes.
Both datasets have the same length.
Both datasets are linked together by a third dataset, RH.
I have tried to use matplotlib to plot the data using twiny() but I am struggling to align both datasets together on the same plot.
Here is the plot I want to align.
The horizontal black line on the figure is defined as the 0°C degree line that was found from Height and was used to test if both datasets, when plotted, would be aligned. They do not. There is a noticable difference between the black line and the 0°C tick from Temperature.
Rather than the two y axes changing independently from each other I would like to plot each index from Height and Temperature at the same y position on the plot.
Here is the code that I used to create the plot:
#Define number of subplots sharing y axis
f, ax1 = plt.subplots()
ax1.minorticks_on()
ax1.grid(which='major',axis='both',c='grey')
#Set axis parameters
ax1.set_ylabel('Height $(km)$')
ax1.set_ylim([np.nanmin(Height), np.nanmax(Height)])
#Plot RH
ax1.plot(RH, Height, label='Original', lw=0.5)
ax1.set_xlabel('RH $(\%)$')
ax2 = ax1.twinx()
ax2.plot(RH, Temperature, label='Original', lw=0.5, c='black')
ax2.set_ylabel('Temperature ($^\circ$C)')
ax2.set_ylim([np.nanmin(Temperature), np.nanmax(Temperature)])
Any help on this would be amazing. Thanks.

Maybe the atmosphere is wrong. :)
It sounds like you are trying to align the two y axes at particular values. Why are you doing this? The relationship of Height vs. Temperature is non-linear, so I think you are setting the stage for a confusing graph. Any particular line you plot can only be interpreted against one vertical axis.
If needed, I think you will be forced to "do some math" on the limits of the y axes. This link may be helpful:
align scales

Matplotlib markers which plot and render fast

I'm using matplotlib to plot 5 sets of approx. 400,000 data points each. Although each set of points is plotted in a different color, I need different markers for people reading the graph on black and white print-outs. The issue I'm facing is that almost all of the possible markers available in the documentation at http://matplotlib.org/api/markers_api.html take too much time to plot and render while displaying. I could only find two markers which plot and render quickly, these are '-' and '--'. Here's my code:
plt.plot(series1,'--',label='Label 1',lw=5)
plt.plot(series2,'-',label='Label 2',lw=5)
plt.plot(series3,'^',label='Label 3',lw=5)
plt.plot(series4,'*',label='Label 4',lw=5)
plt.plot(series5,'_',label='Label 5',lw=5)
I tried multiple markers. Series 1 and series 2 plot quickly and render in no time. But series 3, 4, and 5 take forever to plot and AGES to display.
I'm not able to figure out the reason behind this. Does someone know of more markers that plot and render quickly?

The first two ('--' and '-') are linestyles not markers. Thats why they are rendered faster.
It doesn't make sense to plot ~400,000 markers. You wont be able to see all of them... However, what you could do is to only plot a subset of the points.
So add the line with all your data (even though you could probably also subsample that too) and then add a second "line" with only the markers.
for that you need an "x" vectors, which you can subsample too:
# define the number of markers you want
nrmarkers = 100
# define a x-vector
x = np.arange(len(series3))
# calculate the subsampling step size
subsample = int(len(series3) / nrmarkers)
# plot the line
plt.plot(x, series3, color='g', label='Label 3', lw=5)
# plot the markers (using every `subsample`-th data point)
plt.plot(x[::subsample], series3[::subsample], color='g',
lw=5, linestyle='', marker='*')
# similar procedure for series4 and series5
Note: The code is written from scratch and not tested

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python matplotlib: legend gives wrong result for scatter - python

Related

How to align y labels when using 'add_axes'?

Plot colours in custom function (matplotlib)

How to reproduce this legend with multiple curves?

Plot two datasets at same position based on their index

Matplotlib markers which plot and render fast

Categories

Resources