Python: Highlighting, marking or indicating point in (scatter) plot - python

UPDATE
Trying some more, I managed to run this code without error:
from matplotlib.pyplot import figure
dict = pd.DataFrame({"Return": mkw_returns, "Standard Deviation": mkw_stds})
dict.head()
#plt.annotate("Sharpe Ratio", xytext=(0.5,0.5), xy=(0.03,0.03) , arrowprops=dict(facecolor='blue', shrink=0.01, width=220)) # arrowprops={width = 3, "facecolor":
#dict.plot(x="Standard Deviation", y = "Return", kind="scatter", figsize=(10,6))
#plt.xlabel("Standard Deviations")
#plt.ylabel("log_Return YoY")
figure(num=None, figsize=(15, 10), dpi=100, facecolor='w', edgecolor='k')
plt.plot( 'Standard Deviation', 'Return', data=dict, linestyle='none', marker='o')
plt.xlabel("Standard Deviations")
plt.ylabel("log_Return YoY")
# Annotate with text + Arrow
plt.annotate(
# Label and coordinate
'This is a Test', xy=(0.01, 1), xytext=(0.01, 1), color= "r", arrowprops={"facecolor": 'black', "shrink": 0.05}
)
Which now works YaY, can anybody shed some light onto this issue? Im not so sure why it suddenly started working. Thank you :)
Also, how would I simply mark a point, instead of using the arrow?
Problem: Cannot figure out how to mark/select/highlight a specific point in my scatter graph
(Python 3 Beginner)
So my goal is to highlight one or more points in a scatter graph with some text by it or supplied by a legend.
https://imgur.com/a/VWeO1EH
(not enough reputation to post images, sorry)
dict = pd.DataFrame({"Return": mkw_returns, "Standard Deviation": mkw_stds})
dict.head()
#plt.annotate("Sharpe Ratio", xytext=(0.5,0.5), xy=(0.03,0.03) , arrowprops=dict(facecolor='blue', shrink=0.01, width=220)) # arrowprops={width = 3, "facecolor":
dict.plot(x="Standard Deviation", y = "Return", kind="scatter", figsize=(10,6))
plt.xlabel("Standard Deviations")
plt.ylabel("log_Return YoY")
The supressed "plt.annotate" would give an error as specified below.
Specifically i would like to select the sharpe ratio, but for now Im happy if I manage to select any point in the scatter graph.
Im truly confused how to work with matplotlib, so any help is welcomed
I tried the following solutions I found online:
I)
This shows a simple way to use annotate in a plot, to mark a specific point by an arrow.
https://www.youtube.com/watch?v=ItHDZEE5wSk
However the pd.dataframe environment does not like annotate and i get the error:
TypeError: 'DataFrame' object is not callable
II)
Since Im running into issues with annotate in a Data Frame environment, I looked at the following solution
Annotate data points while plotting from Pandas DataFrame
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import string
df = pd.DataFrame({'x':np.random.rand(10), 'y':np.random.rand(10)},
index=list(string.ascii_lowercase[:10]))
fig, ax = plt.subplots()
df.plot('x', 'y', kind='scatter', ax=ax, figsize=(10,6))
for k, v in df.iterrows():
ax.annotate(k, v)
However the resulting plot does not show any annotation what so ever when applied to my problem, besides this very long horizontal scroll bar
https://imgur.com/a/O8ykmeg
III)
Further, I stumbled upon this solution, to use a marker instead of an arrow,
Matplotlib annotate with marker instead of arrow
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
x=[1,2,3,4,5,6,7,8,9,10]
y=[1,1,1,2,10,2,1,1,1,1]
line, = ax.plot(x, y)
ymax = max(y)
xpos = y.index(ymax)
xmax = x[xpos]
# Add dot and corresponding text
ax.plot(xmax, ymax, 'ro')
ax.text(xmax, ymax+2, 'local max:' + str(ymax))
ax.set_ylim(0,20)
plt.show()
however the code does absolutely nothing, when applied to my situation like so
dict = pd.DataFrame({"Return": mkw_returns, "Standard Deviation": mkw_stds})
dict.head()
plt.annotate("Sharpe Ratio", xytext=(0.5,0.5), xy=(0.03,0.03) , arrowprops=dict(facecolor='blue', shrink=0.01, width=220)) # arrowprops={width = 3, "facecolor":
dict.plot(x="Standard Deviation", y = "Return", kind="scatter", figsize=(10,6))
plt.xlabel("Standard Deviations")
plt.ylabel("log_Return YoY")
ymax = max(y)
xpos = y.index(ymax)
xmax = x[xpos]
# Add dot and corresponding text
ax.plot(xmax, ymax, 'ro')
ax.text(xmax, ymax+2, 'local max:' + str(ymax))
ax.set_ylim(0,20)
plt.show()
IV)
Lastly, I tried a solution that apparently works flawlessly with an arrow in a pd.dataframe,
https://python-graph-gallery.com/193-annotate-matplotlib-chart/
# Library
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Basic chart
df=pd.DataFrame({'x': range(1,101), 'y': np.random.randn(100)*15+range(1,101) })
plt.plot( 'x', 'y', data=df, linestyle='none', marker='o')
# Annotate with text + Arrow
plt.annotate(
# Label and coordinate
'This point is interesting!', xy=(25, 50), xytext=(0, 80),
# Custom arrow
arrowprops=dict(facecolor='black', shrink=0.05)
)
however running this code yields me the same error as above:
TypeError: 'DataFrame' object is not callable
Version:
import sys; print(sys.version)
3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)]
Sorry for the WoT, but I thought its best to have everything I tried together in one post.
Any help is appreciated, thank you!

I think one solution is the following, as posted above as the "UPDATE":
UPDATE
Trying some more, I managed to run this code without error:
from matplotlib.pyplot import figure
dict = pd.DataFrame({"Return": mkw_returns, "Standard Deviation": mkw_stds})
dict.head()
#plt.annotate("Sharpe Ratio", xytext=(0.5,0.5), xy=(0.03,0.03) , arrowprops=dict(facecolor='blue', shrink=0.01, width=220)) # arrowprops={width = 3, "facecolor":
#dict.plot(x="Standard Deviation", y = "Return", kind="scatter", figsize=(10,6))
#plt.xlabel("Standard Deviations")
#plt.ylabel("log_Return YoY")
figure(num=None, figsize=(15, 10), dpi=100, facecolor='w', edgecolor='k')
plt.plot( 'Standard Deviation', 'Return', data=dict, linestyle='none', marker='o')
plt.xlabel("Standard Deviations")
plt.ylabel("log_Return YoY")
# Annotate with text + Arrow
plt.annotate(
# Label and coordinate
'This is a Test', xy=(0.01, 1), xytext=(0.01, 1), color= "r", arrowprops={"facecolor": 'black', "shrink": 0.05}
)
One question remains, how can I use a different marker or color and write about it in the legend instead?
Thanks in advance :)

Related

Is it possible to plot multiple buffers in python

Im rather new to coding so sorry if my question is stupid, but i can't find a solution anywhere.
My question is if you can plot multiple buffers on top of eachother, with multiple colors? Im trying to make a map where i would like a buffer showing 20, 30 and 50km range from a coordinate. My try so far looks like this:
gdf = geopandas.GeoDataFrame(df, geometry=geopandas.points_from_xy(df.x, df.y), crs="EPSG:25832")
gdf30=gdf
gdf30['geometry'] = gdf30.geometry.buffer(30*1000)
gdf20=gdf
gdf20['geometry'] = gdf20.geometry.buffer(20*1000)
Map = geopandas.read_file("Map_DK_SWE.gpkg")
Map = Map.to_crs(25832)
fig,ax=plt.subplots()
Map.plot(ax=ax,color='white', edgecolor='black')
ax.set_ylim([6000000, 6500000])
ax.set_xlim([400000, 850000])
gdf30.plot(ax=ax, color='blue',zorder=2)
gdf20.plot(ax=ax, color='green',zorder=1)
[This is what i get from then code][1]
i dont know what exactly your issue is since I cant see your plot - but you can do it like this
from matplotlib import pyplot as plt
import geopandas as gpd
cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
centroid = cities[cities.name == 'Tokyo']
buffer_1 = cities[cities.name == 'Tokyo'].geometry.buffer(3)
buffer_2 = cities[cities.name == 'Tokyo'].geometry.buffer(2)
buffer_3 = cities[cities.name == 'Tokyo'].geometry.buffer(1)
f, ax = plt.subplots()
# plot basemap
world.plot(edgecolor='k', facecolor='w', ax=ax)
# plot buffers
buffer_1.plot(color='r', label='buffer 1', ax=ax, alpha=.5)
buffer_2.plot(color='b', label='buffer 2', ax=ax, alpha=.5)
buffer_3.plot(color='g', label='buffer 3', ax=ax, alpha=.5)
# plot original coordinates
centroid.plot(marker='X', color='r', ax=ax)
# crop map to extent
ax.set_xlim(120, 145)
ax.set_ylim(25, 50)
plt.show()

Why can't I change the title of the x-axis and add a curve to a graph in a module containing a plotting function?

I am looking to study the evolution of the population using the pyworld3 module.
For this I entered the parameters and details I wanted. I get the result I wanted with my code.
Here is my code:
import pyworld3
from pyworld3 import World3
import numpy as np
import matplotlib.pyplot as plt
from pyworld3.utils import plot_world_variables
world3 = World3(year_min=1951, year_max=2100, dt=1)
world3.init_world3_constants(p1i=92e7,
p2i=70e7, p3i=19e7, p4i=6e7,
dcfsn=3,
fcest=4000, hsid=20, ieat=3,
len=42, # life expectancy normal.
lpd=20, mtfn=12, pet=4000, rlt=30, sad=20,
zpgt=4000, ici=2.1e11, sci=1.44e11, iet=4000,
iopcd=400,lfpf=0.75, lufdt=2, icor1=3, icor2=3,
scor1=1,
scor2=1, alic1=14, alic2=14, alsc1=20, alsc2=20,
fioac1=0.43, fioac2=0.43,
ali=0.9e9, pali=2.3e9, lfh=0.7, palt=3.2e9,
pl=0.1, alai1=2, alai2=2, io70=7.9e11, lyf1=1,
lyf2=1, sd=0.07, uili=8.2e6, alln=6000, uildt=10,
lferti=600, ilf=600, fspd=2, sfpc=230,
ppoli=2.5e7, ppol70=1.36e8, ahl70=1.5, amti=1,
imti=10, imef=0.1, fipm=0.001, frpm=0.02,
ppgf1=1, ppgf2=1, ppgf21=1, pptd1=20, pptd2=20,
nri=1e12, nruf1=1, nruf2=1)
world3.init_world3_variables()
world3.set_world3_table_functions(json_file=None)
world3.set_world3_delay_functions(method= 'odeint')
world3.run_world3()
plot_world_variables(world3.time,
[world3.nrfr, world3.iopc, world3.fpc, world3.pop,
world3.ppolx],
["NRFR", "IOPC", "FPC", "POP", "PPOLX"],
[[0, 1], [0, 1e3], [0, 1e3], [5e9, 12e9], [0, 32]],
# img_background="./img/fig7-7.png",
figsize=(12, 8),
title="Evolution of the world population",
grid=True)
Here is the output I get:
However I would like to change the title of the x-axis and also add a curve on the graph with plt.plot.
I can choose the title I want to give to the graph because there is an argument for that in plot_world_variables but there is no argument to choose the title of the x-axis.
So I tried to make these changes with plt.gcf() and plt.gca().
Here is what I added after my previous code:
# First we get its Axes:
axes: plt.Axes = plt.gcf().gca()
# On it, we can plot:
X = np.linspace(-2, 0, 100)
Y = X2*2-1
axes.plot(X2, Y2, label="another curve")
plt.legend()
# And adjust things:
axes.set_xlabel("Year")
plt.show()
I don't get an error when adding this code. In fact, I get nothing at all. Nothing changes when I run the code. Python gives me exactly the same output as the one I got before.
Where do you think this problem comes from and how can I fix it?
P.S.: I saw that someone had asked the same question as me formerly but even reading his post I still can't figure out my problem.
Sadly, plot_world_variables doesn't return anything. A quick and dirty solution: you can easily copy the source code of that function and apply the necessary edits. I've looked at it and there is nothing fancy going on, easy edit to do :)
EDIT: source code of that function.
from matplotlib.ticker import EngFormatter
from matplotlib.image import imread
from numpy import isnan
import matplotlib.pyplot as plt
def plot_world_variables(time, var_data, var_names, var_lims,
img_background=None,
title=None,
figsize=None,
dist_spines=0.09,
grid=False):
"""
Plots world state from an instance of World3 or any single sector.
"""
prop_cycle = plt.rcParams['axes.prop_cycle']
colors = prop_cycle.by_key()['color']
var_number = len(var_data)
fig, host = plt.subplots(figsize=figsize)
axs = [host, ]
for i in range(var_number-1):
axs.append(host.twinx())
fig.subplots_adjust(left=dist_spines*2)
for i, ax in enumerate(axs[1:]):
ax.spines["left"].set_position(("axes", -(i + 1)*dist_spines))
ax.spines["left"].set_visible(True)
ax.yaxis.set_label_position('left')
ax.yaxis.set_ticks_position('left')
if img_background is not None:
im = imread(img_background)
axs[0].imshow(im, aspect="auto",
extent=[time[0], time[-1],
var_lims[0][0], var_lims[0][1]], cmap="gray")
ps = []
for ax, label, ydata, color in zip(axs, var_names, var_data, colors):
ps.append(ax.plot(time, ydata, label=label, color=color)[0])
axs[0].grid(grid)
axs[0].set_xlim(time[0], time[-1])
for ax, lim in zip(axs, var_lims):
ax.set_ylim(lim[0], lim[1])
for ax_ in axs:
formatter_ = EngFormatter(places=0, sep="\N{THIN SPACE}")
ax_.tick_params(axis='y', rotation=90)
ax_.yaxis.set_major_locator(plt.MaxNLocator(5))
ax_.yaxis.set_major_formatter(formatter_)
tkw = dict(size=4, width=1.5)
axs[0].set_xlabel("time [years] asd")
axs[0].tick_params(axis='x', **tkw)
for i, (ax, p) in enumerate(zip(axs, ps)):
ax.set_ylabel(p.get_label(), rotation="horizontal")
ax.yaxis.label.set_color(p.get_color())
ax.tick_params(axis='y', colors=p.get_color(), **tkw)
ax.yaxis.set_label_coords(-i*dist_spines, 1.01)
if title is not None:
fig.suptitle(title, x=0.95, ha="right", fontsize=10)
plt.tight_layout()
Now you can copy it and modify it to your liking.

How to solve " 'PathCollection' object has no attribute 'yaxis' " error?

I'm a MSc Student and I used to make graphs and plots with commercial packages like OriginPro, Excel and Matlab. Although these softwares provide a great user experience, there are some major disadvantages as they are specific OS dependent and, in general, very expensive.
Hence, I started to learn Python using matplotlib library with VS Code, however I'm having some problems with some library functions and statements that seems to be standard from matplotlib and numPy, but it doesnt work.
For example, I'm making some templates for scatter plots and I can't control minor ticks because it doesn't recognize the statements xaxix and yaxix:
Sample of the code:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, AutoMinorLocator
.
.
.
fig = plt.figure(figsize=(x_pixels/my_dpi, y_pixels/my_dpi), dpi=my_dpi)
ax = plt.scatter(x*format_x, y*format_y, s = size, alpha = transparency, color = color, label = legend_text)
.
.
.
# Major Ticks
plt.tick_params(axis = 'both', which = 'major', length = majorT_length, direction = majorT_direction, color = majorT_color, labelsize = label_size, top = 'on', right = 'on')
# Minor Ticks
plt.minorticks_on()
plt.tick_params(axis='both', which='minor', length = minorT_length, direction = minorT_direction, color = minorT_color, top = 'on', right = 'on')
ax.yaxis.set_minor_locator(AutoMinorLocator(2))
ax.xaxis.set_minor_locator(AutoMinorLocator(2))
# Figure Layout
plt.tight_layout()
plt.savefig(output_file, dpi=my_dpi, bbox_inches=borders)
plt.show()
and the Terminal show this error:
File "c:/Users/luagu/Desktop/Python Matplotlib Training/Scatter_Template.py", line 128, in <module>
ax.yaxis.set_minor_locator(AutoMinorLocator(2))
AttributeError: 'PathCollection' object has no attribute 'yaxis'
What I'm doing wrong?
Thanks in advance!
You wrote ax = plt.scatter but your ax here is an artist returned by the scatter method, not an Axes object. What you want to do is:
plt.scatter(...)
...
ax = plt.gca()
ax.yaxis.set_minor_locator(AutoMinorLocator(2))
ax.xaxis.set_minor_locator(AutoMinorLocator(2))

How to create a basic legend to a multicolored line?

I am currently finishing a bigger project and the last part is to add a simple legend to a plot of a multicolored line. The line only contains two different colors.
The following image shows the plot when created.
The next image shows the same plot with higher resolution.
The plot displays the distance between Earth and Mars over time. For the months March to August the line is orange, for the other months it's blue. The legend should come in a simple box in the upper right corner of the plot showing a label each for the used colors. Something like this would be nice.
The data for the plot comes from a huge matrix I named master_array. It contains a lot more information that is necessary for some tasks prior to show the plot this question is regarding to.
Important for the plot I am struggling with are the columns 0, 1 and 6 which are containing the date, distance between the planets at related date and in column 6 I set a flag to determine whether the given point belongs to the 'March to August' set or not (0 is for Sep-Feb / "winter", 1 is for Mar-Aug / "summer"). The master_array is a numpy array, dtype is float64. It contains approximately 45k data points.
It looks like:
In [3]: master_array
Out[3]:
array([[ 1.89301010e+07, 1.23451036e+00, -8.10000000e+00, ...,
1.00000000e+00, 1.00000000e+00, 1.89300000e+03],
[ 1.89301020e+07, 1.24314818e+00, -8.50000000e+00, ...,
2.00000000e+00, 1.00000000e+00, 1.89300000e+03],
[ 1.89301030e+07, 1.25179997e+00, -9.70000000e+00, ...,
3.00000000e+00, 1.00000000e+00, 1.89300000e+03],
...,
[ 2.01903100e+07, 1.84236878e+00, 7.90000000e+00, ...,
1.00000000e+01, 3.00000000e+00, 2.01900000e+03],
[ 2.01903110e+07, 1.85066892e+00, 5.50000000e+00, ...,
1.10000000e+01, 3.00000000e+00, 2.01900000e+03],
[ 2.01903120e+07, 1.85894904e+00, 9.40000000e+00, ...,
1.20000000e+01, 3.00000000e+00, 2.01900000e+03]])
This is the function to get the plot I described in the beginning:
def md_plot3(dt64=np.array, md=np.array, swFilter=np.array):
""" noch nicht fertig """
y, m, d = dt64.astype(int) // np.c_[[10000, 100, 1]] % np.c_[[10000, 100, 100]]
dt64 = y.astype('U4').astype('M8') + (m-1).astype('m8[M]') + (d-1).astype('m8[D]')
cmap = ListedColormap(['b','darkorange'])
plt.figure('zeitlich-global betrachtet')
plt.title("Marsdistanz unter Berücksichtigung der Halbjahre der steigenden und sinkenden Temperaturen",
loc='left', wrap=True)
plt.xlabel("Zeit in Jahren\n")
plt.xticks(rotation = 45)
plt.ylabel("Marsdistanz in AE\n(1 AE = 149.597.870,7 km)")
# plt.legend(loc='upper right', frameon=True) # worked formerly
ax=plt.gca()
plt.style.use('seaborn-whitegrid')
#convert dates to numbers first
inxval = mdates.date2num(dt64)
points = np.array([inxval, md]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1],points[1:]], axis=1)
lc = LineCollection(segments, cmap=cmap, linewidth=3)
# set color to s/w values
lc.set_array(swFilter)
ax.add_collection(lc)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
ax.autoscale_view()
In the bigger script there is also another function (scatter plot) to mark the minima and maxima of the curve, but I guess this is not so important here.
I already tried this resulting in a legend, that shows a vertical colorbar and only one label and also both options described in the answers to this question because it looks more like what I am aiming for but couldn't make it work for my case.
Maybe I should add that I am only a beginner in python, this is my first project so I am not familiar with the deeper functionality of matplotlib what is probably the reason why I am not able to customize the mentioned answers to get it to work in my case.
UPDATE
Thanks to the help of the user ImportanceOfBeingErnest I made some improvements:
import matplotlib.dates as mdates
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap
from matplotlib.lines import Line2D
def md_plot4(dt64=np.array, md=np.array, swFilter=np.array):
y, m, d = dt64.astype(int) // np.c_[[10000, 100, 1]] % np.c_[[10000, 100, 100]]
dt64 = y.astype('U4').astype('M8') + (m-1).astype('m8[M]') + (d-1).astype('m8[D]')
z = np.unique(swFilter)
cmap = ListedColormap(['b','darkorange'])
fig = plt.figure('Test')
plt.title("Test", loc='left', wrap=True)
plt.xlabel("Zeit in Jahren\n")
plt.xticks(rotation = 45)
plt.ylabel("Marsdistanz in AE\n(1 AE = 149.597.870,7 km)")
# plt.legend(loc='upper right', frameon=True) # worked formerly
ax=plt.gca()
plt.style.use('seaborn-whitegrid')
#plt.style.use('classic')
#convert dates to numbers first
inxval = mdates.date2num(dt64)
points = np.array([inxval, md]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1],points[1:]], axis=1)
lc = LineCollection(segments, array=z, cmap=plt.cm.get_cmap(cmap),
linewidth=3)
# set color to s/w values
lc.set_array(swFilter)
ax.add_collection(lc)
fig.colorbar(lc)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
ax.autoscale_view()
def make_proxy(zvalue, scalar_mappable, **kwargs):
color = scalar_mappable.cmap(scalar_mappable.norm(zvalue))
return Line2D([0, 1], [0, 1], color=color, **kwargs)
proxies = [make_proxy(item, lc, linewidth=2) for item in z]
ax.legend(proxies, ['Winter', 'Summer'])
plt.show()
md_plot4(dt64, md, swFilter)
+What is good about it:
Well it shows a legend and it shows the right colors according to the labels.
-What is still to optimize:
1) The legend is not in a box and the 'lines' of the legend are interfering with the bottom layers of the plot. As the user ImportanceOfBeingErnest stated out this is caused by using plt.style.use('seaborn-whitegrid'). So if there's a way to use plt.style.use('seaborn-whitegrid') together with the legend style of plt.style.use('classic') that might would help.
2) The bigger issue is the colorbar. I added the fig.colorbar(lc) line to the original code to achieve what I was looking for according to this answer.
So I tried some other changes:
I used the plt.style.use('classic') to get a legend in the way I need it but this costs me the nice style of plt.style.use('seaborn-whitegrid') as mentioned before. Moreover I disabled the colorbar line I added prior according to the mentioned answer.
This is what I got:
import matplotlib.dates as mdates
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap
from matplotlib.lines import Line2D
def md_plot4(dt64=np.array, md=np.array, swFilter=np.array):
y, m, d = dt64.astype(int) // np.c_[[10000, 100, 1]] % np.c_[[10000, 100, 100]]
dt64 = y.astype('U4').astype('M8') + (m-1).astype('m8[M]') + (d-1).astype('m8[D]')
z = np.unique(swFilter)
cmap = ListedColormap(['b','darkorange'])
#fig =
plt.figure('Test')
plt.title("Test", loc='left', wrap=True)
plt.xlabel("Zeit in Jahren\n")
plt.xticks(rotation = 45)
plt.ylabel("Marsdistanz in AE\n(1 AE = 149.597.870,7 km)")
# plt.legend(loc='upper right', frameon=True) # worked formerly
ax=plt.gca()
#plt.style.use('seaborn-whitegrid')
plt.style.use('classic')
#convert dates to numbers first
inxval = mdates.date2num(dt64)
points = np.array([inxval, md]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1],points[1:]], axis=1)
lc = LineCollection(segments, array=z, cmap=plt.cm.get_cmap(cmap),
linewidth=3)
# set color to s/w values
lc.set_array(swFilter)
ax.add_collection(lc)
#fig.colorbar(lc)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
ax.autoscale_view()
def make_proxy(zvalue, scalar_mappable, **kwargs):
color = scalar_mappable.cmap(scalar_mappable.norm(zvalue))
return Line2D([0, 1], [0, 1], color=color, **kwargs)
proxies = [make_proxy(item, lc, linewidth=2) for item in z]
ax.legend(proxies, ['Winter', 'Summer'])
plt.show()
md_plot4(dt64, md, swFilter)
+What is good about it:
It shows the legend in the way I need it.
It doesn't show a colorbar anymore.
-What is to optimize:
The plot isn't multicolored anymore.
Neither is the legend.
The classic style is not what I was looking for as I explained before...
So if anyone has a good advice please let me know!
I am using numpy version 1.16.2 and matplotlib version 3.0.3
To get a multicoloured plot in matplotlib, label your plots and then call the legend() function. The following sample code is taken from a link, but as links break, here's the post..
The chart used here is a line, but the same principle applies to other chart types, as you can see from this other SO answer
import matplotlib.pyplot as plt
import numpy as np
y = [2,4,6,8,10,12,14,16,18,20]
y2 = [10,11,12,13,14,15,16,17,18,19]
x = np.arange(10)
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(x, y, label='$y = numbers')
ax.plot(x, y2, label='$y2 = other numbers')
plt.title('Legend inside')
ax.legend()
plt.show()
This code will show the following image (with the legend inside the chart)
Hope this helps
So here is the answer how to create a basic legend to a multicolored line, containing multiple labels for each used color and without showing a colorbar next to the plot (standard colorbar, nothing inside the legend; see update of original question for more information about the issues):
Thanks to a lot of helpful comments I figured out to add a norm to the LineCollection() to avoid ending up with a monocolored line when removing the colorbar by disabling fig.colorbar() (also see this)
The additional argument (in this case "norm") to add was norm=plt.Normalize(z.min(), z.max()), where z is the array that contains the information responsible for the different colors of the segments. Note that z only needs to hold one single element for each different color. This is why I wrapped my swFilter array, consisting of one flag per data point, into np.unique().
To get a proper legend inside a box not touching the plt.style.use(), I simply had to add the right arguments to ax.legend(). In my case a simple frameon=True did the job.
The result is the following:
Here is the code:
import matplotlib.dates as mdates
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap
from matplotlib.lines import Line2D
def md_plot4(dt64=np.array, md=np.array, swFilter=np.array):
y, m, d = dt64.astype(int) // np.c_[[10000, 100, 1]] % np.c_[[10000, 100, 100]]
dt64 = y.astype('U4').astype('M8') + (m-1).astype('m8[M]') + (d-1).astype('m8[D]')
z = np.unique(swFilter)
cmap = ListedColormap(['b','darkorange'])
#fig =
plt.figure('Test')
plt.title("Marsdistanz unter Berücksichtigung der Halbjahre der steigenden und sinkenden Temperaturen\n",
loc='left', wrap=True)
plt.xlabel("Zeit in Jahren\n")
plt.xticks(rotation = 45)
plt.ylabel("Marsdistanz in AE\n(1 AE = 149.597.870,7 km)")
plt.tight_layout()
ax=plt.gca()
plt.style.use('seaborn-whitegrid')
#convert dates to numbers first
inxval = mdates.date2num(dt64)
points = np.array([inxval, md]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1],points[1:]], axis=1)
lc = LineCollection(segments, array=z, cmap=plt.cm.get_cmap(cmap),
linewidth=3, norm=plt.Normalize(z.min(), z.max()))
# set color to s/w values
lc.set_array(swFilter)
ax.add_collection(lc)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
ax.autoscale_view()
def make_proxy(zvalue, scalar_mappable, **kwargs):
color = scalar_mappable.cmap(scalar_mappable.norm(zvalue))
return Line2D([0, 1], [0, 1], color=color, **kwargs)
proxies = [make_proxy(item, lc, linewidth=2) for item in z]
ax.legend(proxies, ['Halbjahr der sinkenden \nTemperaturen',
'Halbjahr der steigenden \nTemperaturen'], frameon=True)
plt.show()
md_plot4(dt64, md, swFilter)
Note that I added plt.tight_layout() to ensure the title of the plot and the description of the axes are shown without any cut-offs in the window mode.
New issue now (resulting from adding tight_layout()) is that the plot gets horizontal compressed, even though there is much space available on the right side of the plot (the place where a colorbar would appear when called).
This requires another fix but currently I don't know how. So if anyone knows how to prevent the plots title and description of the axes from getting cut-off in window mode, I would be very grateful if you leave a comment.

Exception has occurred: TypeError:only size-1 arrays can be converted to Python scalars

That's my first post here.
I'm doing a project in Python about Football Scores statistics and prediction.
I got the ideas from this project and I was trying to recreate it, but it gives me an error like
this
I'm re-writing the code for my needs, but even if I do a copy and paste of the original one, it gives me the same error, while in the original post it seems to go all good.
That's the part of code incriminated:
ax1.bar(chel_home.index-0.4,chel_home.values,width=0.4,color="#034694",label="Chelsea")
And it just says that "only size-1 arrays can be converted to Python scalars", but I don't really know where the problem could be because that's one of my first approaches with Python.
The full code is this:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn
from scipy.stats import poisson,skellam
epl_1617 = pd.read_csv("http://www.football-data.co.uk/mmz4281/1617/E0.csv")
epl_1617 = epl_1617[['HomeTeam','AwayTeam','FTHG','FTAG']]
epl_1617 = epl_1617.rename(columns={'FTHG': 'HomeGoals', 'FTAG': 'AwayGoals'})
epl_1617.head()
epl_1617 = epl_1617[:-10]
epl_1617.mean()
# construct Poisson for each mean goals value
poisson_pred = np.column_stack([[poisson.pmf(i, epl_1617.mean()[j]) for i in range(8)] for j in range(2)])
# plot histogram of actual goals
plt.hist(epl_1617[['HomeGoals', 'AwayGoals']].values, range(9),
alpha=0.7, label=['Home', 'Away'],normed=True, color=["#FFA07A", "#20B2AA"])
# add lines for the Poisson distributions
pois1, = plt.plot([i-0.5 for i in range(1,9)], poisson_pred[:,0],
linestyle='-', marker='o',label="Home", color = '#CD5C5C')
pois2, = plt.plot([i-0.5 for i in range(1,9)], poisson_pred[:,1],
linestyle='-', marker='o',label="Away", color = '#006400')
leg=plt.legend(loc='upper right', fontsize=13, ncol=2)
leg.set_title("Poisson Actual ", prop = {'size':'14',
'weight':'bold'})
plt.xticks([i-0.5 for i in range(1,9)],[i for i in range(9)])
plt.xlabel("Goals per Match",size=13)
plt.ylabel("Proportion of Matches",size=13)
plt.title("Number of Goals per Match (EPL 2016/17 Season)",size=14,fontweight='bold')
plt.ylim([-0.004, 0.4])
plt.tight_layout()
plt.show()
# probability of draw between home and away team
skellam.pmf(0.0, epl_1617.mean()[0], epl_1617.mean()[1])
# probability of home team winning by one goal
skellam.pmf(1, epl_1617.mean()[0], epl_1617.mean()[1])
skellam_pred = [skellam.pmf(i, epl_1617.mean()[0], epl_1617.mean()[1]) for i in range(-6,8)]
plt.hist(epl_1617[['HomeGoals']].values - epl_1617[['AwayGoals']].values, range(-6,8),
alpha=0.7, label='Actual',normed=True)
plt.plot([i+0.5 for i in range(-6,8)], skellam_pred,
linestyle='-', marker='o',label="Skellam", color = '#CD5C5C')
plt.legend(loc='upper right', fontsize=13)
plt.xticks([i+0.5 for i in range(-6,8)],[i for i in range(-6,8)])
plt.xlabel("Home Goals - Away Goals",size=13)
plt.ylabel("Proportion of Matches",size=13)
plt.title("Difference in Goals Scored (Home Team vs Away Team)",size=14,fontweight='bold')
plt.ylim([-0.004, 0.26])
plt.tight_layout()
plt.show()
It works perfectly until this point, then there's the part that is giving me that error:
fig,(ax1,ax2) = plt.subplots(2, 1)
chel_home = epl_1617[epl_1617['HomeTeam']=='Chelsea'][['HomeGoals']].apply(pd.value_counts,normalize=True)
chel_home_pois = [poisson.pmf(i,np.sum(np.multiply(chel_home.values.T,chel_home.index.T),axis=1)[0]) for i in range(8)]
sun_home = epl_1617[epl_1617['HomeTeam']=='Sunderland'][['HomeGoals']].apply(pd.value_counts,normalize=True)
sun_home_pois = [poisson.pmf(i,np.sum(np.multiply(sun_home.values.T,sun_home.index.T),axis=1)[0]) for i in range(8)]
chel_away = epl_1617[epl_1617['AwayTeam']=='Chelsea'][['AwayGoals']].apply(pd.value_counts,normalize=True)
chel_away_pois = [poisson.pmf(i,np.sum(np.multiply(chel_away.values.T,chel_away.index.T),axis=1)[0]) for i in range(8)]
sun_away = epl_1617[epl_1617['AwayTeam']=='Sunderland'][['AwayGoals']].apply(pd.value_counts,normalize=True)
sun_away_pois = [poisson.pmf(i,np.sum(np.multiply(sun_away.values.T,sun_away.index.T),axis=1)[0]) for i in range(8)]
ax1.bar(chel_home.index-0.4,chel_home.values,width=0.4,color="#034694",label="Chelsea")
ax1.bar(sun_home.index,sun_home.values,width=0.4,color="#EB172B",label="Sunderland")
pois1, = ax1.plot([i for i in range(8)], chel_home_pois,
linestyle='-', marker='o',label="Chelsea", color = "#0a7bff")
pois1, = ax1.plot([i for i in range(8)], sun_home_pois,
linestyle='-', marker='o',label="Sunderland", color = "#ff7c89")
leg=ax1.legend(loc='upper right', fontsize=12, ncol=2)
leg.set_title("Poisson Actual ", prop = {'size':'14', 'weight':'bold'})
ax1.set_xlim([-0.5,7.5])
ax1.set_ylim([-0.01,0.65])
ax1.set_xticklabels([])
# mimicing the facet plots in ggplot2 with a bit of a hack
ax1.text(7.65, 0.585, ' Home ', rotation=-90,
bbox={'facecolor':'#ffbcf6', 'alpha':0.5, 'pad':5})
ax2.text(7.65, 0.585, ' Away ', rotation=-90,
bbox={'facecolor':'#ffbcf6', 'alpha':0.5, 'pad':5})
ax2.bar(chel_away.index-0.4,chel_away.values,width=0.4,color="#034694",label="Chelsea")
ax2.bar(sun_away.index,sun_away.values,width=0.4,color="#EB172B",label="Sunderland")
pois1, = ax2.plot([i for i in range(8)], chel_away_pois,
linestyle='-', marker='o',label="Chelsea", color = "#0a7bff")
pois1, = ax2.plot([i for i in range(8)], sun_away_pois,
linestyle='-', marker='o',label="Sunderland", color = "#ff7c89")
ax2.set_xlim([-0.5,7.5])
ax2.set_ylim([-0.01,0.65])
ax1.set_title("Number of Goals per Match (EPL 2016/17 Season)",size=14,fontweight='bold')
ax2.set_xlabel("Goals per Match",size=13)
ax2.text(-1.15, 0.9, 'Proportion of Matches', rotation=90, size=13)
plt.tight_layout()
plt.show()
Here another graph should appear, but instead it just says: "only size-1 arrays can be converted to Python scalars".
I don't really know what to do and I'm starting to go crazy, so I really hope that you can help me.
Thank you in advance and have a nice day everybody!
The problem is that your arrays for the bar plot are 2d arrays and you have to flatten them. This can be easily done using .flatten() which converts the 2d arrays in your code into 1-d arrays. If you look at chel_home.values, it looks like
array([[0.33333333],
[0.22222222],
[0.22222222],
[0.16666667],
[0.05555556]])
whereas what you need is
array([0.33333333, 0.22222222, 0.22222222, 0.16666667, 0.05555556])
Just replace the plotting commands in your code with the following lines
ax1.bar(chel_home.index-0.4,chel_home.values.flatten(),width=0.4,color="#034694",label="Chelsea")
ax1.bar(sun_home.index, sun_home.values.flatten(),width=0.4,color="#EB172B",label="Sunderland")
ax2.bar(chel_away.index-0.4,chel_away.values.flatten(),width=0.4,color="#034694",label="Chelsea")
ax2.bar(sun_away.index,sun_away.values.flatten(),width=0.4,color="#EB172B",label="Sunderland")
You can also use .ravel() instead of .flatten()

Categories