Memory leak when using matplotlib.collection.LineCollection

Memory leak when using matplotlib.collection.LineCollection - python

I am using the following code to create a collection of color coded line plots:
for j in idlist[i]:
single_traj(lonarray, latarray, parray)
plt.savefig(savename, dpi = 400)
plt.close('all')
plt.clf()
where:
def single_traj(lonarray, latarray, parray, linewidth = 0.7):
"""
Plots XY Plot of one trajectory, with color as a function of p
Helper Function for DrawXYTraj
"""
global lc
x = lonarray
y = latarray
p = parray
points = np.array([x,y]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = col.LineCollection(segments, cmap=plt.get_cmap('Spectral'),
norm=plt.Normalize(100, 1000), alpha = 0.8)
lc.set_array(p)
lc.set_linewidth(linewidth)
plt.gca().add_collection(lc)
Somehow, this loop uses a lot of memory (> ~10GB), which is still being used after the plot is saved.
I used hpy to look at memory usage
Partition of a set of 27472988 objects. Total size = 10990671168 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 8803917 32 9226505016 84 9226505016 84 dict of matplotlib.path.Path
1 8888542 32 711083360 6 9937588376 90 numpy.ndarray
2 8803917 32 563450688 5 10501039064 96 matplotlib.path.Path
3 11 0 219679112 2 10720718176 98 guppy.sets.setsc.ImmNodeSet
4 25407 0 77593848 1 10798312024 98 list
5 89367 0 28232616 0 10826544640 99 dict (no owner)
6 7642 0 25615984 0 10852160624 99 dict of matplotlib.collections.LineCollection
7 15343 0 16079464 0 10868240088 99 dict of
matplotlib.transforms.CompositeGenericTransform
8 15327 0 16062696 0 10884302784 99 dict of matplotlib.transforms.Bbox
9 53741 0 15047480 0 10899350264 99 dict of weakref.WeakValueDictionary
At this point the plot is already saved, so all matplotlib related objects should be gone... But I cant "find" these objects, which means I don't know how to delete them.
EDIT:
Here is a stand-alone example which reproduces the leak (savefig throws an error for some reason but isn't relevant anyway):
# Memory leak test!
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.collections as col
def draw():
x = range(1000)
y = range(1000)
p = range(1000)
fig = plt.figure(figsize = (12,8))
ax = plt.gca()
ax.set_aspect('equal')
for i in range(1000):
if i%100 == 0:
print i
points = np.array([x,y]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = col.LineCollection(segments, cmap=plt.get_cmap('Spectral'),
norm=plt.Normalize(0, 1000), alpha = 0.8)
lc.set_array(p)
lc.set_linewidth(0.7)
plt.gca().add_collection(lc)
cb = fig.colorbar(lc, shrink = 0.7)
cb.set_label('p')
cb.ax.invert_yaxis()
plt.tight_layout()
#plt.savefig('./mem_test.png', dpi = 400)
plt.close('all')
plt.clf()
draw()
a = input('Wait...')
The draw() function should delete all plt objects, but they still use up memory after the function is called. I just check it with top/htop!

It seems from your hpy dump that the memory hog consists of a large number of matplotlib.path.Paths. This may be due to your variable lc. Have you tried del lc? It may be that plt.close is not (at least should not be!) able to delete them, as they are in your global variable lc.

Related

Matplotlib PathPatch Colors and Legends not Matching

I have a dataset that is a list of lists.
Each list is a category to be plotted as a box plot.
Each list has a list of up to 9 components to be plotted into subplots.
The functions I am using is below was based on this answer. I pulled it out of my work and added some mock data. Should be a minimal example below.
neonDict = {
0:0, 1:1, 2:2, 3:3, 4:4, 5:5, 6:6, 7:7, 8:8
}
import matplotlib as mpl
import matplotlib.pyplot as plt
def coloredBoxPlot(axis, data,edgeColor,fillColor):
bp = axis.boxplot(data,vert=False,patch_artist=True)
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
plt.setp(bp[element], color=edgeColor)
for patch in bp['boxes']:
patch.set(facecolor=fillColor)
return bp
def plotCalStats(data, prefix='Channel', savedir=None,colors=['#00597c','#a8005c','#00aeea','#007d50','#400080','#e07800'] ):
csize = mpl.rcParams['figure.figsize']
cdpi = mpl.rcParams['figure.dpi']
mpl.rcParams['figure.figsize'] = (12,8)
mpl.rcParams['figure.dpi'] = 1080
pkdata = []
labels = []
lstyles = []
fg, ax = plt.subplots(3,3)
for pk in range(len(neonDict)):
px = pk // 3
py = pk % 3
ax[px,py].set_xlabel('Max Pixel')
ax[px,py].set_ylabel('')
ax[px,py].set_title(str(neonDict[pk]) + ' nm')
pkdata.append([])
for cat in range(len(data)):
bp = ''
for acal in data[cat]:
for apeak in acal.peaks:
pkdata[apeak].append(acal.peaks[apeak][0])
for pk in range(9):
px = pk // 3
py = pk % 3
bp = coloredBoxPlot(ax[px,py], pkdata[pk], colors[cat], '#ffffff')
if len(data[cat]) > 0:
#print(colors[cat])
#print(bp['boxes'][0].get_edgecolor())
labels.append(prefix+' '+str(cat))
lstyles.append(bp['boxes'][0])
fg.legend(lstyles,labels)
fg.suptitle('Calibration Summary by '+prefix)
fg.tight_layout()
if savedir is not None:
plt.savefig(savedir + 'Boxplots.png')
plt.show()
mpl.rcParams['figure.figsize'] = csize
mpl.rcParams['figure.dpi'] = cdpi
return
class acal:
def __init__(self):
self.peaks = {}
for x in range(9):
self.peaks[x] = (np.random.randint(20*x,20*(x+1)),)
mockData = [[acal() for y in range(100)] for x in range(6)]
#Some unused channels
mockData[2] = []
mockData[3] = []
mockData[4] = []
plotCalStats(mockData)
So the issue is that the plot colors do not match the legend. Even if I restrict the data to only add a label if data exists (ensuring thus there is no issue with calling boxplots with an empty data set and not getting an appropriate PathPatch.
The printouts verify the colors are correctly stored in the PathPatch. (I can add my digits -> hex converter) if that is questioned.
Attached is the output. One can see I get a purple box but no purple in the legend. Purple is the 4th category which is empty.
Any ideas why the labels don't match the actual style? Thanks much!
EDITS:
To address question on 'confusing'.
I have six categories of data, each category is coming from a single event. Each event has 9 components. I want to compare all events, for each individual component, for each category on a single plot as shown below.
Each subplot is a individual component comprised from the series of data for each categorical (Channel).
So the link I have provided, (like I said, is adapted from) shows how to create a single box plot on one axis for 2 data sets. I've basically done the same thing for 6 data sets on 9 axis, where 3 data sets are empty (but don't have to be, I did it to illustrate the issue. If I have all 6 data sets there, how can you tell the colors are messed up?????)
Regarding the alpha:
The alphas are always 'ff' when giving only RGB data to matplotlib. If I call get_edgecolors, it will return a tuple (RGBA) where A = 1.0.
See commented out print statement.
EDIT2:
If I restrict it down to a single category, it makes the box plot view less confusing.
Single Example (see how box plot color is orange, figure says it's blue)
All colors off
Feel like this used to work....

Uncertain how the error presented as it did, but the issue has to do with reformatting the data before creating the box plot.
By removing pkdata.append([]) during the creation of the subplots before looping the categories and adding:
pkdata = [[],[],[],[],[],[],[],[],[]] during each iteration of the category loop fixed the issue. The former was sending in all previous channel data...
Output is now better. Full sol attached.
Likely, since the plot uses data from pkdata, the empty channel (data[cat]) plotted previous data (from data[cat-1]) as that was still in pkdata (actually, all previous data[cat] was still in pkdata) which was then plotted. I only check data[cat] for data on each loop to add to the legend. The legend was set up for channels 0,1,5, for example.. but we saw data for channel: 0 as 0, 0+1 as 1, 0+1 as 2, 0+1 as 3, 0+1 as 4, 0+1+5 as 5... thus channel 4 (purple) had data to plot but wasn't added to the legend. Giving the impression of 'misaligned' legends but rather unlegend data...
The single channel data is actually all 6 channels overlapping, the final channel 5 color being orange, overlapping all previous, namely the original channel 0 data to whom the data belongs and was properly added to the legend.
neonDict = {
0:0, 1:1, 2:2, 3:3, 4:4, 5:5, 6:6, 7:7, 8:8
}
import matplotlib as mpl
import matplotlib.pyplot as plt
def getHex(r,g,b,a=1.0):
colors = [int(r * 255 ),int(g * 255 ),int(b * 255 ),int(a * 255) ]
s = '#'
for x in range(4):
cs = hex(colors[x])
if len(cs) == 3:
cs = cs + '0'
s += cs.replace('0x','')
return s
def getRGB(colstr):
try:
a = ''
r = int(colstr[1:3],16) / 255
g = int(colstr[3:5],16) / 255
b = int(colstr[5:7],16) / 255
if len (colstr) == 7:
a = 1.0
else:
a = int(colstr[7:],16) / 255
return (r,g,b,a)
except Exception as e:
print(e)
raise e
return
def compareHexColors(col1,col2):
try:
## ASSUME #RBG or #RBGA
## If less than 7, append the ff for the colors
if len(col1) < 9:
col1 += 'ff'
if len(col2) < 9:
col2 += 'ff'
return col1.lower() == col2.lower()
except Exception as e:
raise e
return False
def coloredBoxPlot(axis, data,edgeColor,fillColor):
bp = axis.boxplot(data,vert=False,patch_artist=True)
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
plt.setp(bp[element], color=edgeColor)
for patch in bp['boxes']:
patch.set(facecolor=fillColor)
return bp
def plotCalStats(data, prefix='Channel', savedir=None,colors=['#00597c','#a8005c','#00aeea','#007d50','#400080','#e07800'] ):
csize = mpl.rcParams['figure.figsize']
cdpi = mpl.rcParams['figure.dpi']
mpl.rcParams['figure.figsize'] = (12,8)
mpl.rcParams['figure.dpi'] = 1080
pkdata = []
labels = []
lstyles = []
fg, ax = plt.subplots(3,3)
for pk in range(len(neonDict)):
px = pk // 3
py = pk % 3
ax[px,py].set_xlabel('Max Pixel')
ax[px,py].set_ylabel('')
ax[px,py].set_title(str(neonDict[pk]) + ' nm')
for cat in range(len(data)):
bp = ''
pkdata = [[],[],[],[],[],[],[],[],[]]
for acal in data[cat]:
for apeak in acal.peaks:
pkdata[apeak].append(acal.peaks[apeak][0])
for pk in range(9):
px = pk // 3
py = pk % 3
bp = coloredBoxPlot(ax[px,py], pkdata[pk], colors[cat], '#ffffff')
if len(data[cat]) > 0:
print(compareHexColors(colors[cat],getHex(*bp['boxes'][0].get_edgecolor())))
labels.append(prefix+' '+str(cat))
lstyles.append(bp['boxes'][0])
fg.legend(lstyles,labels)
fg.suptitle('Calibration Summary by '+prefix)
fg.tight_layout()
if savedir is not None:
plt.savefig(savedir + 'Boxplots.png')
plt.show()
mpl.rcParams['figure.figsize'] = csize
mpl.rcParams['figure.dpi'] = cdpi
return
class acal:
def __init__(self,center):
self.peaks = {}
for x in range(9):
self.peaks[x] = [10*x + (center) + (np.random.randint(10)-1)/2.0,0,0]
mockData = [[acal(x) for y in range(1000)] for x in range(6)]
#Some unused channels
mockData[2] = []
mockData[3] = []
mockData[4] = []
plotCalStats(mockData)

Bar chart with customised width in Python

I have this dataframe df which contains -
Name Team Name Category Challenge Points Time
A B 1 1ABC 50 2019-11-04 07:37:02
D B 2 2ACE 150 2019-11-04 09:57:02
X P 4 4PQR 500 2019-11-05 08:45:02
A B 3 3PQR 10 2019-11-04 10:25:20
N P 4 4ABC 120 2019-11-05 08:35:00
C G 1 1ABC 50 2019-11-04 07:37:02
D B 4 4RST 200 2019-11-04 10:57:02
I have this ambitious plan of visualizing this dataset as a customised barchart where each team has a building (bar) made of different blocks of varying width (depending on the points asssociated with that challenge), and vertical order of blocks would be depending on the time (first one goes at the bottom). In short the plot for the above data should roughly look like this -
The different colours represent the different categories here. I know how to group the data by teams and then plot each teams number of attempts by -
df.groupby(['Team Name'])['Challenge'].count().plot.bar()
but beyond that, I'm clueless as to how to change the bar widths. Can someone help with this?
Alternatively, if someone has a better idea of how to visualise it using any of the conventional plots, I'd love to hear your opinions too.
Thanks!

Does this look like what you want?
You can accomplish this by manually plotting the 'blocks' via matplotlib.patches, it just requires some extra manipulation to do so algorithmically. Here is a complete example using the data supplied in the question
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import numpy as np
import pandas as pd
t20 = [(31, 119, 180), (174, 199, 232), (255, 127, 14), (255, 187, 120)]
for i in range(len(t20)):
r, g, b = t20[i]
t20[i] = (r / 255., g / 255., b / 255.)
fig, ax = plt.subplots(1)
df['Time'] = pd.to_datetime(df['Time'])
df = df.sort_values('Time')
cat = df['Category'].unique()
cidx = dict(zip(cat, range(len(cat))))
mw = max(df['Points'])
names = list(df['Team Name'].unique())
nt = len(names)
h = 0.5
hs = [0]*3
for ii in range(len(df.index)):
w = float(df['Points'].iloc[ii])/mw
idx = names.index(df['Team Name'].iloc[ii])
r = Rectangle((idx - w/2.0, hs[idx]), w, h, color=t20[cidx[df['Category'].iloc[ii]]])
hs[idx] += 0.5
ax.add_patch(r)
plt.xlim([-0.5, len(names)-0.5])
plt.ylim([0, max(hs)+3])
plt.xticks(range(len(names)), names)
plt.show()
I used the first 4 colors in the tableau 20 palette in case you were interested.
Edit
You can add a legend with the line
plt.legend(handles=[Patch(facecolor=t20[ii], label=cat[ii]) for ii in range(len(t20))])
as long as the additional import of Patches from matplotlib.patches is included, i.e.
from matplotlib.patches import Rectangle, Patch
And the output will be

2D bin (x,y) and calculate mean of values (c) of 10 deepest data points (z)

For a data set consisting of:
coordinates x, y
depth z
a certain value c
I would like to do the following more efficient:
bin the data set in 2D bins based on the coordinates (x, y)
take the 10 deepest data points (z) per bin
calculate the mean value of c of these 10 data points per bin
Finally show a 2d heatmap with the calculated mean values.
I have found a working solution, but this takes too long for small bins and/or large data sets.
Is there a more efficient way of achieving the same result?
Current working example
Example dataframe:
import numpy as np
from numpy.random import rand
import pandas as pd
import math
import matplotlib.pyplot as plt
n = 10000
df = pd.DataFrame({'x':rand(n), 'y':rand(n), 'z':rand(n), 'c':rand(n)})
Bin the data set:
cell_size = 0.01
nx = math.ceil((max(df['x']) - min(df['x'])) / cell_size)
ny = math.ceil((max(df['y']) - min(df['y'])) / cell_size)
x_range = np.arange(0, nx)
y_range = np.arange(0, ny)
df['xbin'], x_edges = pd.cut(x=df['x'], bins=nx, labels=x_range, retbins=True)
df['ybin'], y_edges = pd.cut(x=df['y'], bins=ny, labels=y_range, retbins=True)
Code that now takes to long:
df = df.groupby(['xbin', 'ybin']).apply(
lambda d: d.sort_values('z').head(10).mean())
Update an empty DataFrame for the bins without data and show result:
index = pd.MultiIndex.from_product([x_range, y_range],
names=['xbin', 'ybin'])
tot_df = pd.DataFrame(index=index, columns=['z', 'c'])
tot_df.update(df)
zval = tot_df['c'].astype('float').values
zval = zval.reshape((nx, ny))
zval = zval.T
zval = np.flipud(zval)
extent = [min(x_edges), max(x_edges), min(y_edges), max(y_edges)]
plt.matshow(zval, aspect='auto', extent=extent)
plt.show()

you can use np.searchsorted to bin the rows by x and y and then use groupby to take 10 deep values and calculate means. As groupby will maintains the order in each group you can sort values before applying bins. groupby will perform better without apply
df = pd.DataFrame({'x':rand(n), 'y':rand(n), 'z':rand(n), 'c':rand(n)})
df = df.sort_values("z", ascending=False)
bins = np.linspace(0, 1, 11)
df["bin_x"] = np.searchsorted(bins, df['x'].values) - 1
df["bin_y"] = np.searchsorted(bins, df['y'].values) - 1
result = df.groupby(["bin_x", "bin_y"]).head(10)
result.groupby(["bin_x", "bin_y"])["c"].mean()
Result
bin_x bin_y
0 0 0.369531
1 0.601803
2 0.554452
3 0.575464
4 0.455198
...
9 5 0.469838
6 0.420772
7 0.367549
8 0.379200
9 0.523083
Name: c, Length: 100, dtype: float64

Adding a single label to the legend for a series of different data points plotted inside a designated bin in Python using matplotlib.pyplot.plot()

I have a script for plotting astronomical data of redmapping clusters using a csv file. I could get the data points in it and want to plot them using different colors depending on their redshift values: I am binning the dataset into 3 bins (0.1-0.2, 0.2-0.25, 0.25,0.31) based on the redshift.
The problem arises with my code after I distinguish to what bin the datapoint belongs: I want to have 3 labels in the legend corresponding to red, green and blue data points, but this is not happening and I don't know why. I am using plot() instead of scatter() as I also had to do the best fit from the data in the same figure. So everything needs to be in 1 figure.
import numpy as np
import matplotlib.pyplot as py
import csv
z = open("Sheet4CSV.csv","rU")
data = csv.reader(z)
x = []
y = []
ylow = []
yupp = []
xlow = []
xupp = []
redshift = []
for r in data:
x.append(float(r[2]))
y.append(float(r[5]))
xlow.append(float(r[3]))
xupp.append(float(r[4]))
ylow.append(float(r[6]))
yupp.append(float(r[7]))
redshift.append(float(r[1]))
from operator import sub
xerr_l = map(sub,x,xlow)
xerr_u = map(sub,xupp,x)
yerr_l = map(sub,y,ylow)
yerr_u = map(sub,yupp,y)
py.xlabel("$Original\ Tx\ XCS\ pipeline\ Tx\ keV$")
py.ylabel("$Iterative\ Tx\ pipeline\ keV$")
py.xlim(0,12)
py.ylim(0,12)
py.title("Redmapper Clusters comparison of Tx pipelines")
ax1 = py.subplot(111)
##Problem starts here after the previous line##
for p in redshift:
for i in xrange(84):
p=redshift[i]
if 0.1<=p<0.2:
ax1.plot(x[i],y[i],color="b", marker='.', linestyle = " ")#, label = "$z < 0.2$")
exit
if 0.2<=p<0.25:
ax1.plot(x[i],y[i],color="g", marker='.', linestyle = " ")#, label="$0.2 \leq z < 0.25$")
exit
if 0.25<=p<=0.3:
ax1.plot(x[i],y[i],color="r", marker='.', linestyle = " ")#, label="$z \geq 0.25$")
exit
##There seems nothing wrong after this point##
py.errorbar(x,y,yerr=[yerr_l,yerr_u],xerr=[xerr_l,xerr_u], fmt= " ",ecolor='magenta', label="Error bars")
cof = np.polyfit(x,y,1)
p = np.poly1d(cof)
l = np.linspace(0,12,100)
py.plot(l,p(l),"black",label="Best fit")
py.plot([0,15],[0,15],"black", linestyle="dotted", linewidth=2.0, label="line $y=x$")
py.grid()
box = ax1.get_position()
ax1.set_position([box.x1,box.y1,box.width, box.height])
py.legend(loc='center left',bbox_to_anchor=(1,0.5))
py.show()
In the 1st 'for' loop, I have indexed every value 'p' in the list 'redshift' so that bins can be created using 'if' statement. But if I add the labels that are hashed out against each py.plot() inside the 'if' statements, each data point 'i' that gets plotted in the figure as an intersection of (x[i],y[i]) takes the label and my entire legend attains in total 87 labels (including the 3 mentioned in the code at other places)!!!!!!
I essentially need 1 label for each bin...
Please tell me what needs to done after the bins are created and py.plot() commands used...Thanks in advance :-)
Sorry I cannot post my image here due to low reputation!
The data 'appended' for x, y and redshift lists from the csv file are as follows:
x=[5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547]
y=[5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677]
redshift = [0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19]

Working with numerical data like this, you should really consider using a numerical library, like numpy.
The problem in your code arises from processing each record (a coordinate (x,y) and the corresponding value redshift) one at a time. You are calling plot for each point, thereby creating legends for each of those 84 datapoints. You should consider your "bins" as groups of data that belong to the same dataset and process them as such. You could use "logical masks" to distinguish between your "bins", as shown below.
It's also not clear why you call exit after each plotting action.
import numpy as np
import matplotlib.pyplot as plt
x = np.array([5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547])
y = np.array([5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677])
redshift = np.array([0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19])
bin3 = 0.25 <= redshift
bin2 = np.logical_and(0.2 <= redshift, redshift < 0.25)
bin1 = np.logical_and(0.1 <= redshift, redshift < 0.2)
plt.ion()
labels = ("$z < 0.2$", "$0.2 \leq z < 0.25$", "$z \geq 0.25$")
colors = ('r', 'g', 'b')
for bin, label, co in zip( (bin1, bin2, bin3), labels, colors):
plt.plot(x[bin], y[bin], color=co, ls='none', marker='o', label=label)
plt.legend()
plt.show()

Matplotlib: Same title for 8 plots plotted using loop

I have the following code which generates 8 plots. I want to put the phases as titles in each plot. So I have succeded to put the phase on the plot. But instead of taking corresponding phase, it is always taking the last phase to show in each plot. The 8phases.txt file has the following 8 lines which I want to put in each plot -
-1 1 -1
-1 1 1
1 1 1
1 -1 1
-1 -1 -1
1 1 -1
1 -1 -1
-1 -1 1
Here is the code -
import numpy as np
import matplotlib.pyplot as plt
D=12
n=np.arange(1,4)
x = np.linspace(-D/2,D/2, 3000)
I = np.array([125,300,75])
phase = np.genfromtxt('8phases.txt')
I_phase = I*phase
for i in I_phase:
F = sum(m*np.cos(2*np.pi*l*x/D) for m,l in zip(i,n))
f,(ax1,ax2) = plt.subplots(2)
for row in phase:
ax1.plot(x,F,'g')
ax1.set_title(row)
plt.show()

I think your inner-most loop is unnecessary; it is recreating the same plot 8 times and updating the title 8 times with each of the 8 values.
If I understood what you are asking for, I believe this gives the correct results:
...
for index,i in enumerate(I_phase):
F = sum(m*np.cos(2*np.pi*l*x/D) for m,l in zip(i,n))
f,(ax1,ax2) = plt.subplots(2)
ax1.plot(x,F,'g')
ax1.set_title(phase[index])
...
(I would normally use "i" instead of "index", but you had already used "i")

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Memory leak when using matplotlib.collection.LineCollection - python

It seems from your hpy dump that the memory hog consists of a large number of matplotlib.path.Paths. This may be due to your variable lc. Have you tried del lc? It may be that plt.close is not (at least should not be!) able to delete them, as they are in your global variable lc.

Related

Matplotlib PathPatch Colors and Legends not Matching

Bar chart with customised width in Python

2D bin (x,y) and calculate mean of values (c) of 10 deepest data points (z)

Adding a single label to the legend for a series of different data points plotted inside a designated bin in Python using matplotlib.pyplot.plot()

Matplotlib: Same title for 8 plots plotted using loop

Categories

Resources