store values and continue running function - python

I am using astropy to detect sources in an image. I am trying to write a function that will detect my sources, with an option of storing their coordinates in an array, and another option to plot the sources. This is my code:
def SourceDetect(file, SIG, FWHM, THRESH, store = False, PlotStars = False):
image = pf.open(file)
im = image[0].data
mean, median, std = sigma_clipped_stats(im, sigma=SIG)
daofind = DAOStarFinder(fwhm = FWHM, threshold = THRESH * std)
sources = daofind(im - median)
positions = np.transpose((sources['xcentroid'], sources['ycentroid']))
if(store):
return positions
if(PlotStars):
apertures = CircularAperture(positions, r = 6.)
norm = ImageNormalize(stretch = SqrtStretch())
plt.imshow(im, cmap = 'Greys', origin = 'lower', norm = norm,
interpolation = 'nearest')
apertures.plot(color = 'blue', lw = 1.5, alpha = 1)
for i in range(0, len(sources)):
plt.text(sources[i]['xcentroid'], sources[i]['ycentroid'], i, color='black');
However when I run this code with both store and plot set to True only the store part of the function runs and I can't get it to plot unless I make store False. Is there a way to write this code where I will be able to have my coordinates stored and plotted?

Simple change order - first PlotStars, next store
if PlotStars:
# ... code ...
if store:
return positions
and this will first display plot and later it exits function with value.
But if you want first get value and later plot then you should run it two times - first only with store=True and later only with PlotStars=True

Related

NameError: name 'IMG_H' is not defined

I am a new programming Interface. I am using the PIL and Matplotlib libraries for the contract streaching.When I am using the Histogram Equalizer I am getting the error as name 'IMG_H' is not defined.I am also Converting my image to numpy array, calculate the histogram, cumulative sum, mapping and then apply the mapping to create a new image.
You can see my code below -
# HISTOGRAM EQUALIZATION
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
def make_histogram(img):
""" Take an image and create a historgram from it's luma values """
y_vals = img[:,:,0].flatten()
histogram = np.zeros(256, dtype=int)
for y_index in range(y_vals.size):
histogram[y_vals[y_index]] += 1
return histogram
def make_cumsum(histogram):
""" Create an array that represents the cumulative sum of the histogram """
cumsum = np.zeros(256, dtype=int)
cumsum[0] = histogram[0]
for i in range(1, histogram.size):
cumsum[i] = cumsum[i-1] + histogram[i]
return cumsum
def make_mapping(histogram, cumsum):
mapping = np.zeros(256, dtype=int)
luma_levels = 256
for i in range(histogram.size):
mapping[i] = max(0, round((luma_levels*cumsum[i])/(IMG_H*IMG_W))-1)
return mapping
def apply_mapping(img, mapping):
""" Apply the mapping to our image """
new_image = img.copy()
new_image[:,:,0] = list(map(lambda a : mapping[a], img[:,:,0]))
return new_image
# Load image
pillow_img = Image.open('pout.jpg')
# Convert our image to numpy array, calculate the histogram, cumulative sum,
# mapping and then apply the mapping to create a new image
img = np.array(pillow_img)
histogram = make_histogram(img)
cumsum = make_cumsum(histogram)
mapping = make_mapping(histogram, cumsum)
new_image = apply_mapping(img, mapping)
output_image = Image.fromarray(np.uint8(new_image))
imshow(output_image, cmap='gray')
# Display the old (black) and new (red) histograms next to eachother
x_axis = np.arange(256)
fig = plt.figure()
fig.add_subplot(1,2,1)
plt.bar(x_axis , histogram, color = "black")
fig.add_subplot(1,2,2)
plt.bar(x_axis , make_histogram(new_image), color = "red")
plt.show()
You have this variable here:
mapping[i] = max(0, round((luma_levels*cumsum[i])/(IMG_H*IMG_W))-1)
But you didn't define it (or import) before, therefore you get this error.
for i in range(histogram.size):
mapping[i] = max(0, round((luma_levels*cumsum[i])/(IMG_H*IMG_W))-1)
In above stated line. you are using 2 Variables, IMG_H and IMG_W.
where you defined these variables?
EDITED PART
for i in range(histogram.size):
mapping[i] = max(0, round((luma_levels*cumsum[i])/(IMG_H*IMG_W))-1)
in the above stated line you are using 2 variables try to do multiplication (IMG_H*IMG_W) but you did not define and import these variables in the whole code.
You can do like this.
you can define these variables on the top of the code.
your code shows that these variables are defined for Image width and height
IMG_W = 120 #Any value in integer for Image Width
IMG_H = 124 #Any value in integer for Image Height

How to Create a Boxplot / Group Boxplot from [Min ,Q1 ,Q2 ,Q3 ,Max] in Python? [duplicate]

From what I can see, boxplot() method expects a sequence of raw values (numbers) as input, from which it then computes percentiles to draw the boxplot(s).
I would like to have a method by which I could pass in the percentiles and get the corresponding boxplot.
For example:
Assume that I have run several benchmarks and for each benchmark I've measured latencies ( floating point values ). Now additionally, I have precomputed the percentiles for these values.
Hence for each benchmark, I have the 25th, 50th, 75th percentile along with the min and max.
Now given these data, I would like to draw the box plots for the benchmarks.
As of 2020, there is a better method than the one in the accepted answer.
The matplotlib.axes.Axes class provides a bxp method, which can be used to draw the boxes and whiskers based on the percentile values. Raw data is only needed for the outliers, and that is optional.
Example:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
boxes = [
{
'label' : "Male height",
'whislo': 162.6, # Bottom whisker position
'q1' : 170.2, # First quartile (25th percentile)
'med' : 175.7, # Median (50th percentile)
'q3' : 180.4, # Third quartile (75th percentile)
'whishi': 187.8, # Top whisker position
'fliers': [] # Outliers
}
]
ax.bxp(boxes, showfliers=False)
ax.set_ylabel("cm")
plt.savefig("boxplot.png")
plt.close()
This produces the following image:
To draw the box plot using just the percentile values and the outliers ( if any ) I made a customized_box_plot function that basically modifies attributes in a basic box plot ( generated from a tiny sample data ) to make it fit according to your percentile values.
The customized_box_plot function
def customized_box_plot(percentiles, axes, redraw = True, *args, **kwargs):
"""
Generates a customized boxplot based on the given percentile values
"""
box_plot = axes.boxplot([[-9, -4, 2, 4, 9],]*n_box, *args, **kwargs)
# Creates len(percentiles) no of box plots
min_y, max_y = float('inf'), -float('inf')
for box_no, (q1_start,
q2_start,
q3_start,
q4_start,
q4_end,
fliers_xy) in enumerate(percentiles):
# Lower cap
box_plot['caps'][2*box_no].set_ydata([q1_start, q1_start])
# xdata is determined by the width of the box plot
# Lower whiskers
box_plot['whiskers'][2*box_no].set_ydata([q1_start, q2_start])
# Higher cap
box_plot['caps'][2*box_no + 1].set_ydata([q4_end, q4_end])
# Higher whiskers
box_plot['whiskers'][2*box_no + 1].set_ydata([q4_start, q4_end])
# Box
box_plot['boxes'][box_no].set_ydata([q2_start,
q2_start,
q4_start,
q4_start,
q2_start])
# Median
box_plot['medians'][box_no].set_ydata([q3_start, q3_start])
# Outliers
if fliers_xy is not None and len(fliers_xy[0]) != 0:
# If outliers exist
box_plot['fliers'][box_no].set(xdata = fliers_xy[0],
ydata = fliers_xy[1])
min_y = min(q1_start, min_y, fliers_xy[1].min())
max_y = max(q4_end, max_y, fliers_xy[1].max())
else:
min_y = min(q1_start, min_y)
max_y = max(q4_end, max_y)
# The y axis is rescaled to fit the new box plot completely with 10%
# of the maximum value at both ends
axes.set_ylim([min_y*1.1, max_y*1.1])
# If redraw is set to true, the canvas is updated.
if redraw:
ax.figure.canvas.draw()
return box_plot
USAGE
Using inverse logic ( code at the very end ) I extracted the percentile values from this example
>>> percentiles
(-1.0597368367634488, 0.3977683984966961, 1.0298955252405229, 1.6693981537742526, 3.4951447843464449)
(-0.90494930553559483, 0.36916539612108634, 1.0303658700697103, 1.6874542731392828, 3.4951447843464449)
(0.13744105279440233, 1.3300645202649739, 2.6131540656339483, 4.8763411136047647, 9.5751914834437937)
(0.22786243898199182, 1.4120860286080519, 2.637650402506837, 4.9067126578493259, 9.4660357513550899)
(0.0064696168078617741, 0.30586770128093388, 0.70774153557312702, 1.5241965711101928, 3.3092932063051976)
(0.007009744579241136, 0.28627373934008982, 0.66039691869500572, 1.4772725266672091, 3.221716765477217)
(-2.2621660374110544, 5.1901313713883352, 7.7178532139979357, 11.277744848353247, 20.155971739152388)
(-2.2621660374110544, 5.1884411864079532, 7.3357079047721054, 10.792299385806913, 18.842012119715388)
(2.5417888074435702, 5.885996170695587, 7.7271286220368598, 8.9207423361593179, 10.846938621419374)
(2.5971767318505856, 5.753551925927133, 7.6569980004033464, 8.8161056254143233, 10.846938621419374)
Note that to keep this short I haven't shown the outliers vectors which will be the 6th element of each of the percentile array.
Also note that all usual additional kwargs / args can be used since they are simply passed to the boxplot method inside it :
>>> fig, ax = plt.subplots()
>>> b = customized_box_plot(percentiles, ax, redraw=True, notch=0, sym='+', vert=1, whis=1.5)
>>> plt.show()
EXPLANATION
The boxplot method returns a dictionary mapping the components of the boxplot to the individual matplotlib.lines.Line2D instances that were created.
Quoting from the matplotlib.pyplot.boxplot documentation :
That dictionary has the following keys (assuming vertical boxplots):
boxes: the main body of the boxplot showing the quartiles and the median’s confidence intervals if enabled.
medians: horizonal lines at the median of each box.
whiskers: the vertical lines extending to the most extreme, n-outlier data points. caps: the horizontal lines at the ends of the whiskers.
fliers: points representing data that extend beyond the whiskers (outliers).
means: points or lines representing the means.
For example observe the boxplot of a tiny sample data of [-9, -4, 2, 4, 9]
>>> b = ax.boxplot([[-9, -4, 2, 4, 9],])
>>> b
{'boxes': [<matplotlib.lines.Line2D at 0x7fe1f5b21350>],
'caps': [<matplotlib.lines.Line2D at 0x7fe1f54d4e50>,
<matplotlib.lines.Line2D at 0x7fe1f54d0e50>],
'fliers': [<matplotlib.lines.Line2D at 0x7fe1f5b317d0>],
'means': [],
'medians': [<matplotlib.lines.Line2D at 0x7fe1f63549d0>],
'whiskers': [<matplotlib.lines.Line2D at 0x7fe1f5b22e10>,
<matplotlib.lines.Line2D at 0x7fe20c54a510>]}
>>> plt.show()
The matplotlib.lines.Line2D objects have two methods that I'll be using in my function extensively. set_xdata ( or set_ydata ) and get_xdata ( or get_ydata ).
Using these methods we can alter the position of the constituent lines of the base box plot to conform to your percentile values ( which is what the customized_box_plot function does ). After altering the constituent lines' position, you can redraw the canvas using figure.canvas.draw()
Summarizing the mappings from percentile to the coordinates of the various Line2D objects.
The Y Coordinates :
The max ( q4_end - end of 4th quartile ) corresponds to the top most cap Line2D object.
The min ( q1_start - start of the 1st quartile ) corresponds to the lowermost most cap Line2D object.
The median corresponds to the ( q3_start ) median Line2D object.
The 2 whiskers lie between the ends of the boxes and extreme caps ( q1_start and q2_start - lower whisker; q4_start and q4_end - upper whisker )
The box is actually an interesting n shaped line bounded by a cap at the lower portion. The extremes of the n shaped line correspond to the q2_start and the q4_start.
The X Coordinates :
The Central x coordinates ( for multiple box plots are usually 1, 2, 3... )
The library automatically calculates the bounding x coordinates based on the width specified.
INVERSE FUNCTION TO RETRIEVE THE PERCENTILES FROM THE boxplot DICT:
def get_percentiles_from_box_plots(bp):
percentiles = []
for i in range(len(bp['boxes'])):
percentiles.append((bp['caps'][2*i].get_ydata()[0],
bp['boxes'][i].get_ydata()[0],
bp['medians'][i].get_ydata()[0],
bp['boxes'][i].get_ydata()[2],
bp['caps'][2*i + 1].get_ydata()[0],
(bp['fliers'][i].get_xdata(),
bp['fliers'][i].get_ydata())))
return percentiles
NOTE:
The reason why I did not make a completely custom boxplot method is because, there are many features offered by the inbuilt box plot that cannot be fully reproduced.
Also excuse me if I may have unnecessarily explained something that may have been too obvious.
Here is an updated version of this useful routine. Setting the vertices directly appears to work for both filled boxes (patchArtist=True) and unfilled ones.
def customized_box_plot(percentiles, axes, redraw = True, *args, **kwargs):
"""
Generates a customized boxplot based on the given percentile values
"""
n_box = len(percentiles)
box_plot = axes.boxplot([[-9, -4, 2, 4, 9],]*n_box, *args, **kwargs)
# Creates len(percentiles) no of box plots
min_y, max_y = float('inf'), -float('inf')
for box_no, pdata in enumerate(percentiles):
if len(pdata) == 6:
(q1_start, q2_start, q3_start, q4_start, q4_end, fliers_xy) = pdata
elif len(pdata) == 5:
(q1_start, q2_start, q3_start, q4_start, q4_end) = pdata
fliers_xy = None
else:
raise ValueError("Percentile arrays for customized_box_plot must have either 5 or 6 values")
# Lower cap
box_plot['caps'][2*box_no].set_ydata([q1_start, q1_start])
# xdata is determined by the width of the box plot
# Lower whiskers
box_plot['whiskers'][2*box_no].set_ydata([q1_start, q2_start])
# Higher cap
box_plot['caps'][2*box_no + 1].set_ydata([q4_end, q4_end])
# Higher whiskers
box_plot['whiskers'][2*box_no + 1].set_ydata([q4_start, q4_end])
# Box
path = box_plot['boxes'][box_no].get_path()
path.vertices[0][1] = q2_start
path.vertices[1][1] = q2_start
path.vertices[2][1] = q4_start
path.vertices[3][1] = q4_start
path.vertices[4][1] = q2_start
# Median
box_plot['medians'][box_no].set_ydata([q3_start, q3_start])
# Outliers
if fliers_xy is not None and len(fliers_xy[0]) != 0:
# If outliers exist
box_plot['fliers'][box_no].set(xdata = fliers_xy[0],
ydata = fliers_xy[1])
min_y = min(q1_start, min_y, fliers_xy[1].min())
max_y = max(q4_end, max_y, fliers_xy[1].max())
else:
min_y = min(q1_start, min_y)
max_y = max(q4_end, max_y)
# The y axis is rescaled to fit the new box plot completely with 10%
# of the maximum value at both ends
axes.set_ylim([min_y*1.1, max_y*1.1])
# If redraw is set to true, the canvas is updated.
if redraw:
ax.figure.canvas.draw()
return box_plot
Here is a bottom-up approach where the box_plot is build up using matplotlib's vline, Rectangle, and normal plot functions
def boxplot(df, ax=None, box_width=0.2, whisker_size=20, mean_size=10, median_size = 10 , line_width=1.5, xoffset=0,
color=0):
"""Plots a boxplot from existing percentiles.
Parameters
----------
df: pandas DataFrame
ax: pandas AxesSubplot
if to plot on en existing axes
box_width: float
whisker_size: float
size of the bar at the end of each whisker
mean_size: float
size of the mean symbol
color: int or rgb(list)
If int particular color of property cycler is taken. Example of rgb: [1,0,0] (red)
Returns
-------
f, a, boxes, vlines, whisker_tips, mean, median
"""
if type(color) == int:
color = plt.rcParams['axes.prop_cycle'].by_key()['color'][color]
if ax:
a = ax
f = a.get_figure()
else:
f, a = plt.subplots()
boxes = []
vlines = []
xn = []
for row in df.iterrows():
x = row[0] + xoffset
xn.append(x)
# box
y = row[1][25]
height = row[1][75] - row[1][25]
box = plt.Rectangle((x - box_width / 2, y), box_width, height)
a.add_patch(box)
boxes.append(box)
# whiskers
y = (row[1][95] + row[1][5]) / 2
vl = a.vlines(x, row[1][5], row[1][95])
vlines.append(vl)
for b in boxes:
b.set_linewidth(line_width)
b.set_facecolor([1, 1, 1, 1])
b.set_edgecolor(color)
b.set_zorder(2)
for vl in vlines:
vl.set_color(color)
vl.set_linewidth(line_width)
vl.set_zorder(1)
whisker_tips = []
if whisker_size:
g, = a.plot(xn, df[5], ls='')
whisker_tips.append(g)
g, = a.plot(xn, df[95], ls='')
whisker_tips.append(g)
for wt in whisker_tips:
wt.set_markeredgewidth(line_width)
wt.set_color(color)
wt.set_markersize(whisker_size)
wt.set_marker('_')
mean = None
if mean_size:
g, = a.plot(xn, df['mean'], ls='')
g.set_marker('o')
g.set_markersize(mean_size)
g.set_zorder(20)
g.set_markerfacecolor('None')
g.set_markeredgewidth(line_width)
g.set_markeredgecolor(color)
mean = g
median = None
if median_size:
g, = a.plot(xn, df['median'], ls='')
g.set_marker('_')
g.set_markersize(median_size)
g.set_zorder(20)
g.set_markeredgewidth(line_width)
g.set_markeredgecolor(color)
median = g
a.set_ylim(np.nanmin(df), np.nanmax(df))
return f, a, boxes, vlines, whisker_tips, mean, median
This is how it looks in action:
import numpy as np
import pandas as pd
import matplotlib.pylab as plt
nopts = 12
df = pd.DataFrame()
df['mean'] = np.random.random(nopts) + 7
df['median'] = np.random.random(nopts) + 7
df[5] = np.random.random(nopts) + 4
df[25] = np.random.random(nopts) + 6
df[75] = np.random.random(nopts) + 8
df[95] = np.random.random(nopts) + 10
out = boxplot(df)

gee 'sampleRectangle()' returning 1x1 array

I'm facing an issue when trying to use 'sampleRectangle()' function in GEE, it is returning 1x1 arrays and I can't seem to find a workaround. Please, see below a python code in which I'm using an approach posted by Justin Braaten. I suspect there's something wrong with the geometry object I'm passing to the function, but at the same time I've tried several ways to check how this argument is behaving and couldn't no spot any major issue.
Can anyone give me a hand trying to understand what is happening?
Thanks!
import json
import ee
import numpy as np
import matplotlib.pyplot as plt
ee.Initialize()
point = ee.Geometry.Point([-55.8571, -9.7864])
box_l8sr = ee.Geometry(point.buffer(50).bounds())
box_l8sr2 = ee.Geometry.Polygon(box_l8sr.coordinates())
# print(box_l8sr2)
# Define an image.
# l8sr_y = ee.Image('LANDSAT/LC08/C01/T1_SR/LC08_038029_20180810')
oli_sr_coll = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')
## Function to mask out clouds and cloud-shadows present in Landsat images
def maskL8sr(image):
## Bits 3 and 5 are cloud shadow and cloud, respectively.
cloudShadowBitMask = (1 << 3)
cloudsBitMask = (1 << 5)
## Get the pixel QA band.
qa = image.select('pixel_qa')
## Both flags should be set to zero, indicating clear conditions.
mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0)
mask = qa.bitwiseAnd(cloudsBitMask).eq(0)
return image.updateMask(mask)
l8sr_y = oli_sr_coll.filterDate('2019-01-01', '2019-12-31').map(maskL8sr).mean()
l8sr_bands = l8sr_y.select(['B2', 'B3', 'B4']).sampleRectangle(box_l8sr2)
print(type(l8sr_bands))
# Get individual band arrays.
band_arr_b4 = l8sr_bands.get('B4')
band_arr_b3 = l8sr_bands.get('B3')
band_arr_b2 = l8sr_bands.get('B2')
# Transfer the arrays from server to client and cast as np array.
np_arr_b4 = np.array(band_arr_b4.getInfo())
np_arr_b3 = np.array(band_arr_b3.getInfo())
np_arr_b2 = np.array(band_arr_b2.getInfo())
print(np_arr_b4.shape)
print(np_arr_b3.shape)
print(np_arr_b2.shape)
# Expand the dimensions of the images so they can be concatenated into 3-D.
np_arr_b4 = np.expand_dims(np_arr_b4, 2)
np_arr_b3 = np.expand_dims(np_arr_b3, 2)
np_arr_b2 = np.expand_dims(np_arr_b2, 2)
# # print(np_arr_b4.shape)
# # print(np_arr_b5.shape)
# # print(np_arr_b6.shape)
# # Stack the individual bands to make a 3-D array.
rgb_img = np.concatenate((np_arr_b2, np_arr_b3, np_arr_b4), 2)
# print(rgb_img.shape)
# # Scale the data to [0, 255] to show as an RGB image.
rgb_img_test = (255*((rgb_img - 100)/3500)).astype('uint8')
# plt.imshow(rgb_img)
plt.show()
# # # create L8OLI plot
# fig, ax = plt.subplots()
# ax.set(title = "Satellite Image")
# ax.set_axis_off()
# plt.plot(42, 42, 'ko')
# img = ax.imshow(rgb_img_test, interpolation='nearest')
I have the same issue. It seems to have something to do with .mean(), or any reduction of image collections for that matter.
One solution is to reproject after the reduction. For example, you could try adding "reproject" at the end:
l8sr_y = oli_sr_coll.filterDate('2019-01-01', '2019-12-31').map(maskL8sr).mean().reproject(crs = ee.Projection('EPSG:4326'), scale=30)
It should work.

How to speed up matplotlib animation made from database results?

I'm using matplotlib to make contourplots over some maps I have stored in a database but the process is taking several hours to produce each movie. The code I'm using to make this movie is:
def load_img(option, obs_id, columnshape):
'''
This function will load the image
from the database and make the
conversion from string to nparray
'''
#starting a db session
session = makesession()
#defining the cases to query the db
case = {'Bz': Observations.mean_bz,
'Es': Observations.poyn_Es,
'En': Observations.poyn_En,
'Et': Observations.poyn_Et}
#checking if the option selected was
#a valid one
while option not in case.keys():
#feedback
print('Invalid option. Select a valid one: ', case.keys())
option = str(input())
#querying the db
s = sql.select([case[option]]).where(Observations.id == obs_id)
#fetching the result
rp = session.execute(s)
result = rp.fetchone()
#restoring the image
img = ajuste(result[0],columnshape)
return(img)
def db_animation(ar_id, option):
'''
Description
'''
vmin = -1e17
vmax = 1e17
levels = [vmin, 0.8*vmin, 0.6*vmin, 0.4*vmin, 0.2*vmin,
0.2*vmax, 0.4*vmax, 0.6*vmax,0.8*vmax,vmax]
#getting the observation ids
obs_ids = scout_obs_ids(ar_id)
#obs_ids = [x for x in range(400,450)]
#getting the columnshape
columnshape = scout_colshape(obs_ids[0])
#creating the figure objects
fig, ax = plt.subplots(figsize = (12,8))
#loading the first data
data_bz = load_img('Bz', obs_ids[0], columnshape)
data_E = load_img(option, obs_ids[0], columnshape)
#making the image objects
img1 = ax.imshow(data_bz, origin = 'lower', cmap = plt.cm.gray,
animated = True)
img2 = [ax.contourf(data_E, alpha = 0.35,
#vmax = 1e17, vmin = -1e17,
levels = levels,
origin = 'lower',
cmap = 'PiYG')]
#adding a colorbar
fig.colorbar(img2[0], shrink = 0.75, label = 'W')
def refresher(frame_number, img1,img2):
'''
description
'''
#taking the new data
new_data_bz = load_img('Bz', obs_id = obs_ids[frame_number+1],
columnshape = columnshape)
new_data_E = load_img(option, obs_id = obs_ids[frame_number+1],
columnshape = columnshape)
#setting the new data
img1.set_data(new_data_bz)
#removing the contours to start anew
for tp in img2[0].collections:
tp.remove()
img2[0] = ax.contourf(new_data_E, alpha = 0.35,
levels = levels,
origin = 'lower', cmap = 'PiYG')
return(img1, img2[0].collections,)
#using the animation function
ani = FuncAnimation(fig, refresher,
frames=range(len(obs_ids)-1),
interval = 100,
#blit = True,
fargs = [img1,img2])
#saving
ani.save("test.mp4")
return
On average those movies take 1200 images for each img object (a total around 2400) from the database. Each pair of images are individually loaded and restored to make the background image and the contour plot.
I was wondering about reasons why the processing time was escalating quickly as I increase the number of images to make the movie but could not get to a conclusion on my own. I find it particularly intriguing that when I set blit to True (which according to the documentation should help improve performance) I get the following error:
AttributeError: 'silent_list' object has no attribute 'set_animated'
I imagine that either my queries or the way I constructed my animation function are then highly inefficient. But I suspect more on the latter since when I'm normally using the DB the results are loaded in what I imagine is a reasonable time.
Can someone cast some light at this struggle for me?

Adding a single label to the legend for a series of different data points plotted inside a designated bin in Python using matplotlib.pyplot.plot()

I have a script for plotting astronomical data of redmapping clusters using a csv file. I could get the data points in it and want to plot them using different colors depending on their redshift values: I am binning the dataset into 3 bins (0.1-0.2, 0.2-0.25, 0.25,0.31) based on the redshift.
The problem arises with my code after I distinguish to what bin the datapoint belongs: I want to have 3 labels in the legend corresponding to red, green and blue data points, but this is not happening and I don't know why. I am using plot() instead of scatter() as I also had to do the best fit from the data in the same figure. So everything needs to be in 1 figure.
import numpy as np
import matplotlib.pyplot as py
import csv
z = open("Sheet4CSV.csv","rU")
data = csv.reader(z)
x = []
y = []
ylow = []
yupp = []
xlow = []
xupp = []
redshift = []
for r in data:
x.append(float(r[2]))
y.append(float(r[5]))
xlow.append(float(r[3]))
xupp.append(float(r[4]))
ylow.append(float(r[6]))
yupp.append(float(r[7]))
redshift.append(float(r[1]))
from operator import sub
xerr_l = map(sub,x,xlow)
xerr_u = map(sub,xupp,x)
yerr_l = map(sub,y,ylow)
yerr_u = map(sub,yupp,y)
py.xlabel("$Original\ Tx\ XCS\ pipeline\ Tx\ keV$")
py.ylabel("$Iterative\ Tx\ pipeline\ keV$")
py.xlim(0,12)
py.ylim(0,12)
py.title("Redmapper Clusters comparison of Tx pipelines")
ax1 = py.subplot(111)
##Problem starts here after the previous line##
for p in redshift:
for i in xrange(84):
p=redshift[i]
if 0.1<=p<0.2:
ax1.plot(x[i],y[i],color="b", marker='.', linestyle = " ")#, label = "$z < 0.2$")
exit
if 0.2<=p<0.25:
ax1.plot(x[i],y[i],color="g", marker='.', linestyle = " ")#, label="$0.2 \leq z < 0.25$")
exit
if 0.25<=p<=0.3:
ax1.plot(x[i],y[i],color="r", marker='.', linestyle = " ")#, label="$z \geq 0.25$")
exit
##There seems nothing wrong after this point##
py.errorbar(x,y,yerr=[yerr_l,yerr_u],xerr=[xerr_l,xerr_u], fmt= " ",ecolor='magenta', label="Error bars")
cof = np.polyfit(x,y,1)
p = np.poly1d(cof)
l = np.linspace(0,12,100)
py.plot(l,p(l),"black",label="Best fit")
py.plot([0,15],[0,15],"black", linestyle="dotted", linewidth=2.0, label="line $y=x$")
py.grid()
box = ax1.get_position()
ax1.set_position([box.x1,box.y1,box.width, box.height])
py.legend(loc='center left',bbox_to_anchor=(1,0.5))
py.show()
In the 1st 'for' loop, I have indexed every value 'p' in the list 'redshift' so that bins can be created using 'if' statement. But if I add the labels that are hashed out against each py.plot() inside the 'if' statements, each data point 'i' that gets plotted in the figure as an intersection of (x[i],y[i]) takes the label and my entire legend attains in total 87 labels (including the 3 mentioned in the code at other places)!!!!!!
I essentially need 1 label for each bin...
Please tell me what needs to done after the bins are created and py.plot() commands used...Thanks in advance :-)
Sorry I cannot post my image here due to low reputation!
The data 'appended' for x, y and redshift lists from the csv file are as follows:
x=[5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547]
y=[5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677]
redshift = [0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19]
Working with numerical data like this, you should really consider using a numerical library, like numpy.
The problem in your code arises from processing each record (a coordinate (x,y) and the corresponding value redshift) one at a time. You are calling plot for each point, thereby creating legends for each of those 84 datapoints. You should consider your "bins" as groups of data that belong to the same dataset and process them as such. You could use "logical masks" to distinguish between your "bins", as shown below.
It's also not clear why you call exit after each plotting action.
import numpy as np
import matplotlib.pyplot as plt
x = np.array([5.031,10.599,10.589,8.548,9.089,8.675,3.588,1.244,3.023,8.632,8.953,7.603,7.513,2.917,7.344,7.106,3.889,7.287,3.367,6.839,2.801,2.316,1.328,6.31,6.19,6.329,6.025,5.629,6.123,5.892,5.438,4.398,4.542,4.624,4.501,4.504,5.033,5.068,4.197,2.854,4.784,2.158,4.054,3.124,3.961,4.42,3.853,3.658,1.858,4.537,2.072,3.573,3.041,5.837,3.652,3.209,2.742,2.732,1.312,3.635,2.69,3.32,2.488,2.996,2.269,1.701,3.935,2.015,0.798,2.212,1.672,1.925,3.21,1.979,1.794,2.624,2.027,3.66,1.073,1.007,1.57,0.854,0.619,0.547])
y = np.array([5.255,10.897,11.045,9.125,9.387,17.719,4.025,1.389,4.152,8.703,9.051,8.02,7.774,3.139,7.543,7.224,4.155,7.416,3.905,6.868,2.909,2.658,1.651,6.454,6.252,6.541,6.152,5.647,6.285,6.079,5.489,4.541,4.634,8.851,4.554,4.555,5.559,5.144,5.311,5.839,5.364,3.18,4.352,3.379,4.059,4.575,3.914,5.736,2.304,4.68,3.187,3.756,3.419,9.118,4.595,3.346,3.603,6.313,1.816,4.34,2.732,4.978,2.719,3.761,2.623,2.1,4.956,2.316,4.231,2.831,1.954,2.248,6.573,2.276,2.627,3.85,3.545,25.405,3.996,1.347,1.679,1.435,0.759,0.677])
redshift = np.array([0.12,0.25,0.23,0.23,0.27,0.26,0.12,0.27,0.17,0.18,0.17,0.3,0.23,0.1,0.23,0.29,0.29,0.12,0.13,0.26,0.11,0.24,0.13,0.21,0.17,0.2,0.3,0.29,0.23,0.27,0.25,0.21,0.11,0.15,0.1,0.26,0.23,0.12,0.23,0.26,0.2,0.17,0.22,0.26,0.25,0.12,0.19,0.24,0.18,0.15,0.27,0.14,0.14,0.29,0.29,0.26,0.15,0.29,0.24,0.24,0.23,0.26,0.29,0.22,0.13,0.18,0.24,0.14,0.24,0.24,0.17,0.26,0.29,0.11,0.14,0.26,0.28,0.26,0.28,0.27,0.23,0.26,0.23,0.19])
bin3 = 0.25 <= redshift
bin2 = np.logical_and(0.2 <= redshift, redshift < 0.25)
bin1 = np.logical_and(0.1 <= redshift, redshift < 0.2)
plt.ion()
labels = ("$z < 0.2$", "$0.2 \leq z < 0.25$", "$z \geq 0.25$")
colors = ('r', 'g', 'b')
for bin, label, co in zip( (bin1, bin2, bin3), labels, colors):
plt.plot(x[bin], y[bin], color=co, ls='none', marker='o', label=label)
plt.legend()
plt.show()

Categories