How to plot specific parts of a matrix in matplotlib? - python

I have a matrix that represents temperature distribution in a hollow square plate (hope the attached figure helps). The problem is with the hollow part in the plate which doesn't represent any solid material so I need to exclude this part from the plot.
The simulation returns an np.array() with the temperature results (except of course for the hollow part). and this is the part where I define dimensions of the grid:
import numpy as np
plate_height = 0.4 #meters
hollow_square_height = 0.2 #meters
#discretization data
delta_x = delta_y = 0.05 #meters
grid_points_n = (plate_height/delta_x) + 1
grid = np.zeros(shape=(grid_points_n, grid_points_n))
# the simulation assures that the hollow part will remain zero valued.
So, how do I approach this?

Instead of changing the original data, you can mask the values that you don't want to be used in calculations, plots, etc.:
import matplotlib.pyplot as plt
import numpy as np
data = [
[11, 11, 12, 13],
[9, 0, 0, 12],
[8, 0, 0, 11],
[8, 9, 10, 11]
]
#Here's what you have:
data_array = np.array(data)
#Mask every position where there is a 0:
masked_data = np.ma.masked_equal(data_array, 0)
#Plot the matrix:
fig = plt.figure()
ax = fig.gca()
ax.matshow(masked_data, cmap=plt.cm.autumn_r) #_r => reverse the standard color map
plt.show()
#plt.savefig('heatmap.png')

Replace zeros by nan, nan values are ignored in any plot. For example:
import matplotlib.pyplot as plt
from numpy import nan,matrix
M = matrix([
[20,30,25,20,50],
[22,nan,nan,nan,27],
[30,nan,nan,nan,20],
[33,nan,nan,nan,31],
[21,28,29,23,36]])
fig = plt.figure()
ax = fig.add_subplot(111)
ax.matshow(M, cmap=plt.cm.jet) # Show matrix color
plt.show()
You can replace zeros by nan in a matrix as follow:
from numpy import nan
A[A==0.0]=nan # A is your matrix

Related

How to plot histogram, when the number of values in interval is given? (python)

I know that when you usually plot a histogram you have an array of values and intervals.
But if I have intervals and the number of values that are in those intervals, how can I plot the histogram?
I have something that looks like this:
amounts = np.array([23, 7, 18, 5])
and my interval is from 0 to 4 with step 1,
so on interval [0,1] there are 23 values and so on.
You could probably try matplotlib.pyplot.stairs for this.
import matplotlib.pyplot as plt
import numpy as np
amounts = np.array([23, 7, 18, 5])
plt.stairs(amounts, range(5))
plt.show()
Please mark it as solved if this helps.
I find it easier to just simulate some data having the desired distribution, and then use plt.hist to plot the histogram.
Here is am example. Hopefully it will be helpful!
import numpy as np
import matplotlib.pyplot as plt
amounts = np.array([23, 7, 18, 5])
bin_edges = np.arange(5)
bin_centres = (bin_edges[1:] + bin_edges[:-1]) / 2
# fake some data having the desired distribution
data = [[bc] * amount for bc, amount in zip(bin_centres, amounts)]
data = np.concatenate(data)
hist = plt.hist(data, bins=bin_edges, histtype='step')[0]
plt.show()
# the plotted distribution is consistent with amounts
assert np.allclose(hist, amounts)
If you already know the values, then the histogram just becomes a bar plot.
amounts = np.array([23, 7, 18, 5])
interval = np.arange(5)
midvals = (interval + 0.5)[0:len(vals)-1] # 0.5, 1.5, 2.5, 3.5
plt.bar(midvals,
amounts)
plt.xticks(interval) # Shows the interval ranges rather than the centers of the bars
plt.show()
If the gap between the bars looks to wide, you can change the width of the bars by passing in a width (as a fraction of 1 - default is 0.8) argument to plt.bar().

How to numerically compute the mass map and density map for a collection of masses?

Good day to everyone. I was wondering if there is any way to extract a mass map and a mass density map for a scatter plot of mass distributions.
Developing the code for the mass distributions:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.ndimage.filters import gaussian_filter
from numpy.random import rand
# Finds nran number of random points in two dimensions
def randomizer(nran):
arr = rand(nran, 2)
return arr
# Calculates a sort of 'density' plot. Using this from a previous StackOverflow Question: https://stackoverflow.com/questions/2369492/generate-a-heatmap-in-matplotlib-using-a-scatter-data-set
def myplot(x, y, s, bins = 1000):
plot, xedges, yedges = np.histogram2d(x, y, bins = bins)
plot = gaussian_filter(plot, sigma = s)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
return plot.T, extent
Trying out an example:
arr = randomizer(1000)
plot, extent = myplot(arr[:, 0], arr[:, 1], 20)
fig, ax = plt.subplots(1, 2, figsize = (15, 5))
ax[0].scatter(arr[:, 0], arr[:, 1])
ax[0].set_aspect('equal')
ax[0].set_xlabel('x')
ax[0].set_ylabel('y')
ax[0].set_title('Scatter Plot')
img = ax[1].imshow(plot)
ax[1].set_title('Density Plot?')
ax[1].set_aspect('equal')
ax[1].set_xlabel('x')
ax[1].set_ylabel('y')
plt.colorbar(img)
This yields a scatter plot and what I think kind of represents a density plot (please correct if wrong). Now, suppose that each dot has a mass of 50 kg. Does the "density plot" represent a map of the total mass distribution (if that makes sense?)since the colorbar has a max value much less than 50. Then, using this, how can I compute a mass density for this mass distribution? I would really appreciate if someone could help. Thank you.
Edit: Added the website from where I got the heatmap function.
Okay, I think I've got the solution. I've been meaning to upload this for quite an amount of time. Here it goes:
# Importing packages
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from numpy.random import random
from scipy.stats import binned_statistic_2d
# Finds nran number of random points in two dimensions
def randomizer(nran):
arr_x = []
arr_y = []
for i in range(nran):
arr_x += [10 * random()] # Since random() only produces floats in (0, 1), I multiply by 10 (for illustrative purposes)
arr_y += [10 *random()] # Since random() only produces floats in (0, 1), I multiply by 10 (for illustrative purposes)
return arr_x, arr_y
# Computing weight array
def weights_array(weight, length):
weights = np.array([weight] * length)
return weights
# Computes a weighted histogram and divides it by the total grid area to get the density
def histogramizer(x_array, y_array, weights, num_pixels, Dimension):
Range = [0, Dimension] # Assumes the weights are distributed in a square area
grid, _, _, _ = binned_statistic_2d(x_array, y_array, weights, 'sum', bins=num_pixels, range=[Range,Range])
area = int(np.max(x_array)) * int(np.max(y_array))
density = grid/area
return density
Then, actually implementing this, one finds:
arr_x, arr_y = randomizer(1000000)
weights = []
for i in range(len(arr_x)):
weights += [50]
density = histogramizer(arr_x, arr_y, weights, [400,400], np.max(arr_x))
fig, ax = plt.subplots(figsize = (15, 5))
plt.imshow(density, extent = [0, int(np.max(arr_x)), 0, int(np.max(arr_x))]);
plt.colorbar(label = '$kg m^-2$');
The result I got for this was the following plot (I know it's generally not recommended to add a photo, but I wanted to add it for sake of showing my code's output):

Set 'global' colorbar range for multiple matplotlib subplots of different ranges

I would like to plot data in subplots using matplotlib.pyplot in python. Each subplot will contain data of different ranges. I would like to plot them using pyplot.scatter, and use one single colorbar for the entire plot. Thus, the colorbar should encompass the entire range of the values in every subplot. However, when I use a loop to plot the subplots and call a colorbar outside of the loop, it only uses the range of values from the last subplot. A lot of examples available concern the sizing the position of the colorbar, so this answer (how to make one universal colorbar for multiple subplots) is not obvious.
I have the following self-contained example code. Here, two subplots are rendered, one that should be colored with frigid temperatures typical of Russia and the other with tropical temperatures of Brazil. However, the end result shows a colorbar that only ranges the tropical Brazilian temperatures, making the Russia subplot erroneous:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
core_list = ['Russia', 'Brazil']
core_depth = [0, 2, 4, 6, 8, 10]
lo = [-33, 28]
hi = [10, 38]
df = pd.DataFrame([], columns = ['Location', 'Depth', '%TOC', 'Temperature'])
#Fill df
for ii, name in enumerate(core_list):
for jj in core_depth:
df.loc[len(df.index)] = [name, jj, (np.random.randint(1, 20))/10, np.random.randint(lo[ii], hi[ii])]
#Russia data have much colder temperatures than Brazil data due to hi and lo
#Plot data from each location using scatter plots
fig, axs = plt.subplots(nrows = 1, ncols = 2, sharey = True)
for nn, name in enumerate(core_list):
core_mask = df['Location'] == name
data = df.loc[core_mask]
plt.sca(axs[nn])
plt.scatter(data['Depth'], data['%TOC'], c = data['Temperature'], s = 50, edgecolors = 'k')
axs[nn].set_xlabel('%TOC')
plt.text(1.25*min(data['%TOC']), 1.75, name)
if nn == 0:
axs[nn].set_ylabel('Depth')
cbar = plt.colorbar()
cbar.ax.set_ylabel('Temperature, degrees C')
#How did Russia get so warm?!? Temperatures and ranges of colorbar are set to last called location.
#How do I make one colorbar encompass global temperature range of both data sets?
The output of this code shows that the temperatures in Brazil and Russia fall within the same range of colors:
We know intuitively, and from glancing at the data, that this is wrong. So, how do we tell pyplot to plot this correctly?
The answer is straightforward using the vmax and vmin controls of pyplot.scatter. These must be set with a universal range of data, not just the data focused on in any single iteration of a loop. Thus, to change the code above:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
core_list = ['Russia', 'Brazil']
core_depth = [0, 2, 4, 6, 8, 10]
lo = [-33, 28]
hi = [10, 38]
df = pd.DataFrame([], columns = ['Location', 'Depth', '%TOC', 'Temperature'])
#Fill df
for ii, name in enumerate(core_list):
for jj in core_depth:
df.loc[len(df.index)] = [
name,
jj,
(np.random.randint(1, 20))/10,
np.random.randint(lo[ii], hi[ii])
]
#Russia data have much colder temperatures than Brazil data due to hi and lo
#Plot data from each location using scatter plots
fig, axs = plt.subplots(nrows = 1, ncols = 2, sharey = True)
for nn, name in enumerate(core_list):
core_mask = df['Location'] == name
data = df.loc[core_mask]
plt.sca(axs[nn])
plt.scatter(
data['Depth'],
data['%TOC'],
c=data['Temperature'],
s=50,
edgecolors='k',
vmax=max(df['Temperature']),
vmin=min(df['Temperature'])
)
axs[nn].set_xlabel('%TOC')
plt.text(1.25*min(data['%TOC']), 1.75, name)
if nn == 0:
axs[nn].set_ylabel('Depth')
cbar = plt.colorbar()
cbar.ax.set_ylabel('Temperature, degrees C')
Now, the output shows a temperature difference between Russia and Brazil, which one would expect after a cursory glance at the data. The change that fixes this problem occurs within the for loop, however it references all of the data to find a max and min:
plt.scatter(data['Depth'], data['%TOC'], c = data['Temperature'], s = 50, edgecolors = 'k', vmax = max(df['Temperature']), vmin = min(df['Temperature']) )

Plotting discrete colorbar in legend style using Matplotlib

Sometimes, I want to plot discrete value in pcolormesh style.
For example, to represent a 2-d array in the shape of 100x100 which contain int 0~7
data = np.random.randint(8, size=(100,100))
cmap = plt.cm.get_cmap('PiYG', 8)
plt.pcolormesh(data,cmap = cmap,alpha = 0.75)
plt.colorbar()
The figure shows like this:
How to generate the colorbar in legend style. In other word, each color box corresponds to its value(e.g pink colorbox --> 0)
An illustration here(Not fit this example):
Maybe the easiest way is to create corresponding number of Patch instances:
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
import numpy as np
data = np.random.randint(8, size=(100,100))
cmap = plt.cm.get_cmap('PiYG', 8)
plt.pcolormesh(data,cmap = cmap,alpha = 0.75)
# Set borders in the interval [0, 1]
bound = np.linspace(0, 1, 9)
# Preparing borders for the legend
bound_prep = np.round(bound * 7, 2)
# Creating 8 Patch instances
plt.legend([mpatches.Patch(color=cmap(b)) for b in bound[:-1]],
['{} - {}'.format(bound_prep[i], bound_prep[i+1] - 0.01) for i in range(8)])

Setting different color for each series in scatter plot on matplotlib

Suppose I have three data sets:
X = [1,2,3,4]
Y1 = [4,8,12,16]
Y2 = [1,4,9,16]
I can scatter plot this:
from matplotlib import pyplot as plt
plt.scatter(X,Y1,color='red')
plt.scatter(X,Y2,color='blue')
plt.show()
How can I do this with 10 sets?
I searched for this and could find any reference to what I'm asking.
Edit: clarifying (hopefully) my question
If I call scatter multiple times, I can only set the same color on each scatter. Also, I know I can set a color array manually but I'm sure there is a better way to do this.
My question is then, "How can I automatically scatter-plot my several data sets, each with a different color.
If that helps, I can easily assign a unique number to each data set.
I don't know what you mean by 'manually'. You can choose a colourmap and make a colour array easily enough:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]
colors = cm.rainbow(np.linspace(0, 1, len(ys)))
for y, c in zip(ys, colors):
plt.scatter(x, y, color=c)
Or you can make your own colour cycler using itertools.cycle and specifying the colours you want to loop over, using next to get the one you want. For example, with 3 colours:
import itertools
colors = itertools.cycle(["r", "b", "g"])
for y in ys:
plt.scatter(x, y, color=next(colors))
Come to think of it, maybe it's cleaner not to use zip with the first one neither:
colors = iter(cm.rainbow(np.linspace(0, 1, len(ys))))
for y in ys:
plt.scatter(x, y, color=next(colors))
The normal way to plot plots with points in different colors in matplotlib is to pass a list of colors as a parameter.
E.g.:
import matplotlib.pyplot
matplotlib.pyplot.scatter([1,2,3],[4,5,6],color=['red','green','blue'])
When you have a list of lists and you want them colored per list.
I think the most elegant way is that suggesyted by #DSM,
just do a loop making multiple calls to scatter.
But if for some reason you wanted to do it with just one call, you can make a big list of colors, with a list comprehension and a bit of flooring division:
import matplotlib
import numpy as np
X = [1,2,3,4]
Ys = np.array([[4,8,12,16],
[1,4,9,16],
[17, 10, 13, 18],
[9, 10, 18, 11],
[4, 15, 17, 6],
[7, 10, 8, 7],
[9, 0, 10, 11],
[14, 1, 15, 5],
[8, 15, 9, 14],
[20, 7, 1, 5]])
nCols = len(X)
nRows = Ys.shape[0]
colors = matplotlib.cm.rainbow(np.linspace(0, 1, len(Ys)))
cs = [colors[i//len(X)] for i in range(len(Ys)*len(X))] #could be done with numpy's repmat
Xs=X*nRows #use list multiplication for repetition
matplotlib.pyplot.scatter(Xs,Ys.flatten(),color=cs)
cs = [array([ 0.5, 0. , 1. , 1. ]),
array([ 0.5, 0. , 1. , 1. ]),
array([ 0.5, 0. , 1. , 1. ]),
array([ 0.5, 0. , 1. , 1. ]),
array([ 0.28039216, 0.33815827, 0.98516223, 1. ]),
array([ 0.28039216, 0.33815827, 0.98516223, 1. ]),
array([ 0.28039216, 0.33815827, 0.98516223, 1. ]),
array([ 0.28039216, 0.33815827, 0.98516223, 1. ]),
...
array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17,
1.00000000e+00]),
array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17,
1.00000000e+00]),
array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17,
1.00000000e+00]),
array([ 1.00000000e+00, 1.22464680e-16, 6.12323400e-17,
1.00000000e+00])]
An easy fix
If you have only one type of collections (e.g. scatter with no error bars) you can also change the colours after that you have plotted them, this sometimes is easier to perform.
import matplotlib.pyplot as plt
from random import randint
import numpy as np
#Let's generate some random X, Y data X = [ [frst group],[second group] ...]
X = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)]
Y = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)]
labels = range(1,len(X)+1)
fig = plt.figure()
ax = fig.add_subplot(111)
for x,y,lab in zip(X,Y,labels):
ax.scatter(x,y,label=lab)
The only piece of code that you need:
#Now this is actually the code that you need, an easy fix your colors just cut and paste not you need ax.
colormap = plt.cm.gist_ncar #nipy_spectral, Set1,Paired
colorst = [colormap(i) for i in np.linspace(0, 0.9,len(ax.collections))]
for t,j1 in enumerate(ax.collections):
j1.set_color(colorst[t])
ax.legend(fontsize='small')
The output gives you differnent colors even when you have many different scatter plots in the same subplot.
You can always use the plot() function like so:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]
plt.figure()
for y in ys:
plt.plot(x, y, 'o')
plt.show()
This question is a bit tricky before Jan 2013 and matplotlib 1.3.1 (Aug 2013), which is the oldest stable version you can find on matpplotlib website. But after that it is quite trivial.
Because present version of matplotlib.pylab.scatter support assigning: array of colour name string, array of float number with colour map, array of RGB or RGBA.
this answer is dedicate to #Oxinabox's endless passion for correcting the 2013 version of myself in 2015.
you have two option of using scatter command with multiple colour in a single call.
as pylab.scatter command support use RGBA array to do whatever colour you want;
back in early 2013, there is no way to do so, since the command only support single colour for the whole scatter point collection. When I was doing my 10000-line project I figure out a general solution to bypass it. so it is very tacky, but I can do it in whatever shape, colour, size and transparent. this trick also could be apply to draw path collection, line collection....
the code is also inspired by the source code of pyplot.scatter, I just duplicated what scatter does without trigger it to draw.
the command pyplot.scatter return a PatchCollection Object, in the file "matplotlib/collections.py" a private variable _facecolors in Collection class and a method set_facecolors.
so whenever you have a scatter points to draw you can do this:
# rgbaArr is a N*4 array of float numbers you know what I mean
# X is a N*2 array of coordinates
# axx is the axes object that current draw, you get it from
# axx = fig.gca()
# also import these, to recreate the within env of scatter command
import matplotlib.markers as mmarkers
import matplotlib.transforms as mtransforms
from matplotlib.collections import PatchCollection
import matplotlib.markers as mmarkers
import matplotlib.patches as mpatches
# define this function
# m is a string of scatter marker, it could be 'o', 's' etc..
# s is the size of the point, use 1.0
# dpi, get it from axx.figure.dpi
def addPatch_point(m, s, dpi):
marker_obj = mmarkers.MarkerStyle(m)
path = marker_obj.get_path()
trans = mtransforms.Affine2D().scale(np.sqrt(s*5)*dpi/72.0)
ptch = mpatches.PathPatch(path, fill = True, transform = trans)
return ptch
patches = []
# markerArr is an array of maker string, ['o', 's'. 'o'...]
# sizeArr is an array of size float, [1.0, 1.0. 0.5...]
for m, s in zip(markerArr, sizeArr):
patches.append(addPatch_point(m, s, axx.figure.dpi))
pclt = PatchCollection(
patches,
offsets = zip(X[:,0], X[:,1]),
transOffset = axx.transData)
pclt.set_transform(mtransforms.IdentityTransform())
pclt.set_edgecolors('none') # it's up to you
pclt._facecolors = rgbaArr
# in the end, when you decide to draw
axx.add_collection(pclt)
# and call axx's parent to draw_idle()
A MUCH faster solution for large dataset and limited number of colors is the use of Pandas and the groupby function:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time
# a generic set of data with associated colors
nsamples=1000
x=np.random.uniform(0,10,nsamples)
y=np.random.uniform(0,10,nsamples)
colors={0:'r',1:'g',2:'b',3:'k'}
c=[colors[i] for i in np.round(np.random.uniform(0,3,nsamples),0)]
plt.close('all')
# "Fast" Scatter plotting
starttime=time.time()
# 1) make a dataframe
df=pd.DataFrame()
df['x']=x
df['y']=y
df['c']=c
plt.figure()
# 2) group the dataframe by color and loop
for g,b in df.groupby(by='c'):
plt.scatter(b['x'],b['y'],color=g)
print('Fast execution time:', time.time()-starttime)
# "Slow" Scatter plotting
starttime=time.time()
plt.figure()
# 2) group the dataframe by color and loop
for i in range(len(x)):
plt.scatter(x[i],y[i],color=c[i])
print('Slow execution time:', time.time()-starttime)
plt.show()
This works for me:
for each series, use a random rgb colour generator
c = color[np.random.random_sample(), np.random.random_sample(), np.random.random_sample()]
You can also create a list of colors which includes all the colors you need in your scatter plot and give it as a parameter inside like:
colors = ["red", "blue", "green"]
plt.scatter(X, Y, color = colors)

Categories