Library/tool for drawing ternary/triangle plots [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I need to draw ternary/triangle plots representing mole fractions (x, y, z) of various substances/mixtures (x + y + z = 1). Each plot represents iso-valued substances, e.g. substances which have the same melting point. The plots need to be drawn on the same triangle with different colors/symbols and it would be nice if I could also connect the dots.
I have looked at matplotlib, R and gnuplot, but they don't seem to be able to draw this kind of plot. The 3rd party ade4 package for R seems to be able to draw it, but I'm not sure if I can draw multiple plots on the same triangle.
I need something that runs under Linux or Windows. I'm open to any suggestions, including libraries for other languages, e.g. Perl, PHP, Ruby, C# and Java.

Created a very basic script for generating ternary (or more) plots. No gridlines or ticklines, but those wouldn't be too hard to add using the vectors in the "basis" array.
from pylab import *
def ternaryPlot(
data,
# Scale data for ternary plot (i.e. a + b + c = 1)
scaling=True,
# Direction of first vertex.
start_angle=90,
# Orient labels perpendicular to vertices.
rotate_labels=True,
# Labels for vertices.
labels=('one','two','three'),
# Can accomodate more than 3 dimensions if desired.
sides=3,
# Offset for label from vertex (percent of distance from origin).
label_offset=0.10,
# Any matplotlib keyword args for plots.
edge_args={'color':'black','linewidth':2},
# Any matplotlib keyword args for figures.
fig_args = {'figsize':(8,8),'facecolor':'white','edgecolor':'white'},
):
'''
This will create a basic "ternary" plot (or quaternary, etc.)
'''
basis = array(
[
[
cos(2*_*pi/sides + start_angle*pi/180),
sin(2*_*pi/sides + start_angle*pi/180)
]
for _ in range(sides)
]
)
# If data is Nxsides, newdata is Nx2.
if scaling:
# Scales data for you.
newdata = dot((data.T / data.sum(-1)).T,basis)
else:
# Assumes data already sums to 1.
newdata = dot(data,basis)
fig = figure(**fig_args)
ax = fig.add_subplot(111)
for i,l in enumerate(labels):
if i >= sides:
break
x = basis[i,0]
y = basis[i,1]
if rotate_labels:
angle = 180*arctan(y/x)/pi + 90
if angle > 90 and angle <= 270:
angle = mod(angle + 180,360)
else:
angle = 0
ax.text(
x*(1 + label_offset),
y*(1 + label_offset),
l,
horizontalalignment='center',
verticalalignment='center',
rotation=angle
)
# Clear normal matplotlib axes graphics.
ax.set_xticks(())
ax.set_yticks(())
ax.set_frame_on(False)
# Plot border
ax.plot(
[basis[_,0] for _ in range(sides) + [0,]],
[basis[_,1] for _ in range(sides) + [0,]],
**edge_args
)
return newdata,ax
if __name__ == '__main__':
k = 0.5
s = 1000
data = vstack((
array([k,0,0]) + rand(s,3),
array([0,k,0]) + rand(s,3),
array([0,0,k]) + rand(s,3)
))
color = array([[1,0,0]]*s + [[0,1,0]]*s + [[0,0,1]]*s)
newdata,ax = ternaryPlot(data)
ax.scatter(
newdata[:,0],
newdata[:,1],
s=2,
alpha=0.5,
color=color
)
show()

R has an external package called VCD which should do what you want.
The documentation is very good (122 page manual distributed w/ the package); there's also a book by the same name, Visual Display of Quantitative Information, by the package's author (Prof. Michael Friendly).
To create ternary plots using vcd, just call ternaryplot() and pass in an m x 3 matrix, i.e., a matrix with three columns.
The method signature is very simple; only a single parameter (the m x 3 data matrix) is required; and all of the keyword parameters relate to the plot's aesthetics, except for scale, which when set to 1, normalizes the data column-wise.
To plot data points on the ternary plot, the coordinates for a given point are calculated as the gravity center of mass points in which each feature value comprising the data matrix is a separate weight, hence the coordinates of a point V(a, b, c) are
V(b, c/2, c * (3^.5)/2
To generate the diagram below, i just created some fake data to represent four different chemical mixtures, each comprised of varying fractions of three substances (x, y, z). I scaled the input (so x + y + z = 1) but the function will do it for you if you pass in a value for its 'scale' parameter (in fact, the default is 1, which i believe is what your question requires). I used different colors & symbols to represent the four data points, but you can also just use a single color/symbol and label each point (via the 'id' argument).

A package I have authored in R has just been accepted for CRAN, webpage is www.ggtern.com:
It is based off ggplot2, which I have used as a platform. The driving force for me, was a desire to have consistency in my work, and, since I use ggplot2 heavily, development of the package was a logical progression.
For those of you who use ggplot2, use of ggtern should be a breeze, and, here is a couple of demonstrations of what can be achieved.
Produced with the following code:
# Load data
data(Feldspar)
# Sort it by decreasing pressure
# (so small grobs sit on top of large grobs
Feldspar <- Feldspar[with(Feldspar, order(-P.Gpa)), ]
# Build and Render the Plot
ggtern(data = Feldspar, aes(x = An, y = Ab, z = Or)) +
#the layer
geom_point(aes(fill = T.C,
size = P.Gpa,
shape = Feldspar)) +
#scales
scale_shape_manual(values = c(21, 24)) +
scale_size_continuous(range = c(2.5, 7.5)) +
scale_fill_gradient(low = "green", high = "red") +
#theme tweaks
theme_tern_bw() +
theme(legend.position = c(0, 1),
legend.justification = c(0, 1),
legend.box.just = "left") +
#tweak guides
guides(shape= guide_legend(order =1,
override.aes=list(size=5)),
size = guide_legend(order =2),
fill = guide_colourbar(order=3)) +
#labels and title
labs(size = "Pressure/GPa",
fill = "Temperature/C") +
ggtitle("Feldspar - Elkins and Grove 1990")
Contour plots have also been patched for the ternary environment, and, an inclusion of a new geometry for representing confidence intervals via the Mahalanobis Distance.
Produced with the following code:
ggtern(data=Feldspar,aes(An,Ab,Or)) +
geom_confidence(aes(group=Feldspar,
fill=..level..,
alpha=1-..level..),
n=2000,
breaks=c(0.01,0.02,0.03,0.04,
seq(0.05,0.95,by=0.1),
0.99,0.995,0.9995),
color=NA,linetype=1) +
geom_density2d(aes(color=..level..)) +
geom_point(fill="white",aes(shape=Feldspar),size=5) +
theme_tern_bw() +
theme_tern_nogrid() +
theme(ternary.options=element_ternary(padding=0.2),
legend.position=c(0,1),
legend.justification=c(0,1),
legend.box.just="left") +
labs(color="Density",fill="Confidence",
title="Feldspar - Elkins and Grove 1990 + Confidence Levels + Density") +
scale_color_gradient(low="gray",high="magenta") +
scale_fill_gradient2(low="red",mid="orange",high="green",
midpoint=0.8) +
scale_shape_manual(values=c(21,24)) +
guides(shape= guide_legend(order =1,
override.aes=list(size=5)),
size = guide_legend(order =2),
fill = guide_colourbar(order=3),
color= guide_colourbar(order=4),
alpha= "none")

Veusz supports ternary plots. Here is an example from the documentation:

Chloƫ Lewis developed a triangle-plot general class, meant to support the soil texture triangle
with Python and Matplotlib. It's available here http://nature.berkeley.edu/~chlewis/Sourcecode.html https://github.com/chlewissoil/TernaryPlotPy
Chloe editing to add: Moved it to a more reliable host! Also, it's a public repo, so if you want to request library-ization, you could add an issue. Hope it's useful to someone.

I just discovered a tool which uses Python/Matplotlib to generate ternary plots called wxTernary. It's available via http://wxternary.sourceforge.net/ -- I was able to successfully generate a ternary plot on the first try.

There seems to be an implementation at work here in gnuplot:
(source: ugm.ac.id)

There is a R package named soiltexture. It's aimed at soil texture triangle plot, but can be customized for some aspects.

Find a vector drawing library and draw it from scratch if you can't find an easier way to do it.

Related

How to make a 2D plot with color density as the 3rd argument in python 3

I'd like to make a plot where each point it has its x&y value and it also has a third value expressing the color density at that point. Applying my python code in mathematica I am able to do it using the following code, but now I want to do it only using python(preferably using matlibplot).
def printMath2DTableMethod():
print('{', end="")
for i in range(0, lines, 1):
print('{', end="")
for j in range(0, columns, 1):
f = int(columns * rearrange_.rearrangeMethod(i) + rearrange_.rearrangeMethod(j))
print('%d' % size[f], end = '')
if (j < columns - 1):
print(',', end='')
if (i < lines - 1):
print('},')
else:
print('}}')
The plotting should look something similar to the images of these two questions
How can I make a scatter plot colored by density in matplotlib?
How to plot a density map in python?
it should have a colorbar at the side and the points with the biggest density should be on the top of the other points(if they overlap).
The data that this method produces I append it to some file and it looks like:
1,2,4,5,6,2,6 x256 columns in total
3,2,4,5,1,6,4
4,2,5,6,1,7,5
x256 rows in total
The plotting can be made by using the code directly or by reading the data from the file, but what I don't know is how to assign values to x(which is the i at the 1st for loop at the code above), to y(which is the j at the 2nd for loop at the code above) and especially to the 3rd argument, the one which will show the color density(which is the size[f] at the code above) since it is depended on i and j of the for loops.
I have been trying to research and solve it myself all these days, but not much success, so any help would be highly appreciated. Thanks in advance :)
Here are examples for both plots you linked
import matplotlib.pyplot as plt
import scipy as sp
# scatterplot as link 1
Data = sp.randn(1000,3)
plt.scatter(Data[:,0],Data[:,1],c=Data[:,2],cmap='magma')
plt.colorbar()
# density matrix as link 2
Nbins = 50
M = sp.zeros((Nbins+1,Nbins+1))
xinds = sp.digitize(Data[:,0],sp.linspace(-3,3,Nbins)) # chose limits accordingly
yinds = sp.digitize(Data[:,1],sp.linspace(-3,3,Nbins))
# to account for the highest density drawn over the others
sort_inds = sp.argsort(Data[:,2])[::-1]
Data = Data[sort_inds,:]
xinds = xinds[sort_inds]
yinds = yinds[sort_inds]
for i in range(Data.shape[0]):
M[xinds[i],yinds[i]] = Data[i,2]
plt.matshow(M,cmap='magma',
extent=(Data[:,0].min(),Data[:,0].max(),Data[:,1].max(),Data[:,1].min()),
aspect='equal')
plt.colorbar()

Differing length of matplotlib.pyplot.pcolorfast edges on symlog scale

I'm currently trying to create a coloured grid plot on a logarithmic scale using . As I want to include the area from 0 to 1, I'm using "symlog" as a scale instead of "log".
fig, ax = plt.subplots()
Z = np.random.random(size=(RATE_EXPONENT + 1, BLOCK_EXPONENT + 1))
x_edges = [0] + [AXIS_BASE ** i for i in range(RATE_EXPONENT + 1)]
y_edges = [0] + [AXIS_BASE ** i for i in range(BLOCK_EXPONENT + 1)]
ax.set_xbound(0.0, MAX_FEE_RATE)
ax.set_ybound(0.0, MAX_CONFIRMATION_BLOCKS)
ax.set_xlabel('Fee rate in satoshis / byte')
ax.set_ylabel('Confirmation time in blocks')
ax.set_xscale('symlog')
ax.set_yscale('symlog')
ax.set_xticks(x_edges)
ax.set_yticks(y_edges)
ax.get_xaxis().set_major_formatter(ticker.ScalarFormatter())
ax.get_yaxis().set_major_formatter(ticker.ScalarFormatter())
colour_map = colors.LinearSegmentedColormap.from_list('GreenRed', ['red', 'green'], N=256)
ax.pcolorfast(x_edges, y_edges, Z, cmap=colour_map)
plt.show()
Unfortunately, the edges aren't quite predictably spaced to the point where I'd know how to input my data and in fact, the edges are moved depending on the zoom factor.
For reference, this is what it looks like all zoomed out
and this is what it looks like when you zoom into the interval from 2 to 4
As you can see, the grid edges move as I zoom in. I'd also like for the edges to be placed at the same intervals as the axis ticks, however I've not found anything useful in the pyplot docs.
Any help would be much appreciated!
PS: Using a linear instead of a symlog scale works. Same thing when using xlim / ylim.
Use matplotlib.pyplot.pcolormesh instead of matplotlib.axes.Axes.pcolorfast. The docstring of pcolorfast says that it is experimental and
"...it lacks support for log scaling of the axes...",
as of the current version 2.0.0.

How to remove/omit smaller contour lines using matplotlib

I am trying to plot contour lines of pressure level. I am using a netCDF file which contain the higher resolution data (ranges from 3 km to 27 km). Due to higher resolution data set, I get lot of pressure values which are not required to be plotted (rather I don't mind omitting certain contour line of insignificant values). I have written some plotting script based on the examples given in this link http://matplotlib.org/basemap/users/examples.html.
After plotting the image looks like this
From the image I have encircled the contours which are small and not required to be plotted. Also, I would like to plot all the contour lines smoother as mentioned in the above image. Overall I would like to get the contour image like this:-
Possible solution I think of are
Find out the number of points required for plotting contour and mask/omit those lines if they are small in number.
or
Find the area of the contour (as I want to omit only circled contour) and omit/mask those are smaller.
or
Reduce the resolution (only contour) by increasing the distance to 50 km - 100 km.
I am able to successfully get the points using SO thread Python: find contour lines from matplotlib.pyplot.contour()
But I am not able to implement any of the suggested solution above using those points.
Any solution to implement the above suggested solution is really appreciated.
Edit:-
# Andras Deak
I used print 'diameter is ', diameter line just above del(level.get_paths()[kp]) line to check if the code filters out the required diameter. Here is the filterd messages when I set if diameter < 15000::
diameter is 9099.66295612
diameter is 13264.7838257
diameter is 445.574234531
diameter is 1618.74618114
diameter is 1512.58974168
However the resulting image does not have any effect. All look same as posed image above. I am pretty sure that I have saved the figure (after plotting the wind barbs).
Regarding the solution for reducing the resolution, plt.contour(x[::2,::2],y[::2,::2],mslp[::2,::2]) it works. I have to apply some filter to make the curve smooth.
Full working example code for removing lines:-
Here is the example code for your review
#!/usr/bin/env python
from netCDF4 import Dataset
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
import numpy as np
import scipy.ndimage
from mpl_toolkits.basemap import interp
from mpl_toolkits.basemap import Basemap
# Set default map
west_lon = 68
east_lon = 93
south_lat = 7
north_lat = 23
nc = Dataset('ncfile.nc')
# Get this variable for later calucation
temps = nc.variables['T2']
time = 0 # We will take only first interval for this example
# Draw basemap
m = Basemap(projection='merc', llcrnrlat=south_lat, urcrnrlat=north_lat,
llcrnrlon=west_lon, urcrnrlon=east_lon, resolution='l')
m.drawcoastlines()
m.drawcountries(linewidth=1.0)
# This sets the standard grid point structure at full resolution
x, y = m(nc.variables['XLONG'][0], nc.variables['XLAT'][0])
# Set figure margins
width = 10
height = 8
plt.figure(figsize=(width, height))
plt.rc("figure.subplot", left=.001)
plt.rc("figure.subplot", right=.999)
plt.rc("figure.subplot", bottom=.001)
plt.rc("figure.subplot", top=.999)
plt.figure(figsize=(width, height), frameon=False)
# Convert Surface Pressure to Mean Sea Level Pressure
stemps = temps[time] + 6.5 * nc.variables['HGT'][time] / 1000.
mslp = nc.variables['PSFC'][time] * np.exp(9.81 / (287.0 * stemps) * nc.variables['HGT'][time]) * 0.01 + (
6.7 * nc.variables['HGT'][time] / 1000)
# Contour only at 2 hpa interval
level = []
for i in range(mslp.min(), mslp.max(), 1):
if i % 2 == 0:
if i >= 1006 and i <= 1018:
level.append(i)
# Save mslp values to upload to SO thread
# np.savetxt('mslp.txt', mslp, fmt='%.14f', delimiter=',')
P = plt.contour(x, y, mslp, V=2, colors='b', linewidths=2, levels=level)
# Solution suggested by Andras Deak
for level in P.collections:
for kp,path in enumerate(level.get_paths()):
# include test for "smallness" of your choice here:
# I'm using a simple estimation for the diameter based on the
# x and y diameter...
verts = path.vertices # (N,2)-shape array of contour line coordinates
diameter = np.max(verts.max(axis=0) - verts.min(axis=0))
if diameter < 15000: # threshold to be refined for your actual dimensions!
#print 'diameter is ', diameter
del(level.get_paths()[kp]) # no remove() for Path objects:(
#level.remove() # This does not work. produces ValueError: list.remove(x): x not in list
plt.gcf().canvas.draw()
plt.savefig('dummy', bbox_inches='tight')
plt.close()
After the plot is saved I get the same image
You can see that the lines are not removed yet. Here is the link to mslp array which we are trying to play with http://www.mediafire.com/download/7vi0mxqoe0y6pm9/mslp.txt
If you want x and y data which are being used in the above code, I can upload for your review.
Smooth line
You code to remove the smaller circles working perfectly. However the other question I have asked in the original post (smooth line) does not seems to work. I have used your code to slice the array to get minimal values and contoured it. I have used the following code to reduce the array size:-
slice = 15
CS = plt.contour(x[::slice,::slice],y[::slice,::slice],mslp[::slice,::slice], colors='b', linewidths=1, levels=levels)
The result is below.
After searching for few hours I found this SO thread having simmilar issue:-
Regridding regular netcdf data
But none of the solution provided over there works.The questions similar to mine above does not have proper solutions. If this issue is solved then the code is perfect and complete.
General idea
Your question seems to have 2 very different halves: one about omitting small contours, and another one about smoothing the contour lines. The latter is simpler, since I can't really think of anything else other than decreasing the resolution of your contour() call, just like you said.
As for removing a few contour lines, here's a solution which is based on directly removing contour lines individually. You have to loop over the collections of the object returned by contour(), and for each element check each Path, and delete the ones you don't need. Redrawing the figure's canvas will get rid of the unnecessary lines:
# dummy example based on matplotlib.pyplot.clabel example:
import matplotlib
import numpy as np
import matplotlib.cm as cm
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
delta = 0.025
x = np.arange(-3.0, 3.0, delta)
y = np.arange(-2.0, 2.0, delta)
X, Y = np.meshgrid(x, y)
Z1 = mlab.bivariate_normal(X, Y, 1.0, 1.0, 0.0, 0.0)
Z2 = mlab.bivariate_normal(X, Y, 1.5, 0.5, 1, 1)
# difference of Gaussians
Z = 10.0 * (Z2 - Z1)
plt.figure()
CS = plt.contour(X, Y, Z)
for level in CS.collections:
for kp,path in reversed(list(enumerate(level.get_paths()))):
# go in reversed order due to deletions!
# include test for "smallness" of your choice here:
# I'm using a simple estimation for the diameter based on the
# x and y diameter...
verts = path.vertices # (N,2)-shape array of contour line coordinates
diameter = np.max(verts.max(axis=0) - verts.min(axis=0))
if diameter<1: # threshold to be refined for your actual dimensions!
del(level.get_paths()[kp]) # no remove() for Path objects:(
# this might be necessary on interactive sessions: redraw figure
plt.gcf().canvas.draw()
Here's the original(left) and the removed version(right) for a diameter threshold of 1 (note the little piece of the 0 level at the top):
Note that the top little line is removed while the huge cyan one in the middle doesn't, even though both correspond to the same collections element i.e. the same contour level. If we didn't want to allow this, we could've called CS.collections[k].remove(), which would probably be a much safer way of doing the same thing (but it wouldn't allow us to differentiate between multiple lines corresponding to the same contour level).
To show that fiddling around with the cut-off diameter works as expected, here's the result for a threshold of 2:
All in all it seems quite reasonable.
Your actual case
Since you've added your actual data, here's the application to your case. Note that you can directly generate the levels in a single line using np, which will almost give you the same result. The exact same can be achieved in 2 lines (generating an arange, then selecting those that fall between p1 and p2). Also, since you're setting levels in the call to contour, I believe the V=2 part of the function call has no effect.
import numpy as np
import matplotlib.pyplot as plt
# insert actual data here...
Z = np.loadtxt('mslp.txt',delimiter=',')
X,Y = np.meshgrid(np.linspace(0,300000,Z.shape[1]),np.linspace(0,200000,Z.shape[0]))
p1,p2 = 1006,1018
# this is almost the same as the original, although it will produce
# [p1, p1+2, ...] instead of `[Z.min()+n, Z.min()+n+2, ...]`
levels = np.arange(np.maximum(Z.min(),p1),np.minimum(Z.max(),p2),2)
#control
plt.figure()
CS = plt.contour(X, Y, Z, colors='b', linewidths=2, levels=levels)
#modified
plt.figure()
CS = plt.contour(X, Y, Z, colors='b', linewidths=2, levels=levels)
for level in CS.collections:
for kp,path in reversed(list(enumerate(level.get_paths()))):
# go in reversed order due to deletions!
# include test for "smallness" of your choice here:
# I'm using a simple estimation for the diameter based on the
# x and y diameter...
verts = path.vertices # (N,2)-shape array of contour line coordinates
diameter = np.max(verts.max(axis=0) - verts.min(axis=0))
if diameter<15000: # threshold to be refined for your actual dimensions!
del(level.get_paths()[kp]) # no remove() for Path objects:(
# this might be necessary on interactive sessions: redraw figure
plt.gcf().canvas.draw()
plt.show()
Results, original(left) vs new(right):
Smoothing by resampling
I've decided to tackle the smoothing problem as well. All I could come up with is downsampling your original data, then upsampling again using griddata (interpolation). The downsampling part could also be done with interpolation, although the small-scale variation in your input data might make this problem ill-posed. So here's the crude version:
import scipy.interpolate as interp #the new one
# assume you have X,Y,Z,levels defined as before
# start resampling stuff
dN = 10 # use every dN'th element of the gridded input data
my_slice = [slice(None,None,dN),slice(None,None,dN)]
# downsampled data
X2,Y2,Z2 = X[my_slice],Y[my_slice],Z[my_slice]
# same as X2 = X[::dN,::dN] etc.
# upsampling with griddata over original mesh
Zsmooth = interp.griddata(np.array([X2.ravel(),Y2.ravel()]).T,Z2.ravel(),(X,Y),method='cubic')
# plot
plt.figure()
CS = plt.contour(X, Y, Zsmooth, colors='b', linewidths=2, levels=levels)
You can freely play around with the grids used for interpolation, in this case I just used the original mesh, as it was at hand. You can also play around with different kinds of interpolation: the default 'linear' one will be faster, but less smooth.
Result after downsampling(left) and upsampling(right):
Of course you should still apply the small-line-removal algorithm after this resampling business, and keep in mind that this heavily distorts your input data (since if it wasn't distorted, then it wouldn't be smooth). Also, note that due to the crude method used in the downsampling step, we introduce some missing values near the top/right edges of the region under consideraton. If this is a problem, you should consider doing the downsampling based on griddata as I've noted earlier.
This is a pretty bad solution, but it's the only one that I've come up with. Use the get_contour_verts function in this solution you linked to, possibly with the matplotlib._cntr module so that nothing gets plotted initially. That gives you a list of contour lines, sections, vertices, etc. Then you have to go through that list and pop the contours you don't want. You could do this by calculating a minimum diameter, for example; if the max distance between points is less than some cutoff, throw it out.
That leaves you with a list of LineCollection objects. Now if you make a Figure and Axes instance, you can use Axes.add_collection to add all of the LineCollections in the list.
I checked this out really quick, but it seemed to work. I'll come back with a minimum working example if I get a chance. Hope it helps!
Edit: Here's an MWE of the basic idea. I wasn't familiar with plt._cntr.Cntr, so I ended up using plt.contour to get the initial contour object. As a result, you end up making two figures; you just have to close the first one. You can replace checkDiameter with whatever function works. I think you could turn the line segments into a Polygon and calculate areas, but you'd have to figure that out on your own. Let me know if you run into problems with this code, but it at least works for me.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
def checkDiameter(seg, tol=.3):
# Function for screening line segments. NB: Not actually a proper diameter.
diam = (seg[:,0].max() - seg[:,0].min(),
seg[:,1].max() - seg[:,1].min())
return not (diam[0] < tol or diam[1] < tol)
# Create testing data
x = np.linspace(-1,1, 21)
xx, yy = np.meshgrid(x,x)
z = np.exp(-(xx**2 + .5*yy**2))
# Original plot with plt.contour
fig0, ax0 = plt.subplots()
# Make sure this contour object actually has a tiny contour to remove
cntrObj = ax0.contour(xx,yy,z, levels=[.2,.4,.6,.8,.9,.95,.99,.999])
# Primary loop: Copy contours into a new LineCollection
lineNew = list()
for lineOriginal in cntrObj.collections:
# Get properties of the original LineCollection
segments = lineOriginal.get_segments()
propDict = lineOriginal.properties()
propDict = {key: value for (key,value) in propDict.items()
if key in ['linewidth','color','linestyle']} # Whatever parameters you want to carry over
# Filter out the lines with small diameters
segments = [seg for seg in segments if checkDiameter(seg)]
# Create new LineCollection out of the OK segments
if len(segments) > 0:
lineNew.append(mpl.collections.LineCollection(segments, **propDict))
# Make new plot with only these line collections; display results
fig1, ax1 = plt.subplots()
ax1.set_xlim(ax0.get_xlim())
ax1.set_ylim(ax0.get_ylim())
for line in lineNew:
ax1.add_collection(line)
plt.show()
FYI: The bit with propDict is just to automate bringing over some of the line properties from the original plot. You can't use the whole dictionary at once, though. First, it contains the old plot's line segments, but you can just swap those for the new ones. But second, it appears to contain a number of parameters that are in conflict with each other: multiple linewidths, facecolors, etc. The {key for key in propDict if I want key} workaround is my way to bypass that, but I'm sure someone else can do it more cleanly.

How to reshape a networkx graph in Python?

So I created a really naive (probably inefficient) way of generating hasse diagrams.
Question:
I have 4 dimensions... p q r s .
I want to display it uniformly (tesseract) but I have no idea how to reshape it. How can one reshape a networkx graph in Python?
I've seen some examples of people using spring_layout() and draw_circular() but it doesn't shape in the way I'm looking for because they aren't uniform.
Is there a way to reshape my graph and make it uniform? (i.e. reshape my hasse diagram into a tesseract shape (preferably using nx.draw() )
Here's what mine currently look like:
Here's my code to generate the hasse diagram of N dimensions
#!/usr/bin/python
import networkx as nx
import matplotlib.pyplot as plt
import itertools
H = nx.DiGraph()
axis_labels = ['p','q','r','s']
D_len_node = {}
#Iterate through axis labels
for i in xrange(0,len(axis_labels)+1):
#Create edge from empty set
if i == 0:
for ax in axis_labels:
H.add_edge('O',ax)
else:
#Create all non-overlapping combinations
combinations = [c for c in itertools.combinations(axis_labels,i)]
D_len_node[i] = combinations
#Create edge from len(i-1) to len(i) #eg. pq >>> pqr, pq >>> pqs
if i > 1:
for node in D_len_node[i]:
for p_node in D_len_node[i-1]:
#if set.intersection(set(p_node),set(node)): Oops
if all(p in node for p in p_node) == True: #should be this!
H.add_edge(''.join(p_node),''.join(node))
#Show Plot
nx.draw(H,with_labels = True,node_shape = 'o')
plt.show()
I want to reshape it like this:
If anyone knows of an easier way to make Hasse Diagrams, please share some wisdom but that's not the main aim of this post.
This is a pragmatic, rather than purely mathematical answer.
I think you have two issues - one with layout, the other with your network.
1. Network
You have too many edges in your network for it to represent the unit tesseract. Caveat I'm not an expert on the maths here - just came to this from the plotting angle (matplotlib tag). Please explain if I'm wrong.
Your desired projection and, for instance, the wolfram mathworld page for a Hasse diagram for n=4 has only 4 edges connected all nodes, whereas you have 6 edges to the 2 and 7 edges to the 3 bit nodes. Your graph fully connects each "level", i.e. 4-D vectors with 0 1 values connect to all vectors with 1 1 value, which then connect to all vectors with 2 1 values and so on. This is most obvious in the projection based on the Wikipedia answer (2nd image below)
2. Projection
I couldn't find a pre-written algorithm or library to automatically project the 4D tesseract onto a 2D plane, but I did find a couple of examples, e.g. Wikipedia. From this, you can work out a co-ordinate set that would suit you and pass that into the nx.draw() call.
Here is an example - I've included two co-ordinate sets, one that looks like the projection you show above, one that matches this one from wikipedia.
import networkx as nx
import matplotlib.pyplot as plt
import itertools
H = nx.DiGraph()
axis_labels = ['p','q','r','s']
D_len_node = {}
#Iterate through axis labels
for i in xrange(0,len(axis_labels)+1):
#Create edge from empty set
if i == 0:
for ax in axis_labels:
H.add_edge('O',ax)
else:
#Create all non-overlapping combinations
combinations = [c for c in itertools.combinations(axis_labels,i)]
D_len_node[i] = combinations
#Create edge from len(i-1) to len(i) #eg. pq >>> pqr, pq >>> pqs
if i > 1:
for node in D_len_node[i]:
for p_node in D_len_node[i-1]:
if set.intersection(set(p_node),set(node)):
H.add_edge(''.join(p_node),''.join(node))
#This is manual two options to project tesseract onto 2D plane
# - many projections are available!!
wikipedia_projection_coords = [(0.5,0),(0.85,0.25),(0.625,0.25),(0.375,0.25),
(0.15,0.25),(1,0.5),(0.8,0.5),(0.6,0.5),
(0.4,0.5),(0.2,0.5),(0,0.5),(0.85,0.75),
(0.625,0.75),(0.375,0.75),(0.15,0.75),(0.5,1)]
#Build the "two cubes" type example projection co-ordinates
half_coords = [(0,0.15),(0,0.6),(0.3,0.15),(0.15,0),
(0.55,0.6),(0.3,0.6),(0.15,0.4),(0.55,1)]
#make the coords symmetric
example_projection_coords = half_coords + [(1-x,1-y) for (x,y) in half_coords][::-1]
print example_projection_coords
def powerset(s):
ch = itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
return [''.join(t) for t in ch]
pos={}
for i,label in enumerate(powerset(axis_labels)):
if label == '':
label = 'O'
pos[label]= example_projection_coords[i]
#Show Plot
nx.draw(H,pos,with_labels = True,node_shape = 'o')
plt.show()
Note - unless you change what I've mentioned in 1. above, they still have your edge structure, so won't look exactly the same as the examples from the web. Here is what it looks like with your existing network generation code - you can see the extra edges if you compare it to your example (e.g. I don't this pr should be connected to pqs:
'Two cube' projection
Wikimedia example projection
Note
If you want to get into the maths of doing your own projections (and building up pos mathematically), you might look at this research paper.
EDIT:
Curiosity got the better of me and I had to search for a mathematical way to do this. I found this blog - the main result of which being the projection matrix:
This led me to develop this function for projecting each label, taking the label containing 'p' to mean the point has value 1 on the 'p' axis, i.e. we are dealing with the unit tesseract. Thus:
def construct_projection(label):
r1 = r2 = 0.5
theta = math.pi / 6
phi = math.pi / 3
x = int( 'p' in label) + r1 * math.cos(theta) * int('r' in label) - r2 * math.cos(phi) * int('s' in label)
y = int( 'q' in label) + r1 * math.sin(theta) * int('r' in label) + r2 * math.sin(phi) * int('s' in label)
return (x,y)
Gives a nice projection into a regular 2D octagon with all points distinct.
This will run in the above program, just replace
pos[label] = example_projection_coords[i]
with
pos[label] = construct_projection(label)
This gives the result:
play with r1,r2,theta and phi to your heart's content :)

how to print equation of line using scipy stats

My code performs a linear regression on 2 sets of data. It works fine but i do not know how i can print the equation of the line onto the graph itself with scipy or numpy.
Here is my code:
y=np.array([15,1489,859,336,277,265,229,285,391,372,5,345])
x=np.array([196.16,17762.47,28542.19,30170.5,9384.06,43210.29,21819.2,16978.2,45767.54,12328.78,113.71,19257.6])
print x
print y
slope, intercept, r_value, p_value, slope_std_error = stats.linregress(x, y)
print "slope = "+ str(slope)
print "r_value = "+ str(r_value)
print "r_squared = " + str(r_value**2)
print "p_value = "+str(p_value)
# Calculate some additional outputs
predict_y = intercept + slope * x
print predict_y
pred_error = y - predict_y
degrees_of_freedom = len(x) - 2
residual_std_error = np.sqrt(np.sum(pred_error**2) / degrees_of_freedom)
# Plotting
pylab.xlabel('cost')
pylab.ylabel('signups')
pylab.plot(x, y, 'o')
pylab.plot(x, predict_y, 'k-')
pylab.show()
Where do you want the equation to go? To put it on the title, for example: plt.title('$y=%3.7sx+%3.7s$'%(slope, intercept)). To put it inside the plot use plot.text.
There are lots of ways to do this, depending on the look you want. You could have the line's equation: in a box on the side; floating in the middle of the plot; with an arrow pointing to the line (see below); written along the line; as a title; as a caption (ie, in the text that usually occurs below the plot -- this would be the most common approach); or as a boxed legend in the plot (eg, with different colored lines titled with different colors).
My favorite, given no other constraints is an arrow to the line, because then the reader has no doubt what the equation is actually referring to. To do this, use annotate:
x0 = 20000
y0 = slope*x0+intercept
pylab.annotate(line_eqn, xy=(x0, y0), xytext=(x0-.4*x0, y0+.4*y0),
arrowprops=dict(arrowstyle='->', connectionstyle='arc3,rad=-0.5'))
For clarity, it may seem that writing along the line would be even more clear, but this will get difficult to read for vertical lines or crossing lines, and there's less positioning flexibility. Personally, I wouldn't recommend the title, since readers expect to see the actual title or topic of the plot in this location, but it's probably the easiest to do since it requires no other parameters for its location.

Categories