I am looking for a way to rescale a numpy 2D array to arbitrary dimensions in such a way that each cell in the rescaled array contains a weighted mean of all the cells that it (partially) covers.
I have found several methods to do this if the new dimensions are multiples of the original dimensions. For example, given a 4x4 array, this can be rescaled into a 2x2 array where the first cell is the mean of the 4 top-left cells in the original etc. But none of these methods seem to work for example when going from a 4x4 array to a 3x3 array.
This image illustrates what I'd like to do in the case of going from 4x4 (black grid) to 3x3 (red grid):
https://www.dropbox.com/s/iutym4frcphcef2/regrid.png?dl=0
Cell (0,0) in the smaller array covers the entire cell (0,0) and parts of cells (1,0), (0,1) and (1,1). I'd like the new cell to contain the mean of these cells weighted by the areas of the yellow, green, blue and orange regions.
Is there a way in to do this with numpy/scipy? Is there a name for this type of regridding (that would help when searching for a method)?
Here you go:#
It uses the Interval package to easily calculate the overlaps of the cells of the different grids, so you'll need to grab that.
from matplotlib import pyplot
import numpy
from interval import Interval, IntervalSet
def overlap(rect1, rect2):
"""Calculate the overlap between two rectangles"""
xInterval = Interval(rect1[0][0], rect1[1][0]) & Interval(rect2[0][0], rect2[1][0])
yInterval = Interval(rect1[0][1], rect1[1][1]) & Interval(rect2[0][1], rect2[1][1])
area = (xInterval.upper_bound - xInterval.lower_bound) * (yInterval.upper_bound - yInterval.lower_bound)
return area
def meanInterp(data, m, n):
newData = numpy.zeros((m,n))
mOrig, nOrig = data.shape
hBoundariesOrig, vBoundariesOrig = numpy.linspace(0,1,mOrig+1), numpy.linspace(0,1,nOrig+1)
hBoundaries, vBoundaries = numpy.linspace(0,1,m+1), numpy.linspace(0,1,n+1)
for iOrig in range(mOrig):
for jOrig in range(nOrig):
for i in range(m):
if hBoundaries[i+1] <= hBoundariesOrig[iOrig]: continue
if hBoundaries[i] >= hBoundariesOrig[iOrig+1]: break
for j in range(n):
if vBoundaries[j+1] <= vBoundariesOrig[jOrig]: continue
if vBoundaries[j] >= vBoundariesOrig[jOrig+1]: break
boxCoords = ((hBoundaries[i], vBoundaries[j]),(hBoundaries[i+1], vBoundaries[j+1]))
origBoxCoords = ((hBoundariesOrig[iOrig], vBoundariesOrig[jOrig]),(hBoundariesOrig[iOrig+1], vBoundariesOrig[jOrig+1]))
newData[i][j] += overlap(boxCoords, origBoxCoords) * data[iOrig][jOrig] / (hBoundaries[1] * vBoundaries[1])
return newData
fig = pyplot.figure()
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)
m1, n1 = 37,59
m2, n2 = 10,13
dataGrid1 = numpy.random.rand(m1, n1)
dataGrid2 = meanInterp(dataGrid1, m2, n2)
mat1 = ax1.matshow(dataGrid1, cmap="YlOrRd")
mat2 = ax2.matshow(dataGrid2, cmap="YlOrRd")
#make both plots square
ax1.set_aspect(float(n1)/float(m1))
ax2.set_aspect(float(n2)/float(m2))
pyplot.show()
Here are a couple of examples with differing grids:
Down sampling is possible too.
After having done this, i'm pretty sure all i've done is some form of image sampling. If you're looking to do this on large lists, then you're going to need to make things a bit more efficient, as it will be pretty slow.
Related
I want to digitize (= average out over cells) photon count data into pixels given by a grid that tells how they are aligned. The photon count data is stored in a 2D array. I want to split that data into cells, each of which would correspond to a pixel. The idea is basically the same as changing an HD image to a smaller resolution. I'd like to achieve this in Python.
The digitizing function I've written:
import numpy as np
def digitize(function_data, grid_shape):
"""
function_data = 2D array of function values of some 3D shape,
eg.: exp(-(x^2 + y^2 -> want to digitize this
grid_shape: an array of length 2 which contains the dimensions of the smaller resolution
"""
l = len(function_data)
pixel_len_x = int(l/grid_shape[0])
pixel_len_y = int(l/grid_shape[1])
digitized_data = np.empty((grid_shape[0], grid_shape[1]))
for i in range(grid_shape[0]): #row-index of pixel in smaller-resolution grid
for j in range(grid_shape[1]): #column-index of pixel in smaller-resolution grid
hd_pixel = []
for k in range(pixel_len_y):
hd_pixel.append(z_data[k][j:j*pixel_len_x])
hd_pixel = np.ravel(hd_pixel) #turns 2D array into 1D to be able to compute average
pixel_avg = np.average(hd_pixel)
digitized_data[i][j] = pixel_avg
return digitized_data
In theory, this function should do what I want to achieve, but when tested it doesn't yield the expected results. Either a completed version of my function or any other method that achieves my goal would be extremely helpful.
You could also use a interpolation function, if you can use SciPy. Here we use one of the gridded data interpolating functions, RectBivariateSpline to upsample your function, but you can find numerous examples on this and other sites.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import RectBivariateSpline as rbs
# Sampling coordinates
x = np.linspace(-2,2,20)
y = np.linspace(-2,2,30)
# Your function
f = np.exp(-(x[:,None]**2 + y**2))
# Interpolator
interp = rbs(x, y, f)
# Higher resolution coordinates
x_hd = np.linspace(x.min(), x.max(), x.size * 5)
y_hd = np.linspace(y.min(), y.max(), y.size * 5)
# New higher res function
f_hd = interp(x_hd, y_hd, grid = True)
# Some plots
fig, ax = plt.subplots(ncols = 2)
ax[0].imshow(f)
ax[1].imshow(f_hd)
I have a set of points in a text file: random_shape.dat.
The initial order of points in the file is random. I would like to sort these points in a counter-clockwise order as follows (the red dots are the xy data):
I tried to achieve that by using the polar coordinates: I calculate the polar angle of each point (x,y) then sort by the ascending angles, as follows:
"""
Script: format_file.py
Description: This script will format the xy data file accordingly to be used with a program expecting CCW order of data points, By soting the points in Counterclockwise order
Example: python format_file.py random_shape.dat
"""
import sys
import numpy as np
# Read the file name
filename = sys.argv[1]
# Get the header name from the first line of the file (without the newline character)
with open(filename, 'r') as f:
header = f.readline().rstrip('\n')
angles = []
# Read the data from the file
x, y = np.loadtxt(filename, skiprows=1, unpack=True)
for xi, yi in zip(x, y):
angle = np.arctan2(yi, xi)
if angle < 0:
angle += 2*np.pi # map the angle to 0,2pi interval
angles.append(angle)
# create a numpy array
angles = np.array(angles)
# Get the arguments of sorted 'angles' array
angles_argsort = np.argsort(angles)
# Sort x and y
new_x = x[angles_argsort]
new_y = y[angles_argsort]
print("Length of new x:", len(new_x))
print("Length of new y:", len(new_y))
with open(filename.split('.')[0] + '_formatted.dat', 'w') as f:
print(header, file=f)
for xi, yi in zip(new_x, new_y):
print(xi, yi, file=f)
print("Done!")
By running the script:
python format_file.py random_shape.dat
Unfortunately I don't get the expected results in random_shape_formated.dat! The points are not sorted in the desired order.
Any help is appreciated.
EDIT: The expected resutls:
Create a new file named: filename_formatted.dat that contains the sorted data according to the image above (The first line contains the starting point, the next lines contain the points as shown by the blue arrows in counterclockwise direction in the image).
EDIT 2: The xy data added here instead of using github gist:
random_shape
0.4919261070361315 0.0861956168831175
0.4860816807027076 -0.06601587301587264
0.5023029456281289 -0.18238249845392662
0.5194784026079869 0.24347943722943777
0.5395164357511545 -0.3140611471861465
0.5570497147514262 0.36010146103896146
0.6074231036252226 -0.4142604617604615
0.6397066014669927 0.48590810704447085
0.7048302091822873 -0.5173701298701294
0.7499157837544145 0.5698170011806378
0.8000108666123336 -0.6199254449254443
0.8601249660418364 0.6500974025974031
0.9002010323281716 -0.7196585989767801
0.9703341483292582 0.7299242424242429
1.0104102146155935 -0.7931355765446666
1.0805433306166803 0.8102046438410078
1.1206193969030154 -0.865251869342778
1.1907525129041021 0.8909386068476981
1.2308285791904374 -0.9360074773711129
1.300961695191524 0.971219008264463
1.3410377614778592 -1.0076702085792988
1.4111708774789458 1.051499409681228
1.451246943765281 -1.0788793781975592
1.5213800597663678 1.1317798110979933
1.561456126052703 -1.1509956709956706
1.6315892420537896 1.2120602125147582
1.671665308340125 -1.221751279024005
1.7417984243412115 1.2923406139315234
1.7818744906275468 -1.2943211334120424
1.8520076066286335 1.3726210153482883
1.8920836729149686 -1.3596340023612745
1.9622167889160553 1.4533549783549786
2.0022928552023904 -1.4086186540731989
2.072425971203477 1.5331818181818184
2.1125020374898122 -1.451707005116095
2.182635153490899 1.6134622195985833
2.2227112197772345 -1.4884454939000387
2.292844335778321 1.6937426210153486
2.3329204020646563 -1.5192876820149541
2.403053518065743 1.774476584022039
2.443129584352078 -1.5433264462809912
2.513262700353165 1.8547569854388037
2.5533387666395 -1.561015348288075
2.6234718826405867 1.9345838252656438
2.663547948926922 -1.5719008264462806
2.7336810649280086 1.9858362849271942
2.7737571312143436 -1.5750757575757568
2.8438902472154304 2.009421487603306
2.883966313501766 -1.5687258953168035
2.954099429502852 2.023481896890988
2.9941754957891877 -1.5564797323888229
3.0643086117902745 2.0243890200708385
3.1043846780766096 -1.536523022432113
3.1745177940776963 2.0085143644234558
3.2145938603640314 -1.5088557654466737
3.284726976365118 1.9749508067689887
3.324803042651453 -1.472570838252656
3.39493615865254 1.919162731208186
3.435012224938875 -1.4285753640299088
3.5051453409399618 1.8343467138921687
3.545221407226297 -1.3786835891381335
3.6053355066557997 1.7260966810966811
3.655430589513719 -1.3197205824478546
3.6854876392284703 1.6130086580086582
3.765639771801141 -1.2544077134986225
3.750611246943765 1.5024152236652237
3.805715838087476 1.3785173160173163
3.850244800627849 1.2787337662337666
3.875848954088563 -1.1827449822904361
3.919007794704616 1.1336638361638363
3.9860581363759846 -1.1074537583628485
3.9860581363759846 1.0004485329485333
4.058012891753723 0.876878197560016
4.096267318663407 -1.0303482880755608
4.15638141809291 0.7443374218374221
4.206476500950829 -0.9514285714285711
4.256571583808748 0.6491902794175526
4.3166856832382505 -0.8738695395513574
4.36678076609617 0.593855765446675
4.426894865525672 -0.7981247540338443
4.476989948383592 0.5802489177489183
4.537104047813094 -0.72918339236521
4.587199130671014 0.5902272727272733
4.647313230100516 -0.667045454545454
4.697408312958435 0.6246979535615904
4.757522412387939 -0.6148858717040526
4.807617495245857 0.6754968516332154
4.8677315946753605 -0.5754260133805582
4.917826677533279 0.7163173947264858
4.977940776962782 -0.5500265643447455
5.028035859820701 0.7448917748917752
5.088149959250204 -0.5373268398268394
5.138245042108123 0.7702912239275879
5.198359141537626 -0.5445838252656432
5.2484542243955445 0.7897943722943728
5.308568323825048 -0.5618191656828015
5.358663406682967 0.8052154663518301
5.41877750611247 -0.5844972451790631
5.468872588970389 0.8156473829201105
5.5289866883998915 -0.6067217630853987
5.579081771257811 0.8197294372294377
5.639195870687313 -0.6248642266824076
5.689290953545233 0.8197294372294377
5.749405052974735 -0.6398317591499403
5.799500135832655 0.8142866981503349
5.859614235262157 -0.6493565525383702
5.909709318120076 0.8006798504525783
5.969823417549579 -0.6570670995670991
6.019918500407498 0.7811767020857934
6.080032599837001 -0.6570670995670991
6.13012768269492 0.7562308146399057
6.190241782124423 -0.653438606847697
6.240336864982342 0.7217601338055886
6.300450964411845 -0.6420995670995664
6.350546047269764 0.6777646595828419
6.410660146699267 -0.6225964187327819
6.4607552295571855 0.6242443919716649
6.520869328986689 -0.5922077922077915
6.570964411844607 0.5548494687131056
6.631078511274111 -0.5495730027548205
6.681173594132029 0.4686727666273125
6.7412876935615325 -0.4860743801652889
6.781363759847868 0.3679316979316982
6.84147785927737 -0.39541245791245716
6.861515892420538 0.25880333951762546
6.926639500135833 -0.28237987012986965
6.917336127605076 0.14262677798392165
6.946677533279001 0.05098957832291173
6.967431210462995 -0.13605442176870675
6.965045730326905 -0.03674603174603108
I find that an easy way to sort points with x,y-coordinates like that is to sort them dependent on the angle between the line from the points and the center of mass of the whole polygon and the horizontal line which is called alpha in the example. The coordinates of the center of mass (x0 and y0) can easily be calculated by averaging the x,y coordinates of all points. Then you calculate the angle using numpy.arccos for instance. When y-y0 is larger than 0 you take the angle directly, otherwise you subtract the angle from 360° (2𝜋). I have used numpy.where for the calculation of the angle and then numpy.argsort to produce a mask for indexing the initial x,y-values. The following function sort_xy sorts all x and y coordinates with respect to this angle. If you want to start from any other point you could add an offset angle for that. In your case that would be zero though.
def sort_xy(x, y):
x0 = np.mean(x)
y0 = np.mean(y)
r = np.sqrt((x-x0)**2 + (y-y0)**2)
angles = np.where((y-y0) > 0, np.arccos((x-x0)/r), 2*np.pi-np.arccos((x-x0)/r))
mask = np.argsort(angles)
x_sorted = x[mask]
y_sorted = y[mask]
return x_sorted, y_sorted
Plotting x, y before sorting using matplotlib.pyplot.plot (points are obvisously not sorted):
Plotting x, y using matplotlib.pyplot.plot after sorting with this method:
If it is certain that the curve does not cross the same X coordinate (i.e. any vertical line) more than twice, then you could visit the points in X-sorted order and append a point to one of two tracks you follow: to the one whose last end point is the closest to the new one. One of these tracks will represent the "upper" part of the curve, and the other, the "lower" one.
The logic would be as follows:
dist2 = lambda a,b: (a[0]-b[0])*(a[0]-b[0]) + (a[1]-b[1])*(a[1]-b[1])
z = list(zip(x, y)) # get the list of coordinate pairs
z.sort() # sort by x coordinate
cw = z[0:1] # first point in clockwise direction
ccw = z[1:2] # first point in counter clockwise direction
# reverse the above assignment depending on how first 2 points relate
if z[1][1] > z[0][1]:
cw = z[1:2]
ccw = z[0:1]
for p in z[2:]:
# append to the list to which the next point is closest
if dist2(cw[-1], p) < dist2(ccw[-1], p):
cw.append(p)
else:
ccw.append(p)
cw.reverse()
result = cw + ccw
This would also work for a curve with steep fluctuations in the Y-coordinate, for which an angle-look-around from some central point would fail, like here:
No assumption is made about the range of the X nor of the Y coordinate: like for instance, the curve does not necessarily have to cross the X axis (Y = 0) for this to work.
Counter-clock-wise order depends on the choice of a pivot point. From your question, one good choice of the pivot point is the center of mass.
Something like this:
# Find the Center of Mass: data is a numpy array of shape (Npoints, 2)
mean = np.mean(data, axis=0)
# Compute angles
angles = np.arctan2((data-mean)[:, 1], (data-mean)[:, 0])
# Transform angles from [-pi,pi] -> [0, 2*pi]
angles[angles < 0] = angles[angles < 0] + 2 * np.pi
# Sort
sorting_indices = np.argsort(angles)
sorted_data = data[sorting_indices]
Not really a python question I think, but still I think you could try sorting by - sign(y) * x doing something like:
def counter_clockwise_sort(points):
return sorted(points, key=lambda point: point['x'] * (-1 if point['y'] >= 0 else 1))
should work fine, assuming you read your points properly into a list of dicts of format {'x': 0.12312, 'y': 0.912}
EDIT: This will work as long as you cross the X axis only twice, like in your example.
If:
the shape is arbitrarily complex and
the point spacing is ~random
then I think this is a really hard problem.
For what it's worth, I have faced a similar problem in the past, and I used a traveling salesman solver. In particular, I used the LKH solver. I see there is a Python repo for solving the problem, LKH-TSP. Once you have an order to the points, I don't think it will be too hard to decide on a clockwise vs clockwise ordering.
If we want to answer your specific problem, we need to pick a pivot point.
Since you want to sort according to the starting point you picked, I would take a pivot in the middle (x=4,y=0 will do).
Since we're sorting counterclockwise, we'll take arctan2(-(y-pivot_y),-(x-center_x)) (we're flipping the x axis).
We get the following, with a gradient colored scatter to prove correctness (fyi I removed the first line of the dat file after downloading):
import numpy as np
import matplotlib.pyplot as plt
points = np.loadtxt('points.dat')
#oneliner for ordering points (transform, adjust for 0 to 2pi, argsort, index at points)
ordered_points = points[np.argsort(np.apply_along_axis(lambda x: np.arctan2(-x[1],-x[0]+4) + np.pi*2, axis=1,arr=points)),:]
#color coding 0-1 as str for gray colormap in matplotlib
plt.scatter(ordered_points[:,0], ordered_points[:,1],c=[str(x) for x in np.arange(len(ordered_points)) / len(ordered_points)],cmap='gray')
Result (in the colormap 1 is white and 0 is black), they're numbered in the 0-1 range by order:
For points with comparable distances between their neighbouring pts, we can use KDTree to get two closest pts for each pt. Then draw lines connecting those to give us a closed shape contour. Then, we will make use of OpenCV's findContours to get contour traced always in counter-clockwise manner. Now, since OpenCV works on images, we need to sample data from the provided float format to uint8 image format. Given, comparable distances between two pts, that should be pretty safe. Also, OpenCV handles it well to make sure it traces even sharp corners in curvatures, i.e. smooth or not-smooth data would work just fine. And, there's no pivot requirement, etc. As such all kinds of shapes would be good to work with.
Here'e the implementation -
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial.distance import pdist
from scipy.spatial import cKDTree
import cv2
from scipy.ndimage.morphology import binary_fill_holes
def counter_clockwise_order(a, DEBUG_PLOT=False):
b = a-a.min(0)
d = pdist(b).min()
c = np.round(2*b/d).astype(int)
img = np.zeros(c.max(0)[::-1]+1, dtype=np.uint8)
d1,d2 = cKDTree(c).query(c,k=3)
b = c[d2]
p1,p2,p3 = b[:,0],b[:,1],b[:,2]
for i in range(len(b)):
cv2.line(img,tuple(p1[i]),tuple(p2[i]),255,1)
cv2.line(img,tuple(p1[i]),tuple(p3[i]),255,1)
img = (binary_fill_holes(img==255)*255).astype(np.uint8)
if int(cv2.__version__.split('.')[0])>=3:
_,contours,hierarchy = cv2.findContours(img.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
else:
contours,hierarchy = cv2.findContours(img.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cont = contours[0][:,0]
f1,f2 = cKDTree(cont).query(c,k=1)
ordered_points = a[f2.argsort()[::-1]]
if DEBUG_PLOT==1:
NPOINTS = len(ordered_points)
for i in range(NPOINTS):
plt.plot(ordered_points[i:i+2,0],ordered_points[i:i+2,1],alpha=float(i)/(NPOINTS-1),color='k')
plt.show()
return ordered_points
Sample run -
# Load data in a 2D array with 2 columns
a = np.loadtxt('random_shape.csv',delimiter=' ')
ordered_a = counter_clockwise_order(a, DEBUG_PLOT=1)
Output -
Using matplotlib in Python I'm plotting anywhere between 20 and 50 lines. Using matplotlib's sliding colour scales these become indistinguishable after a certain number of lines are plotted (well before 20).
While I've seen a few example of code in Matlab and C# to create colour maps of an arbitrary number of colours which are maximally distinguishable from one another I can't find anything for Python.
Can anyone point me in the direction of something in Python that will do this?
Cheers
I loved the idea of palette created by #xuancong84 and modified his code a bit to make it not depending on alpha channel. I drop it here for others to use, thank you #xuancong84!
import math
import numpy as np
from matplotlib.colors import ListedColormap
from matplotlib.cm import hsv
def generate_colormap(number_of_distinct_colors: int = 80):
if number_of_distinct_colors == 0:
number_of_distinct_colors = 80
number_of_shades = 7
number_of_distinct_colors_with_multiply_of_shades = int(math.ceil(number_of_distinct_colors / number_of_shades) * number_of_shades)
# Create an array with uniformly drawn floats taken from <0, 1) partition
linearly_distributed_nums = np.arange(number_of_distinct_colors_with_multiply_of_shades) / number_of_distinct_colors_with_multiply_of_shades
# We are going to reorganise monotonically growing numbers in such way that there will be single array with saw-like pattern
# but each saw tooth is slightly higher than the one before
# First divide linearly_distributed_nums into number_of_shades sub-arrays containing linearly distributed numbers
arr_by_shade_rows = linearly_distributed_nums.reshape(number_of_shades, number_of_distinct_colors_with_multiply_of_shades // number_of_shades)
# Transpose the above matrix (columns become rows) - as a result each row contains saw tooth with values slightly higher than row above
arr_by_shade_columns = arr_by_shade_rows.T
# Keep number of saw teeth for later
number_of_partitions = arr_by_shade_columns.shape[0]
# Flatten the above matrix - join each row into single array
nums_distributed_like_rising_saw = arr_by_shade_columns.reshape(-1)
# HSV colour map is cyclic (https://matplotlib.org/tutorials/colors/colormaps.html#cyclic), we'll use this property
initial_cm = hsv(nums_distributed_like_rising_saw)
lower_partitions_half = number_of_partitions // 2
upper_partitions_half = number_of_partitions - lower_partitions_half
# Modify lower half in such way that colours towards beginning of partition are darker
# First colours are affected more, colours closer to the middle are affected less
lower_half = lower_partitions_half * number_of_shades
for i in range(3):
initial_cm[0:lower_half, i] *= np.arange(0.2, 1, 0.8/lower_half)
# Modify second half in such way that colours towards end of partition are less intense and brighter
# Colours closer to the middle are affected less, colours closer to the end are affected more
for i in range(3):
for j in range(upper_partitions_half):
modifier = np.ones(number_of_shades) - initial_cm[lower_half + j * number_of_shades: lower_half + (j + 1) * number_of_shades, i]
modifier = j * modifier / upper_partitions_half
initial_cm[lower_half + j * number_of_shades: lower_half + (j + 1) * number_of_shades, i] += modifier
return ListedColormap(initial_cm)
These are the colours I get:
from matplotlib import pyplot as plt
import numpy as np
N = 16
M = 7
H = np.arange(N*M).reshape([N,M])
fig = plt.figure(figsize=(10, 10))
ax = plt.pcolor(H, cmap=generate_colormap(N*M))
plt.show()
Recently, I also encountered the same problem. So I created the following simple Python code to generate visually distinguishable colors for jupyter notebook matplotlib. It is not quite maximally perceptually distinguishable, but it works better than most built-in colormaps in matplotlib.
The algorithm splits the HSV scale into 2 chunks, 1st chunk with increasing RGB value, 2nd chunk with decreasing alpha so that the color can blends into the white background.
Take note that if you are using any toolkit other than jupyter notebook, you must make sure the background is white, otherwise, alpha blending will be different and the resultant color will also be different.
Furthermore, the distinctiveness of color highly depends on your computer screen, projector, etc. A color palette distinguishable on one screen does not necessarily imply on another. You must physically test it out if you want to use for presentation.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
def generate_colormap(N):
arr = np.arange(N)/N
N_up = int(math.ceil(N/7)*7)
arr.resize(N_up)
arr = arr.reshape(7,N_up//7).T.reshape(-1)
ret = matplotlib.cm.hsv(arr)
n = ret[:,3].size
a = n//2
b = n-a
for i in range(3):
ret[0:n//2,i] *= np.arange(0.2,1,0.8/a)
ret[n//2:,3] *= np.arange(1,0.1,-0.9/b)
# print(ret)
return ret
N = 16
H = np.arange(N*N).reshape([N,N])
fig = plt.figure(figsize=(10, 10))
ax = plt.pcolor(H, cmap=ListedColormap(generate_colormap(N*N)))
So I created a really naive (probably inefficient) way of generating hasse diagrams.
Question:
I have 4 dimensions... p q r s .
I want to display it uniformly (tesseract) but I have no idea how to reshape it. How can one reshape a networkx graph in Python?
I've seen some examples of people using spring_layout() and draw_circular() but it doesn't shape in the way I'm looking for because they aren't uniform.
Is there a way to reshape my graph and make it uniform? (i.e. reshape my hasse diagram into a tesseract shape (preferably using nx.draw() )
Here's what mine currently look like:
Here's my code to generate the hasse diagram of N dimensions
#!/usr/bin/python
import networkx as nx
import matplotlib.pyplot as plt
import itertools
H = nx.DiGraph()
axis_labels = ['p','q','r','s']
D_len_node = {}
#Iterate through axis labels
for i in xrange(0,len(axis_labels)+1):
#Create edge from empty set
if i == 0:
for ax in axis_labels:
H.add_edge('O',ax)
else:
#Create all non-overlapping combinations
combinations = [c for c in itertools.combinations(axis_labels,i)]
D_len_node[i] = combinations
#Create edge from len(i-1) to len(i) #eg. pq >>> pqr, pq >>> pqs
if i > 1:
for node in D_len_node[i]:
for p_node in D_len_node[i-1]:
#if set.intersection(set(p_node),set(node)): Oops
if all(p in node for p in p_node) == True: #should be this!
H.add_edge(''.join(p_node),''.join(node))
#Show Plot
nx.draw(H,with_labels = True,node_shape = 'o')
plt.show()
I want to reshape it like this:
If anyone knows of an easier way to make Hasse Diagrams, please share some wisdom but that's not the main aim of this post.
This is a pragmatic, rather than purely mathematical answer.
I think you have two issues - one with layout, the other with your network.
1. Network
You have too many edges in your network for it to represent the unit tesseract. Caveat I'm not an expert on the maths here - just came to this from the plotting angle (matplotlib tag). Please explain if I'm wrong.
Your desired projection and, for instance, the wolfram mathworld page for a Hasse diagram for n=4 has only 4 edges connected all nodes, whereas you have 6 edges to the 2 and 7 edges to the 3 bit nodes. Your graph fully connects each "level", i.e. 4-D vectors with 0 1 values connect to all vectors with 1 1 value, which then connect to all vectors with 2 1 values and so on. This is most obvious in the projection based on the Wikipedia answer (2nd image below)
2. Projection
I couldn't find a pre-written algorithm or library to automatically project the 4D tesseract onto a 2D plane, but I did find a couple of examples, e.g. Wikipedia. From this, you can work out a co-ordinate set that would suit you and pass that into the nx.draw() call.
Here is an example - I've included two co-ordinate sets, one that looks like the projection you show above, one that matches this one from wikipedia.
import networkx as nx
import matplotlib.pyplot as plt
import itertools
H = nx.DiGraph()
axis_labels = ['p','q','r','s']
D_len_node = {}
#Iterate through axis labels
for i in xrange(0,len(axis_labels)+1):
#Create edge from empty set
if i == 0:
for ax in axis_labels:
H.add_edge('O',ax)
else:
#Create all non-overlapping combinations
combinations = [c for c in itertools.combinations(axis_labels,i)]
D_len_node[i] = combinations
#Create edge from len(i-1) to len(i) #eg. pq >>> pqr, pq >>> pqs
if i > 1:
for node in D_len_node[i]:
for p_node in D_len_node[i-1]:
if set.intersection(set(p_node),set(node)):
H.add_edge(''.join(p_node),''.join(node))
#This is manual two options to project tesseract onto 2D plane
# - many projections are available!!
wikipedia_projection_coords = [(0.5,0),(0.85,0.25),(0.625,0.25),(0.375,0.25),
(0.15,0.25),(1,0.5),(0.8,0.5),(0.6,0.5),
(0.4,0.5),(0.2,0.5),(0,0.5),(0.85,0.75),
(0.625,0.75),(0.375,0.75),(0.15,0.75),(0.5,1)]
#Build the "two cubes" type example projection co-ordinates
half_coords = [(0,0.15),(0,0.6),(0.3,0.15),(0.15,0),
(0.55,0.6),(0.3,0.6),(0.15,0.4),(0.55,1)]
#make the coords symmetric
example_projection_coords = half_coords + [(1-x,1-y) for (x,y) in half_coords][::-1]
print example_projection_coords
def powerset(s):
ch = itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
return [''.join(t) for t in ch]
pos={}
for i,label in enumerate(powerset(axis_labels)):
if label == '':
label = 'O'
pos[label]= example_projection_coords[i]
#Show Plot
nx.draw(H,pos,with_labels = True,node_shape = 'o')
plt.show()
Note - unless you change what I've mentioned in 1. above, they still have your edge structure, so won't look exactly the same as the examples from the web. Here is what it looks like with your existing network generation code - you can see the extra edges if you compare it to your example (e.g. I don't this pr should be connected to pqs:
'Two cube' projection
Wikimedia example projection
Note
If you want to get into the maths of doing your own projections (and building up pos mathematically), you might look at this research paper.
EDIT:
Curiosity got the better of me and I had to search for a mathematical way to do this. I found this blog - the main result of which being the projection matrix:
This led me to develop this function for projecting each label, taking the label containing 'p' to mean the point has value 1 on the 'p' axis, i.e. we are dealing with the unit tesseract. Thus:
def construct_projection(label):
r1 = r2 = 0.5
theta = math.pi / 6
phi = math.pi / 3
x = int( 'p' in label) + r1 * math.cos(theta) * int('r' in label) - r2 * math.cos(phi) * int('s' in label)
y = int( 'q' in label) + r1 * math.sin(theta) * int('r' in label) + r2 * math.sin(phi) * int('s' in label)
return (x,y)
Gives a nice projection into a regular 2D octagon with all points distinct.
This will run in the above program, just replace
pos[label] = example_projection_coords[i]
with
pos[label] = construct_projection(label)
This gives the result:
play with r1,r2,theta and phi to your heart's content :)
Given that I have three matrices which describe the data that I want to plot:
lons - 2D matrix with [n_lons,n_lats]
lats - 2D matrix with [n_lons,n_lats]
dataRGB - 3D matrix with [n_lons,n_lats,3]
what is the preferred way to plot such data using python and basemap.
For pseudo-color data this is quite simple using the pcolormesh method:
data - 2D matrix with [n_lons,n_lats]
m = Basemap(...)
m.pcolormesh(lons,lats,data,latlon=True)
From reading the documentation, it seems to me that the imshow command should be used in this case, but for this method regularly gridded data is needed and I would have to regridd and interpolate my data.
Is there any other way to plot the data?
I ran into this same issue awhile ago, and this is the only solution I could come up with:
(Note that this works with matplotlib 1.3.0, but not 1.1.0)
from mpl_toolkits.basemap import Basemap
import numpy.ma as ma
import numpy as np
m = Basemap() #Define your map projection here
Assuming var is your variable of interest (NxMx3),lats is (N)x(M) and lons is (N)x(M):
We need to convert pixel center lat/lons to pixel corner lat/lons (N+1)x(M+1)
cornerLats=getCorners(lat);cornerLons=getCorners(lon)
Get coordinate corners
xCorners,yCorners=m(cornerLats,cornerLons,inverse=True)
Mask the data that is invalid
var=ma.masked_where(np.isnan(var),var)
We need a flattened tuple(N*M,3) to pass to pcolormesh
colorTuple=tuple(np.array([var[:,:,0].flatten(),var[:,:,1].flatten(),var[:,:,2].flatten()]).transpose().tolist())
Setting a larger linewidth will result in more edge distortion, and a
smaller linewidth will result in a screwed up image for some reason.
m.pcolormesh(xCorners,yCorners,var[:,:,0],color=colorTuple,clip_on=True,linewidth=0.05)
def getCorners(centers):
one = centers[:-1,:]
two = centers[1:,:]
d1 = (two - one) / 2.
one = one - d1
two = two + d1
stepOne = np.zeros((centers.shape[0] + 1,centers.shape[1]))
stepOne[:-2,:] = one
stepOne[-2:,:] = two[-2:,:]
one = stepOne[:,:-1]
two = stepOne[:,1:]
d2 = (two - one) / 2.
one = one - d2
two = two + d2
stepTwo = np.zeros((centers.shape[0] + 1,centers.shape[1] + 1))
stepTwo[:,:-2] = one
stepTwo[:,-2:] = two[:,-2:]
return stepTwo