Holoviews: how to customize histogram for linked time series Curve plots - python

I am just getting started with Holoviews. My questions are on customizing histograms, but also I am sharing a complete example as it may be helpful for other newbies to look at, since the documentation for Holoviews is very thorough but can be overwhelming.
I have a number of time series in text files loaded as Pandas DataFrames where:
each file is for a specific location
at each location about 10 time series were collected, each with about 15,000 points
I am building a small interactive tool where a Selector can be used to choose the location / DataFrame, and then another Selector to pick 3 of 10 of the time series to be plotted together.
My goal is to allow linked zooms (both x and y scales). The questions and code will focus on this aspect of the tool.
I cannot share the actual data I am using, unfortunately, as it is proprietary, but I have created 3 random walks with specific data ranges that are consistent with the actual data.
## preliminaries ##
import pandas as pd
import numpy as np
import holoviews as hv
from holoviews.util.transform import dim
from holoviews.selection import link_selections
from holoviews import opts
from holoviews.operation.datashader import shade, rasterize
import hvplot.pandas
hv.extension('bokeh', width=100)
## create random walks (one location) ##
data_df = pd.DataFrame()
npoints=15000
np.random.seed(71)
x = np.arange(npoints)
y1 = 1300+2.5*np.random.randn(npoints).cumsum()
y2 = 1500+2*np.random.randn(npoints).cumsum()
y3 = 3+np.random.randn(npoints).cumsum()
data_df.loc[:,'x'] = x
data_df.loc[:,'rand1'] = y1
data_df.loc[:,'rand2'] = y2
data_df.loc[:,'rand3'] = y3
This first block is just to plot the data and show how, by design, one of the random walks have different range from the other two:
data_df.hvplot(x='x',y=['rand1','rand2','rand3'],value_label='y',width=800,height=400)
As a result, although hvplot subplots work out of the box (for linking), ranges are different so the scaling is not quite there:
data_df.hvplot(x='x',y=['rand1','rand2','rand3'],
value_label='y',subplots=True,width=800,height=200).cols(1)
So, my first attempt was to adapt the Python-based Points example from Linked brushing in the documentation:
colors = hv.Cycle('Category10').values
dims = ['rand1', 'rand2', 'rand3']
layout = hv.Layout([
hv.Points(data_df, dim).opts(color=c)
for c, dim in zip(colors, [['x', d] for d in dims])
])
link_selections(layout).opts(opts.Points(width=1200, height=300)).cols(1)
That is already an amazing result for a 20 minutes effort!
However, what I would really like is to plot a curve rather than points, and also see a histogram, so I adapted the comprehension syntax to work with Curve (after reading the documentation pages Applying customization, and Composing elements):
colors = hv.Cycle('Category10').values
dims = ['rand1', 'rand2', 'rand3']
layout = hv.Layout([hv.Curve(data_df,'x',dim).opts(height=300,width=1200,
color=c).hist(dim) for c,
dim in zip(colors,[d for d in dims])])
link_selections(layout).cols(1)
Which is almost exactly what I want. But I still struggle with the different layers of opts syntax.
Question 1: with the comprehension from the last code block, how would I make the histogram share color with the curves?
Now, suppose I want to rasterize the plots (although I do not think is quite yet necessary with 15,000 points like in this case), I tried to adapt the first example with Points:
cmaps = ['Blues', 'Greens', 'Reds']
dims = ['rand1', 'rand2', 'rand3']
layout = hv.Layout([
shade(rasterize(hv.Points(data_df, dims),
cmap=c)).opts(width=1200, height = 400).hist(dims[1])
for c, dims in zip(cmaps, [['x', d] for d in dims])
])
link_selections(layout).cols(1)
This is a decent start, but again I struggle with the options/customization.
Question 2: in the above cod block, how would I pass the colormaps (it does not work as it is now), and how do I make the histogram reflect data values as in the previous case (and also have the right colormap)?
Thank you!

Sander answered how to color the histogram, but for the other question about coloring the datashaded plot, Datashader renders your data with a colormap rather than a single color, so the parameter is named cmap rather than color. So you were correct to use cmap in the datashaded case, but (a) cmap is actually a parameter to shade (which does the colormapping of the output of rasterize), and (b) you don't really need shade, as you can let Bokeh do the colormapping in most cases nowadays, in which case cmap is an option rather than an argument. Example:
from bokeh.palettes import Blues, Greens, Reds
cmaps = [Blues[256][200:], Greens[256][200:], Reds[256][200:]]
dims = ['rand1', 'rand2', 'rand3']
layout = hv.Layout([
rasterize(hv.Points(data_df, ds)).opts(cmap=c,width=1200, height = 400).hist(dims[1])
for c, ds in zip(cmaps, [['x', d] for d in dims])
])
link_selections(layout).cols(1)

To answer your first question to make the histogram share the color of the curve, I've added .opts(opts.Histogram(color=c)) to your code.
When you have a layout you can specify the options of an element inside the layout like that.
colors = hv.Cycle('Category10').values
dims = ['rand1', 'rand2', 'rand3']
layout = hv.Layout(
[hv.Curve(data_df,'x',dim)
.opts(height=300,width=600, color=c)
.hist(dim)
.opts(opts.Histogram(color=c))
for c, dim in zip(colors,[d for d in dims])]
)
link_selections(layout).cols(1)

Related

Accessing (the right) data when using holoviews/bokeh

I am having difficulties accessing (the right) data when using holoviews/bokeh, either for connected plots showing a different aspect of the dataset, or just customising a plot with dynamic access to the data as plotted (say a tooltip).
TLDR: How to add a projection plot of my dataset (different set of dimensions and linked to main plot, like a marginal distribution but, you know, not restricted to histogram or distribution) and probably with a similar solution a related question I asked here on SO
Let me exemplify (straight from a ipynb, should be quite reproducible):
import numpy as np
import random, pandas as pd
import bokeh
import datashader as ds
import holoviews as hv
from holoviews import opts
from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize
hv.extension('bokeh')
With imports set up, let's create a dataset (N target 10e12 ;) to use with datashader. Beside the key dimensions, I really need some value dimensions (here z and z2).
import numpy as np
import pandas as pd
N = int(10e6)
x_r = (0,100)
y_r = (100,2000)
z_r = (0,10e8)
x = np.random.randint(x_r[0]*1000,x_r[1]*1000,size=(N, 1))
y = np.random.randint(y_r[0]*1000,y_r[1]*1000,size=(N, 1))
z = np.random.randint(z_r[0]*1000,z_r[1]*1000,size=(N, 1))
z2 = np.ones((N,1)).astype(int)
df = pd.DataFrame(np.column_stack([x,y,z,z2]), columns=['x','y','z','z2'])
df[['x','y','z']] = df[['x','y','z']].div(1000, axis=0)
df
Now I plot the data, rasterised, and also activate the tooltip to see the defaults. Sure, x/y is trivial, but as I said, I care about the value dimensions. It shows z2 as x_y z2. I have a question related to tooltips with the same sort of data here on SO for value dimension access for the tooltips.
from matplotlib.cm import get_cmap
palette = get_cmap('viridis')
# palette_inv = palette.reversed()
p=hv.Points(df,['x','y'], ['z','z2'])
P=rasterize(p, aggregator=ds.sum("z2"),x_range=(0,100)).opts(cmap=palette)
P.opts(tools=["hover"]).opts(height=500, width=500,xlim=(0,100),ylim=(100,2000))
Now I can add a histogram or a marginal distribution which is pretty close to what I want, but there are issues with this soon past the trivial defaults. (E.g.: P << hv.Distribution(p, kdims=['y']) or P.hist(dimension='y',weight_dimension='x_y z',num_bins = 2000,normed=True))
Both are close approaches, but do not give me the other value dimension I'd like visualise. If I try to access the other value dimension ('x_y z') this fails. Also, the 'x_y z2' way seems very clumsy, is there a better way?
When I do something like this, my browser/notebook-extension blows up, of course.
transformed = p.transform(x=hv.dim('z'))
P << hv.Curve(transformed)
So how do I access all my data in the right way?

Unwanted white column in matplotlib -- how to remove?

here is python code (porting from Richard McElreath's excellent Statistical Rethinking) that results in an unwanted white trasparent 'column' in my resulting plot:
import numpy as np
import pandas as pd
import scipy.stats
import matplotlib.pyplot as plt
# import data
url = "https://raw.githubusercontent.com/pymc-devs/resources/master/Rethinking_2/Data/Howell1.csv"
df = pd.read_csv(url, delimiter = ';')
df2 = df[df.age >= 18]
# sample priors (prior predictive check)
n = 100
a = scipy.stats.norm.rvs(178, 20, n)
b1 = scipy.stats.norm.rvs(0, 10, n)
b2 = np.exp(scipy.stats.norm.rvs(0, 1, n))
xbar = df2.weight.mean()
# compare 2 priors
fig,ax = plt.subplots(1,2,sharey=True)
for i in range(100):
ax[0].plot(df2.weight, a[i] + b1[i]*(df2.weight - xbar),color = 'grey',lw=.5,alpha=.2)
ax[0].set_xlabel('weight')
ax[0].set_ylabel('height')
ax[0].set_title('normal prior of β')
ax[1].plot(df2.weight, a[i] + b2[i]*(df2.weight - xbar),color = 'grey',lw=.5,alpha=.2)
ax[1].set_xlabel('weight')
ax[1].set_title('log-normal prior of β')
plt.axis([30,60,-100,400])
plt.show()
This occurs in my Jupyter notebook, in Google CoLab and in the pdf (plt.savefig)
My notebook versions:
numpy 1.19.4
pandas 1.1.5
scipy 1.5.4
matplotlib 3.3.3
Thanks!!
I think you mean the region where the lines are drawn thinner/lighter and not the borders.
I found out it has to do with aliasing and not the data itself.
Play around with the antialiased parameter:
ax[0].plot(..., antialiased=False)
Looks like this:
Surely it makes the plot look ugly but you may increase the figure size or dpi parameter.
fig.set_dpi(300.0)
...
plt.show();
Then you get this:
This is a data artifact interacting with anti-aliasing in an interesting way. In the final image we have to pick a color for every pixel. Without anti-aliasing when we have to draw a line we have to decide is this pixel "in" the line (and hence we color it) or "out" (in which case we do not color it) which can lead to stair-step looking lines (particularly with lines that are close to flat). With anti-aliasing we color the pixel based on how much of the pixel is "in" the line vs not. That smearing out fools our eye (in a good way) and we see a more convincing straight line. Without anti-aliasing or alpha drawing the same line multiple times does not change the appearance (any given pixel is still in or out), but with anti-aliasing or alpha, every time you draw the line any of the "partial" pixels get darker.
In the original data the values in df2.weight all fall on the same line, but they are not sorted so as we draw it is going back-and-forth over the same path (see the trace in the left-center panel). Depending on exactly where the turning points are and how many times any given segment is traversed the line will look darker in someplaces than others. There is something in the exact structure of the data that is causing that "band".
If you increase the DPI, the pixels get smaller so the effect will get less pronounced (similar to zooming in) and if you turn of anti-aliasing the effect will get less pronounced. I suspect (but have not tested) if you shuffle the data you will be able to move the band around!
Sorting the weights (which from this context I do not think their order is meaningful?) makes the plots in the bottom two panels that look nicer.
So in short, that band is "real" in the sense that it is representing something in the data rather than being a bug in the render process, but is highlighting structure in the data that I do not think is meaningful.
import numpy as np
import pandas as pd
import scipy.stats
import matplotlib.pyplot as plt
# import data
url = "https://raw.githubusercontent.com/pymc-devs/resources/master/Rethinking_2/Data/Howell1.csv"
# this is a mpl 3.3 feature
fig, ad = plt.subplot_mosaic(
[
["normal", "log-normal"],
["trace", "hist"],
["sorted normal", "sorted log-normal"],
],
constrained_layout=True,
)
df = pd.read_csv(url, delimiter=";")
df2 = df[df.age >= 18]
# sample priors (prior predictive check)
n = 100
a = scipy.stats.norm.rvs(178, 20, n)
b1 = scipy.stats.norm.rvs(0, 10, n)
b2 = np.exp(scipy.stats.norm.rvs(0, 1, n))
def inner(weights, a, b1, b2, ax_dict):
xbar = np.mean(weights)
for i in range(100):
ax_dict["normal"].plot(
weights, a[i] + b1[i] * (weights - xbar), color="grey", lw=0.5, alpha=0.2
)
ax_dict["normal"].set_xlabel("weight")
ax_dict["normal"].set_ylabel("height")
ax_dict["normal"].set_title("normal prior of β")
ax_dict["log-normal"].plot(
weights, a[i] + b2[i] * (weights - xbar), color="grey", lw=0.5, alpha=0.2
)
ax_dict["log-normal"].set_xlabel("weight")
ax_dict["log-normal"].set_title("log-normal prior of β")
inner(df2.weight, a, b1, b2, ad)
inner(
np.array(sorted(df2.weight)),
a,
b1,
b2,
{"normal": ad["sorted normal"], "log-normal": ad["sorted log-normal"]},
)
ad["hist"].hist(df2.weight, bins=100, color="0.5")
ad["hist"].set_xlabel("weight")
ad["hist"].set_ylabel("#")
ad["trace"].plot(df2.weight, "-o", color="0.5", alpha=0.5)
ad["trace"].set_ylabel("weight")
ad["trace"].set_xlabel("index")
plt.show()
I set the values manually, but if you set the axes for each graph, the margins will disappear.
ax[0].axis([33,63,-100,400])
ax[1].axis([33,60,-100,400])
If you want to make the spacing between the graphs narrower, you can do so in the following way.
fig.subplots_adjust(wspace=0.05)

Plotting a large point cloud using plotly produces a blank graph

Plotting a fairly large point cloud in python using plotly produces a graph with axes (not representative of the data range) and no data points.
The code:
import pandas as pd
import plotly.express as px
import numpy as np
all_res = np.load('fullshelf4_11_2019.npy' )
all_res.shape
(3, 6742382)
np.max(all_res[2])
697.5553566696478
np.min(all_res[2])
-676.311654692491
frm = pd.DataFrame(data=np.transpose(all_res[0:, 0:]),columns=["X", "Y", "Z"])
fig = px.scatter_3d(frm, x='X', y='Y', z='Z')
fig.update_traces(marker=dict(size=4))
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0))
fig.show()
Alternatively you could generate random data and follow the process through
all_res = np.random.rand(3, 6742382)
Which also produces a blank graph with a axis scales that are incorrect.
So -- what am I doing wrong, and is there a better way to plot such a moderately large data set?
Thanks for your help!
Try plotting using ipyvolume.It can handle large point cloud datasets.
It seems like that's too much data for WebGL to handle. I managed to plot 100k points, but 1M points already caused Jupyter to crash. However, a 3D scatterplot of 6.7 million points is of questionable value anyway. You probably won't be able to make any sense out of it (except for data boundaries maybe) and it will be super slow to rotate etc.
I would try to think of alternative approaches, depending on what you want to do. Maybe pick a representative subset of points and plot those.
I would suggest using pythreejs for a point cloud. It has very good performance, even for a large number of points.
import pythreejs as p3
import numpy as np
N = 1_000_000
# Positions centered around the origin
positions = np.random.normal(loc=0.0, scale=100.0, size=(N, 3)).astype('float32')
# Create a buffer geometry with random color for each point
geometry = p3.BufferGeometry(
attributes={'position': p3.BufferAttribute(array=positions),
'color': p3.BufferAttribute(
array=np.random.random((N, 3)).astype('float32'))})
# Create a points material
material = p3.PointsMaterial(vertexColors='VertexColors', size=1)
# Combine the geometry and material into a Points object
points = p3.Points(geometry=geometry, material=material)
# Create the scene and the renderer
view_width = 700
view_height = 500
camera = p3.PerspectiveCamera(position=[800.0, 0, 0], aspect=view_width/view_height)
scene = p3.Scene(children=[points, camera], background="#DDDDDD")
controller = p3.OrbitControls(controlling=camera)
renderer = p3.Renderer(camera=camera, scene=scene, controls=[controller],
width=view_width, height=view_height)
renderer

How to hack this Bokeh HexTile plot to fix the coords, label placement and axes?

Below is Bokeh 1.4.0 code that tries to draw a HexTile map of the input dataframe, with axes, and tries to place labels on each hex.
I've been stuck on this for two days solid, reading bokeh doc, examples and github known issues, SO, Bokeh Discourse and Red Blob Games's superb tutorial on Hexagonal Grids, and trying code. (I'm less interested in raising Bokeh issues for the future, and far more interested in pragmatic workarounds to known limitations to just get my map code working today.) Plot is below, and code at bottom.
Here are the issues, in rough decreasing order of importance (it's impossible to separate the root-cause and tell which causes which, due to the way Bokeh handles glyphs. If I apply one scale factor or coord transform it fixes one set of issues, but breaks another, 'whack-a-mole' effect):
The label placement is obviously wrong, but I can't seem to hack up any variant of either (x,y) coords or (q,r) coords to work. (I tried combinations of figure(..., match_aspect=True)), I tried 1/sqrt(2) scaling the (x,y)-coords, I tried Hextile(... size, scale) params as per redblobgames, e.g. size = 1/sqrt(3) ~ 0.57735).
Bokeh forces the origin to be top left, and y-coords to increase as you go down, however the default axis labels show y or r as being negative. I found I still had to use p.text(q, -r, .... I suppose I have to manually patch the auto-supplied yaxis labels or TickFormatter to be positive.
I use np.mgrid to generate the coord grid, but I still seem to have to assign q-coords right-to-left: np.mgrid[0:8, (4+1):0:-1]. Still no matter what I do, the hexes are flipped L-to-R
(Note: empty '' counties are placeholders to get the desired shape, hence the boolean mask [counties!=''] on grid coords. This works fine and I want to leave it as-is)
The source (q,r) coords for the hexes are integers, and I use 'odd-r' offset coords (not axial or hexagonal coords). No matter what HexTile(..., size, scale) args I use, one or both dimensions in the plot is wrong or squashed. Or whether I include the 1/sqrt(2) factor in coord transform.
My +q-axis is east and my +r-axis should be 120° SSE
Ideally I'd like to have my origin at bottom-left (math plot style, not computer graphics). But Bokeh apparently doesn't support that, I can live without that. However defaulting the y-axis labels to negative, while requiring a mix of positive and negative coords, is confusing. Anyway, how to hack an automatic fix to that with minimum grief? (manual p.yrange = Range1d(?, ?)?)
Bokeh's approach to attaching (hex) glyphs to plots is a hard idiom to use. Ideally I simply want to reference (q,r)-coords everywhere for hexes, labels, axes. I never want to see (x,y)-coords appearing on axes, label coords, tick-marks, etc. but seems Bokeh won't allow you. I guess you have to manually hack the axes and ticks later. Also, the plot<->glyph interface doesn't allow you to expose a (q,r) <-> (x,y) coord transform function, certainly not a bidirectional one.
The default axes don't seem to have any accessors to automatically find their current extent/limits; p.yaxis.start/end are empty unless you specified them. The result from p.yaxis.major_tick_in,p.yaxis.major_tick_out is also wrong, for this plot it gives (2,6) for both x and y, seems to be clipping those to the interior multiples of 2(?). How to automatically get the axes' extent?
My current plot:
My code:
import pandas as pd
import numpy as np
from math import sqrt
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.models.glyphs import HexTile
from bokeh.io import show
# Data source is a list of county abbreviations, in (q,r) coords...
counties = np.array([
['TE','DY','AM','DN', ''],
['DL','FM','MN','AH', ''],
['SO','LM','CN','LH', ''],
['MO','RN','LD','WH','MH'],
['GA','OY','KE','D', ''],
['', 'CE','LS','WW', ''],
['LC','TA','KK','CW', ''],
['KY','CR','WF','WX', ''],
])
#counties = counties[::-1] # UNUSED: flip so origin is at bottom-left
# (q,r) Coordinate system is “odd/even-r” horizontal Offset coords
r, q = np.mgrid[0:8, (4+1):0:-1]
q = q[counties!='']
r = r[counties!='']
sqrt3 = sqrt(3)
# Try to transform odd-r (q,r) offset coords -> (x,y). Per Red Blob Games' tutorial.
x = q - (r//2) # this may be slightly dubious
y = r
counties_df = pd.DataFrame({'q': q, 'r': r, 'abbrev': counties[counties!=''], 'x': x, 'y': y })
counties_ds = ColumnDataSource(ColumnDataSource.from_df(counties_df)) # ({'q': q, 'r': r, 'abbrev': counties[counties != '']})
p = figure(tools='save,crosshair') # match_aspect=True?
glyph = HexTile(orientation='pointytop', q='x', r='y', size=0.76, fill_color='#f6f699', line_color='black') # q,r,size,scale=??!?!!? size=0.76 is an empirical hack.
p.add_glyph(counties_ds, glyph)
p.xaxis.minor_tick_line_color = None
p.yaxis.minor_tick_line_color = None
print(f'Axes: x={p.xaxis.major_tick_in}:{p.xaxis.major_tick_out} y={p.yaxis.major_tick_in}:{p.yaxis.major_tick_out}')
# Now can't manage to get the right coords for text labels
p.text(q, -r, text=["(%d, %d)" % (q,r) for (q, r) in zip(q, r)], text_baseline="middle", text_align="center")
# Ideally I ultimately want to fix this and plot `abbrev` column as the text label
show(p)
There is an axial_to_cartesian function that will just compute the hex centers for you. You can then attach the labels in a variety of orientations and anchoring from these.
Bokeh does not force the origin to be anywhere. There is one axial to cartesian mapping Bokeh uses, exactly what is given by axial_to_cartesian. The position of the Hex tiles (and hence the cartesian coordinates that the axes display) follows from this. If you want different ticks, Bokeh affords lots of control points over both tick location and tick labelling.
There is more than one convention for Axial coords. Bokeh picked the one that has the r-axis tile "up an to the left", i.e. the one explicitly shown here:
https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html#hex-tiles
Bokeh expects up-and-to-the-left axial coords. You will need to convert whatever coordinate system you have to that. For "squishing" you will need to set match_aspect=True to ensure the "data space" aspect ratio matches the "pixel space" aspect ratio 1-1.
Alternatively, if you don't or can't use auto-ranging you will need to set the plot size carefully and also control the border sizes with min_border_left etc to make sure the borders are always big enough to accommodate any tick labels you have (so that the inner region will not be resized)
I don't really understand this question, but you have absolute control over what ticks visually appear, regardless of the underlying tick data. Besides the built-in formatters, there is FuncTickFormatter that lets you format ticks any way you want with a snippet of JS code. [1] (And you also have control of where ticks are located, if you want that.)
[1] Please note the CoffeeScript and from_py_func options are both deprecated and being removed in then next 2.0 release.
Again, you'll want to use axial_to_cartesian to position anything other then Hex tiles. No other glyphs in Bokeh understand axial coordinates (which is why we provide the conversion function).
You misunderstood what major_tick_in and major_tick_out are for. They are literally how far the ticks visually extend inside and outside the plot frame, in pixels.
Auto-ranging (with DataRange1d) is only computed in the browser, in JavaScript, which is why the start/end are not available on the "Python" side. If you need to know the start/end, you will need to explicitly set the start/end, yourself. Note, however that match_aspect=True only function with DataRange1d. If you explicitly set start/end manually, Bokeh will assume you know what you want, and will honor what you ask for, regardless of what it does to aspect.
Below are my solution and plot. Mainly per #bigreddot's advice, but there's still some coordinate hacking needed:
Expecting users to pass input coords as axial instead of offset coords is a major limitation. I work around this. There's no point in creating a offset_to_cartesian() because we need to negate r in two out of three places:
My input is even-r offset coords. I still need to manually apply the offset: q = q + (r+1)//2
I need to manually negate r in both the axial_to_cartesian() call and the datasource creation for the glyph. (But not in the text() call).
The call needs to be: axial_to_cartesian(q, -r, size=2/3, orientation='pointytop')
Need p = figure(match_aspect=True ...) to prevent squishing
I need to manually create my x,y axes to get the range right
Solution:
import pandas as pd
import numpy as np
from math import sqrt
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, Range1d
from bokeh.models.glyphs import HexTile
from bokeh.io import curdoc, show
from bokeh.util.hex import cartesian_to_axial, axial_to_cartesian
counties = np.array([
['DL','DY','AM','', ''],
['FM','TE','AH','DN', ''],
['SO','LM','CN','MN', ''],
['MO','RN','LD','MH','LH'],
['GA','OY','WH','D' ,'' ],
['' ,'CE','LS','KE','WW'],
['LC','TA','KK','CW','' ],
['KY','CR','WF','WX','' ]
])
counties = np.flip(counties, (0)) # Flip UD for bokeh
# (q,r) Coordinate system is “odd/even-r” horizontal Offset coords
r, q = np.mgrid[0:8, 0:(4+1)]
q = q[counties!='']
r = r[counties!='']
# Transform for odd-r offset coords; +r-axis goes up
q = q + (r+1)//2
#r = -r # cannot globally negate 'r', see comments
# Transform odd-r offset coords (q,r) -> (x,y)
x, y = axial_to_cartesian(q, -r, size=2/3, orientation='pointytop')
counties_df = pd.DataFrame({'q': q, 'r': -r, 'abbrev': counties[counties!=''], 'x': x, 'y': y })
counties_ds = ColumnDataSource(ColumnDataSource.from_df(counties_df)) # ({'q': q, 'r': r, 'abbrev': counties[counties != '']})
p = figure(match_aspect=True, tools='save,crosshair')
glyph = HexTile(orientation='pointytop', q='q', r='r', size=2/3, fill_color='#f6f699', line_color='black') # q,r,size,scale=??!?!!?
p.add_glyph(counties_ds, glyph)
p.x_range = Range1d(-2,6)
p.y_range = Range1d(-1,8)
p.xaxis.minor_tick_line_color = None
p.yaxis.minor_tick_line_color = None
p.text(x, y, text=["(%d, %d)" % (q,r) for (q, r) in zip(q, r)],
text_baseline="middle", text_align="center")
show(p)

How to overplot arrays of different shape?

I'm trying to overplot two arrays with different shapes but I'm unable to project one on the top of the other. For example:
#importing the relevant packages
import numpy as np
import matplotlib.pyplot as plt
def overplot(data1,data2):
'''
This function should make a contour plot
of data2 over the data1 plot.
'''
#creating the figure
fig = plt.figure()
#adding an axe
ax = fig.add_axes([1,1,1,1])
#making the plot for the
#first dataset
ax.imshow(data1)
#overplotting the contours
#for the second dataset
ax.contour(data2, projection = data2,
levels = [0.5,0.7])
#showing the figure
plt.show(fig)
return
if __name__ == '__main__':
'''
testing zone
'''
#creating two mock datasets
data1 = np.random.rand(3,3)
data2 = np.random.rand(9,9)
#using the overplot
overplot(data1,data2)
Currently, my output is something like:
While what I actually would like is to project the contours of the second dataset into the first one. This way, if I got images of the same object but with different resolution for the cameras I would be able to do such plots. How can I do that?
Thanks for your time and attention.
It's generally best to make the data match, and then plot it. This way you have complete control over how things are done.
In the simple example you give, you could use repeat along each axis to expand the 3x3 data to match the 9x9 data. That is, you could use, data1b = np.repeat(np.repeat(data1, 3, axis=1), 3, axis=0) to give:
But for the more interesting case of images, like you mention at the end of your question, then the axes probably won't be integer multiples and you'll be better served by a spline or other type interpolation. This difference is an example of why it's better to have control over this yourself, since there are many ways to to this type of mapping.

Categories