Bokeh is behaving in mysterious way - python

import numpy as np
from bokeh.plotting import *
from bokeh.models import ColumnDataSource
prepare data
N = 300
x = np.linspace(0,4*np.pi, N)
y0 = np.sin(x)
y1 = np.cos(x)
output_notebook()
#create a column data source for the plots to share
source = ColumnDataSource(data = dict(x = x, y0 = y0, y1 = y1))
Tools = "pan, wheel_zoom, box_zoom, reset, save, box_select, lasso_select"
create a new plot and add a renderer
left = figure(tools = Tools, plot_width = 350, plot_height = 350, title = 'sinx')
left.circle(x, y0,source = source )
create another plot and add a renderer
right = figure(tools = Tools, plot_width = 350, plot_height = 350 , title = 'cosx')
right.circle(x, y1, source = source)
put the subplot in gridplot and show the plot
p = gridplot([[left, right]])
show(p)
something is wrong with sin graph. Don't know why 'Bokeh' is behaving like this.But if I write y's into Double or single quotation marks/inverted commas then things work fine
left.circle(x, 'y0',source = source )
right.circle(x, 'y1', source = source)
put the subplot in gridplot and show the plot
p = gridplot([[left, right]])
show(p)
Things I tried to resolve the problem
1) Restarted my notebook . (Easiest way to solve problem)
2) Generated the output into new window.
3) Generated plot separately instead of grid plot.
Please help me out to find out the reason behind the scene.
Am I doing something wrong ?
Is it a bug ?

If you want to configure multiple glyphs to share data from a single ColumnDataSource, then you always need to configure the glyph properties with the names of the columns, and not with the actual data literals, as you have done. In other words:
left.circle('x', 'y0',source = source )
right.circle('x', 'y1', source = source)
Note that I have quoted 'x' as well. This is the correct way to do things when sharing a source. When you pass a literal value (i.e., a real list or array), glyphs functions like .circle automatically synthesize a column for you as a convenience. But they use defined names based on the property, so if you share a source between two renderers, then the second call to .circle will overwrite the column 'y' column that the first call to .circle made. Which is exactly what you are seeing.
As you can imagine, this behavior is confusing. Accordingly, there is an open GitHub issue to specifically and completely disallow passing in data literals whenever the source argument is provided explicitly. I can guarantee this will happen in the near future, so if you are sharing a source, you should always and only pass in column names (i.e. strings).

Related

How to set starting zoom level in EsriImagery with datashader and bokeh?

I want to project a map with its starting position like this
The current output that I get is like this
import holoviews as hv
from geoviews.tile_sources import EsriImagery
from holoviews.operation.datashader import datashade, dynspread
import datashader as ds
import colorcet as cc
hv.extension('bokeh', 'matplotlib')
c = df.loc[(df['dropoff_latitude'] >= 40.5) &
(df['dropoff_latitude'] <= 41) &
(df['dropoff_longitude'] >= -74.1) &
(df['dropoff_longitude'] <= -73.7)]
map_tiles = EsriImagery().opts(alpha=0.5, width=900, height=480, bgcolor='black')
points = hv.Points(ds.utils.lnglat_to_meters(c['dropoff_longitude'], c['dropoff_latitude']))
taxi_trips = datashade(points, dynamic = True, x_sampling=0.1, y_sampling=0.1, cmap=cc.fire, height=1000, width=1000)
map_tiles * taxi_trips
I tried to set a zoom_level or xrange, yrange in EsriImagery opts, but there are no such parameters. The method itself also has no documentation. And I couldn't find the documentation regrading this online too. (I could be looking at the wrong place.)
There are two ways to do this:
Option 1 -- dircet input
Set your wanted values using the parameter x_range and y_range in datashade(...).
taxi_trips = datashade(points, x_range=(-8250000,-8200000))
Option 2 -- indirect input
If you don't know the needed values and you want to play around a bit, you can use this workaround.
The existing figure object has a Range1d object, and this has a start and end point. This can be printed and set by a user.
This code starts with the last line of your example.
from bokeh.plotting import show
fig = hv.render(map_tiles * taxi_trips)
fig.x_range.start = -8250000
fig.x_range.end = -8200000
# fig.x_range.reset_start = -8250000
# fig.x_range.reset_end = -8200000
# the same for the y-axis
show(fig)
Here you have to get the bokeh (underlying package) figure and set your values. This values looks a bit odd and you maybe have to play a bit with it.
Output for both options
Here is the changed output.
I hope this works for you. Good luke.

plotly - multiple traces using a shared slider variable

As the title hints, I'm struggling to create a plotly chart that has multiple lines that are functions of the same slider variable.
I hacked something together using bits and pieces from the documentation: https://pastebin.com/eBixANqA. This works for one line.
Now I want to add more lines to the same chart, but this is where I'm struggling. https://pastebin.com/qZCMGeAa.
I'm getting a PlotlyListEntryError: Invalid entry found in 'data' at index, '0'
Path To Error: ['data'][0]
Can someone please help?
It looks like you were using https://plot.ly/python/sliders/ as a reference, unfortunately I don't have time to test with your code, but this should be easily adaptable. If you create each trace you want to plot in the same way that you have been:
trace1 = [dict(
type='scatter',
visible = False,
name = "trace title",
mode = 'markers+lines',
x = x[0:step],
y = y[0:step]) for step in range(len(x))]
where I note in my example my data is coming from pre-defined lists, where you are using a function, that's probably the only change you'll really need to make besides your own step size etc.
If you create a second trace in the same way, for example
trace2 = [dict(
type='scatter',
visible = False,
name = "trace title",
mode = 'markers+lines',
x = x2[0:step],
y = y2[0:step]) for step in range(len(x2))]`
Then you can put all your data together with the following
all_traces = trace1 + trace2
then you can just go ahead and plot it provided you have your layout set up correctly (it should remain unchanged from your single trace example):
fig = py.graph_objs.Figure(data=all_traces, layout=layout)
py.offline.iplot(fig)
Your slider should control both traces provided you were following https://plot.ly/python/sliders/ to get the slider working. You can combine multiple data dictionaries this way in order to have multiple plots controlled by the same slider.
I do note that if your lists of dictionaries containing data are of different length, that this gets topsy-turvy.

how to create an object out of a seaborn plot for later reference

i'm trying to decide whether i should pursue a project involving a potentially large number of plots using matplotlib or using seaborn. the latter seems a lot more user friendly upon first examination so i am a bit biased that way. that said, i am unclear how i can create an object out of a plot that i can then call later. for example, suppose i have the following code:
x1 = np.random.randn(50)
y1 = np.random.randn(50)
data = pd.DataFrame ({})
data['x1'] = x1
data['y1'] = y1
sns.lmplot('x1', 'y1', data, fit_reg=True, ci = None)
this will display the plot as output in the iPython notebook. what i would like to do however is something like:
x = sns.lmplot('x1', 'y1', data, fit_reg=True, ci = None)
so that i can store x in a dictionary to be called later. this line runs (and will plot the output as well), but typing 'x' in a later cell displays nothing and just shows:
< seaborn.axisgrid.FacetGrid at ... >
any suggestions appreciated!
The Figure object is accessible at the FacetGrid.fig attribute.

Possible update in bokeh is causing a strange generator bug

I had the following code snippet working:
import numpy as np
import bokeh.plotting as bp
from bokeh.models import HoverTool
bp.output_file('test.html')
fig = bp.figure(tools="reset,hover")
x = np.linspace(0,2*np.pi)
y1 = np.sin(x)
y2 = np.cos(x)
s1 = fig.scatter(x=x,y=y1,color='#0000ff',size=10,legend='sine')
s1.select(dict(type=HoverTool)).tooltips = {"x":"$x", "y":"$y"}
s2 = fig.scatter(x=x,y=y2,color='#ff0000',size=10,legend='cosine')
fig.select(dict(type=HoverTool)).tooltips = {"x":"$x", "y":"$y"}
bp.show()
no the liine s1.select ... returns a generator and gives me the following bug:
AttributeError: 'generator' object has no attribute 'tooltips'
A server update took place for the process that is running this code. It is possible that bokeh may have been updated. Whats my fastest workaround this ?? or is there a bug I am missing ?
Some time ago the glyph methods were changed to return the glyph renderer, instead of the plot. This makes configuring the visual properties of the glyph renderer much easier. Returning the plot was redundant, since a user typically already has a reference to the plot. But you want to search the plot for a hover tool, not the glyph renderer, so you need to do:
fig.select(HoverTool).tooltips = {"x":"$x", "y":"$y"}
Note that using a dictionary means there is no guarantee about the order of the tooltips. If you care about the order, you should use a list of tuples:
fig.select(HoverTool).tooltips = [("x", "$x"), ("y", "$y")]
Then the tooltip rows will show up in the same order as given, top to bottom.

Changing point color depending on value in real-time plotting with Bokeh

I am using Bokeh in an experiment to plot data in realtime and the library provides a convenient way to do that.
Here a snippet of my code to accomplish this tasks:
# do the imports
import pandas as pd
import numpy as np
import time
from bokeh.plotting import *
from bokeh.models import ColumnDataSource
# here is simulated fake time series data
ts = pd.date_range("8:00", "10:00", freq="5S")
ts.name = 'timestamp'
ms = pd.Series(np.arange(0, len(ts)), index=ts)
ms.name = 'measurement'
data = pd.DataFrame(ms)
data['state'] = np.random.choice(3, len(ts))
data['observation'] = np.random.choice(2, len(ts))
data.reset_index(inplace=True)
data.head()
This is how the data looks like.
Next I have used the following snipped to push the data to the server in real time
output_server("observation")
p = figure(plot_width=800, plot_height=400, x_axis_type="datetime")
x = np.array(data.head(2).timestamp, dtype=np.datetime64)
y = np.array(data.head(2).observation)
p.diamond_cross(x,y, size=30, fill_color=None, line_width=2, name='observation')
show(p)
renderer = p.select(dict(name="observation"))[0]
ds = renderer.data_source
for mes in range(len(data)):
x = np.append(x, np.datetime64(data.loc[mes].timestamp))
y = np.append(y, np.int64(data.loc[mes].observation))
ds.data["x"] = x
ds.data["y"] = y
ds._dirty = True
cursession().store_objects(ds)
time.sleep(.1)
This produces a very nice result, however I need to change the color of each data point conditioned on a value.
In this case, the condition is the state variable which takes three values -- 0, 1, and 2. So my data should be able to reflect that.
I have spent hours trying to figure it out (admittedly I an very new to Bokeh) and any help will be greatly appreciated.
When you push the data, you have to separate the groups by desired color, and then supply the corresponding colors as a palette. There's a longer discussion with several variations at https://github.com/bokeh/bokeh/issues/1967, such as the simple boteh.charts dot example bryevdv posted on 28 Feb:
cat = ['foo', 'bar', 'baz']
xyvalues=dict(x=[1,4,5], y=[2,7,3], z=[3,4,5])
dots = Dot(
xyvalues, cat=cat, title="Data",
ylabel='FP Rate', xlabel='Vendors',
legend=False, palette=["red", "green", "blue"])
show(dots)
Please remember to read and follow the posting guidelines at https://stackoverflow.com/help/how-to-ask; I found this and several other potentially useful hits with my first search attempt, "Bokeh 'change color' plot". If none of these solve your problem, you need to differentiate what you're doing from the answers already out there.

Categories