plotly graph object colors to similar - python

I try to plot values using plotly graph objects, and use one column to set the color.
fig_1.add_trace(
go.Scattermapbox(
lat=data.latitude,
lon=data.longitude,
marker=go.scattermapbox.Marker(size=9, color=data.id_nr),
)
)
However, the values are very high numbers (id numbers, int64) and thereby, id numbers which are closer to each other (100k different rather than 1M) will appear almost as the same color.
Is there a way to set the colours as discrete colours? using
... color=data.id_nr.astype(str)
as used in Plotly express to make the coolers discrete does not work.
Invalid element(s) received for the 'color' property of
scattermapbox.marker
The basic question is: Can you set the colors that each value, however how close or distanced the delta is, gets a unique color?
EDIT:
The id values are more like:
id=[1,2,5,100004,100007,100009]
In combination with continuous coloring by plotly, the first three and the last three are kind of identically in color.
Plotly express solves this with changing the int values (of id) to strings, making them discrete.
EDIT2 :
A solution would be to separate the data by ID Then add a trace for each ID. However, this is not ..."sexy" and I would rather know a solution with plotly handling the colors discretely.

I recreated the data and created the code by adapting the example from the official reference to your assignment. The format of the data is a data frame with three columns: id column, latitude and longitude, and location name. In the definition of the marker, I used the id column to define the color, and specified 'Viridis' as the color scale, which is a continuous color map. See this for a concrete example of introducing a color scale in a scatter plot.
import pandas as pd
import plotly.express as px
lat = [38.91427,38.91538,38.91458,38.92239,38.93222,38.90842,38.91931,38.93260,38.91368,38.88516,38.921894,38.93206, 38.91275]
lon = [-77.02827,-77.02013,-77.03155,-77.04227,-77.02854,-77.02419,-77.02518,-77.03304,-77.04509,-76.99656,-77.042438,-77.02821,-77.01239]
name = ["The coffee bar","Bistro Bohem","Black Cat", "Snap","Columbia Heights Coffee","Azi's Cafe", "Blind Dog Cafe","Le Caprice","Filter","Peregrine","Tryst","The Coupe","Big Bear Cafe"]
ids = [1,2,3,4,5001,5002,5003,5004,100004,100007,100009,100010,100011]
colors = px.colors.qualitative.Alphabet[:len(lat)]
df = pd.DataFrame({'id':ids, 'lat':lat,'lon':lon,'name':name,'colors': colors})
df.head()
id lat lon name colors
0 1 38.91427 -77.02827 The coffee bar #AA0DFE
1 2 38.91538 -77.02013 Bistro Bohem #3283FE
2 3 38.91458 -77.03155 Black Cat #85660D
3 4 38.92239 -77.04227 Snap #782AB6
4 5001 38.93222 -77.02854 Columbia Heights Coffee #565656
import plotly.graph_objects as go
mapbox_access_token = open("mapbox_api_key.txt").read()
fig = go.Figure(go.Scattermapbox(
lat=df['lat'],
lon=df['lon'],
mode='markers',
marker=go.scattermapbox.Marker(
size=16,
color=df['colors'],
#colorscale='Viridis'
),
text=df['name'],
))
fig.update_layout(
autosize=False,
width=1000,
height=500,
hovermode='closest',
mapbox=dict(
accesstoken=mapbox_access_token,
bearing=0,
center=dict(
lat=38.92,
lon=-77.07
),
pitch=0,
zoom=11
),
)
fig.show()

Related

Manually set color legend for Plotly line plot [duplicate]

This question already has an answer here:
Plotly: How to define colors in a figure using Plotly Graph Objects and Plotly Express?
(1 answer)
Closed 15 days ago.
I would like to fix colors for my legend in a plotly line graph. Basically, I have a dataframe df in which there is a column Sex. I already know that my dataframe will hold only 3 possible values in that column.
So I would like to fix colors ie. Green for Male, Blue for Female and Yellow for Other. This should also hold true when there are no occurrences in the df for one category i.e. Male.
Currently, my code auto-defaults colors. For ex: If df contains all categories, it sets Blue for Male, Yellow for Female, Green for Other. But when, df only holds values containing Male and Female, the color preferences change, thereby losing consistency.
Code is as follows:
df = pd.DataFrame(...)
lineplot = px.line(
df,
x="Date",
y='Ct',
color='Sex',
title=f"Lineplot to keep track of people across time"
)
lineplot.show()
You can define the colours individually using color_discrete_map argument. Ref and still keep the specific color which we set manually:
import plotly.express as px
import pandas as pd
df = pd.DataFrame(dict(
Date=[1,2,3],
Male = [1,2,3],
Female = [2,3,1],
Others = [7,5,2]
))
fig = px.line(df, x="Date", y=["Male",
# "Female", comment this type for testing color
"Others"],
color_discrete_map={
"Male": "#456987",
"Female": "#147852",
"Others": "#00D",
})
fig.show()
output:

Add formatting, surrounding box to Altair vertical line tooltip label?

I am new to Altair, and am attempting to plot a monthly time-series variable, and have a vertical line tooltip display the date and corresponding y-value.
The code I have (warning, probably a bit ugly) gets me most of the way there:
import altair as alt
import datetime as dt
import numpy as np
import pandas as pd
# create DataFrame
monthly_dates = pd.date_range('1997-09-01', '2022-08-01', freq = 'M')
monthly_data = pd.DataFrame(
index=['Date', 'y_var'],
data=[monthly_dates, np.random.normal(size = len(monthly_dates))]
).T
# Create a selection that chooses the nearest point & selects based on x-value
nearest = alt.selection(type='single', nearest=True, on='mouseover',
fields=['Date'], empty='none')
# The basic line
line = alt.Chart(monthly_data).mark_line().encode(
x='Date:T',
y=alt.Y('y_var', title='Y variable')
)
# Transparent selectors across the chart. This is what tells us
# the x-value of the cursor
selectors = alt.Chart(monthly_data).mark_point().encode(
x='Date',
opacity=alt.value(0),
).add_selection(
nearest
)
# Draw points on the line, and highlight based on selection
points = line.mark_point().encode(
opacity=alt.condition(nearest, alt.value(1), alt.value(0))
)
# Draw text labels near the points, and highlight based on selection
text_x = line.mark_text(align='left', dx=5, dy=-10).encode(
text=alt.condition(nearest, 'Date', alt.value(' '))
)
# Draw text labels near the points, and highlight based on selection
text_y = line.mark_text(align='left', dx=5, dy=5).encode(
text=alt.condition(nearest, 'y_var', alt.value(' '))
).transform_calculate(label='datum.y_var + "%"')
# Draw a rule at the location of the selection
rules = alt.Chart(monthly_data).mark_rule(color='gray').encode(
x='Date',
).transform_filter(
nearest
)
# Put the seven layers into a chart and bind the data
chart = alt.layer(
line, selectors, points, rules, text_x, text_y
).properties(
width=600, height=300
).interactive()
chart.show()
yields the following interactive chart:
There are two things I need to do, though:
Add a box around the tooltip labels (and a plain background to this box), so that they are easy to read.
Format the labels independently: since we have monthly data, it would be great to drop the day and just have Oct 2008 or 2008-10 or something along those lines. For the value, rounding to one or two digits and adding '%' afterwards would be great. I tried using the example found here (as you can see for creating text_y) but to no avail.
Any and all help would be greatly appreciated. Apologies in advance for any dumb mistakes or poor coding practices; again, I am still learning the basics of Altair.
Update: I figured both out.
The solutions to both 1 and 2 are in the code below.
For 1: instead of trying to add a box around the text manually, I instead added tooltips to the selectors object and dropped the text_x and text_y entirely.
For 2: I used transform_calculate to create new fields for x_label and y_label that are exactly what I want to display, then feed these into the tooltip objects. This page has tons of ways to transform data.
selectors = alt.Chart(monthly_data).mark_point().transform_calculate(
x_label='timeFormat(datum.Date, "%b %Y")',
y_label='format(datum.y_var, ".1f") + "%"'
).encode(
x='Date',
opacity=alt.value(0),
tooltip=[
alt.Tooltip('x_label:N', title='Date'),
alt.Tooltip('y_label:N', title='Pct. Change')
]
).add_selection(
nearest
)
The finished product:

How do i get the positive/negative bars on the second facet plot to color correctly?

I have the dataframe below where i add the column "Color" with red or green based on if 'netflow_USD' is positive or negative.
date netflow_total coin close netflow_USD Color
0 2022-07-27 59784.661988 eth 1636.451360 9.783469e+07 green
1 2022-07-26 232728.945946 eth 1449.334247 3.373020e+08 green
2 2022-07-25 126246.255448 eth 1440.557443 1.818650e+08 green
3 2022-07-24 56072.035139 btc1598.078740 8.960753e+07 green
4 2022-07-23 -16099.547813 btc 1548.982186 -2.493791e+07 red
When plotting with plotly express, i get this
here is the code for the chart:
fig = px.bar(data, x="date", y="netflow_USD", facet_row="coin")
fig.update_traces(marker_color=data["Color"]) #reassign bar colors based on red / green column
fig.show()
The bars on the bottom chart are coloring based on the colors from the top chart, not the 'Color' column. Any idea how to fix this?
I don't know if this is a glitch or a specification. To work around this, we use subplots to create bar graphs for data extracted by crypto asset name in the data frame. Since there are two crypto asset names in the data presented, I add a title for the y-axis of the second axis of the graph with a limit of two. The fig.add_annotation() line should be disabled because a different method is required when the number of crypto asset names increases.
from plotly.subplots import make_subplots
fig = make_subplots(rows=2, cols=1)
for i,c in enumerate(data['coin'].unique()):
df = data.query('coin == #c')
fig.add_bar(x=df['date'], y=df['netflow_USD'], marker=dict(color=df['Color']), row=i+1, col=1)
fig.update_layout(yaxis=dict(title=c, anchor='y2'))#yaxis=dict(title='netflow_USD'),
fig.add_annotation(x=1.005, y=0.25 if i == 1 else 0.75,
showarrow=False,
text=c,
textangle=90,
xanchor='left',
xref='paper',
yanchor='middle',
yref='paper',)
fig.update_layout(xaxis=dict(dtick='1D'),
yaxis=dict(title='netflow_USD'),
xaxis2=dict(dtick='1D'),
yaxis2=dict(title='netflow_USD'),
showlegend=False)
fig.show()

Fix scale botttom colour on 0 in altair

I am generating a waffle plot (github-like activity heatmap) in the following way:
import altair as alt
import pandas as pd
# Import data
df = pd.read_csv("https://pastebin.com/raw/AzwJ0va4")
# Year interactive dropdown
years = list(df["year"].unique())
year_dropdown = alt.binding_select(options=years)
selection = alt.selection_single(
fields=["year"], bind=year_dropdown, name="Year", init={"year": 2020}
)
# Plot
(
alt.Chart(df)
.mark_rect()
.encode(
x=alt.X("week:O", title="Week"),
y=alt.Y("day(committed_on):O", title=""),
color=alt.Color(
"hash:Q", scale=alt.Scale(range=["transparent", "green"]), title="Commits"
),
tooltip=[
alt.Tooltip("committed_on", title="Date"),
alt.Tooltip("day(committed_on)", title="Day"),
alt.Tooltip("hash", title="Commits"),
],
)
.add_selection(selection)
.transform_filter(selection)
.properties(width=1000, height=200)
)
The resulting plot is behaving 99% as I would expect, but when I select a year with no activity (hash column populated as 0), as 2017, the plot will be filled with green squares as 0 anchored exactly in the middle of the scale.
How can I make sure that 0 is always placed at the bottom of the scale? (transparent color)
You can set the domain of the color scale the same way you set it for an axes: scale=alt.Scale(range=["transparent", "green"], domain=[0, 16]). It is possible to set just domainMin in newer version of VegaLite but not yet in Altair. In your case it is probably a got idea to set both min and max anyways, so that colors are interpreted the same for all years.

Plotly subplot represent same y-axis name with same color and single legend

I am trying to create a plot for two categories in a subplot. 1st column represent category FF and 2nd column represent category RF in the subplot.
The x-axis is always time and y-axis is remaining columns. In other words, it is a plot with one column vs rest.
1st category and 2nd category always have same column names just only the values differs.
I tried to generate the plot in a for loop but the problem is plotly treats each column name as distinct and thereby it represents the lines in different color for y-axis with same name. As a consequence, in legend also an entry is created.
For example, in first row Time vs price2010 I want both subplot FF and RF to be represented in same color (say blue) and a single entry in legend.
I tried adding legendgroup in go.Scatter but it doesn't help.
import pandas as pd
from pandas import DataFrame
from plotly import tools
from plotly.offline import init_notebook_mode, plot, iplot
import plotly.graph_objs as go
from plotly.subplots import make_subplots
CarA = {'Time': [10,20,30,40 ],
'Price2010': [22000,26000,27000,35000],
'Price2011': [23000,27000,28000,36000],
'Price2012': [24000,28000,29000,37000],
'Price2013': [25000,29000,30000,38000],
'Price2014': [26000,30000,31000,39000],
'Price2015': [27000,31000,32000,40000],
'Price2016': [28000,32000,33000,41000]
}
ff = DataFrame(CarA)
CarB = {'Time': [8,18,28,38 ],
'Price2010': [19000,20000,21000,22000],
'Price2011': [20000,21000,22000,23000],
'Price2012': [21000,22000,23000,24000],
'Price2013': [22000,23000,24000,25000],
'Price2014': [23000,24000,25000,26000],
'Price2015': [24000,25000,26000,27000],
'Price2016': [25000,26000,27000,28000]
}
rf = DataFrame(CarB)
Type = {
'FF' : ff,
'RF' : rf
}
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.3/len(ff.columns))
labels = ff.columns[1:]
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params)
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=2024, width=1024,title_text="Car Analysis")
iplot(fig)
It might not be a good solution, but so far I can able to come up only with this hack.
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.2/len(ff.columns))
labels = ff.columns[1:]
colors = [ '#a60000', '#f29979', '#d98d36', '#735c00', '#778c23', '#185900', '#00a66f']
legend = True
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params, showlegend=legend, marker=dict(
color=colors[indexP]))
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=1068, width=1024,title_text="Car Analysis")
legend = False
If you combine your data into a single tidy data frame, you can use a simple Plotly Express call to make the chart: px.line() with color, facet_row and facet_col

Categories