Change trace order in parallel plots with plotly - python

I have a question about plotly parallel_coordinates function.
To reproduce my case use this code:
import numpy as np
import pandas as pd
import plotly.express as px
np.random.seed(100)
data = np.random.random([1000,2])
data_df = pd.DataFrame(data, columns=["data1","data2"])
fig_parallel = px.parallel_coordinates(data_df,
color="data2",
dimensions=["data1","data2"])
fig_parallel.write_html('test.html')
You will get this image:
The color-bar correspond to the last axis ( data2).
We can notice that the traces going to the top of the last axis (the yellow ones) are in front of the plot.
What I want to have is the same plot but with the blue traces in the front of the image to have a better visualisation (bottom data are my data of interest).
Thank you in advance to any one who may be able to give me a solution ^^

Related

Creating a 2 colour heatmap with Python

I have numerous sets of seasonal data that I am looking to show in a heatmap format. I am not worried about the magnitude of the values in the dataset but more the overall direction and any patterns that i can look at in more detail later. To do this I want to create a heatmap that only shows 2 colours (red for below zero and green for zero and above).
I can create a normal heatmap with seaborn but the normal colour maps do not have only 2 colours and I am not able to create one myself. Even if I could I am unable to set the parameters to reflect the criteria of below zero = red and zero+ = green.
I managed to create this simply by styling the dataframe but I was unable to export it as a .png because the table_criteria='matplotlib' option removes the formatting.
Below is an example of what I would like to create made from random data, could someone help or point me in the direction of a helpful Stackoverflow answer?
I have also included the code I used to style and export the dataframe.
Desired output - this is created with random data in an Excel spreadsheet
#Code to create a regular heatmap - can this be easily amended?
df_hm = pd.read_csv(filename+h)
pivot = df_hm.pivot_table(index='Year', columns='Month', values='delta', aggfunc='sum')
fig, ax = plt.subplots(figsize=(10,5))
ax.set_title('M1 '+h[:-7])
sns.heatmap(pivot, annot=True, fmt='.2f', cmap='RdYlGn')
plt.savefig(chartpath+h[:-7]+" M1.png", bbox_inches='tight')
plt.close()
#code used to export dataframe that loses format in the .png
import matplotlib.pyplot as plt
import dataframe_image as dfi
#pivot is the dateframe name
pivot = pd.DataFrame(np.random.randint(-100,100,size= (5, 12)),columns=list ('ABCDEFGHIJKL'))
styles = [dict(selector="caption", props=[("font-size", "120%"),("font-weight", "bold")])]
pivot = pivot.style.format(precision=2).highlight_between(left=-100000, right=-0.01, props='color:white;background-color:red').highlight_between(left=0, right= 100000, props='color:white;background-color:green').set_caption(title).set_table_styles(styles)
dfi.export(pivot, root+'testhm.png', table_conversion='matplotlib',chrome_path=None)
You can manually set cmap property to list of colors and if you want to annotate you can do it and it will show same value as it's not converted to -1 or 1.
import numpy as np
import seaborn as sns
arr = np.random.randn(10,10)
sns.heatmap(arr,cmap=["grey",'green'],annot=True,center=0)
# center will make it dividing point
Output:
PS. If you don't want color-bar you can pass cbar=False in `sns.heatmap)
Welcome to SO!
To achieve what you need, you just need to pass delta through the sign function, here's an example code:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
arr = np.random.randn(25,25)
sns.heatmap(np.sign(arr))
Which results in a binary heatmap, albeit one with a quite ugly colormap, still, you can fiddle around with Seaborn's colormaps in order to make it look like excel.

How to incorporate subplots option when plotting a data frame using Pandas-Bokeh?

I have a dataframe corresponding to a multivariate time series which I'd like to plot. Each channel would appear on its own set of axes, with all plots arranged vertically. I'd also like to add the interactive options available with Bokeh, including the ability to remove one channel from view by clicking on its label.
Without Bokeh, I can use subplots to get the separate "static" plots stacked vertically as follows:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
A=np.random.rand(800,10)
df=pd.DataFrame(data=A,columns=['a','b','c','d','e','f','g','h','i','j'])
df.plot(subplots=True)
plt.show()
I can plot the 10 channels on one set of axes using Bokeh using this:
import numpy as np
import pandas as pd
pd.set_option('plotting.backend', 'pandas_bokeh')
A=np.random.rand(800,10)
df=pd.DataFrame(data=A,columns=['a','b','c','d','e','f','g','h','i','j'])
df.plot_bokeh(kind="line")
The resulting graph allows for zooming, panning, channel de-selection, etc. However all plots signals are plotted on the same set of axes, which I would rather not do.
I use this code snippet to plot my figures in a grid.
import pandas as pd
import pandas_bokeh
from bokeh.palettes import Dark2_5 as palette
def plot_grid(df: pd.DataFrame):
figs = []
color = itertools.cycle(palette)
for c in df.columns:
figs.append(df[c].plot_bokeh(show_figure=False, color=next(color)))
pandas_bokeh.plot_grid(figs, ncols=1, plot_width=1500)
The ncols parameter allows you to specify how many columns you want per row.
Hope this helps!

How to plot multiple lines on the same y-axis using Plotly Express in Python

I just installed plotly express. And I am trying to do something simple - plot each column of my data frame on the same y-axis with the index as x-axis. Here are questions/observations:
Is it necessary for the data frame to have index as a column to be used as x-axis ? Can I not directly use the index for x-axis?
How can I add multiple traces as were called in plotly on y-axis for the same x-axis ?
Please note that, I am not trying to add traces using plotly, rather trying to use plotly-express.
Also, there a few similar posts online, the closest was this:
https://community.plot.ly/t/multiple-traces-plotly-express/23360
However, this post shows how you can add a scatter, not a line. I want to plot a line and there is no add_line similar to add_scatter shown in the example here.
Appreciate any help in advance
Sample code:
import plotly.express as px
import pandas as pd
import numpy as np
# Get some data
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')
# Plot
fig = px.line(df, x='Date', y='AAPL.High')
# Only thing I figured is - I could do this
fig.add_scatter(x=df['Date'], y=df['AAPL.Low']) # Not what is desired - need a line
# Show plot
fig.show()
PLot:
Short answer:
fig = px.line(df, x='Date', y=df.columns[1:-6])
Where df.columns are the column names of the columns returned as a list, or a subset of the columns using, for example, df.columns[1:-6]
The details
Your code works fine But if you specifically do not want to apply the (somewhat laborious) add_trace() function to each line, you can use px.line(). This used to require you to transform your data from a wide to long format. But not anymore, so just define an index and name the columns you'd like to plot. Or reference all or a subset of your dataframe columns through, for ecxample, y=df.columns[1:-6]
Code 1:
# imports
import plotly.express as px
import pandas as pd
import numpy as np
# data
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')
fig = px.line(df, x='Date', y=df.columns[1:-6])
# Show plot
fig.show()
Plot:
If you'd like to know how to do the same thing with data of a long format, here's how you do that too using pandas and plotly:
Code 2:
# imports
import plotly.express as px
import pandas as pd
import numpy as np
# data
df_wide = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')
df_long=pd.melt(df_wide, id_vars=['Date'], value_vars=['AAPL.Open', 'AAPL.High', 'AAPL.Low', 'AAPL.Close', 'mavg'])
# plotly
fig = px.line(df_long, x='Date', y='value', color='variable')
# Show plot
fig.show()
Not sure what type of line your looking for, but have you tried something like below
fig.add_scatter(x=df['Date'], y=df['AAPL.Low'],mode='lines')
On a standard scatter you can set the mode to be any combination of lines, markers and text.
There is one method to add plots in a single graph.
import matplotlib.plotly as plt
plt.figure(figsize=(-,-))
ax1 = plt.plot(x1,y1)
ax2 = plt.plot(x2,y2)
ax3 = plt.plot(x3,y3)
plt.legend(handles=[ax1,ax2,ax3],figsize=10)

Matplotlib bar chart - overlay bars similar to stacked

I want to create a matplotlib bar plot that has the look of a stacked plot without being additive from a multi-index pandas dataframe.
The below code gives the basic behaviour
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import io
data = io.StringIO('''Fruit,Color,Price
Apple,Red,1.5
Apple,Green,1.0
Pear,Red,2.5
Pear,Green,2.3
Lime,Green,0.5
Lime, Red, 3.0
''')
df_unindexed = pd.read_csv(data)
df_unindexed
df = df_unindexed.set_index(['Fruit', 'Color'])
df.unstack().plot(kind='bar')
The plot command df.unstack().plot(kind='bar') shows all the apple prices grouped next to each other. If you choose the option df.unstack().plot(kind='bar',stacked=True) - it adds the prices for Red and Green together and stacks them.
I am wanting a plot that is halfway between the two - it shows each group as a single bar, but overlays the values so you can see them all. The below figure (done in powerpoint) shows what behaviour I am looking for -> I want the image on the right.
Short of calculating all the values and then using the stacked option, is this possible?
This seems (to me) like a bad idea, since this representation leads to several problem. Will a reader understand that those are not staked bars? What happens when the front bar is taller than the ones behind?
In any case, to accomplish what you want, I would simply repeatedly call plot() on each subset of the data and using the same axes so that the bars are drawn on top of each other.
In your example, the "Red" prices are always higher, so I had to adjust the order to plot them in the back, or they would hide the "Green" bars.
fig,ax = plt.subplots()
my_groups = ['Red','Green']
df_group = df_unindexed.groupby("Color")
for color in my_groups:
temp_df = df_group.get_group(color)
temp_df.plot(kind='bar', ax=ax, x='Fruit', y='Price', color=color, label=color)
There are two problems with this kind of plot. (1) What if the background bar is smaller than the foreground bar? It would simply be hidden and not visible. (2) A chart like this is not distinguishable from a stacked bar chart. Readers will have severe problems interpreting it.
That being said, you can plot both columns individually.
import matplotlib.pyplot as plt
import pandas as pd
import io
data = io.StringIO('''Fruit,Color,Price
Apple,Red,1.5
Apple,Green,1.0
Pear,Red,2.5
Pear,Green,2.3
Lime,Green,0.5
Lime,Red,3.0''')
df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Fruit', 'Color']).unstack()
df.columns = df.columns.droplevel()
plt.bar(df.index, df["Red"].values, label="Red")
plt.bar(df.index, df["Green"].values, label="Green")
plt.legend()
plt.show()

Pandas stacked area chart with zero values

I am creating a stacked area chart using pandas df.plot(kind = area). Some of my data values are zero at some times. I would like to not have the line show where the value is zero. Is it possible to hide the line while still showing the area?
Here is basic code that makes a simple graph. I don't want the red line to show between 3 and 4 because the values are 0.
import numpy as np
import pandas as pd
data = np.array([np.arange(10)]*3).T
df = pd.DataFrame(data, columns = ['A','B','C'])
df['C']=np.where(df.index==4,0,df['C'])
df['C']=np.where(df.index==3,0,df['C'])
df.plot(kind='area')
I have finally worked out the solution to this. Other places suggested edgecolor etc but it didn't solve the problem. linewidth, however, does.
linewidth=0
or, in your case, use the line of code:
df.plot(kind='area', linewidth=0)

Categories