Multiline chart from dataframe using nvd3

Multiline chart from dataframe using nvd3 - python

The nvd3 line chart in the example below uses python list as data source. But how to plot multiline from a pandas dataframe without explicitly stating the columns i.e. like in pandas plot: df.plot() df could contain x columns.
from nvd3 import lineChart
# Open File for test
output_file = open('test_lineChart.html', 'w')
# ---------------------------------------
type = "lineChart"
chart = lineChart(name=type, x_is_date=False, x_axis_format="AM_PM")
xdata = list(range(0, 24))
ydata = [0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 4, 3, 3, 5, 7, 5, 3, 16, 6, 9, 15, 4, 12]
ydata2 = [9, 8, 11, 8, 3, 7, 10, 8, 6, 6, 9, 6, 5, 4, 3, 10, 0, 6, 3, 1, 0, 0, 0, 1]
kwargs1 = {'color': 'black'}
kwargs2 = {'color': 'red'}
extra_serie = {"tooltip": {"y_start": "There is ", "y_end": " calls"}}
chart.add_serie(y=ydata, x=xdata, name='sine', extra=extra_serie, **kwargs1)
extra_serie = {"tooltip": {"y_start": "", "y_end": " min"}}
chart.add_serie(y=ydata2, x=xdata, name='cose', extra=extra_serie, **kwargs2)
chart.buildhtml()
output_file.write(chart.htmlcontent)
# close Html file
output_file.close()
How to plot from this dataframe using nvd3:
df = pd.DataFrame(data)
df = df.set_index('datetime')
fig, ax = plt.subplots()
df.plot(ax=ax, marker='o')

IIUC, the chart takes the data as list, so you would have to convert your index and column data to list like so (assuming your column names are col1 and col2 respectively:
def plot_nvd3(df, ydata='col1', ydata2='col2'):
# Open File for test
output_file = open('test_lineChart.html', 'w')
# ---------------------------------------
type = "lineChart"
chart = lineChart(name=type, x_is_date=False, x_axis_format="AM_PM")
xdata = df.index.tolist()
ydata = df[ydata].tolist()
ydata2 = df[ydata2].tolist()
kwargs1 = {'color': 'black'}
kwargs2 = {'color': 'red'}
extra_serie = {"tooltip": {"y_start": "There is ", "y_end": " calls"}}
chart.add_serie(y=ydata, x=xdata, name='sine', extra=extra_serie, **kwargs1)
extra_serie = {"tooltip": {"y_start": "", "y_end": " min"}}
chart.add_serie(y=ydata2, x=xdata, name='cose', extra=extra_serie, **kwargs2)
chart.buildhtml()
output_file.write(chart.htmlcontent)
# close Html file
output_file.close()
Usage would by:
plot_nvd3(df, 'col1', 'col2')
I have not checked how nvd3 works with DateTimeIndex, though, in case your df = df.set_index('datetime') results in one.

Related

How to set order of the nodes in Sankey Diagram Plotly

So i am traying to make a cycle that gives different sankey diagram the thing is due to the plotly optimization the node are in different positions. I will like to set the standard order to be [Formal, Informal, Unemployed, Inactive]
import matplotlib.pyplot as plt
import pandas as pd
import plotly.graph_objects as go
df = pd.read_csv(path, delimiter=",")
Lista_Paises = df["code"].unique().tolist()
Lista_DF = []
for x in Lista_Paises:
DF_x = df[df["code"] == x]
Lista_DF.append(DF_x)
def grafico(df):
df = df.astype({"Source": "category", "Value": "float", "Target": "category"})
def category(i):
if i == "Formal":
return 0
if i == "Informal":
return 1
if i == "Unemployed":
return 2
if i == "Inactive":
return 3
def color(i):
if i == "Formal":
return "#9FB5D5"
if i == "Informal":
return "#E3EEF9"
if i == "Unemployed":
return "#E298AE"
if i == "Inactive":
return "#FCEFBC"
df['Source_cat'] = df["Source"].apply(category).astype("int")
df['Target_cat'] = df["Target"].apply(category).astype("int")
# df['Source_cat'] = LabelEncoder().fit_transform(df.Source)
# df['Target_cat'] = LabelEncoder().fit_transform(df.Target)
df["Color"] = df["Source"].apply(color).astype("str")
df = df.sort_values(by=["Source_cat", "Target_cat"])
Lista_Para_Sumar = df["Source_cat"].nunique()
Lista_Para_Tags = df["Source"].unique().tolist()
Suma = Lista_Para_Sumar
df["out"] = df["Target_cat"] + Suma
TAGS = Lista_Para_Tags + Lista_Para_Tags
Origen = df['Source_cat'].tolist()
Destino = df["out"].tolist()
Valor = df["Value"].tolist()
Color = df["Color"].tolist()
return (TAGS, Origen, Destino, Valor, Color)
def Sankey(TAGS: object, Origen: object, Destino: object, Valor: object, Color: object, titulo: str) -> object:
label = TAGS
source = Origen
target = Destino
value = Valor
link = dict(source=source, target=target, value=value,
color=Color)
node = dict(x=[0, 0, 0, 0, 1, 1, 1, 1], y=[1, 0.75, 0.5, 0.25, 0, 1, 0.75, 0.5, 0.25, 0], label=label, pad=35,
thickness=10,
color=["#305CA3", "#C1DAF1", "#C9304E", "#F7DC70", "#305CA3", "#C1DAF1", "#C9304E", "#F7DC70"])
data = go.Sankey(link=link, node=node, arrangement='snap')
fig = go.Figure(data)
fig.update_layout(title_text=titulo + "-" + "Mujeres", font_size=10, )
plt.plot(alpha=0.01)
titulo_guardar = (str(titulo) + ".png")
fig.write_image("/Users/agudelo/Desktop/GRAFICOS PNUD/Graficas/MUJERES/" + titulo_guardar, engine="kaleido")
for y in Lista_DF:
TAGS, Origen, Destino, Valor, Color = grafico(y)
titulo = str(y["code"].unique())
titulo = titulo.replace("[", "")
titulo = titulo.replace("]", "")
titulo = titulo.replace("'", "")
Sankey(TAGS, Origen, Destino, Valor, Color, titulo)
The expected result should be.
The expected result due to the correct order:
The real result i am getting is:

I had a similar problem earlier. I hope this will work for you. As I did not have your data, I created some dummy data. Sorry about the looooong explanation. Here are the steps that should help you reach your goal...
This is what I did:
Order the data and sort it - used pd.Categorical to set the order and then df.sort to sort the data so that the input is sorted by source and then destination.
For the sankey node, you need to set the x and y positions. x=0, y=0 starts at top left. This is important as you are telling plotly the order you want the nodes. One weird thing is that it sometimes errors if x or y is at 0 or 1. Keep it very close, but not the same number... wish I knew why
For the other x and y entries, I used ratios as my total adds up to 285. For eg. Source-Informal starts at x = 0.001 and y = 75/285 as Source-Formal = 75 and this will start right after that
Based on step 1, the link -> source and destination should also be sorted. But, pls do check.
Note: I didn't color the links, but think you already have achieved that...
Hope this helps resolve your issue...
My data - sankey.csv
source,destination,value
Formal,Formal,20
Formal,Informal, 10
Formal,Unemployed,30
Formal,Inactive,15
Informal,Formal,20
Informal,Informal,15
Informal,Unemployed,25
Informal,Inactive,25
Unemployed,Formal,5
Unemployed,Informal,10
Unemployed,Unemployed,10
Unemployed,Inactive,5
Inactive,Formal,30
Inactive,Informal,20
Inactive,Unemployed,20
Inactive,Inactive,25
The code
import plotly.graph_objects as go
import pandas as pd
df = pd.read_csv('sankey.csv') #Read above CSV
#Sort by Source and then Destination
df['source'] = pd.Categorical(df['source'], ['Formal','Informal', 'Unemployed', 'Inactive'])
df['destination'] = pd.Categorical(df['destination'], ['Formal','Informal', 'Unemployed', 'Inactive'])
df.sort_values(['source', 'destination'], inplace = True)
df.reset_index(drop=True)
mynode = dict(
pad = 15,
thickness = 20,
line = dict(color = "black", width = 0.5),
label = ['Formal', 'Informal', 'Unemployed', 'Inactive', 'Formal', 'Informal', 'Unemployed', 'Inactive'],
x = [0.001, 0.001, 0.001, 0.001, 0.999, 0.999, 0.999, 0.999],
y = [0.001, 75/285, 160/285, 190/285, 0.001, 75/285, 130/285, 215/285],
color = ["#305CA3", "#C1DAF1", "#C9304E", "#F7DC70", "#305CA3", "#C1DAF1", "#C9304E", "#F7DC70"])
mylink = dict(
source = [ 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3 ],
target = [ 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7 ],
value = df.value.to_list())
fig = go.Figure(data=[go.Sankey(
arrangement='snap',
node = mynode,
link = mylink
)])
fig.update_layout(title_text="Basic Sankey Diagram", font_size=20)
fig.show()
The output

Cannot update Bokeh chart with CheckboxGroup

I want to make interactive bokeh line charts. I use CheckboxGroup widget to update the charts. However, the charts won't update.
d={'created':['02/01/2019 00:00:00','03/01/2019 00:00:00','04/01/2019 00:00:00','05/01/2019 00:00:00','06/01/2019 00:00:00','07/01/2019 00:00:00'],
'aaa': [5, 4, 10, 7, 5, 5],
'bbb':[0,10,2,9,8,4],
'ccc':[10,12,14,14,5,7]}
df=pd.DataFrame.from_dict(d)
df['created']=pd.to_datetime(df['created'])
df.set_index('created', inplace=True)
plot=figure(plot_width=700,
plot_height=500,
x_axis_type='datetime',
title='lines')
src=ColumnDataSource(df)
products=sorted(list(src.data.keys())[1:])
product_selection=CheckboxGroup(labels=products, active =[0,1])
def make_dataset(initial_dataframe, columns_to_keep):
df=initial_dataframe[columns_to_keep].copy()
src=ColumnDataSource(df)
return src
for i in product_selection.active:
plot.line(x='created', y=product_selection.labels[i], source=src)
def update(attr, old, new):
prods=[product_selection.labels[i] for i in product_selection.active]
src=make_dataset(df,prods)
product_selection.on_change('active', update)
layout=row(plot,product_selection)
curdoc().add_root(layout)
Please help me to correct my code.

If you want to show / hide the lines at checkbox selection then you could do it this way by using visibility attribute of a glyph (run the code with bokeh serve --show app.py):
from bokeh.models import CheckboxGroup, ColumnDataSource, Row
from bokeh.plotting import figure, curdoc
import pandas as pd
d={'created':['02/01/2019 00:00:00','03/01/2019 00:00:00','04/01/2019 00:00:00','05/01/2019 00:00:00','06/01/2019 00:00:00','07/01/2019 00:00:00'],
'aaa': [5, 4, 10, 7, 5, 5],
'bbb': [0, 10, 2, 9, 8, 4],
'ccc': [10, 12, 14, 14, 5, 7]}
df=pd.DataFrame.from_dict(d)
df['created']=pd.to_datetime(df['created'])
df.set_index('created', inplace=True)
plot=figure(plot_width=700,
plot_height=500,
x_axis_type='datetime',
title='lines')
src=ColumnDataSource(df)
products=sorted(list(src.data.keys())[1:])
product_selection=CheckboxGroup(labels=products, active =[0,1])
lines = []
for i, column in enumerate(df.columns.tolist()):
line = plot.line(x='created', y=column, source=src)
line.visible = i in product_selection.active
lines.append(line)
def update(attr, old, new):
for i, renderer in enumerate(lines):
if i in product_selection.active:
renderer.visible = True
else:
renderer.visible = False
product_selection.on_change('active', update)
curdoc().add_root(Row(plot,product_selection))
Result:

Conditional color formatting of pandas data for export to Excel [duplicate]

In Excel cell text will vary from Pass to Fail.I have to give background color green for Pass(pass/Passed/passed) and red for Fail(fail/Failed/failed) respectively. How to change the color based on text ?
My Script
import xlwt
workbook = xlwt.Workbook()
worksheet = workbook.add_sheet('Testing')
worksheet.write_merge(5, 5, 1, 1,'S.No')
worksheet.write_merge(5, 5, 2, 2,'Test Case Description')
worksheet.write_merge(5, 5, 3, 3,'Status')
worksheet.write_merge(5, 5, 4, 4,'Remarks')
worksheet.write_merge(6, 6, 1, 1,1)
worksheet.write_merge(7, 7, 1, 1,1)
worksheet.write_merge(6, 6, 2, 2,'Verify Transferring rate')
worksheet.write_merge(7, 7, 2, 2,'Verify Receiving rate')
worksheet.write_merge(6, 6, 3, 3,'Pass')
worksheet.write_merge(7, 7, 3, 3,'Fail')
workbook.save('testexcel.xls')
#Henry:
Modified code :
import xlwt
workbook = xlwt.Workbook()
worksheet = workbook.add_sheet('Status')
passed = xlwt.easyxf('back_color green')
failed = xlwt.easyxf('back_color red')
color = (passed if passorfail in ['pass','Passed','passed'] else
(failed if passorfail in ['fail','Failed','failed'] else xlwt.easyxf()))
worksheet.write_merge(6, 6, 3, 3,passorfail, style = color)
workbook.save('passfail2.xls')
print "Completed"
And it's throwing error when execute ? How to resolve this error ?
Traceback (most recent call last):
File "G:\airspan_eclipse\Excel_Gen\passfail2.py", line 5, in <module>
passed = xlwt.easyxf('back_color green')
File "C:\Python27\lib\site-packages\xlwt\Style.py", line 704, in easyxf
field_sep=field_sep, line_sep=line_sep, intro_sep=intro_sep, esc_char=esc_char, debug=debug)
File "C:\Python27\lib\site-packages\xlwt\Style.py", line 632, in _parse_strg_to_obj
raise EasyXFCallerError('line %r should have exactly 1 "%c"' % (line, intro_sep))
xlwt.Style.EasyXFCallerError: line 'back_color green' should have exactly 1 ":"

You can create styles using easyxf and then pass them as arguments to your write method.
For example:
style_pass = xlwt.easyxf('pattern: pattern solid, fore_colour green;')
style_fail = xlwt.easyxf('pattern: pattern solid, fore_colour red;')
worksheet.write_merge(6, 6, 3, 3,'Pass', style=style_pass)
worksheet.write_merge(7, 7, 3, 3,'Fail', style=style_fail)

You'll need to put in a if statement to seperate pased on pass fail.
Then, you'll use that to make a color string, something like 'fore-colour grey25'. Look in Style.py for lists of all possible colors and options (github page: https://github.com/python-excel/xlwt/blob/master/xlwt/Style.py). Since red and green both work, and back_color also works, you can do:
passed = xlwt.easyxf('back_color green')
failed = xlwt.easyxf('back_color red')
color = (passed if passorfail in ['pass','Passed','passed'] else
(failed if passorfail in ['fail','Failed','failed'] else xlwt.easyxf()))
worksheet.write_merge(6, 6, 3, 3,passorfail, style = color)

Python string to excel rows

Hello I'm VERY new in python. I just have to do 1 thing with it.
When i print my string names, this is what comes up:
{'id': 1, 'xd_id': 2, 'name': 'nameea', 'description': 'somethingveryweird', 'again_id': 6, 'some_id': None, 'everything': False, 'is_ready': False, 'test_on': None, 'something': None, 'something': [], 'count_count': 28, 'other_count': 0, 'again_count': 0, 'new_count': 0, 'why_count': 0, 'custom_count': 0, 'custom2_count': 0, 'custom3_count': 0, 'custom4_count': 0, 'custom5_count': 0, 'custom_status6_count': 0, 'custom7_count': 0, 'lol_id': 7, 'wtf_id': None, 'numbers_on': 643346, 'something_by': 99, 'site': 'google.com'}
I would to get this info to excel with the left row being the "id": and the right being the 1. And all the info like this. for example. "site" on the left and "google.com" on the right. my current code adds all this info to the first row on the excel and i can't seem to find any tutorial for this. Thanks for all answers. My current code:
f = open('test.csv', 'w')
s = str(names)
f.write(s)
f.close()

if python is not going to be your key skill and only this task needs to be done, then here is the answer.
f = open('test.csv', 'w')
csvwt = csv.writer(f)
for x in names.items():
csvwt.writerow(x)
f.close()
if you want to write to an excel, then you have to do this,
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
row = 0
col = 0
for x in names.items():
worksheet.write(row, col, str(x[0]))
worksheet.write(row, col + 1, str(x[1]))
row += 1
workbook.close()

xlrd original value of the cell

I'm reading xls file using xlrd. The problem is, when xlrd reading value like this "12/09/2012", i get result like this "xldate:41252.0". When I use xlrd.xldate_as_tuple, i get this result:
(2016, 12, 10, 0, 0, 0)
My code:
curr_row = -1
while curr_row < num_rows:
curr_row += 1
row = worksheet.row(curr_row)
for x in xrange(num_cols):
field_type = worksheet.cell_type(curr_row, x)
if field_type == 3: # this is date
field_value = worksheet.cell_value(curr_row, x)
print worksheet.cell(curr_row, x).value
print xlrd.xldate_as_tuple(field_value, 1)
Result:
41252.0
(2016, 12, 10, 0, 0, 0)
Both results are wrong for me. How can i get original cell value "12/09/2012" using xlrd ?

According to the docstring, you should pass your workbook's datemode to xldate_as_tuple as a second parameter:
from datetime import datetime
import xlrd
book = xlrd.open_workbook("test.xls")
sheet = book.sheet_by_index(0)
a1 = sheet.cell_value(rowx=0, colx=0)
print a1 # prints 41252.0
print xlrd.xldate_as_tuple(a1, 1) # prints (2016, 12, 10, 0, 0, 0)
a1_tuple = xlrd.xldate_as_tuple(a1, book.datemode)
print a1_tuple # prints (2012, 12, 9, 0, 0, 0)
a1_datetime = datetime(*a1_tuple)
print a1_datetime.strftime("%m/%d/%Y") # prints 12/09/2012

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multiline chart from dataframe using nvd3 - python

Related

How to set order of the nodes in Sankey Diagram Plotly

Cannot update Bokeh chart with CheckboxGroup

Conditional color formatting of pandas data for export to Excel [duplicate]

Python string to excel rows

xlrd original value of the cell

Categories

Resources