Related
I have this dataframe with euclidean distances:
import pandas as pd
df = pd.DataFrame({
'O1': [0.0, 1.7, 1.4, 0.4, 2.2, 3.7, 5.2, 0.2, 4.3, 6.8, 6.0],
'O2': [1.7, 0.0, 1.0, 2.0, 1.3, 2.6, 4.5, 1.8, 3.2, 5.9, 5.2],
'O3': [1.4, 1.0, 0.0, 1.7, 0.9, 2.4, 4.1, 1.5, 3.0, 5.5, 4.8],
'O4': [0.4, 2.0, 1.7, 0.0, 2.6, 4.0, 5.5, 0.3, 4.6, 7.1, 6.3],
'O5': [2.2, 1.3, 0.9, 2.6, 0.0, 1.7, 3.4, 2.4, 2.1, 4.8, 4.1],
'O6': [3.7, 2.6, 2.4, 4.0, 1.7, 0.0, 2.0, 3.8, 1.6, 3.3, 2.7],
'O7': [5.2, 4.5, 4.1, 5.5, 3.4, 2.0, 0.0, 5.4, 2.5, 1.6, 0.9],
'O8': [0.2, 1.8, 1.5, 0.3, 2.4, 3.8, 5.4, 0.0, 4.4, 6.9, 6.1],
'O9': [4.3, 3.2, 3.0, 4.6, 2.1, 1.6, 2.5, 4.4, 0.0, 3.4, 2.9],
'O10':[6.8, 5.9, 5.5, 7.1, 4.8, 3.3, 1.6, 6.9, 3.4, 0.0, 1.0],
'O11': [6.0, 5.2, 4.8, 6.3, 4.1, 2.7, 0.9, 6.1, 2.9, 1.0, 0.0]
})
Whereas O1, O2, O3, O4, O5, O6, O7, O8 is class 0 and O9, O10 and O11 is class 1.
I want to change the dataframe above to a dataframe with columns: x, y and class. So I am able to split into train and test sets to then fit a simple classifier.
I am confused how I can achieve dataframe described above. How is this performed in python? Is it possible?
Steps afterwards when dataframe is achieved:
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
import seaborn as sns
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)
model = GaussianNB()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)
sns.scatterplot(x = X_test['x'], y = X_test['y'], hue = y_pred)
You mainly want to include the point name as an additional column in the dataframe. Here I am using point indices as x and y:
import pandas as pd
df = pd.DataFrame({
'x': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
1: [0.0, 1.7, 1.4, 0.4, 2.2, 3.7, 5.2, 0.2, 4.3, 6.8, 6.0],
2: [1.7, 0.0, 1.0, 2.0, 1.3, 2.6, 4.5, 1.8, 3.2, 5.9, 5.2],
3: [1.4, 1.0, 0.0, 1.7, 0.9, 2.4, 4.1, 1.5, 3.0, 5.5, 4.8],
4: [0.4, 2.0, 1.7, 0.0, 2.6, 4.0, 5.5, 0.3, 4.6, 7.1, 6.3],
5: [2.2, 1.3, 0.9, 2.6, 0.0, 1.7, 3.4, 2.4, 2.1, 4.8, 4.1],
6: [3.7, 2.6, 2.4, 4.0, 1.7, 0.0, 2.0, 3.8, 1.6, 3.3, 2.7],
7: [5.2, 4.5, 4.1, 5.5, 3.4, 2.0, 0.0, 5.4, 2.5, 1.6, 0.9],
8: [0.2, 1.8, 1.5, 0.3, 2.4, 3.8, 5.4, 0.0, 4.4, 6.9, 6.1],
9: [4.3, 3.2, 3.0, 4.6, 2.1, 1.6, 2.5, 4.4, 0.0, 3.4, 2.9],
10: [6.8, 5.9, 5.5, 7.1, 4.8, 3.3, 1.6, 6.9, 3.4, 0.0, 1.0],
11: [6.0, 5.2, 4.8, 6.3, 4.1, 2.7, 0.9, 6.1, 2.9, 1.0, 0.0]
})
That allows you to reshape the dataframe to your desired form:
model_df = df.melt(id_vars='x', var_name='y', value_name='distance')
Finally, define a class e.g. using:
def assign_class(x):
return 0 if x <= 8 else 1
model_df["class_x"] = model_df["x"].apply(assign_class),
model_df["class_y"] = model_df["y"].apply(assign_class)
This will give you a dataframe that you can pass to the model. Note that the input matrix is symmetric, so you may want to only keep unique records (drop [y, x] if you already have [x, y]).
I'm working on getting interactive networks so I can send datasets around to collaborators. I've found that HoloViews is the most intuitive option for interactive networks. I'm using Bokeh for the backend not for any reason other than that's what the tutorial above used and I'm pretty familiar with it.
I've gotten the hover tool to work for my network and it looks great. Below is an adaptation of the methodology using the iris dataset for the sake of this post.
What I'm having trouble with is getting custom hover fields in addition to the ones already shown. For example, I want all the nodes to have the [Node, Species] fields from the df_nodes DataFrame. However, in the second part of the code underneath the figure I generate custom fields per node that range from 0-5 categories. I would like to append this onto the existing Hover options.
For example, iris_1 would have the following where * indicates what is already there and # indicates what needs to be added:
* Node iris_1
* Species Setosa
# Category_2 0.734694
# Category_9 0.489796
# Category_8 0.469388
# Category_4 0.122449
iris_2 would only have [Node, Species] since it has 0 categories (if you index the node_to_custom dictionary you will see that). iris_3 will have the [Node, Species, Category_4, Category_5] fields.
How can I add a variable number of custom hover fields, with respect to node, on a HoloViews plot? Preferably with Bokeh but if Plot.ly is the better option for this, then let's do it.
I tried doing line breaks but they didn't render. Though, that was supposed to be a hack and not what I actually wanted.
# Iris
import pandas as pd
import networkx as nx
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
defaults = dict(width=500, height=500)
hv.opts.defaults(
opts.EdgePaths(**defaults),
opts.Graph(**defaults),
opts.Nodes(**defaults),
)
X_iris = pd.DataFrame({'sepal_length': {'iris_0': 5.1, 'iris_1': 4.9, 'iris_2': 4.7, 'iris_3': 4.6, 'iris_4': 5.0, 'iris_5': 5.4, 'iris_6': 4.6, 'iris_7': 5.0, 'iris_8': 4.4, 'iris_9': 4.9, 'iris_10': 5.4, 'iris_11': 4.8, 'iris_12': 4.8, 'iris_13': 4.3, 'iris_14': 5.8, 'iris_15': 5.7, 'iris_16': 5.4, 'iris_17': 5.1, 'iris_18': 5.7, 'iris_19': 5.1, 'iris_20': 5.4, 'iris_21': 5.1, 'iris_22': 4.6, 'iris_23': 5.1, 'iris_24': 4.8, 'iris_25': 5.0, 'iris_26': 5.0, 'iris_27': 5.2, 'iris_28': 5.2, 'iris_29': 4.7, 'iris_30': 4.8, 'iris_31': 5.4, 'iris_32': 5.2, 'iris_33': 5.5, 'iris_34': 4.9, 'iris_35': 5.0, 'iris_36': 5.5, 'iris_37': 4.9, 'iris_38': 4.4, 'iris_39': 5.1, 'iris_40': 5.0, 'iris_41': 4.5, 'iris_42': 4.4, 'iris_43': 5.0, 'iris_44': 5.1, 'iris_45': 4.8, 'iris_46': 5.1, 'iris_47': 4.6, 'iris_48': 5.3, 'iris_49': 5.0, 'iris_50': 7.0, 'iris_51': 6.4, 'iris_52': 6.9, 'iris_53': 5.5, 'iris_54': 6.5, 'iris_55': 5.7, 'iris_56': 6.3, 'iris_57': 4.9, 'iris_58': 6.6, 'iris_59': 5.2, 'iris_60': 5.0, 'iris_61': 5.9, 'iris_62': 6.0, 'iris_63': 6.1, 'iris_64': 5.6, 'iris_65': 6.7, 'iris_66': 5.6, 'iris_67': 5.8, 'iris_68': 6.2, 'iris_69': 5.6, 'iris_70': 5.9, 'iris_71': 6.1, 'iris_72': 6.3, 'iris_73': 6.1, 'iris_74': 6.4, 'iris_75': 6.6, 'iris_76': 6.8, 'iris_77': 6.7, 'iris_78': 6.0, 'iris_79': 5.7, 'iris_80': 5.5, 'iris_81': 5.5, 'iris_82': 5.8, 'iris_83': 6.0, 'iris_84': 5.4, 'iris_85': 6.0, 'iris_86': 6.7, 'iris_87': 6.3, 'iris_88': 5.6, 'iris_89': 5.5, 'iris_90': 5.5, 'iris_91': 6.1, 'iris_92': 5.8, 'iris_93': 5.0, 'iris_94': 5.6, 'iris_95': 5.7, 'iris_96': 5.7, 'iris_97': 6.2, 'iris_98': 5.1, 'iris_99': 5.7, 'iris_100': 6.3, 'iris_101': 5.8, 'iris_102': 7.1, 'iris_103': 6.3, 'iris_104': 6.5, 'iris_105': 7.6, 'iris_106': 4.9, 'iris_107': 7.3, 'iris_108': 6.7, 'iris_109': 7.2, 'iris_110': 6.5, 'iris_111': 6.4, 'iris_112': 6.8, 'iris_113': 5.7, 'iris_114': 5.8, 'iris_115': 6.4, 'iris_116': 6.5, 'iris_117': 7.7, 'iris_118': 7.7, 'iris_119': 6.0, 'iris_120': 6.9, 'iris_121': 5.6, 'iris_122': 7.7, 'iris_123': 6.3, 'iris_124': 6.7, 'iris_125': 7.2, 'iris_126': 6.2, 'iris_127': 6.1, 'iris_128': 6.4, 'iris_129': 7.2, 'iris_130': 7.4, 'iris_131': 7.9, 'iris_132': 6.4, 'iris_133': 6.3, 'iris_134': 6.1, 'iris_135': 7.7, 'iris_136': 6.3, 'iris_137': 6.4, 'iris_138': 6.0, 'iris_139': 6.9, 'iris_140': 6.7, 'iris_141': 6.9, 'iris_142': 5.8, 'iris_143': 6.8, 'iris_144': 6.7, 'iris_145': 6.7, 'iris_146': 6.3, 'iris_147': 6.5, 'iris_148': 6.2, 'iris_149': 5.9}, 'sepal_width': {'iris_0': 3.5, 'iris_1': 3.0, 'iris_2': 3.2, 'iris_3': 3.1, 'iris_4': 3.6, 'iris_5': 3.9, 'iris_6': 3.4, 'iris_7': 3.4, 'iris_8': 2.9, 'iris_9': 3.1, 'iris_10': 3.7, 'iris_11': 3.4, 'iris_12': 3.0, 'iris_13': 3.0, 'iris_14': 4.0, 'iris_15': 4.4, 'iris_16': 3.9, 'iris_17': 3.5, 'iris_18': 3.8, 'iris_19': 3.8, 'iris_20': 3.4, 'iris_21': 3.7, 'iris_22': 3.6, 'iris_23': 3.3, 'iris_24': 3.4, 'iris_25': 3.0, 'iris_26': 3.4, 'iris_27': 3.5, 'iris_28': 3.4, 'iris_29': 3.2, 'iris_30': 3.1, 'iris_31': 3.4, 'iris_32': 4.1, 'iris_33': 4.2, 'iris_34': 3.1, 'iris_35': 3.2, 'iris_36': 3.5, 'iris_37': 3.6, 'iris_38': 3.0, 'iris_39': 3.4, 'iris_40': 3.5, 'iris_41': 2.3, 'iris_42': 3.2, 'iris_43': 3.5, 'iris_44': 3.8, 'iris_45': 3.0, 'iris_46': 3.8, 'iris_47': 3.2, 'iris_48': 3.7, 'iris_49': 3.3, 'iris_50': 3.2, 'iris_51': 3.2, 'iris_52': 3.1, 'iris_53': 2.3, 'iris_54': 2.8, 'iris_55': 2.8, 'iris_56': 3.3, 'iris_57': 2.4, 'iris_58': 2.9, 'iris_59': 2.7, 'iris_60': 2.0, 'iris_61': 3.0, 'iris_62': 2.2, 'iris_63': 2.9, 'iris_64': 2.9, 'iris_65': 3.1, 'iris_66': 3.0, 'iris_67': 2.7, 'iris_68': 2.2, 'iris_69': 2.5, 'iris_70': 3.2, 'iris_71': 2.8, 'iris_72': 2.5, 'iris_73': 2.8, 'iris_74': 2.9, 'iris_75': 3.0, 'iris_76': 2.8, 'iris_77': 3.0, 'iris_78': 2.9, 'iris_79': 2.6, 'iris_80': 2.4, 'iris_81': 2.4, 'iris_82': 2.7, 'iris_83': 2.7, 'iris_84': 3.0, 'iris_85': 3.4, 'iris_86': 3.1, 'iris_87': 2.3, 'iris_88': 3.0, 'iris_89': 2.5, 'iris_90': 2.6, 'iris_91': 3.0, 'iris_92': 2.6, 'iris_93': 2.3, 'iris_94': 2.7, 'iris_95': 3.0, 'iris_96': 2.9, 'iris_97': 2.9, 'iris_98': 2.5, 'iris_99': 2.8, 'iris_100': 3.3, 'iris_101': 2.7, 'iris_102': 3.0, 'iris_103': 2.9, 'iris_104': 3.0, 'iris_105': 3.0, 'iris_106': 2.5, 'iris_107': 2.9, 'iris_108': 2.5, 'iris_109': 3.6, 'iris_110': 3.2, 'iris_111': 2.7, 'iris_112': 3.0, 'iris_113': 2.5, 'iris_114': 2.8, 'iris_115': 3.2, 'iris_116': 3.0, 'iris_117': 3.8, 'iris_118': 2.6, 'iris_119': 2.2, 'iris_120': 3.2, 'iris_121': 2.8, 'iris_122': 2.8, 'iris_123': 2.7, 'iris_124': 3.3, 'iris_125': 3.2, 'iris_126': 2.8, 'iris_127': 3.0, 'iris_128': 2.8, 'iris_129': 3.0, 'iris_130': 2.8, 'iris_131': 3.8, 'iris_132': 2.8, 'iris_133': 2.8, 'iris_134': 2.6, 'iris_135': 3.0, 'iris_136': 3.4, 'iris_137': 3.1, 'iris_138': 3.0, 'iris_139': 3.1, 'iris_140': 3.1, 'iris_141': 3.1, 'iris_142': 2.7, 'iris_143': 3.2, 'iris_144': 3.3, 'iris_145': 3.0, 'iris_146': 2.5, 'iris_147': 3.0, 'iris_148': 3.4, 'iris_149': 3.0}, 'petal_length': {'iris_0': 1.4, 'iris_1': 1.4, 'iris_2': 1.3, 'iris_3': 1.5, 'iris_4': 1.4, 'iris_5': 1.7, 'iris_6': 1.4, 'iris_7': 1.5, 'iris_8': 1.4, 'iris_9': 1.5, 'iris_10': 1.5, 'iris_11': 1.6, 'iris_12': 1.4, 'iris_13': 1.1, 'iris_14': 1.2, 'iris_15': 1.5, 'iris_16': 1.3, 'iris_17': 1.4, 'iris_18': 1.7, 'iris_19': 1.5, 'iris_20': 1.7, 'iris_21': 1.5, 'iris_22': 1.0, 'iris_23': 1.7, 'iris_24': 1.9, 'iris_25': 1.6, 'iris_26': 1.6, 'iris_27': 1.5, 'iris_28': 1.4, 'iris_29': 1.6, 'iris_30': 1.6, 'iris_31': 1.5, 'iris_32': 1.5, 'iris_33': 1.4, 'iris_34': 1.5, 'iris_35': 1.2, 'iris_36': 1.3, 'iris_37': 1.4, 'iris_38': 1.3, 'iris_39': 1.5, 'iris_40': 1.3, 'iris_41': 1.3, 'iris_42': 1.3, 'iris_43': 1.6, 'iris_44': 1.9, 'iris_45': 1.4, 'iris_46': 1.6, 'iris_47': 1.4, 'iris_48': 1.5, 'iris_49': 1.4, 'iris_50': 4.7, 'iris_51': 4.5, 'iris_52': 4.9, 'iris_53': 4.0, 'iris_54': 4.6, 'iris_55': 4.5, 'iris_56': 4.7, 'iris_57': 3.3, 'iris_58': 4.6, 'iris_59': 3.9, 'iris_60': 3.5, 'iris_61': 4.2, 'iris_62': 4.0, 'iris_63': 4.7, 'iris_64': 3.6, 'iris_65': 4.4, 'iris_66': 4.5, 'iris_67': 4.1, 'iris_68': 4.5, 'iris_69': 3.9, 'iris_70': 4.8, 'iris_71': 4.0, 'iris_72': 4.9, 'iris_73': 4.7, 'iris_74': 4.3, 'iris_75': 4.4, 'iris_76': 4.8, 'iris_77': 5.0, 'iris_78': 4.5, 'iris_79': 3.5, 'iris_80': 3.8, 'iris_81': 3.7, 'iris_82': 3.9, 'iris_83': 5.1, 'iris_84': 4.5, 'iris_85': 4.5, 'iris_86': 4.7, 'iris_87': 4.4, 'iris_88': 4.1, 'iris_89': 4.0, 'iris_90': 4.4, 'iris_91': 4.6, 'iris_92': 4.0, 'iris_93': 3.3, 'iris_94': 4.2, 'iris_95': 4.2, 'iris_96': 4.2, 'iris_97': 4.3, 'iris_98': 3.0, 'iris_99': 4.1, 'iris_100': 6.0, 'iris_101': 5.1, 'iris_102': 5.9, 'iris_103': 5.6, 'iris_104': 5.8, 'iris_105': 6.6, 'iris_106': 4.5, 'iris_107': 6.3, 'iris_108': 5.8, 'iris_109': 6.1, 'iris_110': 5.1, 'iris_111': 5.3, 'iris_112': 5.5, 'iris_113': 5.0, 'iris_114': 5.1, 'iris_115': 5.3, 'iris_116': 5.5, 'iris_117': 6.7, 'iris_118': 6.9, 'iris_119': 5.0, 'iris_120': 5.7, 'iris_121': 4.9, 'iris_122': 6.7, 'iris_123': 4.9, 'iris_124': 5.7, 'iris_125': 6.0, 'iris_126': 4.8, 'iris_127': 4.9, 'iris_128': 5.6, 'iris_129': 5.8, 'iris_130': 6.1, 'iris_131': 6.4, 'iris_132': 5.6, 'iris_133': 5.1, 'iris_134': 5.6, 'iris_135': 6.1, 'iris_136': 5.6, 'iris_137': 5.5, 'iris_138': 4.8, 'iris_139': 5.4, 'iris_140': 5.6, 'iris_141': 5.1, 'iris_142': 5.1, 'iris_143': 5.9, 'iris_144': 5.7, 'iris_145': 5.2, 'iris_146': 5.0, 'iris_147': 5.2, 'iris_148': 5.4, 'iris_149': 5.1}, 'petal_width': {'iris_0': 0.2, 'iris_1': 0.2, 'iris_2': 0.2, 'iris_3': 0.2, 'iris_4': 0.2, 'iris_5': 0.4, 'iris_6': 0.3, 'iris_7': 0.2, 'iris_8': 0.2, 'iris_9': 0.1, 'iris_10': 0.2, 'iris_11': 0.2, 'iris_12': 0.1, 'iris_13': 0.1, 'iris_14': 0.2, 'iris_15': 0.4, 'iris_16': 0.4, 'iris_17': 0.3, 'iris_18': 0.3, 'iris_19': 0.3, 'iris_20': 0.2, 'iris_21': 0.4, 'iris_22': 0.2, 'iris_23': 0.5, 'iris_24': 0.2, 'iris_25': 0.2, 'iris_26': 0.4, 'iris_27': 0.2, 'iris_28': 0.2, 'iris_29': 0.2, 'iris_30': 0.2, 'iris_31': 0.4, 'iris_32': 0.1, 'iris_33': 0.2, 'iris_34': 0.2, 'iris_35': 0.2, 'iris_36': 0.2, 'iris_37': 0.1, 'iris_38': 0.2, 'iris_39': 0.2, 'iris_40': 0.3, 'iris_41': 0.3, 'iris_42': 0.2, 'iris_43': 0.6, 'iris_44': 0.4, 'iris_45': 0.3, 'iris_46': 0.2, 'iris_47': 0.2, 'iris_48': 0.2, 'iris_49': 0.2, 'iris_50': 1.4, 'iris_51': 1.5, 'iris_52': 1.5, 'iris_53': 1.3, 'iris_54': 1.5, 'iris_55': 1.3, 'iris_56': 1.6, 'iris_57': 1.0, 'iris_58': 1.3, 'iris_59': 1.4, 'iris_60': 1.0, 'iris_61': 1.5, 'iris_62': 1.0, 'iris_63': 1.4, 'iris_64': 1.3, 'iris_65': 1.4, 'iris_66': 1.5, 'iris_67': 1.0, 'iris_68': 1.5, 'iris_69': 1.1, 'iris_70': 1.8, 'iris_71': 1.3, 'iris_72': 1.5, 'iris_73': 1.2, 'iris_74': 1.3, 'iris_75': 1.4, 'iris_76': 1.4, 'iris_77': 1.7, 'iris_78': 1.5, 'iris_79': 1.0, 'iris_80': 1.1, 'iris_81': 1.0, 'iris_82': 1.2, 'iris_83': 1.6, 'iris_84': 1.5, 'iris_85': 1.6, 'iris_86': 1.5, 'iris_87': 1.3, 'iris_88': 1.3, 'iris_89': 1.3, 'iris_90': 1.2, 'iris_91': 1.4, 'iris_92': 1.2, 'iris_93': 1.0, 'iris_94': 1.3, 'iris_95': 1.2, 'iris_96': 1.3, 'iris_97': 1.3, 'iris_98': 1.1, 'iris_99': 1.3, 'iris_100': 2.5, 'iris_101': 1.9, 'iris_102': 2.1, 'iris_103': 1.8, 'iris_104': 2.2, 'iris_105': 2.1, 'iris_106': 1.7, 'iris_107': 1.8, 'iris_108': 1.8, 'iris_109': 2.5, 'iris_110': 2.0, 'iris_111': 1.9, 'iris_112': 2.1, 'iris_113': 2.0, 'iris_114': 2.4, 'iris_115': 2.3, 'iris_116': 1.8, 'iris_117': 2.2, 'iris_118': 2.3, 'iris_119': 1.5, 'iris_120': 2.3, 'iris_121': 2.0, 'iris_122': 2.0, 'iris_123': 1.8, 'iris_124': 2.1, 'iris_125': 1.8, 'iris_126': 1.8, 'iris_127': 1.8, 'iris_128': 2.1, 'iris_129': 1.6, 'iris_130': 1.9, 'iris_131': 2.0, 'iris_132': 2.2, 'iris_133': 1.5, 'iris_134': 1.4, 'iris_135': 2.3, 'iris_136': 2.4, 'iris_137': 1.8, 'iris_138': 1.8, 'iris_139': 2.1, 'iris_140': 2.4, 'iris_141': 2.3, 'iris_142': 1.9, 'iris_143': 2.3, 'iris_144': 2.5, 'iris_145': 2.3, 'iris_146': 1.9, 'iris_147': 2.0, 'iris_148': 2.3, 'iris_149': 1.8}})
y_iris = pd.Series({'iris_0': 'setosa', 'iris_1': 'setosa', 'iris_2': 'setosa', 'iris_3': 'setosa', 'iris_4': 'setosa', 'iris_5': 'setosa', 'iris_6': 'setosa', 'iris_7': 'setosa', 'iris_8': 'setosa', 'iris_9': 'setosa', 'iris_10': 'setosa', 'iris_11': 'setosa', 'iris_12': 'setosa', 'iris_13': 'setosa', 'iris_14': 'setosa', 'iris_15': 'setosa', 'iris_16': 'setosa', 'iris_17': 'setosa', 'iris_18': 'setosa', 'iris_19': 'setosa', 'iris_20': 'setosa', 'iris_21': 'setosa', 'iris_22': 'setosa', 'iris_23': 'setosa', 'iris_24': 'setosa', 'iris_25': 'setosa', 'iris_26': 'setosa', 'iris_27': 'setosa', 'iris_28': 'setosa', 'iris_29': 'setosa', 'iris_30': 'setosa', 'iris_31': 'setosa', 'iris_32': 'setosa', 'iris_33': 'setosa', 'iris_34': 'setosa', 'iris_35': 'setosa', 'iris_36': 'setosa', 'iris_37': 'setosa', 'iris_38': 'setosa', 'iris_39': 'setosa', 'iris_40': 'setosa', 'iris_41': 'setosa', 'iris_42': 'setosa', 'iris_43': 'setosa', 'iris_44': 'setosa', 'iris_45': 'setosa', 'iris_46': 'setosa', 'iris_47': 'setosa', 'iris_48': 'setosa', 'iris_49': 'setosa', 'iris_50': 'versicolor', 'iris_51': 'versicolor', 'iris_52': 'versicolor', 'iris_53': 'versicolor', 'iris_54': 'versicolor', 'iris_55': 'versicolor', 'iris_56': 'versicolor', 'iris_57': 'versicolor', 'iris_58': 'versicolor', 'iris_59': 'versicolor', 'iris_60': 'versicolor', 'iris_61': 'versicolor', 'iris_62': 'versicolor', 'iris_63': 'versicolor', 'iris_64': 'versicolor', 'iris_65': 'versicolor', 'iris_66': 'versicolor', 'iris_67': 'versicolor', 'iris_68': 'versicolor', 'iris_69': 'versicolor', 'iris_70': 'versicolor', 'iris_71': 'versicolor', 'iris_72': 'versicolor', 'iris_73': 'versicolor', 'iris_74': 'versicolor', 'iris_75': 'versicolor', 'iris_76': 'versicolor', 'iris_77': 'versicolor', 'iris_78': 'versicolor', 'iris_79': 'versicolor', 'iris_80': 'versicolor', 'iris_81': 'versicolor', 'iris_82': 'versicolor', 'iris_83': 'versicolor', 'iris_84': 'versicolor', 'iris_85': 'versicolor', 'iris_86': 'versicolor', 'iris_87': 'versicolor', 'iris_88': 'versicolor', 'iris_89': 'versicolor', 'iris_90': 'versicolor', 'iris_91': 'versicolor', 'iris_92': 'versicolor', 'iris_93': 'versicolor', 'iris_94': 'versicolor', 'iris_95': 'versicolor', 'iris_96': 'versicolor', 'iris_97': 'versicolor', 'iris_98': 'versicolor', 'iris_99': 'versicolor', 'iris_100': 'virginica', 'iris_101': 'virginica', 'iris_102': 'virginica', 'iris_103': 'virginica', 'iris_104': 'virginica', 'iris_105': 'virginica', 'iris_106': 'virginica', 'iris_107': 'virginica', 'iris_108': 'virginica', 'iris_109': 'virginica', 'iris_110': 'virginica', 'iris_111': 'virginica', 'iris_112': 'virginica', 'iris_113': 'virginica', 'iris_114': 'virginica', 'iris_115': 'virginica', 'iris_116': 'virginica', 'iris_117': 'virginica', 'iris_118': 'virginica', 'iris_119': 'virginica', 'iris_120': 'virginica', 'iris_121': 'virginica', 'iris_122': 'virginica', 'iris_123': 'virginica', 'iris_124': 'virginica', 'iris_125': 'virginica', 'iris_126': 'virginica', 'iris_127': 'virginica', 'iris_128': 'virginica', 'iris_129': 'virginica', 'iris_130': 'virginica', 'iris_131': 'virginica', 'iris_132': 'virginica', 'iris_133': 'virginica', 'iris_134': 'virginica', 'iris_135': 'virginica', 'iris_136': 'virginica', 'iris_137': 'virginica', 'iris_138': 'virginica', 'iris_139': 'virginica', 'iris_140': 'virginica', 'iris_141': 'virginica', 'iris_142': 'virginica', 'iris_143': 'virginica', 'iris_144': 'virginica', 'iris_145': 'virginica', 'iris_146': 'virginica', 'iris_147': 'virginica', 'iris_148': 'virginica', 'iris_149': 'virginica'})
c_iris = pd.Series({'setosa': '#66c2a5', 'versicolor': '#fc8d62', 'virginica': '#8da0cb'})
# Get edge to weight mapping
weights = X_iris.T.corr().stack()
weights.index = weights.index.map(frozenset)
print(weights.size)
# 22500 = 150**2
# Get rid of diagonal b/c the weights are non-informative
weights = weights[weights.index.map(lambda nodes: len(nodes) == 2)]
print(weights.size)
# 22350 = 150**2 - 150
# Get non-redundant edges ([upper/lower]triangle)
weights = pd.Series(weights.to_dict() )
print(weights.size)
# 11175 = (150**2 - 150)/2
# Create graph
tol = 0.99
graph = nx.Graph()
for edge, w in weights.abs().items(): # For sake of demonstration, just take absolute value though I wouldn't normally do this
if w > tol:
graph.add_edge(*edge, weight=w)
# Get positions
pos = nx.circular_layout(graph)#, seed=0)
# Prepare nodes for HoloViews
df_nodes = pd.DataFrame(pos, index=list("xy")).T
df_nodes.index.name = "Node"
df_nodes["Species"] = y_iris
df_nodes = df_nodes.reset_index()[["x","y", "Node", "Species"]]
df_nodes.head()
# x y Node Species
# 0 0.002421 -0.765592 iris_1 setosa
# 1 0.116149 -0.721862 iris_0 setosa
# 2 0.012620 -0.730962 iris_2 setosa
# 3 0.053972 -0.611302 iris_3 setosa
# 4 0.049840 -0.687669 iris_4 setosa
# Prepare edges for HoloViews
df_edges = list()
for node_a, node_b, edge_data in graph.edges(data=True):
df_edges.append([node_a, node_b, edge_data["weight"]])
df_edges = pd.DataFrame(df_edges, columns=["start", "end", "weight"])
df_edges.head()
# start end weight
# 0 iris_1 iris_0 0.995999
# 1 iris_1 iris_2 0.996607
# 2 iris_1 iris_3 0.997397
# 3 iris_1 iris_4 0.992233
# 4 iris_1 iris_5 0.993592
hv_nodes = hv.Nodes(df_nodes)
hv_graph = hv.Graph((df_edges, hv_nodes), label='Iris Dataset')
hv_graph.opts(cmap=c_iris.to_dict(), node_size=10, edge_line_width="weight",
node_line_color='white', node_color='Species', xaxis=None, yaxis=None)
# Custom mapping
categories = list(map(lambda i: "Category_{}".format(i), range(10)))
range_of_values = np.linspace(0,1)
node_to_custom = dict()
for i, node in enumerate(graph.nodes()):
rng = np.random.RandomState(i)
# Get a random number of categories (real data will not be this obviously)
number_of_categories = rng.choice([0,1,2,3,4,5], size=1)[0]
# Grab N categories w/o replacement
categories_wrt_node = rng.choice(categories, size=number_of_categories, replace=False)
# Get values ranging from [0,1] for those categories
values_wrt_categories = rng.choice(range_of_values, size=number_of_categories )
# Get a mapping between categories and values
categories_to_values = pd.Series(dict(zip(categories_wrt_node, values_wrt_categories)), dtype=float)
# Get non-zero values, sort, and store
node_to_custom[node] = categories_to_values[lambda v: v > 0].sort_values(ascending=False)
# Example of {key:value} showing {node:series}
list(node_to_custom.items())[0]
# ('iris_1',
# Category_2 0.734694
# Category_9 0.489796
# Category_8 0.469388
# Category_4 0.122449
# dtype: float64)
I don't have a definitive answer to your question, but maybe I can still help.
To the best of my knowledge, Holoviews doesn't support variable number of tooltips. What it does support are custom tooltips.
Custom tooltips look like this:
# each tuple will be a row in the tooltip
tooltips = [
('Name', '#name'),
('Symbol', '#symbol'),
('CPK', '$color[hex, swatch]:CPK')
]
custom_hover_tool = HoverTool(tooltips=tooltips)
points.opts(tools=[custom_hover_tool])
Example from here:
http://holoviews.org/user_guide/Plotting_with_Bokeh.html
More details on the usable $variables and #variables:
https://docs.bokeh.org/en/latest/docs/user_guide/tools.html#hovertool
So if this could be good enough for you, you could aggregate your categorical data into a string for each record like "Category_2: 0.734694, Category_9: 0.489796 ..." and display that as a row in the tooltip with a label like "Categories:".
But the tooltips variable actually can be an HTML template too, something like this:
tooltips = """
<div class="row">
<div class="col label">Node</div>
<div class="col value">#name</div>
</div>
<div class="row">
<div class="col label">Species</div>
<div class="col value">#species</div>
</div>
#categories{safe}
"""
The {safe} part forces the tooltip to display the content of that variable as HTML content. So this time you have to previously aggregate your categorical data into a data column that already contains the final HTML code for every record, so for your example record it should look like this:
'\
<div class="row">\
<div class="col label">Category_2:</div>\
<div class="col value">0.734694</div>\
</div>\
<div class="row">\
<div class="col label">Category_9:</div>\
<div class="col value">0.489796</div>\
</div>\
...\
'
(Most likely you would have two for loops inside each other, one for every node and another for every category in it, and only adding one "row" at the time, but something like this would be the end result for each node.)
If you use the exact same HTML/CSS structure in both code blocks, they should be merged seamlessly.
Treat these code blocks as mockups just to demonstrate the idea as I just improvised them here without testing, but I hope it helps.
Let me know if you tried it and show me a working example if you got stuck with it and I try to go into the details.
I have a list of lists with sublists all of which contain float values.
For example the one below has 2 lists with sublists each:
mylist = [[[2.67, 2.67, 0.0, 0.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [0.0, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0]], [[2.67, 2.67, 2.0, 2.0], [0.0, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [0.0, 0.0, 0.0, 0.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0]]]
I want to calculate the standard deviation and the mean of the sublists and what I applied was this:
mean = [statistics.mean(d) for d in mylist]
stdev = [statistics.stdev(d) for d in mylist]
but it takes also the 0.0 values that I do not want because I turned them to 0 in order not to be empty ones. Is there a way to ignore these 0s as they do not exist in the sublist?To not take them under consideration at all? I could not find a way for how I am doing it.
You can use numpy's nanmean and nanstd functions.
import numpy as np
def zero_to_nan(d):
array = np.array(d)
array[array == 0] = np.NaN
return array
mean = [np.nanmean(zero_to_nan(d)) for d in mylist]
stdev = [np.nanstd(zero_to_nan(d)) for d in mylist]
You can do this with a list comprehension.
The following lambda function flattens the nested list into a single list and filters out all zeros:
flatten = lambda nested: [x for sublist in nested for x in sublist if x != 0]
Note that the list comprehension has two for and one ifstatement similar to this code snippet, which does essentially the same:
flat_list = []
for sublist in nested:
for x in sublist:
if x != 0:
flat_list.append(x)
To apply this to your list you can use map. The map function will return an iterator. To get a list we need to pass the iterator to list:
flat_list = list(map(flatten, myList))
Now you can calculate the mean and standard deviation:
mean = [statistics.mean(d) for d in flat]
stdev = [statistics.stdev(d) for d in flat]
print(mean)
print(stdev)
mean = [statistics.mean(d) for d in mylist if d != 0]
stdev = [statistics.stdev(d) for d in mylist if d != 0]
Try:
mean = [statistics.mean([k for k in d if k]) for d in mylist]
stdev = [statistics.stdev([k for k in d if k]) for d in mylist]
def process_timecards():
timecards = []
with open("timecards.txt") as f:
reader = csv.reader(f)
listoftimecards = [list(map(float,row)) for row in reader]
print(listoftimecards)
list1 = listoftimecards.pop(0)
print(list1)
[[688997.0, 5.0, 6.8, 8.0, 7.7, 6.6, 5.2, 7.1, 4.0, 7.5, 7.6], [939825.0, 7.9, 6.6, 6.8, 7.4, 6.4, 5.1, 6.7, 7.3, 6.8, 4.1], [900100.0, 5.1, 6.8, 5.0, 6.6, 7.7, 5.1, 7.5], [969829.0, 6.4, 6.6, 4.4, 5.0, 7.1, 7.1, 4.1, 6.5], [283809.0, 7.2, 5.8, 7.6, 5.3, 6.4, 4.6, 6.4, 5.0, 7.5], [224568.0, 5.2, 6.9, 4.2, 6.4, 5.3, 6.8, 4.4], [163695.0, 4.8, 7.2, 7.2, 4.7, 5.1, 7.3, 7.5, 4.5, 4.6, 7.0], [454912.0, 5.5, 5.3, 4.5, 4.3, 5.5], [285767.0, 7.5, 6.5, 6.3, 4.7, 6.8, 7.1, 6.6, 6.6], [674261.0, 7.2, 6.2, 4.9, 6.5, 7.2, 7.5, 5.0, 7.9], [426824.0, 7.4, 6.5, 5.7, 8.0, 6.9, 7.5, 6.5, 7.5], [934003.0, 5.8, 7.5, 5.8, 4.8, 5.9, 4.8, 4.0, 6.6, 5.5, 7.2]]
This is the list of lists that I have, and i need to grab the first value of each list inside of the list of lists and store that into a list.
I thought i could use pop, but that only goes to the first list. It results in just printing out the first value of the list, which is the first list.
Any advice? I was thinking maybe a for loop, but i have no idea how i would format it.
list1 = [sublist[0] for sublist in listoftimecards]
since you want only the first element you could grab from each row only the first element using the built-in function next:
def process_timecards():
timecards = []
with open("timecards.txt") as f:
reader = csv.reader(f)
list1 = [next(map(float,row)) for row in reader]
print(list1)
list1 = []
for item in listoftimecards:
list1.append(item[0])
This loops through each item in listoftimecards and appends the first item of every list in listoftimecardsto list1
This code below does the same thing as the code above.
list1 = [x[0] for x in listoftimecards]
In Python 2, you can use:
map(lambda lst:lst[0], listoftimecards)
In Python 3, map returns you a generator so you need to call list():
list(map(lambda lst:lst[0], listoftimecards))
but if you only wish to iterate the result using map will be memory efficient.
df is a 3D array mentioned below and which consists of 3 2D array and I should access last column of 2D array along all rows.
array([[[4.3, 3.0, 1.1, 0.1, 'Setosa'],
[4.4, 3.2, 1.3, 0.2, 'Setosa'],
[4.4, 3.0, 1.3, 0.2, 'Setosa'],
[4.4, 2.9, 1.4, 0.2, 'Setosa'],
[4.5, 2.3, 1.3, 0.3, 'Setosa'],
[4.6, 3.6, 1.0, 0.2, 'Setosa'],
[4.6, 3.1, 1.5, 0.2, 'Setosa'],
[4.6, 3.4, 1.4, 0.3, 'Setosa'],
[4.6, 3.2, 1.4, 0.2, 'Setosa'],
[4.7, 3.2, 1.3, 0.2, 'Setosa'],
[4.7, 3.2, 1.6, 0.2, 'Setosa'],
[4.8, 3.0, 1.4, 0.1, 'Setosa'],
[4.8, 3.0, 1.4, 0.3, 'Setosa'],
[4.8, 3.4, 1.9, 0.2, 'Setosa'],
[4.8, 3.4, 1.6, 0.2, 'Setosa'],
[4.8, 3.1, 1.6, 0.2, 'Setosa'],
[4.9, 2.4, 3.3, 1.0, 'Versicolor'],
[4.9, 2.5, 4.5, 1.7, 'Virginica'],
[4.9, 3.1, 1.5, 0.2, 'Setosa'],
[4.9, 3.1, 1.5, 0.1, 'Setosa'],
[4.9, 3.6, 1.4, 0.1, 'Setosa'],
[4.9, 3.0, 1.4, 0.2, 'Setosa'],
[5.0, 3.5, 1.3, 0.3, 'Setosa'],
[5.0, 3.4, 1.6, 0.4, 'Setosa'],
[5.0, 3.3, 1.4, 0.2, 'Setosa'],
[5.0, 3.2, 1.2, 0.2, 'Setosa'],
[5.0, 3.5, 1.6, 0.6, 'Setosa'],
[5.0, 2.0, 3.5, 1.0, 'Versicolor'],
[5.0, 3.4, 1.5, 0.2, 'Setosa'],
[5.0, 2.3, 3.3, 1.0, 'Versicolor'],
[5.0, 3.6, 1.4, 0.2, 'Setosa'],
[5.0, 3.0, 1.6, 0.2, 'Setosa'],
[5.1, 3.8, 1.9, 0.4, 'Setosa'],
[5.1, 3.8, 1.6, 0.2, 'Setosa'],
[5.1, 2.5, 3.0, 1.1, 'Versicolor'],
[5.1, 3.5, 1.4, 0.2, 'Setosa'],
[5.1, 3.4, 1.5, 0.2, 'Setosa'],
[5.1, 3.5, 1.4, 0.3, 'Setosa'],
[5.1, 3.3, 1.7, 0.5, 'Setosa'],
[5.1, 3.7, 1.5, 0.4, 'Setosa'],
[5.1, 3.8, 1.5, 0.3, 'Setosa'],
[5.2, 4.1, 1.5, 0.1, 'Setosa'],
[5.2, 3.4, 1.4, 0.2, 'Setosa'],
[5.2, 3.5, 1.5, 0.2, 'Setosa'],
[5.2, 2.7, 3.9, 1.4, 'Versicolor'],
[5.3, 3.7, 1.5, 0.2, 'Setosa'],
[5.4, 3.0, 4.5, 1.5, 'Versicolor'],
[5.4, 3.9, 1.7, 0.4, 'Setosa'],
[5.4, 3.4, 1.7, 0.2, 'Setosa'],
[5.4, 3.4, 1.5, 0.4, 'Setosa']],
[[5.4, 3.7, 1.5, 0.2, 'Setosa'],
[5.4, 3.9, 1.3, 0.4, 'Setosa'],
[5.5, 3.5, 1.3, 0.2, 'Setosa'],
[5.5, 2.6, 4.4, 1.2, 'Versicolor'],
[5.5, 4.2, 1.4, 0.2, 'Setosa'],
[5.5, 2.3, 4.0, 1.3, 'Versicolor'],
[5.5, 2.4, 3.7, 1.0, 'Versicolor'],
[5.5, 2.4, 3.8, 1.1, 'Versicolor'],
[5.5, 2.5, 4.0, 1.3, 'Versicolor'],
[5.6, 3.0, 4.1, 1.3, 'Versicolor'],
[5.6, 2.8, 4.9, 2.0, 'Virginica'],
[5.6, 3.0, 4.5, 1.5, 'Versicolor'],
[5.6, 2.5, 3.9, 1.1, 'Versicolor'],
[5.6, 2.7, 4.2, 1.3, 'Versicolor'],
[5.6, 2.9, 3.6, 1.3, 'Versicolor'],
[5.7, 2.6, 3.5, 1.0, 'Versicolor'],
[5.7, 2.9, 4.2, 1.3, 'Versicolor'],
[5.7, 2.8, 4.1, 1.3, 'Versicolor'],
[5.7, 4.4, 1.5, 0.4, 'Setosa'],
[5.7, 2.8, 4.5, 1.3, 'Versicolor'],
[5.7, 2.5, 5.0, 2.0, 'Virginica'],
[5.7, 3.8, 1.7, 0.3, 'Setosa'],
[5.7, 3.0, 4.2, 1.2, 'Versicolor'],
[5.8, 2.7, 4.1, 1.0, 'Versicolor'],
[5.8, 4.0, 1.2, 0.2, 'Setosa'],
[5.8, 2.6, 4.0, 1.2, 'Versicolor'],
[5.8, 2.8, 5.1, 2.4, 'Virginica'],
[5.8, 2.7, 5.1, 1.9, 'Virginica'],
[5.8, 2.7, 3.9, 1.2, 'Versicolor'],
[5.8, 2.7, 5.1, 1.9, 'Virginica'],
[5.9, 3.0, 5.1, 1.8, 'Virginica'],
[5.9, 3.0, 4.2, 1.5, 'Versicolor'],
[5.9, 3.2, 4.8, 1.8, 'Versicolor'],
[6.0, 2.9, 4.5, 1.5, 'Versicolor'],
[6.0, 2.7, 5.1, 1.6, 'Versicolor'],
[6.0, 3.0, 4.8, 1.8, 'Virginica'],
[6.0, 3.4, 4.5, 1.6, 'Versicolor'],
[6.0, 2.2, 4.0, 1.0, 'Versicolor'],
[6.0, 2.2, 5.0, 1.5, 'Virginica'],
[6.1, 3.0, 4.9, 1.8, 'Virginica'],
[6.1, 2.6, 5.6, 1.4, 'Virginica'],
[6.1, 2.8, 4.0, 1.3, 'Versicolor'],
[6.1, 2.9, 4.7, 1.4, 'Versicolor'],
[6.1, 2.8, 4.7, 1.2, 'Versicolor'],
[6.1, 3.0, 4.6, 1.4, 'Versicolor'],
[6.2, 2.2, 4.5, 1.5, 'Versicolor'],
[6.2, 2.9, 4.3, 1.3, 'Versicolor'],
[6.2, 3.4, 5.4, 2.3, 'Virginica'],
[6.2, 2.8, 4.8, 1.8, 'Virginica'],
[6.3, 2.5, 4.9, 1.5, 'Versicolor']],
[[6.3, 2.7, 4.9, 1.8, 'Virginica'],
[6.3, 2.5, 5.0, 1.9, 'Virginica'],
[6.3, 3.3, 4.7, 1.6, 'Versicolor'],
[6.3, 2.8, 5.1, 1.5, 'Virginica'],
[6.3, 3.3, 6.0, 2.5, 'Virginica'],
[6.3, 2.3, 4.4, 1.3, 'Versicolor'],
[6.3, 3.4, 5.6, 2.4, 'Virginica'],
[6.3, 2.9, 5.6, 1.8, 'Virginica'],
[6.4, 2.8, 5.6, 2.2, 'Virginica'],
[6.4, 2.8, 5.6, 2.1, 'Virginica'],
[6.4, 3.1, 5.5, 1.8, 'Virginica'],
[6.4, 3.2, 4.5, 1.5, 'Versicolor'],
[6.4, 3.2, 5.3, 2.3, 'Virginica'],
[6.4, 2.9, 4.3, 1.3, 'Versicolor'],
[6.4, 2.7, 5.3, 1.9, 'Virginica'],
[6.5, 3.0, 5.8, 2.2, 'Virginica'],
[6.5, 3.0, 5.5, 1.8, 'Virginica'],
[6.5, 3.0, 5.2, 2.0, 'Virginica'],
[6.5, 2.8, 4.6, 1.5, 'Versicolor'],
[6.5, 3.2, 5.1, 2.0, 'Virginica'],
[6.6, 2.9, 4.6, 1.3, 'Versicolor'],
[6.6, 3.0, 4.4, 1.4, 'Versicolor'],
[6.7, 3.1, 4.7, 1.5, 'Versicolor'],
[6.7, 3.1, 5.6, 2.4, 'Virginica'],
[6.7, 2.5, 5.8, 1.8, 'Virginica'],
[6.7, 3.0, 5.0, 1.7, 'Versicolor'],
[6.7, 3.1, 4.4, 1.4, 'Versicolor'],
[6.7, 3.3, 5.7, 2.5, 'Virginica'],
[6.7, 3.0, 5.2, 2.3, 'Virginica'],
[6.7, 3.3, 5.7, 2.1, 'Virginica'],
[6.8, 3.2, 5.9, 2.3, 'Virginica'],
[6.8, 2.8, 4.8, 1.4, 'Versicolor'],
[6.8, 3.0, 5.5, 2.1, 'Virginica'],
[6.9, 3.1, 5.4, 2.1, 'Virginica'],
[6.9, 3.1, 5.1, 2.3, 'Virginica'],
[6.9, 3.1, 4.9, 1.5, 'Versicolor'],
[6.9, 3.2, 5.7, 2.3, 'Virginica'],
[7.0, 3.2, 4.7, 1.4, 'Versicolor'],
[7.1, 3.0, 5.9, 2.1, 'Virginica'],
[7.2, 3.0, 5.8, 1.6, 'Virginica'],
[7.2, 3.2, 6.0, 1.8, 'Virginica'],
[7.2, 3.6, 6.1, 2.5, 'Virginica'],
[7.3, 2.9, 6.3, 1.8, 'Virginica'],
[7.4, 2.8, 6.1, 1.9, 'Virginica'],
[7.6, 3.0, 6.6, 2.1, 'Virginica'],
[7.7, 2.8, 6.7, 2.0, 'Virginica'],
[7.7, 2.6, 6.9, 2.3, 'Virginica'],
[7.7, 3.8, 6.7, 2.2, 'Virginica'],
[7.7, 3.0, 6.1, 2.3, 'Virginica'],
[7.9, 3.8, 6.4, 2.0, 'Virginica']]], dtype=object)
I tried:
element = df[g_index[[i],[4]]]
but it returns an error
def Numberofoccurences(data,sortcolindex,g_index):
df = DivideColumns(data,sortcolindex)
num_Iris_setosa = 0
num_Iris_versicolor = 0
num_Iris_virginica = 0
for i in range(0,50):
element = df[g_index[[i],[4]]]
if(element == 'Setosa'):
num_Iris_setosa+=1
elif(element == 'Versicolor'):
num_Iris_versicolor+=1
elif (element == 'Virginica'):
num_Iris_virginica+=1
array1D_for_occ = np.array([num_Iris_virginica,num_Iris_versicolor,num_Iris_setosa])
return array1D_for_occ ;
numberofocc = Numberofoccurences(iris_array,0,0)
The error I get is
TypeError: 'int' object is not subscriptable
'Not subscriptable' means you cannot index into it. So g_index[i] is not allowed.
Maybe you want:
element = df[g_index, i, 4]
Here is the fix on your function. So you do a loop on each element in data[g_index] and to get the last element, you put each[-1] where -1 means last element of the list.
def Numberofoccurences(data,sortcolindex,g_index):
df = DivideColumns(data,sortcolindex)
num_Iris_setosa = 0
num_Iris_versicolor = 0
num_Iris_virginica = 0
for each in df[g_index]:
element = each[-1]
if(element == 'Setosa'):
num_Iris_setosa+=1
elif(element == 'Versicolor'):
num_Iris_versicolor+=1
elif (element == 'Virginica'):
num_Iris_virginica+=1
array1D_for_occ = np.array([num_Iris_virginica,num_Iris_versicolor,num_Iris_setosa])
return array1D_for_occ