Network is broken after loading from pickle [duplicate]

Network is broken after loading from pickle [duplicate] - python

Context: I'm trying to run another researcher's code - it describes a traffic model for the Bay Area road network, which is subject to seismic hazard. I'm new to Python and therefore would really appreciate some help debugging the following error.
Issue: When I try to run the code for the sample data provided with the file, following the instructions in the README, I get the following error.
DN0a226926:quick_traffic_model gitanjali$ python mahmodel_road_only.py
You are considering 2 ground-motion intensity maps.
You are considering 1743 different site locations.
You are considering 2 different damage maps (1 per ground-motion intensity map).
Traceback (most recent call last):
File "mahmodel_road_only.py", line 288, in <module>
main()
File "mahmodel_road_only.py", line 219, in main
G = get_graph()
File "mahmodel_road_only.py", line 157, in get_graph
G = add_superdistrict_centroids(G)
File "mahmodel_road_only.py", line 46, in add_superdistrict_centroids
G.add_node(str(1000000 + i))
File "/Library/Python/2.7/site-packages/networkx-2.0-py2.7.egg/networkx/classes/digraph.py", line 412, in add_node
if n not in self._succ:
AttributeError: 'DiGraph' object has no attribute '_succ'
Debugging: Based on some other questions, it seems like this error stems from an issue with the networkx version (I'm using 2.0) or the Python version (I'm using 2.7.10). I went through the migration guide cited in other questions and found nothing that I needed to change in mahmodel_road_only.py. I also checked the digraph.py file and found that self._succ is defined. I also checked the definition of get_graph(), shown below, which calls networkx, but didn't see any obvious issues.
def get_graph():
import networkx
'''loads full mtc highway graph with dummy links and then adds a few
fake centroidal nodes for max flow and traffic assignment'''
G = networkx.read_gpickle("input/graphMTC_CentroidsLength3int.gpickle")
G = add_superdistrict_centroids(G)
assert not G.is_multigraph() # Directed! only one edge between nodes
G = networkx.freeze(G) #prevents edges or nodes to be added or deleted
return G
Question: How can I resolve this problem? Is it a matter of changing the Python or Networkx versions? If not, what next steps could you recommend for debugging?

I believe your problem is similar to that in AttributeError: 'DiGraph' object has no attribute '_node'
The issue there is that the graph being investigated was created in networkx 1.x and then pickled. The graph then has the attributes that a networkx 1.x object has. I believe this happened for you as well.
You've now opened it and you're applying tools from networkx 2.x to that graph. But those tools assume that it's a networkx 2.x DiGraph, with all the attributes expected in a 2.x DiGraph. In particular it expects _succ to be defined for a node, which a 1.x DiGraph does not have.
So here are two approaches that I believe will work:
Short term solution
Remove networkx 2.x and replace with networkx 1.11.
This is not optimal because networkx 2.x is more powerful. Also code that has been written to work in both 2.x and 1.x (following the migration guide you mentioned) will be less efficient in 1.x (for example there will be places where the 1.x code is using lists and the 2.x code is using generators).
Long term solution
Convert the 1.x graph into a 2.x graph (I can't test easily as I don't have 1.x on my computer at the moment - If anyone tries this, please leave a comment saying whether this works and whether your network was weighted):
#you need commands here to load the 1.x graph G
#
import networkx as nx #networkx 2.0
H = nx.DiGraph() #if it's a DiGraph()
#H=nx.Graph() #if it's a typical networkx Graph().
H.add_nodes_from(G.nodes(data=True))
H.add_edges_from(G.edges(data=True))
The data=True is used to make sure that any edge/node weights are preserved. H is now a networkx 2.x DiGraph, with the edges and nodes having whatever attributes G had. The networkx 2.x commands should work on it.
Bonus longer term solution
Contact the other researcher and warn him/her that the code example is now out of date.

Related

code runs in VScode, but not in Dynamo embedded python

i’m trying to connect a module that provides a straight skeleton algorithm to the embedded python in Autodesk Revit's Dynamo environment.
i’m trying to debug things in VS code using a serialized data from Revit, so i think i have VScode using the embedded python that runs in dynamo.
my test code runs in VScode. Dynamo is able to find the module.
i’m trying to bring it over incrementally. when i add in the line that actually calls the function, i get this error:
Warning: RuntimeError : generator raised StopIteration [' File "<string>", line 40, in <module>\n', ' File "C:\\Users\\mclough\\src\\polyskel2\\polyskel\\polyskel.py", line 437, in skeletonize\n slav = _SLAV(polygon, holes)\n', ' File "C:\\Users\\mclough\\src\\polyskel2\\polyskel\\polyskel.py", line 207, in __init__\n self._original_edges = [_OriginalEdge(LineSegment2(vertex.prev.point, vertex.point), vertex.prev.bisector, vertex.bisector) for vertex in chain.from_iterable(self._lavs)]\n', ' File "C:\\Users\\mclough\\src\\polyskel2\\polyskel\\polyskel.py", line 207, in <listcomp>\n self._original_edges = [_OriginalEdge(LineSegment2(vertex.prev.point, vertex.point), vertex.prev.bisector, vertex.bisector) for vertex in chain.from_iterable(self._lavs)]\n']
i did debug this error (or a similar one?) in VScode, which seems to be related to changes in python 3.8. Is it maybe possible that Dynamo’s embedded python is somehow looking at an old version of the file? Just seems weird that it runs in VScode, but not in dynamo when both versions of python are the same.
I don't understand how i'm supposed to interpret the last line in the Error. I don't see anything that generates a StopIteration exception. The only thing i'm thinking about is the call of 'chain.from_iterable', which is part of itertools. Does it make any sense that would be the issue?
Code from the python node:
`
# Load the Python Standard and DesignScript Libraries
import sys
import clr
#this is temporary. final path should be the KS 'common lib'
sys.path.append(r'C:\Users\mclough\src\polyskel2\polyskel')
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *
#sys.path.append(r'C:\ProgramData\Anaconda3\envs\Dynamo212\Lib\site-packages')
import polyskel
#import skgeom as sg
# The inputs to this node will be stored as a list in the IN variables.
# input data is serialized geometry of surface polycurves
# first element is coordinates of the boundary, listed counterclockwise
# remainder elements describe any holes, listed clockwise.
data = IN[0]
# Place your code below this line
skeletons=[]
for shape in data:
bdyPoints = []
boundary = shape.pop(0)
for b in boundary:
x,y=b[0]
bdyPoints.append([float(x),float(y)])
#remaining entries are the holes
holes = []
for curve in shape:
hlePoints=[]
for h in curve:
x,y = h[0]
hlePoints.append([float(x),float(y)])
holes.append(hlePoints)
sk = polyskel.skeletonize(bdyPoints,holes)
#shapes.append([bdyPoints,holes])
skeletons.append(sk)
# Assign your output to the OUT variable.
OUT = []
`
I've checked that VScode and Dynamo are using the same version of python.
I'm using the same input (geometry) data in both instances.
I verified that the modules are discoverable in both environments
I've searched for information on the Runtime Error

Code g.nodes(data=True)[0:10] in networkx not working

The code:
print(g.nodes(data=True)[0:10])
Taken from Graph tutorial does not work.
I had to make two changes to the code as described in two previous questions:
First Question
Second Question
Given the two errors already reported, which seem to point to newer versions of NetworkX, is there some incompatibility with the latest version of NetworkX? I'm running it in Python 3.7.
The error I get after running the whole code and getting all the expected outputs as described in the tutorial is:
Traceback (most recent call last):
File "Drawing-graphs.py", line 44, in <module>
print(list(g.nodes(data=True)[0:10]))
File "/opt/anaconda3/lib/python3.7/site-packages/networkx/classes/reportviews.py", line 277, in __getitem__
ddict = self._nodes[n]
TypeError: unhashable type: 'slice'
The code in the tutorial is a bit long but very straight forward. It loads a graph and it prints some pieces of it. Here is all the code (without the last row it does what's expected without errors):
import itertools
import copy
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew/e570c38bcc72a8d102422f2af836513b/raw/89c76b2563dbc0e88384719a35cba0dfc04cd522/edgelist_sleeping_giant.csv')
# Grab node list data hosted on Gist
nodelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew/f989e10af17fb4c85b11409fea47895b/raw/a3a8da0fa5b094f1ca9d82e1642b384889ae16e8/nodelist_sleeping_giant.csv')
# Create empty graph
g = nx.Graph()
# Add edges and edge attributes
for i, elrow in edgelist.iterrows():
g.add_edge(elrow[0], elrow[1], attr_dict=elrow[2:].to_dict())
# Add node attributes[- see question][1]
for i, nlrow in nodelist.iterrows():
g.node[nlrow['id']].update(nlrow[1:].to_dict())
print(list(g.edges(data=True))[0:5])
# Preview first 10 nodes
print(g.nodes(data=True)[0:10])

You should convert the result of g.nodes() into a list since g.nodes() returns a NodeView type which can't be sliced.
print(list(g.nodes(data=True))[0:10])
this should work on python 3.7 and networkx 2.4

conversion newick to graphml using python

I would like to convert a tree from newick to a format like graphml, that I can open with cytoscape.
So, I have a file "small.newick" that contain:
((raccoon:1,bear:6):0.8,((sea_lion:11.9, seal:12):7,((monkey:100,cat:47):20, weasel:18):2):3,dog:25);
So far, I did that way (Python 3.6.5 |Anaconda):
from Bio import Phylo
import networkx
Tree = Phylo.read("small.newick", 'newick')
G = Phylo.to_networkx(Tree)
networkx.write_graphml(G, 'small.graphml')
There is a problem with the Clade, that I can fix using this code:
from Bio import Phylo
import networkx
def clade_names_fix(tree):
for idx, clade in enumerate(tree.find_clades()):
if not clade.name:
clade.name=str(idx)
Tree = Phylo.read("small.newick", 'newick')
clade_names_fix(Tree)
G = Phylo.to_networkx(Tree)
networkx.write_graphml(G, 'small.graphml')
Giving me something that seem nice enough:
My questions are:
Is that a good way to do it? It seem weird to me that the function does not take care of the internal node names
If you replace one node name with a string long enough, it will be trimmed by the command Phylo.to_networkx(Tree). How to avoid that?
Example: substitution of "dog" by "test_tring_that_create_some_problem_later_on"

Looks like you got pretty far on this already. I can only suggest a few alternatives/extensions to your approach...
Unfortunately, I couldn't find a Cytoscape app that can read this format. I tried searching for PHYLIP, NEWICK and PHYLO. You might have more luck:
http://apps.cytoscape.org/
There is an old Cytoscape 2.x plugin that could read this format, but to run this you would need to install Cytoscape 2.8.3, import the network, then export as xGMML (or save as CYS) and then try to open in Cytoscape 3.7 in order to migrate back into the land of living code. Then again, if 2.8.3 does what you need for this particular case, then maybe you don't need to migrate:
http://apps.cytoscape.org/apps/phylotree
The best approach is programmatic, which you already explored. Finding an R or Python package that turns NEWICK into iGraph or GraphML is a solid strategy. Note that there are updated and slick Cytoscape libs in those languages as well, so you can do all label cleanup, layout, data visualization, analysis, export, etc all within the scripting environment:
https://bioconductor.org/packages/release/bioc/html/RCy3.html
https://py2cytoscape.readthedocs.io/en/latest/

After some research, I actually found a solution that work.
I decided to provide the link here for you, dear reader:
going to github

FYI for anyone coming across this now I think the first issue mentioned here has now been solved in BioPython. Using the same data as above, the networkx graph which is built contains all the internal nodes of the tree as well as the terminal nodes.
import matplotlib.pyplot as plt
import networkx
from Bio import Phylo
Tree = Phylo.read("small.newick", 'newick')
G = Phylo.to_networkx(Tree)
networkx.draw_networkx(G)
plt.savefig("small_graph.png")
Specs:
Python 3.8.10,
Bio 1.78,
networkx 2.5

networkx DiGraph Attribute Error self._succ

Context: I'm trying to run another researcher's code - it describes a traffic model for the Bay Area road network, which is subject to seismic hazard. I'm new to Python and therefore would really appreciate some help debugging the following error.
Issue: When I try to run the code for the sample data provided with the file, following the instructions in the README, I get the following error.
DN0a226926:quick_traffic_model gitanjali$ python mahmodel_road_only.py
You are considering 2 ground-motion intensity maps.
You are considering 1743 different site locations.
You are considering 2 different damage maps (1 per ground-motion intensity map).
Traceback (most recent call last):
File "mahmodel_road_only.py", line 288, in <module>
main()
File "mahmodel_road_only.py", line 219, in main
G = get_graph()
File "mahmodel_road_only.py", line 157, in get_graph
G = add_superdistrict_centroids(G)
File "mahmodel_road_only.py", line 46, in add_superdistrict_centroids
G.add_node(str(1000000 + i))
File "/Library/Python/2.7/site-packages/networkx-2.0-py2.7.egg/networkx/classes/digraph.py", line 412, in add_node
if n not in self._succ:
AttributeError: 'DiGraph' object has no attribute '_succ'
Debugging: Based on some other questions, it seems like this error stems from an issue with the networkx version (I'm using 2.0) or the Python version (I'm using 2.7.10). I went through the migration guide cited in other questions and found nothing that I needed to change in mahmodel_road_only.py. I also checked the digraph.py file and found that self._succ is defined. I also checked the definition of get_graph(), shown below, which calls networkx, but didn't see any obvious issues.
def get_graph():
import networkx
'''loads full mtc highway graph with dummy links and then adds a few
fake centroidal nodes for max flow and traffic assignment'''
G = networkx.read_gpickle("input/graphMTC_CentroidsLength3int.gpickle")
G = add_superdistrict_centroids(G)
assert not G.is_multigraph() # Directed! only one edge between nodes
G = networkx.freeze(G) #prevents edges or nodes to be added or deleted
return G
Question: How can I resolve this problem? Is it a matter of changing the Python or Networkx versions? If not, what next steps could you recommend for debugging?

Why do I get a Graph when I specify create_using=nx.DiGraph

I'm new to networkx in python and I had a problem in creating a DiGraph. I specify create_using=nx.DiGraph when I try to create a DiGraph using the adjacency matrix in pd dataframe, but I got a Graph instead of a DiGraph. Can anyone explain why?

This is apparently a bug, that is fixed in the new version of networkx. The problem is on this line.
You can either install the new version of networkx to fix it, or implement the solution yourself (you need just to add a word). If you chose the second, open the file networkx.convert_matrix.py, which in my system is found at: /usr/local/lib/python3.6/site-packages/networkx/convert_matrix.py (open the file as root using sudo), and change line 191 from:
G = from_numpy_matrix(A, create_using)
to
G = from_numpy_matrix(A, create_using=create_using)
And listo, bug should be solved. Note: is create_using=nx.DiGraph().

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.