Django finding paths between two vertexes in a graph - python

This is mostly a logical question, but the context is done in Django.
In our Database we have Vertex and Line Classes, these form a (neural)network, but it is unordered and I can't change it, it's a Legacy Database
class Vertex(models.Model)
code = models.AutoField(primary_key=True)
lines = models.ManyToManyField('Line', through='Vertex_Line')
class Line(models.Model)
code = models.AutoField(primary_key=True)
class Vertex_Line(models.Model)
line = models.ForeignKey(Line, on_delete=models.CASCADE)
vertex = models.ForeignKey(Vertex, on_delete=models.CASCADE)
Now, in the application, the user will be able to visually select TWO vertexes (the green circles below)
The javascript will then send the pk of these two Vertexes to Django, and it has to find the Line classes that satisfy a route between them, in this case, the following 4 red Lines :
Business Logic:
A Vertex can have 1-4 Lines related to it
A Line can have 1-2 Vertexes related to it
There will only be one possible route between two Vertexes
What I have so far:
I understand that the answer probably includes recursion
The path must be found by trying every path from one Vertex untill the other is find, it can't be directly found
Since there are four and three-way junctions, all the routes being tried must be saved throughout the recursion(unsure of this one)
I know the basic logic is looping through all the lines of each Vertex, and then get the other Vertex of these lines, and keep walking recursively, but I really don't know where to start on this one.
This is as far as I could get, but it probably does not help (views.py) :
def findRoute(request):
data = json.loads(request.body.decode("utf-8"))
v1 = Vertex.objects.get(pk=data.get('v1_pk'))
v2 = Vertex.objects.get(pk=data.get('v2_pk'))
lines = v1.lines.all()
routes = []
for line in lines:
starting_line = line
#Trying a new route
this_route_index = len(routes)
routes[this_route_index] = [starting_line.pk]
other_vertex = line.vertex__set.all().exclude(pk=v1.pk)
#There are cases with dead-ends
if other_vertex.length > 0:
#Mind block...

As you has pointed, this is not a Django/Python related question, but a logical/algorithmic matter.
To find paths between two vertexes in a graph you can use lot of algorithms: Dijkstra, A*, DFS, BFS, Floyd–Warshall etc.. You can choose depending on what you need: shortest/minimum path, all paths...
How to implement this in Django? I suggest to don't apply the algorithm over the models itself, since this could be expensive (in term of time, db queries, etc...) specially for large graphs; instead, I'd rather to map the graph in an in-memory data structure and execute the algorithm over it.
You can take a look to this Networkx, which is a very complete (data structure + algorithms) and well documented library; python-graph, which provides a suitable data structure and a whole set of important algorithms (including some of the mentioned above). More options at Python Graph Library

Related

pyosmium - Build a Geojson linestring based on OSM Relation

I have a python script to analyse OSM data, and the objective is to build a GeoJson with specific data issued from OSM relation.
I'm currently focusing on OSM relation that represents 'hiking' trail like this one.
According to the document:
members
(read-only) Ordered list of relation members. See osmium.osm.RelationMemberList.
the relation object has an attribute members which collects all members of the relation.
Hence The first part of the script manages to extract all relation that have a tag sac_scale=hiking and collects all its ways.
The following script is on purpose focusing only on 1 specific relation : r104369
class HikingWaysfromRelations(osmium.SimpleHandler):
def __init__(self):
super(HikingWaysfromRelations, self).__init__()
self.dict = {}
def _getWays(self, elem, elem_type):
# tag
if 'sac_scale' in elem.tags and elem.tags['sac_scale']=='hiking' and elem.id==104369:
list=[]
for mem in elem.members:
if mem.type=="w":
list.append(str(mem.ref))
self.dict["r"+str(elem.id)]=list
else:
pass
def relation(self,r):
self._getWays(r, "relation")
ml = HikingWaysfromRelations()
ml.apply_file('../pbf/new-zealand.osm.pbf')
The result is a dictionary containing the expected relation as the only key, and its ways:
{"r104369": ["191668175", "765285136", "765285135", "765285138", "765285139", "191668225", "765542429", "765542430", "765542432", "765542431", "765542435", "765542436", "765542434", "765542433", "765542437", "765542438", "765542439", "765542440", "765542441", "765542442", "765548983", "271277386", "765548985", "765548984", "684295241", "684295239", "464603363", "464603364", "464607430", "299788481", "178920047", "155711655", "155711646", "684294192", "259362037", "684294189", "259362038", "259362041", "259362036", "259362043", "259362039", "259362040"]}
Now the question is: How to build a GeoJson containing a single Feature MultiLineString that connects all those ways and rebuild the expected hiking trail?
Based on what I've found on the net, I should re-run a simpleHander on the full .pbf file, and each time I encounter a way I'm looking for - based on the values of the above dictionary - I could reconstruct a LineString with:
import shapely.wkb as wkblib
wkbfab = osmium.geom.WKBFactory()
def getRelationGeometry(elem):
wkb=wkbfab.create_linestring(elem)
return wkb
The issue is that it looks like some ways have only 1 node, hence triggering following error:
RuntimeError: need at least two points for linestring (way_id=155711655)
So what would be the solution to re-build a GeoJson feature - multiLineString - of multiple ways, to be able to plot the result on https://geojson.io/#map=2/20.0/0.0 ?
How for instance openstreetmap manages to re-build the track of a relation when I hit link if not by connecting all nodes (from all ways) issued from the relation ?
Thanks a lot for your help
I know there is way with bash, where you first filter the initial pbf by keeping only relation with the tag sac_scale=hiking, and then transforming this filtered result to GeoJson - but I really want to be able to generate the same with python to understand how OSM data are stored. I just can't figure out an easy way to do so, knowing pyosmium is equivalent (supposedly) to osmium, I believe there should be an easy way there too
osmium export output/output_food-drinks.pbf -f geojson
Looking at the way with the id shown in the error in your post (155711655), it has two nodes, not one. Visible here as of the time of this answer.
Knowing that, I can think of two reasons why you would get that error:
You're not passing in the argument location=True to the apply_file method as suggested by the documentation:
Because of the way that OSM data is structured, osmium needs to internally cache node geometries, when the handler wants to process the geometries of ways and areas. The apply_file() method cannot deduce by itself if this cache is needed. Therefore locations need to be explicitly enabled by setting the locations parameter to True:
h.apply_file("test.osm.pbf", locations=True, idx='flex_mem')
Looking at your code above, the apply_file method only has the input file as an argument, so I think this is likely your problem.
The way may be referencing a node that is missing in your pbf extract. This is simple to verify with the osmium cli tool:
osmium check-refs <your pbf file>
This is the result I get from running that on a valid pbf of my own
There are 6702885 nodes, 431747 ways, and 2474 relations in this file.
Nodes in ways missing: 0
Note the Nodes in ways missing: 0.

How do I pull specific property values from an injected object when adding a batch of edges in a Gremlin/TinkerPop traversal?

I want to add batches of edges to a JanusGraph db that already contains nodes. I want my edges to support setting dynamic/optional properties.
I've cobbled together the following traversal (based on this SO question) that I believe illustrates what I want to do:
1..inject() a batch of edges
2. Pull to/from vertex ids from the objects in the injected edge batch
3. Set all fields in edge batch objects as edge properties with .sideEffect()
uuid_1 = "89079f8fa3ee849a61a45e0b3e6d28cd"
uuid_2 = "00a9ae430dc812f483b0660212264190"
edge_batch = [
{
"from_uuid": uuid_1,
"to_uuid": uuid_2,
"posted_at": 1650012568000,
"test_property_2": "I was here"
},
{
"from_uuid": uuid_2,
"to_uuid": uuid_1,
"posted_at": 1650012568888,
"test_property_3": "I'M STILL HERE"
}
]
new_edges = (
g
.inject(edge_batch)
.unfold()
.as_("edge_batch")
.V()
.has("uuid", __.select("edge_batch").select("to_uuid"))
.as_("to_v")
.V()
.has("uuid", __.select("edge_batch").select("from_uuid"))
.addE("MY_EDGE_TYPE")
.to("to_v")
.as_("new_edge")
.sideEffect(
__.select("edge_batch")
.unfold()
.as_("kvp")
.select("new_edge")
.property(
__.select("kvp").by(Column.keys), __.select("kvp").by(Column.values)
)
)
.iterate()
)
As written, the code above results in a traversal timeout when the referenced vertices exist. If I replace the first two __.select("edge_batch")... expressions above with references to the uuid_1 and uuid_2 variables, the code works. I think my problem is I just can't figure out how I'm supposed to reference properties of the injected, unfolded edge batch objects.
I'm using gremlin-python v3.6.0, JanusGraph v0.6.1, TinkerPop v3.5.1.
Your code runs just fine with a small graph. Only, the two unfolds from a list with two elements makes your code run four times, unintentionally I guess.
As to why the code does not run on your janusgraph installation:
Be sure uuid is an indexed property if your graph is large
Maybe you were confused by vertices with uuid_1 and uuid_2 being present in the janusgraph cache, because .has("uuid", __.select("edge_batch").select("to_uuid")) and .has("uuid", uuid_1) really do the same.

Getting element density from abaqus output database using python scripting

I'm trying to get the element density from the abaqus output database. I know you can request a field output for the volume using 'EVOL', is something similar possible for the density?
I'm afraid it's not because of this: Getting element mass in Abaqus postprocessor
What would be the most efficient way to get the density? Look for every element in which section set it is?
Found a solution, I don't know if it's the fastest but it works:
odb_file_path=r'your_path\file.odb'
odb = session.openOdb(name=odb_file_path)
instance = odb.rootAssembly.instances['MY_PART']
material_name = instance.elements[0].sectionCategory.name[8:-2]
density=odb.materials[material_name].density.table[0][0])
note: the 'name' attribute will give you a string like, 'solid MATERIALNAME'. So I just cut out the part of the string that gave me the real material name. So it's the sectionCategory attribute of an OdbElementObject that is the answer.
EDIT: This doesn't seem to work after all, it turns out that it gives all elements the same material name, being the name of the first material.
The properties are associated something like this:
sectionAssignment connects section to set
set is the container for element
section connects sectionAssignment to material
instance is connected to part (could be from a part from another model)
part is connected to model
model is connected to section
Use the .inp or .cae file if you can. The following gets it from an opened cae file. To thoroughly get elements from materials, you would do something like the following, assuming you're starting your search in rootAssembly.instances:
Find the parts which the instances were created from.
Find the models which contain these parts.
Look for all sections with material_name in these parts, and store all the sectionNames associated with this section
Look for all sectionAssignments which references these sectionNames
Under each of these sectionAssignments, there is an associated region object which has the name (as a string) of an elementSet and the name of a part. Get all the elements from this elementSet in this part.
Cleanup:
Use the Python set object to remove any multiple references to the same element.
Multiply the number of elements in this set by the number of identical part instances that refer to this material in rootAssembly.
E.g., for some cae model variable called model:
model_part_repeats = {}
model_part_elemLabels = {}
for instance in model.rootAssembly.instances.values():
p = instance.part.name
m = instance.part.modelName
try:
model_part_repeats[(m, p)] += 1
continue
except KeyError:
model_part_repeats[(m, p)] = 1
# Get all sections in model
sectionNames = []
for s in mdb.models[m].sections.values():
if s.material == material_name: # material_name is already known
# This is a valid section - search for section assignments
# in part for this section, and then the associated set
sectionNames.append(s.name)
if sectionNames:
labels = []
for sa in mdb.models[m].parts[p].sectionAssignments:
if sa.sectionName in sectionNames:
eset = sa.region[0]
labels = labels + [e.label for e in mdb.models[m].parts[p].sets[eset].elements]
labels = list(set(labels))
model_part_elemLabels[(m,p)] = labels
else:
model_part_elemLabels[(m,p)] = []
num_elements_with_material = sum([model_part_repeats[k]*len(model_part_elemLabels[k]) for k in model_part_repeats])
Finally, grab the material density associated with material_name then multiply it by num_elements_with_material.
Of course, this method will be extremely slow for larger models, and it is more advisable to use string techniques on the .inp file for faster performance.

Python friends network visualization

I have hundreds of lists (each list corresponds to 1 person). Each list contains 100 strings, which are the 100 friends of that person.
I want to 3D visualize this people network based on the number of common friends they have. Considering any 2 lists, the more same strings they have, the closer they should appear together in this 3D graph. I wanted to show each list as a dot on the 3D graph without nodes/connections between the dots.
For brevity, I have included only 3 people here.
person1 = ['mike', 'alex', 'arker','locke','dave','david','ross','rachel','anna','ann','darl','carl','karle']
person2 = ['mika', 'adlex', 'parker','ocke','ave','david','rosse','rachel','anna','ann','darla','carla','karle']
person3 = ['mika', 'alex', 'parker','ocke','ave','david','rosse','ross','anna','ann','darla','carla','karle', 'sasha', 'daria']
Gephi Setup steps:
Install Gephi and then start it
You probably want to upgrade all the plugins now, see the button in the lower right corner.
Now create a new project.
Make sure the current workspace is Workspace1
Enable Graph Streaming plugin
In Streaming tab that then appears configure server to use http and port 8080
start the server (it will then have a green dot underneath it instead of a red dot).
Python steps:
install gephistreamer package (pip install gephistreamer)
Copy the following python cod to something like friends.py:
from gephistreamer import graph
from gephistreamer import streamer
import random as rn
stream = streamer.Streamer(streamer.GephiWS(hostname="localhost",port=8080,workspace="workspace1"))
szfak = 100 # this scales up everything - somehow it is needed
cdfak = 3000
nodedict = {}
def addfnode(fname):
# grab the node out of the dictionary if it is there, otherwise make a newone
if (fname in nodedict):
nnode = nodedict[fname]
else:
nnode = graph.Node(fname,size=szfak,x=cdfak*rn.random(),y=cdfak*rn.random(),color="#8080ff",type="f")
nodedict[fname] = nnode # new node into the dictionary
return nnode
def addnodes(pname,fnodenamelist):
pnode = graph.Node(pname,size=szfak,x=cdfak*rn.random(),y=cdfak*rn.random(),color="#ff8080",type="p")
stream.add_node(pnode)
for fname in fnodenamelist:
print(pname+"-"+fname)
fnode = addfnode(fname)
stream.add_node(fnode)
pfedge = graph.Edge(pnode,fnode,weight=rn.random())
stream.add_edge(pfedge)
person1friends = ['mike','alex','arker','locke','dave','david','ross','rachel','anna','ann','darl','carl','karle']
person2friends = ['mika','adlex','parker','ocke','ave','david','rosse','rachel','anna','ann','darla','carla','karle']
person3friends = ['mika','alex','parker','ocke','ave','david','rosse','ross','anna','ann','darla','carla','karle','sasha','daria']
addnodes("p1",person1friends)
addnodes("p2",person2friends)
addnodes("p3",person3friends)
Run it with the command python friends.py
You should see all the nodes appear. There are then lots of ways you can lay it out to make it look better, I am using the Force Atlas layouter here and you can see the parameters I am using on the left.
Some notes:
you can get the labels to show or disappear by clicking on the T on the bottom status/control bar.
View the data in the nodes and edges by opening Window/Data Table.
It is a very rich program, there are more options than you can shake a stick at.
You can set more properties on your nodes and edges in the python code and then they will show up in the data table view and can be used to filter, etc.
You want to pay attention to that update button in the bottom right corner of Gephi, there are a lot of bugs to fix.
This will get you started (as you asked), but for your particular problem:
you will also need to calculate weights for your persons (the "p" nodes), and link them to each other with those weights
Then you need to find a layouter and paramters that positions those nodes the way you want them based on the new weights.
So you don't really need to show the type="f" nodes, you need just the "p" nodes.
The weight between to "p" nodes should be based on the intersection of the sets of the friend names.
There are also Gephi plugins that can then display this in 3D, but that is actually a completely separate issue, you probably want to get it working in 2D first.
This is running on Windows 10 using Anaconda 4.4.1 and Python 3.5.2 and Gephi 0.9.1.

Follow-up on iterating over a graph using XML minidom

This is a follow-up to the question (Link)
What I intend on doing is using the XML to create a graph using NetworkX. Looking at the DOM structure below, all nodes within the same node should have an edge between them, and all nodes that have attended the same conference should have a node to that conference. To summarize, all authors that worked together on a paper should be connected to each other, and all authors who have attended a particular conference should be connected to that conference.
<conference name="CONF 2009">
<paper>
<author>Yih-Chun Hu(UIUC)</author>
<author>David McGrew(Cisco Systems)</author>
<author>Adrian Perrig(CMU)</author>
<author>Brian Weis(Cisco Systems)</author>
<author>Dan Wendlandt(CMU)</author>
</paper>
<paper>
<author>Dan Wendlandt(CMU)</author>
<author>Ioannis Avramopoulos(Princeton)</author>
<author>David G. Andersen(CMU)</author>
<author>Jennifer Rexford(Princeton)</author>
</paper>
</conference>
I've figured out how to connect authors to conferences, but I'm unsure about how to connect authors to each other. The thing that I'm having difficulty with is how to iterate over the authors that have worked on the same paper and connect them together.
dom = parse(filepath)
conference=dom.getElementsByTagName('conference')
for node in conference:
conf_name=node.getAttribute('name')
print conf_name
G.add_node(conf_name)
#The nodeValue is split in order to get the name of the author
#and to exclude the university they are part of
plist=node.getElementsByTagName('paper')
for p in plist:
author=str(p.childNodes[0].nodeValue)
author= author.split("(")
#Figure out a way to create edges between authors in the same <paper> </paper>
alist=node.getElementsByTagName('author')
for a in alist:
authortext= str(a.childNodes[0].nodeValue).split("(")
if authortext[0] in dict:
edgeQuantity=dict[authortext[0]]
edgeQuantity+=1
dict[authortext[0]]=edgeQuantity
G.add_edge(authortext[0],conf_name)
#Otherwise, add it to the dictionary and create an edge to the conference.
else:
dict[authortext[0]]= 1
G.add_node(authortext[0])
G.add_edge(authortext[0],conf_name)
i+=1
I'm unsure about how to connect authors to each other.
You need to generate (author, otherauthor) pairs so you can add them as edges. The typical way to do that would be a nested iteration:
for thing in things:
for otherthing in things:
add_edge(thing, otherthing)
This is a naïve implementation that includes self-loops (giving an author an edge connecting himself to himself), which you may or may not want; it also includes both (1,2) and (2,1), which if you're doing an undirected graph is redundant. (In Python 2.6, the built-in permutations generator also does this.) Here's a generator that fixes these things:
def pairs(l):
for i in range(len(l)-1):
for j in range(i+1, len(l)):
yield l[i], l[j]
I've not used NetworkX, but looking at the doc it seems to say you can call add_node on the same node twice (with nothing happening the second time). If so, you can discard the dict you were using to try to keep track of what nodes you'd inserted. Also, it seems to say that if you add an edge to an unknown node, it'll add that node for you automatically. So it should be possible to make the code much shorter:
for conference in dom.getElementsByTagName('conference'):
var conf_name= node.getAttribute('name')
for paper in conference.getElementsByTagName('paper'):
authors= paper.getElementsByTagName('author')
auth_names= [author.firstChild.data.split('(')[0] for author in authors]
# Note author's conference attendance
#
for auth_name in auth_names:
G.add_edge(auth_name, conf_name)
# Note combinations of authors working on same paper
#
for auth_name, other_name in pairs(auth_names):
G.add_edge(auth_name, otherauth_name)
im not entirely sure what you're looking for, but based on your description i threw together a graph which I think encapsulates the relationships you describe.
http://imgur.com/o2HvT.png
i used openfst to do this. i find it much easier to clearly layout the graphical relationships before plunging into the code for something like this.
also, do you actually need to generate an explicit edge between authors? this seems like a traversal issue.

Categories