Python friends network visualization

Python friends network visualization - python

I have hundreds of lists (each list corresponds to 1 person). Each list contains 100 strings, which are the 100 friends of that person.
I want to 3D visualize this people network based on the number of common friends they have. Considering any 2 lists, the more same strings they have, the closer they should appear together in this 3D graph. I wanted to show each list as a dot on the 3D graph without nodes/connections between the dots.
For brevity, I have included only 3 people here.
person1 = ['mike', 'alex', 'arker','locke','dave','david','ross','rachel','anna','ann','darl','carl','karle']
person2 = ['mika', 'adlex', 'parker','ocke','ave','david','rosse','rachel','anna','ann','darla','carla','karle']
person3 = ['mika', 'alex', 'parker','ocke','ave','david','rosse','ross','anna','ann','darla','carla','karle', 'sasha', 'daria']

Gephi Setup steps:
Install Gephi and then start it
You probably want to upgrade all the plugins now, see the button in the lower right corner.
Now create a new project.
Make sure the current workspace is Workspace1
Enable Graph Streaming plugin
In Streaming tab that then appears configure server to use http and port 8080
start the server (it will then have a green dot underneath it instead of a red dot).
Python steps:
install gephistreamer package (pip install gephistreamer)
Copy the following python cod to something like friends.py:
from gephistreamer import graph
from gephistreamer import streamer
import random as rn
stream = streamer.Streamer(streamer.GephiWS(hostname="localhost",port=8080,workspace="workspace1"))
szfak = 100 # this scales up everything - somehow it is needed
cdfak = 3000
nodedict = {}
def addfnode(fname):
# grab the node out of the dictionary if it is there, otherwise make a newone
if (fname in nodedict):
nnode = nodedict[fname]
else:
nnode = graph.Node(fname,size=szfak,x=cdfak*rn.random(),y=cdfak*rn.random(),color="#8080ff",type="f")
nodedict[fname] = nnode # new node into the dictionary
return nnode
def addnodes(pname,fnodenamelist):
pnode = graph.Node(pname,size=szfak,x=cdfak*rn.random(),y=cdfak*rn.random(),color="#ff8080",type="p")
stream.add_node(pnode)
for fname in fnodenamelist:
print(pname+"-"+fname)
fnode = addfnode(fname)
stream.add_node(fnode)
pfedge = graph.Edge(pnode,fnode,weight=rn.random())
stream.add_edge(pfedge)
person1friends = ['mike','alex','arker','locke','dave','david','ross','rachel','anna','ann','darl','carl','karle']
person2friends = ['mika','adlex','parker','ocke','ave','david','rosse','rachel','anna','ann','darla','carla','karle']
person3friends = ['mika','alex','parker','ocke','ave','david','rosse','ross','anna','ann','darla','carla','karle','sasha','daria']
addnodes("p1",person1friends)
addnodes("p2",person2friends)
addnodes("p3",person3friends)
Run it with the command python friends.py
You should see all the nodes appear. There are then lots of ways you can lay it out to make it look better, I am using the Force Atlas layouter here and you can see the parameters I am using on the left.
Some notes:
you can get the labels to show or disappear by clicking on the T on the bottom status/control bar.
View the data in the nodes and edges by opening Window/Data Table.
It is a very rich program, there are more options than you can shake a stick at.
You can set more properties on your nodes and edges in the python code and then they will show up in the data table view and can be used to filter, etc.
You want to pay attention to that update button in the bottom right corner of Gephi, there are a lot of bugs to fix.
This will get you started (as you asked), but for your particular problem:
you will also need to calculate weights for your persons (the "p" nodes), and link them to each other with those weights
Then you need to find a layouter and paramters that positions those nodes the way you want them based on the new weights.
So you don't really need to show the type="f" nodes, you need just the "p" nodes.
The weight between to "p" nodes should be based on the intersection of the sets of the friend names.
There are also Gephi plugins that can then display this in 3D, but that is actually a completely separate issue, you probably want to get it working in 2D first.
This is running on Windows 10 using Anaconda 4.4.1 and Python 3.5.2 and Gephi 0.9.1.

Related

Retrieving node locations from pydotplus (or any layered graph drawing engine)

I'm preparing a layered graph drawing using a dataframe containing node data:
type label
0 Class Insurance Product
1 Class Person
2 Class Address
3 Class Insurance Policy
And another containing relationship data:
froml tol rel fromcard tocard
0 Insurance Policy Insurance Product ConveysProduct One One
1 Person Insurance Policy hasPolicy One Many
2 Person Address ResidesAt None None
I populate a pydotplus dot graph with the content, which I can then use to generate a rendering:
pdp_graph = pydotplus.graphviz.Dot(graph_name="pdp_graph", graph_type='digraph', prog="dot")
for i,e in b_rels_df.iterrows():
edge = pydotplus.graphviz.Edge(src=e['froml'], dst=e['tol'], label=e['rel'])#, set_fromcard=e['fromcard'], set_tocard=e['tocard'])
pdp_graph.add_edge(edge)
for i,n in ents_df.iterrows():
node = pydotplus.graphviz.Node(name=n['label'], set_type=n['type'], set_label=n['label'])
pdp_graph.add_node(node)
png = pdp_graph.create_png()
display(Image(png))
So far so good - but now I want to retrieve the node positions for use in my own interactive layout (the png is a nice example/diagram, but I want to build upon it), so am attempting to retrieve the node locations calculated via:
[n.get_pos() for n in pdp_graph.get_nodes()]
But this only returns:
> [None, None, None, None]
I've tried lots of different methods, graphviz/dot are installed fine - as proven by the image of the layout - how can I extract the positions of the nodes as data from any type of dot-style layout?
There is a way I can do this via the pygraphviz library via networkx, but the installation-overhead restricts me (pygraphviz needs to be recompiled to cinch with the graphviz install) from being able to use that for the target installations where I've less control over the base environments, hence my attempt to use pydotplus, which appears less demanding in terms of install requirements.
How do I retrieve the layout data from a layered graph drawing using this setup (or one similar), such that I can use it elsewhere? I'm looking for x,y values that I can map back to the nodes that they belong to.

I know nothing about your python setup. ( As usual with python it seems awkward and restrictive )
I suggest using Graphviz directly. In particular the 'attributed DOT format' which is a plain text file containing the layout ( resulting from the layout engine ) produced when the engine is run with the command option -Tdot. The text file is easily parsed to get exactly what you need.
Here is a screenshot of the first paragraph of the relevant documentation
The graphviz.org website contains all the additional details you may need.

You are creating an output file of png format. All position data is lost in this process. Instead, create output format="dot". Then, read that back in & modify as desired.

How do I pull specific property values from an injected object when adding a batch of edges in a Gremlin/TinkerPop traversal?

I want to add batches of edges to a JanusGraph db that already contains nodes. I want my edges to support setting dynamic/optional properties.
I've cobbled together the following traversal (based on this SO question) that I believe illustrates what I want to do:
1..inject() a batch of edges
2. Pull to/from vertex ids from the objects in the injected edge batch
3. Set all fields in edge batch objects as edge properties with .sideEffect()
uuid_1 = "89079f8fa3ee849a61a45e0b3e6d28cd"
uuid_2 = "00a9ae430dc812f483b0660212264190"
edge_batch = [
{
"from_uuid": uuid_1,
"to_uuid": uuid_2,
"posted_at": 1650012568000,
"test_property_2": "I was here"
},
{
"from_uuid": uuid_2,
"to_uuid": uuid_1,
"posted_at": 1650012568888,
"test_property_3": "I'M STILL HERE"
}
]
new_edges = (
g
.inject(edge_batch)
.unfold()
.as_("edge_batch")
.V()
.has("uuid", __.select("edge_batch").select("to_uuid"))
.as_("to_v")
.V()
.has("uuid", __.select("edge_batch").select("from_uuid"))
.addE("MY_EDGE_TYPE")
.to("to_v")
.as_("new_edge")
.sideEffect(
__.select("edge_batch")
.unfold()
.as_("kvp")
.select("new_edge")
.property(
__.select("kvp").by(Column.keys), __.select("kvp").by(Column.values)
)
)
.iterate()
)
As written, the code above results in a traversal timeout when the referenced vertices exist. If I replace the first two __.select("edge_batch")... expressions above with references to the uuid_1 and uuid_2 variables, the code works. I think my problem is I just can't figure out how I'm supposed to reference properties of the injected, unfolded edge batch objects.
I'm using gremlin-python v3.6.0, JanusGraph v0.6.1, TinkerPop v3.5.1.

Your code runs just fine with a small graph. Only, the two unfolds from a list with two elements makes your code run four times, unintentionally I guess.
As to why the code does not run on your janusgraph installation:
Be sure uuid is an indexed property if your graph is large
Maybe you were confused by vertices with uuid_1 and uuid_2 being present in the janusgraph cache, because .has("uuid", __.select("edge_batch").select("to_uuid")) and .has("uuid", uuid_1) really do the same.

Django finding paths between two vertexes in a graph

This is mostly a logical question, but the context is done in Django.
In our Database we have Vertex and Line Classes, these form a (neural)network, but it is unordered and I can't change it, it's a Legacy Database
class Vertex(models.Model)
code = models.AutoField(primary_key=True)
lines = models.ManyToManyField('Line', through='Vertex_Line')
class Line(models.Model)
code = models.AutoField(primary_key=True)
class Vertex_Line(models.Model)
line = models.ForeignKey(Line, on_delete=models.CASCADE)
vertex = models.ForeignKey(Vertex, on_delete=models.CASCADE)
Now, in the application, the user will be able to visually select TWO vertexes (the green circles below)
The javascript will then send the pk of these two Vertexes to Django, and it has to find the Line classes that satisfy a route between them, in this case, the following 4 red Lines :
Business Logic:
A Vertex can have 1-4 Lines related to it
A Line can have 1-2 Vertexes related to it
There will only be one possible route between two Vertexes
What I have so far:
I understand that the answer probably includes recursion
The path must be found by trying every path from one Vertex untill the other is find, it can't be directly found
Since there are four and three-way junctions, all the routes being tried must be saved throughout the recursion(unsure of this one)
I know the basic logic is looping through all the lines of each Vertex, and then get the other Vertex of these lines, and keep walking recursively, but I really don't know where to start on this one.
This is as far as I could get, but it probably does not help (views.py) :
def findRoute(request):
data = json.loads(request.body.decode("utf-8"))
v1 = Vertex.objects.get(pk=data.get('v1_pk'))
v2 = Vertex.objects.get(pk=data.get('v2_pk'))
lines = v1.lines.all()
routes = []
for line in lines:
starting_line = line
#Trying a new route
this_route_index = len(routes)
routes[this_route_index] = [starting_line.pk]
other_vertex = line.vertex__set.all().exclude(pk=v1.pk)
#There are cases with dead-ends
if other_vertex.length > 0:
#Mind block...

As you has pointed, this is not a Django/Python related question, but a logical/algorithmic matter.
To find paths between two vertexes in a graph you can use lot of algorithms: Dijkstra, A*, DFS, BFS, Floyd–Warshall etc.. You can choose depending on what you need: shortest/minimum path, all paths...
How to implement this in Django? I suggest to don't apply the algorithm over the models itself, since this could be expensive (in term of time, db queries, etc...) specially for large graphs; instead, I'd rather to map the graph in an in-memory data structure and execute the algorithm over it.
You can take a look to this Networkx, which is a very complete (data structure + algorithms) and well documented library; python-graph, which provides a suitable data structure and a whole set of important algorithms (including some of the mentioned above). More options at Python Graph Library

maya component id data

Does anyone know how to obtain the data in maya called the component ID of the vertices.
I know how to get the vert number but the component ID on the vert is something that changes as the model has been changed.
It seems that there is data in the vertex but just can't find any command to extract it. Any help would be helpful.
I even tried using the maya api but this also just seem to give me the vertices index number and not the actual ID (which is not a sequence as the vertices indexes)
Thanks

try that,
import maya.OpenMaya as om
sel = om.MSelectionList()
om.MGlobal.getActiveSelectionList(sel)
dag = om.MDagPath()
comp = om.MObject()
sel.getDagPath(0, dag, comp)
itr = om.MItMeshFaceVertex(dag, comp)
print '| %-15s| %-15s| %-15s' % ('Face ID', 'Object VertID', 'Face-relative VertId')
while not itr.isDone():
print '| %-15s| %-15s| %-15s' % (itr.faceId(), itr.vertId(), itr.faceVertId())
itr.next()
there are many solutions, i find this one.... src: Link

Maya components do not have persistent identities; the 'vertex id' is just the index of one entry in the vertex table (or the tables for faces, normals, etc). That's why its so easy to mess up a model with construction history if you go 'upstream' and change things that affect the model's component count or topology.
You can attach persistent data to a vertex using the PolyBlindData system, which attaches arbitrary info to faces, vertices or edges. You could attach data to a particular vertex and the data would probably survive, though the same considerations which can mess up things like vert colors or UVs when construction history changes upstream will also mess with blind data.

ArcMap Data Driven Pages Dynamic Feature Labels

I am trying to work out a way to change between two sets of labels on a map. I have a map with zip codes that are labeled and I want to be able to output two maps: one with the zip code label (ZIP) and one with a value from a field I have joined to the data (called chrlabel). The goal is to have one map showing data for each zip code, and a second map giving the zip code as reference.
My initial attempt which I can't get working looks like this:
1) I added a second data frame to my map and add a new layer that contains two polygons with names "zip" and "chrlabel".
2) I use this frame to enable data driven pages and then I hide it behind the primary frame (I don't want to see those polygons, I just want to use them to control the data driven pages).
3) In the zip code labels I tried to write a VBScript expression like this pseudo-code:
test = "
If test = "zip" then
label = ZIP
else
label = CHRLABEL
endif
This does not work because the dynamic text does not resolve to the page name in the VBScript.
Is there some way to call the page name in VBScript so that I can make this work?
If not, is there another way to do this?
My other thought is to add another field to the layer that gets filled with a one or a zero. Then I could replace the if-then test condition with if NewField = 1.
Then I would just need to write a script that updates all the NewFields for the zipcode features when the data driven page advances to the second page. Is there a way to trigger a script (python or other) when a data driven page changes?
Thanks

8 months too late, but for posterity...
You're making things hard on yourself - it would be much easier to set up a duplicate layer and use different layers, then adjust layer visibility. I'm not familiar with VBScript for this sort of thing, but in Python (using ESRI's library) it would look something like so [python 2.6, ArcMap 10 - sample only, haven't debugged this but I do similar things quite often]:
from arcpy import mapping
## Load the map from disk
mxdFilePath = "C:\\GIS_Maps_Folder\\MyMap.mxd"
mapDoc = mapping.MapDocument(mxdFilePath)
## Load map elements
dataFrame = mapping.ListDataFrames(mapDoc)[0] #assumes you want the first dataframe; you can also search by name
mxdLayers = mapping.ListLayers(dataFrame)
## Adjust layers
for layer in mxdLayers:
if (layer.name == 'zip'):
zip_lyr = layer
elif(layer.name == 'sample_units'):
labels_lyr = layer
## Print zip code map
zip_lyr.visible = True
zip_lyr.showLabels = True
labels_lyr.visible = False
labels_lyr.showLabels = False
zip_path = "C:\\Output_Folder\\Zips.pdf"
mapping.ExportToPDF(mapDoc, zip_path, layers_attributes="NONE", resolution=150)
## Print labels map
zip_lyr.visible = False
zip_lyr.showLabels = False
labels_lyr.visible = True
labels_lyr.showLabels = True
labels_path = "C:\\Output_Folder\\Labels.pdf"
mapping.ExportToPDF(mapDoc, labels_path, layers_attributes="NONE", resolution=150)
## Combine files (if desired)
pdfDoc = mapping.PDFDocumentCreate("C:\\Output_Folder\\Output.pdf"")
pdfDoc.appendPages(zip_path)
pdfDoc.appendPages(labels_path)
pdfDoc.saveAndClose()
As far as the Data Driven Pages go, you can export them all at once or in a loop, and adjust whatever you want, although I'm not sure why you'd need to if you use something similar to the above. The ESRI documentation and examples are actually quite good on this. (You should be able to get to all the other Python documentation pretty easily from that page.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.