Python: create graph based on degree correlation - python

I want to create a graph using networkx which has positive or negative degree correlation.
Like a graph for a social network or citations in academic papers etc.
Can you suggest some function for this?

If you are talking about producing a visual graph (diagram) you could look at using matplotlib to generate them. I'm not sure if there is going to be a single function that will do what you want (not enough detail) but its a comprehensive library used in many projects to achieve complex graphing related tasks.

Related

How to calculate forces between individual atoms in OpenMM

I am new to OpenMM and I would appreciate some guidance on the following matter:
Currently I am not interested in running molecular dynamics simulations, for starters I would just like to compute what are the forces or free energies between individual pairs of atoms using OpenMMs AMBER force field for example. Essentially I would like to end up with a heat map which represents forces between atom pairs something like this:
Where numbers represent strength of the force or value of free energy.
I have trouble finding out how to access such lower level functionality of OpenMM where I could write a custom script that calculates only desired forces provided the 3D coordinates of atoms and their types. In their tutorials I have just found how to run fully fledged simulations by providing force field data and PDB files of molecular systems.
Preferably I would like to achieve this with python.
Any concrete example or guidance is much appreciated.
I have found an answer in the Openmm's issue tracker on GitHub.
In short: There is no API to achieve exactly that in OpenMM as what I am trying to do is not well defined from purely physical/chemical perspective. My best bet is to compute something that looks like an energy based only on pairwise inter-atom distances which can be quarried from an openmm state like this (as suggested in the discussion referenced above):
state = simulation.context.getState(getPositions=True)
positions = state.getPositions(asNumpy=True).value_in_unit(nanometer)

Graph matching algorithms

I've been searching for graph matching algorithms written in Python but I haven't been able to find much.
I'm currently trying to match two different graphs that derive from two distinct sets of character sequences. I know that there is an underlying connection between the two graphs, more precisely a one-to-one mapping between the nodes. But the graphs don't have the same labels and as such I need graph matching algorithms that return nodes mappings just by comparing topology and/or attributes. By testing, I hope to maximize correct matches.
I've been using Blondel and Heymans from the graphsim package and intend to also use Tacsim from the same package.
I would like to test other options, probably more standard, like maximum subgraph isomorphism or finding subgraphs with very good matchings between the two graphs. Graph edit distance might also help if it manages to give a matching.
The problem is that I can't find anything implemented, even in Networkx that I'm using. Does anyone know of any Python implementations? Would be a plus if those options used Networkx.
I found this implementation of Graph Edit Distance algorithms which uses NetworkX in Python.
https://github.com/Jacobe2169/GMatch4py
"GMatch4py is a library dedicated to graph matching. Graph structure are stored in NetworkX graph objects. GMatch4py algorithms were implemented with Cython to enhance performance."

Simple phase diagram using pymatgen

I've been getting familiar with the pymatgen package and need to make phase diagrams. There's a quick tutorial on this web page that goes through how to make a ternary diagram, but I actually want to make a much simpler one of a pure substance.
I have in mind something like this. I've gone through the documentation and done a lot of google searches but haven;t been able to find what I'm looking for. Perhaps it's possible to combine the data from pymatgen with a graphing package like matplotlib?
T-P phase diagrams like those show phase stability against pressure and temperature as the independent variables. The data on the Materials Project was calculated using density functional theory (DFT) at a temperature of 0K and a pressure of 0Pa. Unfortunately it's not possible to create a T-P phase diagram from the MP data.

Using Python to generate a connection/network graph

I have a text file with about 8.5 million data points in the form:
Company 87178481
Company 893489
Company 2345788
[...]
I want to use Python to create a connection graph to see what the network between companies looks like. From the above sample, two companies would share an edge if the value in the second column is the same (clarification from/for Hooked).
I've been using the NetworkX package and have been able to generate a network for a few thousand points, but it's not making it through the full 8.5 million-node text file. I ran it and left for about 15 hours, and when I came back, the cursor in the shell was still blinking, but there was no output graph.
Is it safe to assume that it was still running? Is there a better/faster/easier approach to graph millions of points?
If you have 1000K points of data, you'll need some way of looking at the broad picture. Depending on what you are looking for exactly, if you can assign a "distance" between companies (say number of connections apart) you can visualize relationships (or clustering) via a Dendrogram.
Scipy does clustering:
http://docs.scipy.org/doc/scipy/reference/cluster.hierarchy.html#module-scipy.cluster.hierarchy
and has a function to turn them into dendrograms for visualization:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html#scipy.cluster.hierarchy.dendrogram
An example for a shortest path distance function via networkx:
http://networkx.lanl.gov/reference/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html#networkx.algorithms.shortest_paths.generic.shortest_path
Ultimately you'll have to decide how you want to weight the distance between two companies (vertices) in your graph.
You have too many datapoints and if you did visualize the network it won't make any sense. You need to have ways to 1)reduce the number of companies by removing those that are less important/less connected 2)summarize the graph somehow and then visualize.
to reduce the size of data it might be better to create the network independently (using your own code to create an edgelist of companies). This way you can reduce the size of your graph (by removing singletons for example, which may be many).
For summarization I recommend running a clustering or a community detection algorithm. This can be done very fast even for very large networks. Use the "fastgreedy" method in the igraph package: http://igraph.sourceforge.net/doc/R/fastgreedy.community.html
(there is a faster algorithm available online as well, this is by Blondel et al: http://perso.uclouvain.be/vincent.blondel/publications/08BG.pdf I know their code is available online somewhere)

Python :How to generate a power law graph

I have installed networkx and matplotlib packages. How can I generate a power law graph based on degree correlation i.e. graphs with high or low degree of homophily
Have you looked at the examples on the Networkx site? This example might help you get started with this.
There are also a number of functions within Networkx which generate random graphs which will probably be helpful. Have a look for the random_powerlaw_tree(....) function detailed in the graph generators section of the documentation.
networkx.generators.barabasi_albert_graph will generate a graph according to the Barabasi-Albert model, which will have a power-law degree distribution.

Categories