Visualize a clickable graph in an HTML page - python

I have defined a data structure using pygraph. I can display that data easily as PNG using graphviz.
I would like to display the data graphically, but making each node in the graph clickable. Each node must be linked to a different web page. How would I approach this?
What I would like is:
Assign an href for each node
Display all the graph as image
Make each node in the image clickable
Tooltips for hover event: whenever the cursor is positioned on top of an edge / node, a tooltip should be displayed

I believe graphviz can already output an image as a map for use in html.
Check this doc or this one for how to tell graphviz to output a coordinate map to use. It will even append the url you specify, and there even is a version that only uses rectangles for mapping links
Edit:
You can also check this document by LanyeThomas which outlines the basic steps:
Create a GraphViz dot file with the required linking information,
maybe with a script.
Run GraphViz once to generate an image of the
graph.
Run GraphViz again to generate an html image-map of the graph.
Create an index.html (or wiki page) with an IMG tag of the graph,
followed by the image-map's html.
Direct the image-map urls to a Wiki
page with each node's name - generate the Wiki pages automatically if
needed.
Optionally, link directly to the class hierarchy image
generated by DoxyGen. Have the Wiki page link to any additional
documentation, including DoxyGen docs, images, etc.

You can use pygraphviz and cmapx
import pygraphviz
my_graph = pygraphviz.AGraph(id="my_graph", name="my_graph")
my_graph.add_node("node",
label="my_node",
tooltip="tooltip text \r next line", # use \r for new line
URL="http://ya.ru/",
target="_blank")
my_graph.layout(prog='dot')
my_graph.draw(path="my_graph.map", format="cmapx")
my_graph.draw(path="my_graph.svg", format="svg")
then use content of my_graph.map in html
<IMG SRC="my_graph.svg" USEMAP="#my_graph" />
... [content of my_graph.map] ...

You could use click maps:
<img src="graph.png" width="400" height="300" usemap="#mygraphmap">
<map name="mygraphmap">
<area shape="circle" coords="100,100,30" href="f8a08.htm">
<area shape="circle" coords="200,100,30" href="1d0f.htm">
</map>
You'd obviously have to find out the coordinates somehow.
edit:
You can also use rect or polygon for the shape attribute.
I think you can also add a mouseover or title attribute to the area elements.
more edit:
Or you could make graphviz output svg which can be integrated in the html5 DOM. I mean that you might be able to handle the elements inside a graph as a DOM-object.

Most of the suggested solutions involve an Image map format, but that's tricky. Not only that, you won't be able to edit the clickable SVG file in e.g. Inkscape afterwards to customize your drawing.
The SVG files that graphviz generates already have clickable links if you provide URLs. If you're having trouble with them, then perhaps you need to use an <object> tag rather than an <img> tag:
<object width="100%" data="./example.gv.svg" type="image/svg+xml"></object>
See also Insert clickable SVG image into Sphinx documentation.

If you are looking for a graphviz alternative, may be you could use jsPlumb Library. See some samples here
Also checkout JavaScript InfoVis Toolkit

Related

Saving HTML Element Tree including CSS properties using Selenium

I'm using Python with Selenium.
I am attempting to do some web scraping. I have a WebElement (which contains child elements) that I would like to save to a offline file. So far, I have managed to get the raw HTML for my WebElement using WebElement.get_attribute('innerHTML'). This works, but, no CSS is present in the final product because a stylesheet is used on the website. So I'd like to get these CSS properties converted to inline.
I found this stackoverflow solution which shows how to get the CSS properties of an element. However, getting this data, then parsing the HTML as a string to add these properties inside the HTML tag's style attribute, then doing this for all the child elements, feels like it'd be a significant undertaking.
So, I was wondering whether there was a more straightforward way of doing this.

How do I get the list of all images on a page?

In Firefox, I can get a list of all images from the "Media" tab of the Page Info window:
How can I obtain such a list using Python Selenium? In addition to getting such a list of image URLs, I would also like to be able to get each image's data (i.e. the image itself) without needing to make additional network requests.
Please DO NOT suggest that I parse the HTML to look for <img ... /> tags. That is clearly not what I'm looking for. I am looking for image responses. Not all image responses are present in the DOM. Example: some image responses from AJAX requests.

Ipython notebook: generate log output in one cell; move to specific line in that cells output based on actions of function call in subsequent cell?

I want to be able to display a log of a few thousand lines in a scrolling window/cell/frame; execute a python function and as a consequence have the log window scroll to a particular line of the log. I thought the ipython notebook environment would help in the further processing of the logged data that must be done, but if it is easiest done with some other GUI ...
it is for exploration of the logged data. We don't as yet know how best to separate the seed from the chaff.
Since posting, I have found this solution of:
<html>
<body>
<script>
function jump2iframe(ifrname, ifrlabel)
{
document.getElementById(ifrname).contentWindow.location.hash = ifrlabel;
};
</script>
...
<iframe src="rad_1_file_5.html" width="100%" id="ifr">
<p>Your browser does not support iframes.</p>
</iframe>
...
Jump to Anchor line0200 in iframe?
...
Problem is that although it works in Firefox I want to continue using chrome, where it fails due to problems with the "Same Origin Policy".
My attempts at trying to use Cross-document messaging with the iframe that successfully loads the file from the same directory as the parent document all fail.
Possible solution found. Instead of including a generated file I will try using:
In first ipython cell
Create html with embedded links for every line:
from Ipython.display import display HTML
for n, line in enumerate(logdata, 1):
display(HTML('<a id="line%06i">%s</b>' % (n, line)))
Click to scroll the cells output. This will create a scrolling html "div" section when there are many lines.
In another cell
I can create a link to scroll to line 22 for example by calling display again:
display(HTML('go to line000022'
I would prefer to get the cross document messaging for an embedded iframe working, but that solution eludes me.

Python pisa/xhtml2pdf messy rendering

In a django project, I want to generate an html page from a view and convert the html/css generated to pdf. I am using xhtml2pdf for this (https://github.com/chrisglass/xhtml2pdf/blob/master/doc/usage.rst#using-xhtml2pdf-in-django).
Browser -> django view -> mysql DB -> django template -> html/css -> pdf
I have made sure that:
I am using a function (link_callbak) to convert all relative paths to a proper absolute ones so xhtml2pdf is able to retrieve all the images needed.
Instead of relying on a tag to include the CSS (which does not work) I have directly used #import function with an absolute path to the css file. (CSS not rendered by Pisa's pdf generation in Django)
The css file is taken into account as I find some style element in the output howver the pdf generated is very different from the html output. Images are all messed up (partly visible and partly just outside the document), forms are not respected, font size is not correct, <ul> are not properly rendered. Moreover, I had to remove a -moz-placeholder tag from the CSS as it was not properly handeled by xhtml2pdf.
Is there known issues of CSS interpretation with xhtml2pdf ? Is there restrictions ?
I already spent a lot of time customizing the CSS file to make it work on Chrome/Firefox and IE7, and I don't want to spend another round on adapting it for xhtml2pdf. Is there a reliable solution to convert an html/CSS templated through django to pdf ? Even a special type of link to call the 'print pdf' function of the browser would do...
And no, I don't want to use ReportLab and draw squares and circles, thank you !

Detect the size of an image in HTML, using python

I'm trying to implement a similar functionality to Facebook's thumbnail preview. The idea is, a user enters the URL of a product, and selects the best image of that product.
In order to filter out images that obviously aren't a product, I want to filter them based on height and width > 150px.
I'm using python and BeautifulSoup to download the HTML and extract images, but can't find a way to gather the height or width when it is specified in CSS.
GD is a library that's been around for quite some time and it has a pretty easy interface to work with...Here's a link to GD
See the "size" method to get width and height.
EDIT
Ah, how about this?
Parse the HTML content and retrieve URLs to the CSS file(s) and inline styles
Download the CSS file(s) and parse CSS files, in order, building a rule-set of the CSS rules.
Next, parse the rest of the HTML from Step 1, gathering IMG tags and if the IMG tag has a class name, look up the class name in your CSS rules and check for width or height.
Might sound a little complicated but I bet download a few CSS stylesheets is much lighter than downloading images and having to use an image library on the server-side.

Categories