PyV8, can I manipulate DOM structure?

PyV8, can I manipulate DOM structure? - python

Lets assume that we have PyV8:
import PyV8
ctxt = PyV8.JSContext()
and a python DOM structure, for example xml.dom
How can I feed a .js-file to PyV8 so that it could change DOM-structure that I have.
If I had it's content:
$("#id").remove();
I want dom item to be removed.
PyV8 has perfect hello-world example. But I'd like to see something usefull.
To be clear, what I want to do is:
"Javascript file" -->-- magic -->-- DOM, (already built with html file) and changed now with passed javascript file

A good example for what you're trying to do can be found here:
https://github.com/buffer/thug
It's a python http client executing JS via PyV8 for security research purposes, but can be strapped down easily for simpler needs.

Appologies for the formatting. I spaced as best I could, but my screen reader doesn't like SO's formatting controls.
I'm going to take a shot at answering your question, though it seems a tad vague. Please let me know if I need to rewrite this answer to fit a different situation.
I assume you are trying to get an HTML file from the web, and run Javascript from inside this file, to act on said document.
Unfortunately, none of the Python xml libraries have true DOM support, and W3C DOM compliance is nonexistent in every package I have found.
What you can do is use the PyV8 w3c.py dom file as a starting example, and create your own full DOM.
W3C Sample Dom
You will need to rewrite this module, though, as it does not respect quotes or apostrophys. BeautifulSoup is also not the speediest parser.
I would recommend using something like lxml.etree's target parser option.
LXML Target Parser
Search for "The feed parser interface".
Then, you can load an HTML/Script document with LXML, parse it as below, and run each of the scripts you need on the created DOM.
Find a partial example below. (Please note that the HTML standards are massive, scattered, and _highly browser specific, so your milage may vary).
class domParser(object):
def __init__(self):
#initialize dom object here, and obtain the root for the destination file object.
self.dom = newAwesomeCompliantDom()
self.document = self.dom.document
self.this = self.document
def comment(self, commentText):
#add commentText to self.document or the above dom object you created
self.this.appendChild(self.document.DOMImplementation.createComment(commentText))
def start(self, tag, attrs):
#same here
self.this = self.this.appendChild(self.document.DOMImplimentation.newElement(tag,attrs))
def data(self, dataText):
#append data to the last accessed element, as a new Text child
self.this.appendChild(self.document.DOMImpl.createDataNode(dataText))
def end(self):
#closing element, so move up the tree
self.this = self.this.parentNode
def close(self):
return self.document
#unchecked, please validate yourself
x = lxml.etree.parse(target=domParser)
x.feed(htmlFile)
newDom = x.close()

Related

Is it possible to display file size in a directory served using http.server in python?

I've served a directory using
python -m http.server
It works well, but it only shows file names. Is it possible to show created/modified dates and file size, like you see in ftp servers?
I looked through the documentation for the module but couldn't find anything related to it.
Thanks!

http.server is meant for dead-simple use cases, and to serve as sample code.1 That's why the docs link right to the source.
That means that, by design, it doesn't have a lot of configuration settings; instead, you configure it by reading the source and choosing what methods you want to override, then building a subclass that does that.
In this case, what you want to override is list_directory. You can see how the base-class version works, and write your own version that does other stuff—either use scandir instead of listdir, or just call stat on each file, and then work out how you want to cram the results into the custom-built HTML.
Since there's little point in doing this except as a learning exercise, I won't give you complete code, but here's a skeleton:
class StattyServer(http.server.HTTPServer):
def list_directory(self, path):
try:
dirents = os.scandir(path)
except OSError:
# blah blah blah
# etc. up to the end of the header-creating bit
for dirent in dirents:
fullname = dirent.path
displayname = linkname = dirent.name
st = dirent.stat()
# pull stuff out of st
# build a table row to append to r
1. Although really, it's sample code for an obsolete and clunky way of building servers, so maybe that should be "to serve as sample code to understand legacy code that you probably won't ever need to look at but just in case…".

In Allure Reports how can i made links with name and link by python

I want to create links with name and links for the item.
Example in reports I have link name - Task 55188 and this link redirects me to 'http://tfs.com/tfs/company/rnd/QA/_testManagement?planId=41890&suiteId=55188&_a=tests'
But how I create this in python code?

With Decorators
Some allure-python integrations allow setting a link pattern with a cmd line switch. For example:
allure-pytest:
--allure-link-pattern=http://tfs.com/tfs/company/rnd/QA/_testManagement?planId=41890&suiteId={}&_a=tests
allure-behave:
-D AllureFormatter.link_pattern=http://tfs.com/tfs/company/rnd/QA/_testManagement?planId=41890&suiteId={}&_a=tests
When you set a link pattern, you don't need to create your own allure_wrapper.py like in Viktoriia's answer and can use #allure.link:55188 directly.
With Dynamic
In addition to the decorator approach above, a Dynamic class is available to dynamically add links to the report at runtime. For example:
import allure
def some_test_function():
allure.dynamic.link('http://tfs.com/tfs/company/rnd/QA/_testManagement?planId=41890&suiteId=55188&_a=tests', name='55188')
Some integrations may not support dynamic linking and will do nothing when allure.dynamic.link is called. For example, I had to add support for allure-behave by implementing the relevant allure hooks in a PR.
We use dynamic linking to conditionally add Jira defect links for failing tests. When a test fails, we create a Jira defect with a tag specific to the test. The next time that test fails, it queries the Jira REST API to find all issues matching the tag and links them. This way we can add/remove test links from Jira and avoid fumbling around with decorators in the test code.

You can create allure_wrapper.py file inside your project and use decorator with task number/task title for your tests.
For example:
In your project you have a list of tasks:
constants.py
TASKS = {
'55188': 'Test task'
}
Import this list and use in allure_wrapper.py for tasks decorator
allure_wrapper.py
from constants import TASKS
from allure import link, issue, story
# Specify your link pattern
TFS_LINK_PATTERN = 'http://tfs.com/tfs/company/rnd/QA/_testManagement?planId=41890&suiteId={}&_a=tests'
def task_link(task_id):
return link(TFS_LINK_PATTERN.format(task_id), name=f'{item_type} {task_id}')
def task_links(links):
decos = []
for link in links:
decos.append(task_link(link))
decos.append(story(TASKS[link]))
return compose_decos(decos)
def compose_decos(decos):
def composition(func):
for deco in reversed(decos):
func = deco(func)
return func
return composition
Use created decorator to attach link:
from allure_wrapper import task_links
#task_links(['55188'])
def test_task_link():
# do smth
As result a clickable link will be available in your allure reporting

I did a workaround for this:
allure.attach('<head></head><body>Link to ...</body>', 'Logs', allure.attachment_type.HTML)

Python: Mako template lookups per app

I'm using cherrypy with Mako as a template engine.
I want Mako to lookup different directories based on what app is being requested.
I.e.
I have three 'apps': Site, Admin and Install.
They all have their own template folder, structure looking something like:
/template
/template/site
/template/admin
/template/install
/template/system
/system contains some system wide templates, like 404 pages, etc.
I'm using Twiseless as a reference whilst trying to get to grips with cherrypy / mako, but I'm stuck with how to do this.
Read on for a brief overview of how I've tried to do this, but a warning: I think I'm going about this completely the wrong way! :) So, if you have any ideas/pointers, it might be a good idea to save yourself the trouble of reading any further than this.
In my main file, server.py, I do something like:
from libs.plugins.template import MakoTemplatePlugin
engine = cherrypy.engine
makoTemplate = MakoTemplatePlugin(engine, self.base_dir)
setTemplateDirs(makoTemplate, self.template_path)
MakoTemplatePlugin is a slightly modified version of the plugin by the same name found in Twiseless, linked above.
What this code does is set the TemplateLookup to use the default template directories from my global config file. i.e.
/template
/template/system
Then, each time an app is loaded, I call a function (setTemplateDirs) to update the directories where Mako searches.
I thought this would work, but it doesn't. Initially I made the error of creating a new instance of MakoTemplatePlugin for each app. This just resulted in them all being called on each page load, starting with the first one instantiated, containing just the basic, non-app specific directories.
As this was called first, it was triggering a 404 error, as it was searching in the wrong folders.
I instead made sure to pass a reference to the MakeTemplatePlugin to all of my apps. I thought if I ran setTemplateDirs each time each app is called, this would solve the problem... but it doesn't.
I don't know where to put the function so it will run every time a page is requested...
e.g.
# /apps/site/app.py
import somemodule.setTemplateDirs
class Site(object, params):
def __init__(self):
self.params = params
self.makoTemplate = params['makoTemplate']
self.base_path = params['base_path']
setTemplateDirs(self.makoTemplate, self.base_path, '', '/')
#cherrypy.expose
#cherrypy.tools.render(template='index.html')
def index(self):
pass
This obviously just works when the application is first loaded... I tried moving the update function call into a seperate method update and tried calling that for each page, e.g:
#cherrypy.exposed
#cherrypy.tools.render(template='index.html')
#update
def index(self):
pass
But this just gives me config related errors.
Rather than to continue to mess about with this, there must be an easier way.
How would you do it?
Thanks a lot,
Tom

I got this working. Thanks to stephan for providing the link to the mako tool example: http://tools.cherrypy.org/wiki/Mako.
I just modified that slightly to get it working.
If anyone's wondering, the basis of it is that you define tools.mako.directories in your global config, you can then override that in individual app config files.
e.g.
server.conf
...
tools.mako.directories: ['', 'system']
...
site.conf
...
tools.mako.directories: ['site', 'system']
...
I did some extra work to translate the relative URIs to absolute paths, but the crux of it is explained above.

Global include in restructured text

I'm using reStructuredText for my blog/website and I want to add a global include file. I have access to and am happy to change the settings file I'm using to generate the html output, I just can't figure out the syntax for either:
adding a default include file to the parser
defining directive/inline-roles, etc in python with docutils in python
I tried reading the source code and the documentation and just find it a bit hard to follow. I'm hoping that I just missed something super-obvious, but I'd like to do something like the following (the first part is just what is already there -- you can see the rest of the file in the jekyll-rst plugin source (links right to it)
import sys
from docutils.core import publish_parts
from optparse import OptionParser
from docutils.frontend import OptionParser as DocutilsOptionParser
from docutils.parsers.rst import Parser
# sets up a writer that is then called to parse rst pages repeatedly
def transform(writer=None, part=None):
p = OptionParser(add_help_option=False)
# Collect all the command line options
docutils_parser = DocutilsOptionParser(components=(writer, Parser()))
for group in docutils_parser.option_groups:
p.add_option_group(group.title, None).add_options(group.option_list)
p.add_option('--part', default=part)
opts, args = p.parse_args()
# ... more settings, etc
# then I just tell the parser/writer to process specified file X.rst every time
# (or alternately a python file defining more roles...but nicer if in rst)
Is there a simple way to do this? It'd be great to define a file defaults.rst and have that load each time.
EDIT: Here are some examples of what I'd like to be able to globally include (custom directives would be nice too, but I'd probably write those in code)
.. role:: raw-html(raw)
:format: html
.. |common-substitution| replace:: apples and orange
.. |another common substitution| replace:: etc

I'm not quite sure if I understand the question. Do you want to define a number of, for example, substitutions in some file and have these available in all your other reStructuredText files, or do you want to include some common HTML in your output files? Can you clarify your question?
If it is the former that you want to do you can use the include directive, as I outline in this answer.
Alternatively, if you want some common HTML included in the generated output, try copying and editing the template.txt file which is include in the module path/to/docutils/writers/html4css1/. You can include arbitrary HTML elements in this file and modify the layout of the HTML generated by Docutils. Neither of these methods require you to modify the Docuitls source code, which is always an advantage.
Edit: I don't think it is possible to set a flag to set an include file using Docuitls. However, if you can use Sphinx, which is based on Docuitls but has a load of extensions, then this package has a setting rst_prolog which does exactly what you need (see this answer). rst_prolog is:
A string of reStructuredText that will be included at the beginning of every source file that is read.

I needed the exact same thing: A way to have some global reStructuredText files being automatically imported into every reStructuredText article without having to specify them each time by hand.
One solution to this problem is the following plugin:
import os
from pelican import signals
from pelican.readers import RstReader
class RstReaderWrapper(RstReader):
enabled = RstReader.enabled
file_extensions = ['rst']
class FileInput(RstReader.FileInput):
def __init__(self, *args, **kwargs):
RstReader.FileInput_.__init__(self, *args, **kwargs)
self.source = RstReaderWrapper.SourceWrapper(self.source)
# Hook into RstReader
RstReader.FileInput_ = RstReader.FileInput
RstReader.FileInput = FileInput
class SourceWrapper():
"""
Mimics and wraps the result of a call to `open`
"""
content_to_prepend = None
def __init__(self, source):
self.source = source
def read(self):
content = self.source.read()
if self.content_to_prepend is not None:
content = "{}\n{}".format(self.content_to_prepend, content)
return content
def close(self):
self.source.close()
def process_settings(pelicanobj):
include_files = pelicanobj.settings.get('RST_GLOBAL_INCLUDES', []) or []
base_path = pelicanobj.settings.get('PATH', ".")
def read(fn):
with open(os.path.join(base_path, fn), 'r') as res:
content = res.read()
return ".. INLCUSION FROM {}\n{}\n".format(fn, content)
inclusion = "".join(map(read, include_files)) if include_files else None
RstReaderWrapper.SourceWrapper.content_to_prepend = inclusion
def register():
signals.initialized.connect(process_settings)
Usage in short:
Create a plugin from the above code (best clone the repository from GitHub)
Import the plugin (adapt PLUGINS in pelicanconf.py)
Define the list of RST files (relative paths to project root) to include by setting the variable RST_GLOBAL_INCLUDES in pelicanconf.py
Please note that pelican and docutils are both not designed to allow this. Neither a signal is provided which provides a clean access to the raw contents of a source file before processing begins, nor is there a possibility to intercept the framework reading the file in "a normal way" (like subclassing, changing hardcoded configuration, etc).
This plugin subclasses the internal class FileInput of RstReader and sets the class reference of RstReader.FileInput to the subclass. Also python file objects are emulated through SourceWrapper.
Nevertheless, this approach works for me and is not cumbersome in the daily workflow.
I know this question is from 2012 but I think this answer can still be helpful to others.

Can I add "Smartypants" to restructuredText?

I use restructuredText, and I like what smartypants does for Markdown. Is there a way to enable the same thing for restructuredText?

Have you tried smartypants.py? I don't know how well it's implemented, much less how well it works for your specific use cases, but it does seem to target exactly your goal, unicode-ification of some ascii constructs (however, it runs on HTML, so I guess you'd run it after restructuredText or whatever other "producer of HTML" component).
If that doesn't work well for you, a user has submitted a patch to python-markdown2 which he calls "this SmartyPants patch" -- it's been accepted and since a month ago it's part of the current source tree of python-markdown2 (r259 or better). That may offer smoother sailing (e.g. if you just get and built python-markdown2 as a read-only svn tree). Or, you could wait for the next downloadable release (there hasn't been one since May and this patch was accepted in mid-July), but who knows when that'll happen.

As Alex Martelli says, smartyPants is what I need. However, I was looking for a little more detailed info on how to use it. So here's a Python script that reads the file named in the first command line argument, converts it to HTML, using Pygments for sourcecode, and then passses it through smartypants for prettifying.
#!/usr/bin/python
# EASY-INSTALL-SCRIPT: 'docutils==0.5','rst2html.py'
"""
A minimal front end to the Docutils Publisher, producing HTML.
"""
try:
from ulif.rest import directives_plain
from ulif.rest import roles_plain
from ulif.rest import pygments_directive
import locale
locale.setlocale(locale.LC_ALL, '')
except:
pass
from docutils.core import publish_doctree, publish_from_doctree
from smartypants import smartyPants
import sys
description = ('Personal docutils parser with extra features.')
doctree = publish_doctree(file(sys.argv[1]).read())
result = publish_from_doctree(doctree, writer_name='html')
result = smartyPants(result)
print result

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

PyV8, can I manipulate DOM structure? - python

A good example for what you're trying to do can be found here: https://github.com/buffer/thug It's a python http client executing JS via PyV8 for security research purposes, but can be strapped down easily for simpler needs.

Related

Is it possible to display file size in a directory served using http.server in python?

In Allure Reports how can i made links with name and link by python

Python: Mako template lookups per app

Global include in restructured text

Can I add "Smartypants" to restructuredText?

Categories

Resources