Global include in restructured text - python

I'm using reStructuredText for my blog/website and I want to add a global include file. I have access to and am happy to change the settings file I'm using to generate the html output, I just can't figure out the syntax for either:
adding a default include file to the parser
defining directive/inline-roles, etc in python with docutils in python
I tried reading the source code and the documentation and just find it a bit hard to follow. I'm hoping that I just missed something super-obvious, but I'd like to do something like the following (the first part is just what is already there -- you can see the rest of the file in the jekyll-rst plugin source (links right to it)
import sys
from docutils.core import publish_parts
from optparse import OptionParser
from docutils.frontend import OptionParser as DocutilsOptionParser
from docutils.parsers.rst import Parser
# sets up a writer that is then called to parse rst pages repeatedly
def transform(writer=None, part=None):
p = OptionParser(add_help_option=False)
# Collect all the command line options
docutils_parser = DocutilsOptionParser(components=(writer, Parser()))
for group in docutils_parser.option_groups:
p.add_option_group(group.title, None).add_options(group.option_list)
p.add_option('--part', default=part)
opts, args = p.parse_args()
# ... more settings, etc
# then I just tell the parser/writer to process specified file X.rst every time
# (or alternately a python file defining more roles...but nicer if in rst)
Is there a simple way to do this? It'd be great to define a file defaults.rst and have that load each time.
EDIT: Here are some examples of what I'd like to be able to globally include (custom directives would be nice too, but I'd probably write those in code)
.. role:: raw-html(raw)
:format: html
.. |common-substitution| replace:: apples and orange
.. |another common substitution| replace:: etc

I'm not quite sure if I understand the question. Do you want to define a number of, for example, substitutions in some file and have these available in all your other reStructuredText files, or do you want to include some common HTML in your output files? Can you clarify your question?
If it is the former that you want to do you can use the include directive, as I outline in this answer.
Alternatively, if you want some common HTML included in the generated output, try copying and editing the template.txt file which is include in the module path/to/docutils/writers/html4css1/. You can include arbitrary HTML elements in this file and modify the layout of the HTML generated by Docutils. Neither of these methods require you to modify the Docuitls source code, which is always an advantage.
Edit: I don't think it is possible to set a flag to set an include file using Docuitls. However, if you can use Sphinx, which is based on Docuitls but has a load of extensions, then this package has a setting rst_prolog which does exactly what you need (see this answer). rst_prolog is:
A string of reStructuredText that will be included at the beginning of every source file that is read.

I needed the exact same thing: A way to have some global reStructuredText files being automatically imported into every reStructuredText article without having to specify them each time by hand.
One solution to this problem is the following plugin:
import os
from pelican import signals
from pelican.readers import RstReader
class RstReaderWrapper(RstReader):
enabled = RstReader.enabled
file_extensions = ['rst']
class FileInput(RstReader.FileInput):
def __init__(self, *args, **kwargs):
RstReader.FileInput_.__init__(self, *args, **kwargs)
self.source = RstReaderWrapper.SourceWrapper(self.source)
# Hook into RstReader
RstReader.FileInput_ = RstReader.FileInput
RstReader.FileInput = FileInput
class SourceWrapper():
"""
Mimics and wraps the result of a call to `open`
"""
content_to_prepend = None
def __init__(self, source):
self.source = source
def read(self):
content = self.source.read()
if self.content_to_prepend is not None:
content = "{}\n{}".format(self.content_to_prepend, content)
return content
def close(self):
self.source.close()
def process_settings(pelicanobj):
include_files = pelicanobj.settings.get('RST_GLOBAL_INCLUDES', []) or []
base_path = pelicanobj.settings.get('PATH', ".")
def read(fn):
with open(os.path.join(base_path, fn), 'r') as res:
content = res.read()
return ".. INLCUSION FROM {}\n{}\n".format(fn, content)
inclusion = "".join(map(read, include_files)) if include_files else None
RstReaderWrapper.SourceWrapper.content_to_prepend = inclusion
def register():
signals.initialized.connect(process_settings)
Usage in short:
Create a plugin from the above code (best clone the repository from GitHub)
Import the plugin (adapt PLUGINS in pelicanconf.py)
Define the list of RST files (relative paths to project root) to include by setting the variable RST_GLOBAL_INCLUDES in pelicanconf.py
Please note that pelican and docutils are both not designed to allow this. Neither a signal is provided which provides a clean access to the raw contents of a source file before processing begins, nor is there a possibility to intercept the framework reading the file in "a normal way" (like subclassing, changing hardcoded configuration, etc).
This plugin subclasses the internal class FileInput of RstReader and sets the class reference of RstReader.FileInput to the subclass. Also python file objects are emulated through SourceWrapper.
Nevertheless, this approach works for me and is not cumbersome in the daily workflow.
I know this question is from 2012 but I think this answer can still be helpful to others.

Related

Is it possible to display file size in a directory served using http.server in python?

I've served a directory using
python -m http.server
It works well, but it only shows file names. Is it possible to show created/modified dates and file size, like you see in ftp servers?
I looked through the documentation for the module but couldn't find anything related to it.
Thanks!
http.server is meant for dead-simple use cases, and to serve as sample code.1 That's why the docs link right to the source.
That means that, by design, it doesn't have a lot of configuration settings; instead, you configure it by reading the source and choosing what methods you want to override, then building a subclass that does that.
In this case, what you want to override is list_directory. You can see how the base-class version works, and write your own version that does other stuff—either use scandir instead of listdir, or just call stat on each file, and then work out how you want to cram the results into the custom-built HTML.
Since there's little point in doing this except as a learning exercise, I won't give you complete code, but here's a skeleton:
class StattyServer(http.server.HTTPServer):
def list_directory(self, path):
try:
dirents = os.scandir(path)
except OSError:
# blah blah blah
# etc. up to the end of the header-creating bit
for dirent in dirents:
fullname = dirent.path
displayname = linkname = dirent.name
st = dirent.stat()
# pull stuff out of st
# build a table row to append to r
1. Although really, it's sample code for an obsolete and clunky way of building servers, so maybe that should be "to serve as sample code to understand legacy code that you probably won't ever need to look at but just in case…".

What is the best method for setting up a config file in Python

I realise this question as been asked before (What's the best practice using a settings file in Python?) but seeing as this was asked 7 years ago, I feel it is valid to discuss again seeing as how technologies have evolved.
I have a python project that requires different configurations to be used based on the value of an environment variable. Since making use of the environment variable to choose a config file is simple enough, my question is as follows:
What format is seen as the best practice in the software industry for setting up a configuration file in python, when multiple configurations are needed based on the environment?
I realise that python comes with a ConfigParser module but I was wondering if it might be better to use a format such as YAML or JSON because of there raise in popularity due to their ease of use across languages. Which format is seen as easier to maintain when you have multiple configurations?
If you really want to use an environment-based YAML configuration, you could do so like this:
config.py
import yaml
import os
config = None
filename = getenv('env', 'default').lower()
script_dir = os.path.dirname(__file__)
abs_file_path = os.path.join(script_dir, filename)
with open(abs_file_path, 'r') as stream:
try:
config = yaml.load(stream)
except yaml.YAMLError as exc:
print(exc)
I think looking at the standard configuration for a Python Django settings module is a good example of this, since the Python Django web framework is extremely popular for commercial projects and therefore is representative of the software industry.
It doesn't get too fancy with JSON or YAML config files - It simply uses a python module called settings.py that can be imported into any other module that needs to access the settings. Environment variable based settings are also defined there. Here is a link to an example settings.py file for Django on Github:
https://github.com/deis/example-python-django/blob/master/helloworld/settings.py
This is really late to the party, but this is what I use and I'm pleased with it (if you're open to a pure Python solution). I like it because my configurations can be set automatically based on where this is deployed using environment variables. I haven't been using this that long so if someone sees an issue, I'm all ears.
Structure:
|--settings
|--__init__.py
|--config.py
config.py
class Common(object):
XYZ_API_KEY = 'AJSKDF234328942FJKDJ32'
XYZ_API_SECRET = 'KDJFKJ234df234fFW3424##ewrFEWF'
class Local(Common):
DB_URI = 'local/db/uri'
DEBUG = True
class Production(Common):
DB_URI = 'remote/db/uri'
DEBUG = False
class Staging(Production):
DEBUG = True
__init__.py
from settings.config import Local, Production, Staging
import os
config_space = os.getenv('CONFIG_SPACE', None)
if config_space:
if config_space == 'LOCAL':
auto_config = Local
elif config_space == 'STAGING':
auto_config = Staging
elif config_space == 'PRODUCTION':
auto_config = Production
else:
auto_config = None
raise EnvironmentError(f'CONFIG_SPACE is unexpected value: {config_space}')
else:
raise EnvironmentError('CONFIG_SPACE environment variable is not set!')
If my environment variable is set in each place where my app exists, I can bring this into my modules as needed:
from settings import auto_config as cfg
That really depends on your requirements, rather than the format's popularity. For instance, if you just need simple key-value pairs, an INI file would be more than enough. As soon as you need complex structures (e.g., arrays or dictionaries), I'd go for JSON or YAML. JSON simply stores data (it's more intended for automated data flow between systems), while YAML is better for human-generated (or maintained, or read) files, as it has comments, you can reference values elsewhere in the file... And on top of that, if you want robustness, flexibility, and means to check the correct structure of the file (but don't care much about the manual edition of the data), I'd go for XML.
I recommend giving trapdoor a try for turn-key configuration (disclaimer: I'm the author of trapdoor).
I also like to take advantage of the fact that you do not have to compile Python source and use plain Python files for configuration. But in the real world you may have multiple environments, each requires a different configuration, and you may also want to read some (mostly sensitive) information from env vars or files that are not in source control (to prevent committing those by mistake).
That's why I wrote this library: https://github.com/davidohana/kofiko,
which let you use plain Python files for configuration, but is also able to override those config settings from .ini or env-vars, and also support customization for different environments.
Blog post about it: https://medium.com/swlh/code-first-configuration-approach-for-python-f975469433b9

PyV8, can I manipulate DOM structure?

Lets assume that we have PyV8:
import PyV8
ctxt = PyV8.JSContext()
and a python DOM structure, for example xml.dom
How can I feed a .js-file to PyV8 so that it could change DOM-structure that I have.
If I had it's content:
$("#id").remove();
I want dom item to be removed.
PyV8 has perfect hello-world example. But I'd like to see something usefull.
To be clear, what I want to do is:
"Javascript file" -->-- magic -->-- DOM, (already built with html file) and changed now with passed javascript file
A good example for what you're trying to do can be found here:
https://github.com/buffer/thug
It's a python http client executing JS via PyV8 for security research purposes, but can be strapped down easily for simpler needs.
Appologies for the formatting. I spaced as best I could, but my screen reader doesn't like SO's formatting controls.
I'm going to take a shot at answering your question, though it seems a tad vague. Please let me know if I need to rewrite this answer to fit a different situation.
I assume you are trying to get an HTML file from the web, and run Javascript from inside this file, to act on said document.
Unfortunately, none of the Python xml libraries have true DOM support, and W3C DOM compliance is nonexistent in every package I have found.
What you can do is use the PyV8 w3c.py dom file as a starting example, and create your own full DOM.
W3C Sample Dom
You will need to rewrite this module, though, as it does not respect quotes or apostrophys. BeautifulSoup is also not the speediest parser.
I would recommend using something like lxml.etree's target parser option.
LXML Target Parser
Search for "The feed parser interface".
Then, you can load an HTML/Script document with LXML, parse it as below, and run each of the scripts you need on the created DOM.
Find a partial example below. (Please note that the HTML standards are massive, scattered, and _highly browser specific, so your milage may vary).
class domParser(object):
def __init__(self):
#initialize dom object here, and obtain the root for the destination file object.
self.dom = newAwesomeCompliantDom()
self.document = self.dom.document
self.this = self.document
def comment(self, commentText):
#add commentText to self.document or the above dom object you created
self.this.appendChild(self.document.DOMImplementation.createComment(commentText))
def start(self, tag, attrs):
#same here
self.this = self.this.appendChild(self.document.DOMImplimentation.newElement(tag,attrs))
def data(self, dataText):
#append data to the last accessed element, as a new Text child
self.this.appendChild(self.document.DOMImpl.createDataNode(dataText))
def end(self):
#closing element, so move up the tree
self.this = self.this.parentNode
def close(self):
return self.document
#unchecked, please validate yourself
x = lxml.etree.parse(target=domParser)
x.feed(htmlFile)
newDom = x.close()

In Python 3.2 how would I dynamically open and compile a class from a specified file?

Code is much more precise than English; Here's what I'd like to do:
import sys
fileName = sys.argv[1]
className = sys.argv[2]
# open py file here and import the class
# ???
# Instantiante new object of type "className"
a = eval(className + "()") # I don't know if this is the way to do that.
# I "know" that className will have this method:
a.writeByte(0x0)
EDIT:
Per the request of the answers, here's what I'm trying to do:
I'm writing a virtual processor adhering to the SIC/XE instruction set. It's an educational theoretical processor used to teach the fundamentals of assembly language and systems software to computer science students. There is a notion of a "device" that I'm trying to abstract from the programming of the "processor." Essentially, I want the user of my program to be able to write their own device plugin (limited to "read_byte" and "write_byte" functionality) and then I want them to be able to "hook up" their devices to the processor at command-line time, so that they can write something like:
python3 sicsim -d1 dev1module Dev1Class -d2 ...
They would also supply the memory image, which would know how to interact with their device. I basically want both of us to be able to write our code without it interfering with each other.
Use importlib.import_module and the built in function getattr. No need for eval.
import sys
import importlib
module_name = sys.argv[1]
class_name = sys.argv[2]
module = importlib.import_module(module_name)
cls = getattr(module, class_name)
obj = cls()
obj.writeByte(0x0)
This will require that the file lives somewhere on your python path. Most of the time, the current directory is on said path. If this is not sufficient, you'll have to parse the directory out of it and append it to the sys.path. I'll be glad to help with that. Just give me a sample input for the first commandline argument.
Valid input for this version would be something like:
python3 myscript.py mypackage.mymodule MyClass
As aaronasterling mentions, you can take advantage of the import machinery if the file in question happens to be on the python path (somewhere under the directories listed in sys.path), but if that's not the case, use the built in exec() function:
fileVars = {}
exec(file(fileName).read(), fileVars)
Then, to get an instance of the class, you can skip the eval():
a = fileVars[className]()

Can I add "Smartypants" to restructuredText?

I use restructuredText, and I like what smartypants does for Markdown. Is there a way to enable the same thing for restructuredText?
Have you tried smartypants.py? I don't know how well it's implemented, much less how well it works for your specific use cases, but it does seem to target exactly your goal, unicode-ification of some ascii constructs (however, it runs on HTML, so I guess you'd run it after restructuredText or whatever other "producer of HTML" component).
If that doesn't work well for you, a user has submitted a patch to python-markdown2 which he calls "this SmartyPants patch" -- it's been accepted and since a month ago it's part of the current source tree of python-markdown2 (r259 or better). That may offer smoother sailing (e.g. if you just get and built python-markdown2 as a read-only svn tree). Or, you could wait for the next downloadable release (there hasn't been one since May and this patch was accepted in mid-July), but who knows when that'll happen.
As Alex Martelli says, smartyPants is what I need. However, I was looking for a little more detailed info on how to use it. So here's a Python script that reads the file named in the first command line argument, converts it to HTML, using Pygments for sourcecode, and then passses it through smartypants for prettifying.
#!/usr/bin/python
# EASY-INSTALL-SCRIPT: 'docutils==0.5','rst2html.py'
"""
A minimal front end to the Docutils Publisher, producing HTML.
"""
try:
from ulif.rest import directives_plain
from ulif.rest import roles_plain
from ulif.rest import pygments_directive
import locale
locale.setlocale(locale.LC_ALL, '')
except:
pass
from docutils.core import publish_doctree, publish_from_doctree
from smartypants import smartyPants
import sys
description = ('Personal docutils parser with extra features.')
doctree = publish_doctree(file(sys.argv[1]).read())
result = publish_from_doctree(doctree, writer_name='html')
result = smartyPants(result)
print result

Categories