Specifying anchor names in reST - python

I'm creating HTML from reST using the rst2html tool which comes with docutils. It seems that the code already assigns id attributes to the individual sections, which can be used as fragment identifiers in a URL, i.e. as anchors to jump to a specific part of the page. Those id values are based on the text of the section headline. When I change the wording of that headline, the identifier will change as well, rendering old URLs invalid.
Is there a way to specify the name to use as an identifier for a given section, so that I can edit the headline without invalidating links? Would there be a way if I were to call the docutils publisher myself, from my own script?

I don't think you can set an explicit id in reST sections, but I could be mistaken.
If you'd rather have numbered ids, which will depend on the ordering of the sections in the document tree, rather than their titles, you can do it with a small change to document.set_id() method in docutils/nodes.py (at line 997 on my version.)
Here is the patch:
def set_id(self, node, msgnode=None):
for id in node['ids']:
if id in self.ids and self.ids[id] is not node:
msg = self.reporter.severe('Duplicate ID: "%s".' % id)
if msgnode != None:
msgnode += msg
if not node['ids']:
- for name in node['names']:
- id = self.settings.id_prefix + make_id(name)
- if id and id not in self.ids:
- break
- else:
+ if True: #forcing numeric ids
id = ''
while not id or id in self.ids:
id = (self.settings.id_prefix +
self.settings.auto_id_prefix + str(self.id_start))
self.id_start += 1
node['ids'].append(id)
self.ids[id] = node
return id
I just tested it and it generates the section ids as id1, id2...
If you don't want to change this system-wide file, you can probably monkey-patch it from a custom rst2html command.

I'm not sure if I really understand your question.
You can create explicit hyperlink targets to arbitrary locations in your document which can be used to reference these locations independent of the implicit hyperlink targets created by docutils:
.. _my_rstfile:
------------------
This is my rstfile
------------------
.. _a-section:
First Chapter
-------------
This a link to a-section_ which is located in my_rstfile_.
As it seems that you want to create links between multiple rst files I would however advise to use Sphinx as it can handle references to arbitrary locations between different files and has some more advantages, like a toctree and theming. You can use sphinx not only for source code documentation, but for general text processing. Something like an example is the Sphinx documentation itself (there are hundreds of other examples on readthedocs).
Invoking Sphinx should be simple using sphinx-quickstart. You can simply add your exiting rst-files to the toctree in index.rst and run make html. If you want to document python code you can use sphinx-apidoc which will automatically generate an API documentation.

I made a Sphinx extension to solve this problem. The extension takes the preceding internal target and uses that as the section's ID. Example (from bmu's answer):
.. _a-section:
First Chapter
-------------
The permalink on "First Chapter" would point to #a-section instead of #first-chapter. If there's multiple, it'll take the last one.
Link to the extension: https://github.com/GeeTransit/sphinx-better-subsection

Related

office365-rest-client: How to get name of file modifier

I am mapping a SharePoint document library using office365-rest-client. My intention is to make a dictionary of the form:
file_dict = {File.serverRelativeUrl: [file_attribute_1, file_attribute_2, ...]}
Where File.serverRealtiveUrl is a string, and one of the above-mentioned attributes is to be the name of the latest person to modify the file.
The File class (seen here) also has a method modified_by() that I have been trying to use to determine the name of the person who last modified the file. However, using this returns an instance of the User class (seen here).
When looking at the code behind User, it doesn't appear to contain any method that would allow for the name of the modifier to be retrieved.
When looking at the files saved within my SharePoint document library, it is clear that the names of these users are present:
Therefore, I would like to know the following:
Does anybody know of the correct method to determine the names of the file modifiers (if one exists)?
Alternatively, is it possible to determine the email addresses of these users instead?
I have already attempted to use the File.properties attribute, but have found that the modifier name / mailing address is not included within these properties:
import json
# Get file
print(json.dumps(file.properties))

Sphinx (for Python) - Search bar disappeared

I'm new to Sphinx and having completed my documentation I noticed the search bar has disappeared when it was there previously.
I haven't (knowingly) changed anything in the conf.py file and am not sure why this has happened.
Does anybody know how to get it back?
You need to include
* :ref:`modindex`
In the file which will named ./index.rst
From the docs ...
The special document names (and pages generated for them) are:
genindex, modindex, search
These are used for the general index, the Python module index, and the
search page, respectively.
The general index is populated with entries from modules, all
index-generating object descriptions, and from index directives.

Word & Python - Create Table of Contents

I'm using the pywin32.client extension for python and building a Word document. I have tried a pretty good host of methods to generate a ToC but all have failed.
I think what I want to do is call the ActiveDocument object and create one with something like this example from the MSDN page:
Set myRange = ActiveDocument.Range(Start:=0, End:=0)
ActiveDocument.TablesOfContents.Add Range:=myRange, _
UseFields:=False, UseHeadingStyles:=True, _
LowerHeadingLevel:=3, _
UpperHeadingLevel:=1
Except in Python it would be something like:
wordObject.ActiveDocument.TableOfContents.Add(Range=???,UseFiles=False, UseHeadingStyles=True, LowerHeadingLevel=3, UpperHeadingLevel=1)
I've built everything so far using the 'Selection' object (example below) and wish to add this ToC after the first page break.
Here's a sample of what the document looks like:
objWord = win32com.client.Dispatch("Word.Application")
objDoc = objWord.Documents.Open('pathtotemplate.docx') #
objSel = objWord.Selection
#These seem to work but I don't know why...
objWord.ActiveDocument.Sections(1).Footers(1).PageNumbers.Add(1,True)
objWord.ActiveDocument.Sections(1).Footers(1).PageNumbers.NumberStyle = 57
objSel.Style = objWord.ActiveDocument.Styles("Heading 1")
objSel.TypeText("TITLE PAGE AND STUFF")
objSel.InsertParagraph()
objSel.TypeText("Some data or another"
objSel.TypeParagraph()
objWord.Selection.InsertBreak()
####INSERT TOC HERE####
Any help would be greatly appreciated! In a perfect world I'd use the default first option which is available from the Word GUI but that seems to point to a file and be harder to access (something about templates).
Thanks
Manually, edit your template in Word, add the ToC (which will be empty initially) any intro stuff, header/footers etc., then at where you want your text content inserted (i.e. after the ToC) put a uniquely named bookmark. Then in your code, create a new document based on the template (or open the template then save it to a different name), search for the bookmark and insert your content there. Save to a different filename.
This approach has all sorts of advantages - you can format your template in Word rather than by writing all the code details, and so you can very easily edit your template to update styles when someone says they want the Normal font to be bigger/smaller/pink you can do it just by editing the template. Make sure to use styles in your code and only apply formatting when it is specifically different from the default style.
Not sure how you make sure the ToC is actually generated, might be automatically updated on every save.

Add a field to existing document in CouchDB

I have a database with a bunch of regular documents that look something like this (example from wiki):
{
"_id":"some_doc_id",
"_rev":"D1C946B7",
"Subject":"I like Plankton",
"Author":"Rusty",
"PostedDate":"2006-08-15T17:30:12-04:00",
"Tags":["plankton", "baseball", "decisions"],
"Body":"I decided today that I don't like baseball. I like plankton."
}
I'm working in Python with couchdb-python and I want to know if it's possible to add a field to each document. For example, if I wanted to have a "Location" field or something like that.
Thanks!
Regarding IDs
Every document in couchdb has an id, whether you set it or not. Once the document is stored you can access it through the doc._id field.
If you want to set your own ids you'll have to assign the id value to doc._id. If you don't set it, then couchdb will assign a uuid.
If you want to update a document, then you need to make sure you have the same id and a valid revision. If say you are working from a blog post and the user adds the Location, then the url of the post may be a good id to use. You'd be able to instantly access the document in this case.
So what's a revision
In your code snippet above you have the doc._rev element. This is the identifier of the revision. If you save a document with an id that already exists, couchdb requires you to prove that the document is still the valid doc and that you are not trying to overwrite someone else's document.
So how do I update a document
If you have the id of your document, you can just access each document by using the db.get(id) function. You can then update the document like this:
doc = db.get(id)
doc['Location'] = "On a couch"
db.save(doc)
I have an example where I store weather forecast data. I update the forecasts approximately every 2 hours. A separate process is looking for data that I get from a different provider looking at characteristics of tweets on the day.
This looks something like this.
doc = db.get(id)
doc_with_loc = GetLocationInformationFromOtherProvider(doc) # takes about 40 seconds.
doc_with_loc["_rev"] = doc["_rev"]
db.save(doc_with_loc) # This will fail if weather update has also updated the file.
If you have concurring processes, then the _rev will become invalid, so you have to have a failsave, eg. this could do:
doc = db.get(id)
doc_with_loc = GetLocationInformationFromAltProvider(doc)
update_outstanding = true
while update_outstanding:
doc = db.get(id) //reretrieve this to get
doc_with_loc["_rev"] = doc["_rev"]
update_outstanding = !db.save(doc_with_loc)
So how do I get the Ids?
One option suggested above is that you actively set the id, so you can retrieve it. Ie. if a user sets a given location that is attached to a URL, use the URL. But you may not know which document you want to update - or even have a process that finds all the document that don't have a location and assign one.
You'll most likely be using a view for this. Views have a mapper and a reducer. You'll use the first one, forget about the last one. A view with a mapper does the following:
It returns a simplyfied/transformed way of looking at your data. You can return multiple values per data or skip some. It gives the data you emit a key, and if you use the _include_docs function it will give you the document (with _id and rev alongside).
The simplest view is the default view db.view('_all_docs') this will return all documents and you may not want to update all of them. Views for example will be stored as a document as well when you define these.
The next simple way is to have view that only returns items that are of the type of the document. I tend to have a _type="article in my database. Think of this as marking that a document belongs to a certain table if you had stored them in a relational database.
Finally you can filter elements that have a location so you'd have a view where you can iterate over all those docs that still need a location and identify this in a separate process. The best documentation on writing view can be found here.

PyYAML and unusual tags

I am working on a project that uses the Unity3D game engine. For some of the pipeline requirements, it is best to be able to update some files from external tools using Python. Unity's meta and anim files are in YAML so I thought this would be strait forward enough using PyYAML.
The problem is that Unity's format uses custom attributes and I am not sure how to work with them as all the examples show more common tags used by Python and Ruby.
Here is what the top lines of a file look like:
%YAML 1.1
%TAG !u! tag:unity3d.com,2011:
--- !u!74 &7400000
AnimationClip:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
...
When I try to read the file I get this error:
could not determine a constructor for the tag 'tag:unity3d.com,2011:74'
Now after looking at all the other questions asked, this tag scheme does not seem to resemble those questions and answers. For example this file uses "!u!" which I was unable to figure out what it means or how something similar would behave (my wild uneducated guess says it looks like an alias or namespace).
I can do a hack way and strip the tags out but that is not the ideal way to try to do this. I am looking for help on a solution that will properly handle the tags and allow me to parse & encode the data in a way that preserves the proper format.
Thanks,
-R
I also had this problem, and the internet was not very helpful. After bashing my head against this problem for 3 days, I was able to sort it out...or at least get a working solution. If anyone wants to add more info, please do. But here's what I got.
1) The documentation on Unity's YAML file format(they call it a "textual scene file" because it contains text that is human readable) - http://docs.unity3d.com/Manual/TextualSceneFormat.html
It is a YAML 1.1 compliant format. So you should be able to use PyYAML or any other Python YAML library to load up a YAML object.
Okay, great. But it doesn't work. Every YAML library has issues with this file.
2) The file is not correctly formed. It turns out, the Unity file has some syntactical issues that make YAML parsers error out on it. Specifically:
2a) At the top, it uses a %TAG directive to create an alias for the string "unity3d.com,2011". It looks like:
%TAG !u! tag:unity3d.com,2011:
What this means is anywhere you see "!u!", replace it with "tag:unity3d.com,2011".
2b) Then it goes on to use "!u!" all over the place before each object stream. But the problem is that - to be YAML 1.1 compliant - it should actually declare a tag alias for each stream (any time a new object starts with "--- "). Declaring it once at the top and never again is only valid for the first stream, and the next stream knows nothing about "!u!", so it errors out.
Also, this tag is useless. It basically appends "tag:unity3d.com,2011" to each entry in the stream. Which we don't care about. We already know it's a Unity YAML file. Why clutter the data?
3) The object types are given by Unity's Class ID. Here is the documentation on that:
http://docs.unity3d.com/Manual/ClassIDReference.html
Basically, each stream is defined as a new class of object...corresponding to the IDs in that link. So a "GameObject" is "1", etc. The line looks like this:
--- !u!1 &100000
So the "--- " defines a new stream. The "!u!" is an alias for "tag:unity3d.com,2011" and the "&100000" is the file ID for this object (inside this file, if something references this object, it uses this ID....remember YAML is a node-based representation, so that ID is used to denote a node connection).
The next line is the root of the YAML object, which happens to be the name of the Unity Class...example "GameObject". So it turns out we don't actually need to translate from Class ID to Human Readable node type. It's right there. If you ever need to use it, just take the root node. And if you need to construct a YAML object for Unity, just keep a dictionary around based on that documentation link to translate "GameObject" to "1", etc.
The other problem is that most YAML parsers (PyYAML is the one I tested) only support 3 types of YAML objects out of the box:
Scalar
Sequence
Mapping
You can define/extend custom nodes. But this amounts to hand writing your own YAML parser because you have to define EXPLICITLY how each YAML constructor is created, and outputs. Why would I use a Library like PyYAML, then go ahead and write my own parser to read these custom nodes? The whole point of using a library is to leverage previous work and get all that functionality from day one. I spent 2 days trying to make a new constructor for each class ID in unity. It never worked, and I got into the weeds trying to build the constructors correctly.
THE GOOD NEWS/SOLUTION:
Turns out, all the Unity nodes I've ever run into so far are basic "Mapping" nodes in YAML. So you can throw away the custom node mapping and just let PyYAML auto-detect the node type. From there, everything works great!
In PyYAML, you can pass a file object, or a string. So, my solution was to write a simple 5 line pre-parser to strip out the bits that confuse PyYAML(the bits that Unity incorrectly syntaxed) and feed this new string to PyYAML.
1) Remove line 2 entirely, or just ignore it:
%TAG !u! tag:unity3d.com,2011:
We don't care. We know it's a unity file. And the tag does nothing for us.
2) For each stream declaration, remove the tag alias ("!u!") and remove the class ID. Leave the fileID. Let PyYAML auto-detect the node as a Mapping node.
--- !u!1 &100000
becomes...
--- &100000
3) The rest, output as is.
The code for the pre-parser looks like this:
def removeUnityTagAlias(filepath):
"""
Name: removeUnityTagAlias()
Description: Loads a file object from a Unity textual scene file, which is in a pseudo YAML style, and strips the
parts that are not YAML 1.1 compliant. Then returns a string as a stream, which can be passed to PyYAML.
Essentially removes the "!u!" tag directive, class type and the "&" file ID directive. PyYAML seems to handle
rest just fine after that.
Returns: String (YAML stream as string)
"""
result = str()
sourceFile = open(filepath, 'r')
for lineNumber,line in enumerate( sourceFile.readlines() ):
if line.startswith('--- !u!'):
result += '--- ' + line.split(' ')[2] + '\n' # remove the tag, but keep file ID
else:
# Just copy the contents...
result += line
sourceFile.close()
return result
To create a PyYAML object from a Unity textual scene file, call your pre-parser function on the file:
import yaml
# This fixes Unity's YAML %TAG alias issue.
fileToLoad = '/Users/vlad.dumitrascu/<SOME_PROJECT>/Client/Assets/Gear/MeleeWeapons/SomeAsset_test.prefab'
UnityStreamNoTags = removeUnityTagAlias(fileToLoad)
ListOfNodes = list()
for data in yaml.load_all(UnityStreamNoTags):
ListOfNodes.append( data )
# Example, print each object's name and type
for node in ListOfNodes:
if 'm_Name' in node[ node.keys()[0] ]:
print( 'Name: ' + node[ node.keys()[0] ]['m_Name'] + ' NodeType: ' + node.keys()[0] )
else:
print( 'Name: ' + 'No Name Attribute' + ' NodeType: ' + node.keys()[0] )
Hope that helps!
-Vlad
PS. To Answer the next issue in making this usable:
You also need to walk the entire project directory and parse all ".meta" files for the "GUID", which is Unity's inter-file reference. So, when you see a reference in a Unity YAML file for something like:
m_Materials:
- {fileID: 2100000, guid: 4b191c3a6f88640689fc5ea3ec5bf3a3, type: 2}
That file is somewhere else. And you can re-cursively open that one to find out any dependencies.
I just ripped through the game project and saved a dictionary of GUID:Filepath Key:Value pairs which I can match against.

Categories