Auto-generate title from filename in Pelican - python

1. Summary
I can’t automatically generate the correct title from filenames of articles/pages.
For example, I can’t automatically generate metadata key title Kira Goddess from article Kira-Goddess.md
2. Argumentation
DRY, automation. I don’t want to manually write the title every time for each article and page if I can do it automatically.
An exception — files with words, that contain hyphens — “well-known”, “English-speaking”. In this case, I must explicitly specify title in the metadata of my articles. But words with hyphens are rare in filenames of my articles.
3. MCVE
3.1. Data
You can see it in my KiraTitleFromFilename branch of my repository for Pelican debugging.
pelicanconf.py:
"""MCVE."""
AUTHOR = 'Sasha Chernykh'
SITENAME = 'SashaPelicanDebugging'
SITEURL = '.'
PATH = 'content'
TIMEZONE = 'Europe/Moscow'
DEFAULT_LANG = 'en'
# [INFO] Use article name when preserve the slug:
# https://docs.getpelican.com/en/stable/settings.html#url-settings
SLUGIFY_SOURCE = 'basename'
# [INFO] Preserve case of article filename
SLUGIFY_PRESERVE_CASE = True
# [INFO] Get title from article filename:
# https://docs.getpelican.com/en/stable/settings.html#metadata
# https://github.com/getpelican/pelican/issues/2107
# https://github.com/getpelican/pelican/commit/2e82a53cdf3f1f9d66557850cc2811479d5bb645
FILENAME_METADATA = '(?P<title>.*)'
Kira-Goddess.md:
Date: 2020-09-24 18:57:33
Kira Goddess!
Another Pelican files generated by pelican-quickstart.
Simplified part of base.html:
<title>{{ article.title }}</title>
3.2. Steps to reproduce
See .travis.yml:
Run Pelican build:
pelican content -s pelicanconf.py --fatal warnings --verbose
Finding content of <title> tag:
grep -E "<title>.*</title>" output/Kira-Goddess.html
3.3. Behavior
3.3.1. Current
See Travis build:
<title>Kira-Goddess</title>
3.3.2. Desired
It would be nice, if:
<title>{{ article.title }}</title>
will transform to:
<title>Kira Goddess</title>
4. Not helped
In the description of EXTRA_PATH_METADATA variable I read, that Pelican used Python group name notation syntax (?P<name>…). I couldn’t find, how I can make substitutions in Python <class 'str'> (print(type(FILENAME_METADATA)) → <class 'str'>). I tried variants as:
import re
KIRA_DEFAULT_FILENAME_REGEX = '(?P<title>.*)'
FILENAME_METADATA = re.sub(KIRA_DEFAULT_FILENAME_REGEX, "-", " ")
or
KIRA_DEFAULT_FILENAME_REGEX = '(?P<title>.*)'
FILENAME_METADATA = KIRA_DEFAULT_FILENAME_REGEX.replace("-", "")
It doesn’t work.
5. Don’t offer
5.1. Use Jinja2 filters in templates
5.1.1. Suggestion
Use replace() filter in your template files like this:
<title>{{ article.title|replace('-', " ") }}</title>
5.1.2. Why is it not good
Pelican plugins (e.g. Pelican Open Graph) still will use article.title. Unwanted data as Kira-Goddess, not Kira Goddess still will pass to plugins.
5.2. Use spaces in your Markdown
5.2.1. Suggestion
For example, name your file Kira Goddess.md, not Kira-Goddess.md.
5.2.2. Why is it not good
Whitespaces in filenames is a bad practice — 1, 2, 3, 4, 5.

It seems like pelican doesn't provide a way to implement what you want.
FILENAME_METADATA regex can only select a name from a filename, but you can't substitute - with whitespaces there.
So I think the current best way for you is to specify title tag manually in each file.

Related

Django debug toolbar

Can't see the Django Debug Toolbar on a simple html doc. New to Django and the tutorial i'm doing is a little outdated. I have done all the requirements such as ensure STATIC_URL = "static/", INSTALLED_APPS = ["django.contrib.staticfiles"], Backend and APP_DIRS is correct. debug_toolbar is in INSTALLED_APPS, added the added the debug url to urlpatterns list, Middleware is done and 'debug_toolbar.middleware.DebugToolbarMiddleware' is at the top. Internal IPS is set to 127.0.0.1, if i change it the ourcecode of the webpage removes the code for debug toolbar. Made sure that Debug = True
I use pycharm mostly, heard that might be an a problem using the runserver command so tried it on cmd as well. multiple times. when viewing the page source i see the code for the debug toolbar as well as my html.
thought maybe my html is written poorly(never used it before) this is what it looks like word for word.
<html>
<head>
<title>Example</title>
</head>
<body>
<p>This is an example of a simple HTML page with one paragraph.</p>
</body>
</html>
Latest version of django-debug-toolbar installed and django. Tried different chrome, edge and explorer browsers, all are the same. I've tried a few tricks like
def show_toolbar(request):
return True
DEBUG_TOOLBAR_CONFIG = {
"SHOW_TOOLBAR_CALLBACK" : show_toolbar,
}
DEBUG_TOOLBAR_CONFIG = {"SHOW_TOOLBAR_CALLBACK" : lambda x: True}
Someone suggested changing HKEY_CLASSES_ROOT.js\ because i'm on a windows 10.i haven't figured out how to do that yet
EDIT: this is the video i'm following along with

How to combine two mechanism for localization site written on Pelican?

I use two mechanisms for localization site:
1. I use the standard template tag {{ gettext 'some_text'}} in my index.html
2. I wrote custom jinja extension that takes the content of markdown file according to language that used on the site.
And I use Babel to create messages.pot file and then to create massages.po file.
I have this babel configuration in babel.cfg :
[jinja2: theme/templates/index.html]
silent = false
And this is my custom jinja extension - custom_jinja_extension.py :
from jinja2 import nodes
from jinja2.ext import Extension
from markdown import Markdown
class IncludeMarkdownExtension(Extension):
"""
a set of names that trigger the extension.
"""
tags = set(['include_markdown'])
def __init__(self, environment):
super(IncludeMarkdownExtension, self).__init__(environment)
def parse(self, parser):
tag = parser.stream.__next__()
ctx_ref = nodes.ContextReference()
if tag.value == 'include_markdown':
args = [ctx_ref, parser.parse_expression(), nodes.Const(tag.lineno)]
body = parser.parse_statements(['name:endinclude_markdown'], drop_needle=True)
callback = self.call_method('convert', args)
return nodes.CallBlock(callback, [], [], body).set_lineno(tag.lineno)
def convert(self, context, tagname, linenum, caller):
"""
Function for converting markdown to html
:param tagname: name of converting file
:return: converting html
"""
for item in context['extra_siteurls']:
if item == context['main_lang']:
input_file = open('content/{}/{}'.format('en', tagname))
else:
input_file = open('content/{}/{}'.format(context['main_lang'], tagname))
text = input_file.read()
html = Markdown().convert(text)
return html
I use this template tag - {% include_markdown 'slide3.md' %}{% endinclude_markdown %}
In my pelicanconf.py I add such strings for jinja extensions:
# Jinja2 extensions
JINJA_ENVIRONMENT = {
'extensions': [
'jinja2_markdown.MarkdownExtension',
'jinja2.ext.i18n',
'custom_jinja_extension.IncludeMarkdownExtension'
]
}
When I run the command:
pybabel extract --mapping babel.cfg --output messages.pot ./
I get this error
jinja2.exceptions.TemplateSyntaxError: Encountered unknown tag
'include_markdown'. Jinja was looking for the following tags:
'endblock'. The innermost block that needs to be closed is 'block'.
When I delete all using of custom template tag gettext work well. What I do wrong?
Trouble was in the path. Babel looking jinja extension in virtualenv in jinja folder, but my custom jinja extension was in the project folder.
That's why I run this command in the terminal
export PYTHONPATH=$PYTHONPATH:/local/path/to/the/project/
and change my babel.cfg :
[jinja2: theme/templates/index.html]
extensions=jinja2.ext.i18n, **custom_jinja_extension.IncludeMarkdownExtension**
silent = false
After this changes babel found my custom extension custom_jinja_extension and created messages.pot file correctly!

Disable rendering some md and html files in Pelican

1. Briefly
I don't find, how I can to disable rendering some files with md and html extensions.
2. Detail
I use Pelican and write my articles use Markdown markup. For example, I want to create custom 404 page in GitHub Pages. I need to have 2 files in root directory of my site: 404.md and 404.html. I create these files in my content folder → I run pelican content command → I get output.
D:\Kristinita>pelican content
WARNING: Meta tag in file D:\Kristinita\content\404.html does not have a 'name' attribute, skipping. Attributes: http-equiv="X-UA-Compatible", content="IE=edge"
ERROR: Skipping .\404.md: could not find information about 'title'
3. Example of expected behavior
I set in pelicanconf.py:
NOT_RENDERING = ['404.md', '404.html']
I run pelican content → 404.md and 404.html files don't have modifications in output.
4. Did not help
I set in pelicanconf.py file:
STATIC_PATHS = ['']
Files with other extension, exclude md and html, copy to the output directory without modification, warnings and errors, but it no work for md and html files.
I use “hack” — I write extensions in UPPERCASE. For example, I create files 404.MD and 404.HTML files instead of 404.md and 404.html. But I don't get custom 404 page in GitHub Pages with UPPERCASE extensions.
I find OUTPUT_SOURCE setting in documentation → I set in pelicanconf.py:
OUTPUT_SOURCES = True
OUTPUT_SOURCES_EXTENSION = '.md'
I run pelican content command → I get error and warning in output, I don't get original 404.md in output. It don't solve my problem.
I would suggest moving those files into a separate directory within the content directory, e.g.:
content/
static/
404.html
404.md
Then you can configure Pelican to treat that directory as a static source:
STATIC_PATHS = [
'static',
]
and move the two files to the root of the output directory on processing:
EXTRA_PATH_METADATA = {
'static/404.html': {'path': '404.html'},
'static/404.md': {'path': '404.md'},
}
To make the processor ignore those files, per this GitHub issue, you will also need to set:
ARTICLE_EXCLUDES = [
'static'
]

How to add custom css file to Sphinx?

How can I add a custom css file? The following config does not work:
# conf.py
html_static_path = ['_static']
html_theme = 'default'
html_theme_options = {
'cssfiles': ['_static/style.css']
}
Result:
$ make html
Running Sphinx v1.2.2
loading pickled environment... not yet created
building [html]: targets for 2 source files that are out of date
updating environment: 2 added, 0 changed, 0 removed
reading sources... [ 50%] help
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents...
Theme error:
unsupported theme option 'cssfiles' given
A simpler way is to add this to your conf.py:
def setup(app):
app.add_css_file('css/custom.css') # may also be an URL
Then put the file into the _static/css/ folder.
You should be able to include custom css by extending the default sphinx theme. In your conf.py you would specify where your extension to the theme would be, such as.
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
Then in _templates you would create a extension to the default theme named 'layout.html' that would include your cssfiles such as.
{# layout.html #}
{# Import the layout of the theme. #}
{% extends "!layout.html" %}
{% set css_files = css_files + ['_static/style.css'] %}
See sphinx's documentation on templating for more information.
The options that you can configure via html_theme_options are theme-dependent. Check out the [options] section of your theme’s theme.conf to find out what is available.
On a global basis, though, you can define html_context in your conf.py to override the settings for css_files (and, for that matter, script_files too):
html_context = {
'css_files': ['_static/custom.css'],
}
(For reference, have a look at Sphinx’s builders.html.StandaloneHTMLBuilder.prepare_writing() and see how self.globalcontext gets populated there.)
I'm using Sphinx 3.2.
I was able to add some simple custom CSS by doing the following:
add this line in conf.py right under html_static_path = ['_static']:
html_css_files = ['css/custom.css']
go to docs/_static/ and add css/custom.css
add custom css to your file then $ make html
Source

Generating HTML documents in python

In python, what is the most elegant way to generate HTML documents. I currently manually append all of the tags to a giant string, and write that to a file. Is there a more elegant way of doing this?
You can use yattag to do this in an elegant way. FYI I'm the author of the library.
from yattag import Doc
doc, tag, text = Doc().tagtext()
with tag('html'):
with tag('body'):
with tag('p', id = 'main'):
text('some text')
with tag('a', href='/my-url'):
text('some link')
result = doc.getvalue()
It reads like html, with the added benefit that you don't have to close tags.
I would suggest using one of the many template languages available for python, for example the one built into Django (you don't have to use the rest of Django to use its templating engine) - a google query should give you plenty of other alternative template implementations.
I find that learning a template library helps in so many ways - whenever you need to generate an e-mail, HTML page, text file or similar, you just write a template, load it with your template library, then let the template code create the finished product.
Here's some simple code to get you started:
#!/usr/bin/env python
from django.template import Template, Context
from django.conf import settings
settings.configure() # We have to do this to use django templates standalone - see
# http://stackoverflow.com/questions/98135/how-do-i-use-django-templates-without-the-rest-of-django
# Our template. Could just as easily be stored in a separate file
template = """
<html>
<head>
<title>Template {{ title }}</title>
</head>
<body>
Body with {{ mystring }}.
</body>
</html>
"""
t = Template(template)
c = Context({"title": "title from code",
"mystring":"string from code"})
print t.render(c)
It's even simpler if you have templates on disk - check out the render_to_string function for django 1.7 that can load templates from disk from a predefined list of search paths, fill with data from a dictory and render to a string - all in one function call. (removed from django 1.8 on, see Engine.from_string for comparable action)
If you're building HTML documents than I highly suggest using a template system (like jinja2) as others have suggested. If you're in need of some low level generation of html bits (perhaps as an input to one of your templates), then the xml.etree package is a standard python package and might fit the bill nicely.
import sys
from xml.etree import ElementTree as ET
html = ET.Element('html')
body = ET.Element('body')
html.append(body)
div = ET.Element('div', attrib={'class': 'foo'})
body.append(div)
span = ET.Element('span', attrib={'class': 'bar'})
div.append(span)
span.text = "Hello World"
if sys.version_info < (3, 0, 0):
# python 2
ET.ElementTree(html).write(sys.stdout, encoding='utf-8',
method='html')
else:
# python 3
ET.ElementTree(html).write(sys.stdout, encoding='unicode',
method='html')
Prints the following:
<html><body><div class="foo"><span class="bar">Hello World</span></div></body></html>
There is also a nice, modern alternative: airium: https://pypi.org/project/airium/
from airium import Airium
a = Airium()
a('<!DOCTYPE html>')
with a.html(lang="pl"):
with a.head():
a.meta(charset="utf-8")
a.title(_t="Airium example")
with a.body():
with a.h3(id="id23409231", klass='main_header'):
a("Hello World.")
html = str(a) # casting to string extracts the value
print(html)
Prints such a string:
<!DOCTYPE html>
<html lang="pl">
<head>
<meta charset="utf-8" />
<title>Airium example</title>
</head>
<body>
<h3 id="id23409231" class="main_header">
Hello World.
</h3>
</body>
</html>
The greatest advantage of airium is - it has also a reverse translator, that builds python code out of html string. If you wonder how to implement a given html snippet - the translator gives you the answer right away.
Its repository contains tests with example pages translated automatically with airium in: tests/documents. A good starting point (any existing tutorial) - is this one: tests/documents/w3_architects_example_original.html.py
I would recommend using xml.dom to do this.
http://docs.python.org/library/xml.dom.html
Read this manual page, it has methods for building up XML (and therefore XHTML). It makes all XML tasks far easier, including adding child nodes, document types, adding attributes, creating texts nodes. This should be able to assist you in the vast majority of things you will do to create HTML.
It is also very useful for analysing and processing existing xml documents.
Here is a tutorial that should help you with applying the syntax:
http://www.postneo.com/projects/pyxml/
I am using the code snippet known as throw_out_your_templates for some of my own projects:
https://github.com/tavisrudd/throw_out_your_templates
https://bitbucket.org/tavisrudd/throw-out-your-templates/src
Unfortunately, there is no pypi package for it and it's not part of any distribution as this is only meant as a proof-of-concept. I was also not able to find somebody who took the code and started maintaining it as an actual project. Nevertheless, I think it is worth a try even if it means that you have to ship your own copy of throw_out_your_templates.py with your code.
Similar to the suggestion to use yattag by John Smith Optional, this module does not require you to learn any templating language and also makes sure that you never forget to close tags or quote special characters. Everything stays written in Python. Here is an example of how to use it:
html(lang='en')[
head[title['An example'], meta(charset='UTF-8')],
body(onload='func_with_esc_args(1, "bar")')[
div['Escaped chars: ', '< ', u'>', '&'],
script(type='text/javascript')[
'var lt_not_escaped = (1 < 2);',
'\nvar escaped_cdata_close = "]]>";',
'\nvar unescaped_ampersand = "&";'
],
Comment('''
not escaped "< & >"
escaped: "-->"
'''),
div['some encoded bytes and the equivalent unicode:',
'你好', unicode('你好', 'utf-8')],
safe_unicode('<b>My surrounding b tags are not escaped</b>'),
]
]
I am attempting to make an easier solution called
PyperText
In Which you can do stuff like this:
from PyperText.html import Script
from PyperText.htmlButton import Button
#from PyperText.html{WIDGET} import WIDGET; ex from PyperText.htmlEntry import Entry; variations shared in file
myScript=Script("myfile.html")
myButton=Button()
myButton.setText("This is a button")
myScript.addWidget(myButton)
myScript.createAndWrite()
I wrote a simple wrapper for the lxml module (should work fine with xml as well) that makes tags for HTML/XML -esq documents.
Really, I liked the format of the answer by John Smith but I didn't want to install yet another module to accomplishing something that seemed so simple.
Example first, then the wrapper.
Example
from Tag import Tag
with Tag('html') as html:
with Tag('body'):
with Tag('div'):
with Tag('span', attrib={'id': 'foo'}) as span:
span.text = 'Hello, world!'
with Tag('span', attrib={'id': 'bar'}) as span:
span.text = 'This was an example!'
html.write('test_html.html')
Output:
<html><body><div><span id="foo">Hello, world!</span><span id="bar">This was an example!</span></div></body></html>
Output after some manual formatting:
<html>
<body>
<div>
<span id="foo">Hello, world!</span>
<span id="bar">This was an example!</span>
</div>
</body>
</html>
Wrapper
from dataclasses import dataclass, field
from lxml import etree
PARENT_TAG = None
#dataclass
class Tag:
tag: str
attrib: dict = field(default_factory=dict)
parent: object = None
_text: str = None
#property
def text(self):
return self._text
#text.setter
def text(self, value):
self._text = value
self.element.text = value
def __post_init__(self):
self._make_element()
self._append_to_parent()
def write(self, filename):
etree.ElementTree(self.element).write(filename)
def _make_element(self):
self.element = etree.Element(self.tag, attrib=self.attrib)
def _append_to_parent(self):
if self.parent is not None:
self.parent.element.append(self.element)
def __enter__(self):
global PARENT_TAG
if PARENT_TAG is not None:
self.parent = PARENT_TAG
self._append_to_parent()
PARENT_TAG = self
return self
def __exit__(self, typ, value, traceback):
global PARENT_TAG
if PARENT_TAG is self:
PARENT_TAG = self.parent

Categories