performance of internationalized html templates - python

I'm writing a web application using Python and the Flask microframework. The application will support several languages, and I'm now trying to decide how to write i18n enabled html templates. My template engine is Jinja2 (though it is early enough in my project that I can switch to something else if necessary).
Let me start by showing an example portion of a template with gettext tags:
{% if error %}<div class="error">{{ _(error) }}</div>{% endif %}
<h1>{{ _("Hello, World!") }}</h1>
In this template there are two kinds of strings that the application will need to know how to translate:
dynamic strings that in the context of the template will only be known at runtime (the error string)
static strings that are known at any time (the "Hello, World!" string)
The first case is easy to handle. The string is passed to the gettext engine at runtime to obtain the translated version. No issues there.
While the second case can be handled in the same way, my impression is that there's got to be a more efficient way to handle these static strings. None of the documentation I read for gettext, Babel or Jinja2 mention anything about optimizing the translation of static strings that will have to be searched and located every time the template needs to be rendered.
An approach that I think makes a lot of sense is to pre-render each template into a set of language specific sub-templates, where each sub-template has the static strings resolved, leaving only the dynamic text sections for gettext to handle at runtime.
So, for example, if I wanted to support English and Spanish, my template above would be processed offline by some tool that will generate two sub-templates that will get written to a template cache:
template-en.html:
{% if error %}<div class="error">{{ _(error) }}</div>{% endif %}
<h1>Hello, World!</h1>
template-es.html:
{% if error %}<div class="error">{{ _(error) }}</div>{% endif %}
<h1>¡Hola, Mundo!</h1>
Then at runtime the template engine needs to check if a specific sub-template for the requested locale is available in the cache, and if one is found then rendering will be much faster.
Are there any frameworks, tools, template engines, etc. that implement something like this?
Or are there any other ways to avoid the overhead of searching the translation database for the same little snippets of text over and over again?

Have you measured the impact of such an "optimization" on the whole request-response cycle? I'd be very surprised if you would see any meaningful speedups, especially when using a template language that is already known to be pretty darn fast, like Jinja2.
As a general rule: never optimize before having measured the potential gains of the optimization. Intuition is oftentimes completely wrong in this area. This is especially true when an optimization introduces complication in code, development or deployment - always measure if it is worth the hassle.
You'd probably have more success looking into optimizing database queries or caching of expensive operations (a dict lookup, which gettext basically is, is not expensive at all).

You're doing it wrong. You should never I18Nize variables (your first type), only static text (your second type). You need to I18Nize the strings that could be used in error, not the variable itself.
And optimization should be done at the message catalog (gettext) level, not in your code.

Related

Is it possible to pass dynamic values into a dbt source freshness test?

I'm trying to dynamically determine warnings and errors on freshness checks, specified in dbt sources.yml, based on the median and std dev of the "synced_at" column of the underlying source.
To accomplish this, I thought I might try to pass a macro in the freshness block of the source.yml file as so:
# sources.yml
...
tables:
- name: appointment_type
freshness:
error_after:
count: test_macro()
period: hour
...
Where:
{%- macro test_macro(this) -%}
{# /*
The idea is {{ this.table }} would parameterize a query,
going over the same column name for all sources, _fivetran_synced,
and spit out the calculated values I want. This makes me feel like
it needs to be a prehook, that somehow stores the value in a var,
and that is accessed in the source.yml, instead of calling it directly.
In this case a trivial integer is attempted to be returned, just as an example.
*/ #}
{{ return(24) }}
{%- endmacro -%}
However this results in a type error. Presumably the macro is not called at all. Wrapping it in jinja quotes also returns an error.
I am curious if passing dynamic values to freshness checks can currently be achieved in any way?
It isn't possible today to call macros from .yml files, for precisely this reason: dbt needs to be able to statically parse those files and validate internal objects (including resource properties like source freshness) before it runs any queries against the database.
I think you could maybe hack this by overriding the collect_freshness macro to return, instead of simply max(synced_at), a timestamp that is Z-score diffed from current_timestamp, normalized based on all Fivetran max(synced_at) timestamps. It feels tricky but possible.
At the same time, I'd gently push back on your larger goal here. We think of source freshness as something that should be prescriptive. You get to tell Fivetran how often you want it to sync data, and add freshness blocks to test those expectations. You can run ad hoc queries like the one you envision above to determine if those expectations are reasonable. Obviously, some tables are updated infrequently or unpredictably, but I find it's more useful to override or remove these tables' freshness expectations than to add significant complexity on their account.

Is there a right way to define additional packages for asset specification in Pyramid?

I have several pyramid projects I'm combining into a single projects with jinja files that have lines like:
{% extends 'some_project:templates/layout.jinja2'%}
and
{% extends 'other_project:templates/layout.jinja2'%}
It would be great if I could simply put all the sub project files into subfolders then register an additional asset specification so some_project:templates/ got turned into combo_projects:templates/some_project/templates and I wouldn't have to touch any of the templates.
I added...
config.override_asset(to_override='other_project:templates/', override_with='combo_projects:templates/some_project/templates'
...which initially complained about missing module other_project, so I made a dummy module and things seem to work but I'm worried I've abused the system and am standing on a house of cards.
Is there a better way to do this? Reading the docs on asset specifications https://docs.pylonsproject.org/projects/pyramid/en/latest/narr/assets.html#asset-specifications or on overriding linked in it aren't giving me any insight.
Obviously I could also update all the files, which I might, but I want to know if I what I did is safe and if there is a better way to accomplish the same thing.
In Pyramid a real module prefix is required to override an asset, but that's it. Now that you've made a module and reserved that namespace, what you're doing is not what I'd consider an abuse of the system.

Django Template Arithmetic

In my template, I am looping through a list, trying to make a two-column layout. Because of the desired two-column layout, the markup I need to write in the for loop is dependent on whether forloop.counter0 is even or odd. If I had the full power of Python in the template language, determining the parity of forloop.counter0 would be trivial, but unfortunately that is not the case. How can I test whether forloop.counter0 is even or odd using the Django template language, or just as good, is there another way I could get elements in the list to display alternatively in the left and right columns?
Thanks in advance!
You should probably use cycle here instead. As for your question, there is a filter called divisibleby.
The philosophy behind Django's template system is to avoid doing any serious logic in the template. Thus they only provide tools to do fairly basic calculations for cases like drawing grids etc.
You can use the divisibleby filter with forloop.counter:
{% if forloop.counter|divisibleby:"2" %}even{% else %}odd{% endif %}
Use cycle template tag:

Best way to denormalize data in Django? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I'm developing a simple web app, and it makes a lot of sense to store some denormalized data.
Imagine a blogging platform that keeps track of Comments, and the BlogEntry model has a "CommentCount" field that I'd like to keep up to date.
One way of doing this would be to use Django signals.
Another way of doing this would be to put hooks directly in my code that creates and destrys Comment objects to synchronously call some methods on BlogEntry to increment/decrement the comment count.
I suppose there are other pythonic ways of accomplishing this with decorators or some other voodoo.
What is the standard Design Pattern for denormalizing in Django? In practice, do you also have to write consistency checkers and data fixers in case of errors?
You have managers in Django.
Use a customized manager to do creates and maintain the FK relationships.
The manager can update the counts as the sets of children are updated.
If you don't want to make customized managers, just extend the save method. Everything you want to do for denormalizing counts and sums can be done in save.
You don't need signals. Just extend save.
I found django-denorm to be useful. It uses database-level triggers instead of signals, but as far as I know, there is also branch based on different approach.
The first approach (signals) has the advantage to loose the coupling between models.
However, signals are somehow more difficult to maintain, because dependencies are less explicit (at least, in my opinion).
If the correctness of the comment count is not so important, you could also think of a cron job that will update it every n minutes.
However, no matter the solution, denormalizing will make maintenance more difficult; for this reason I would try to avoid it as much as possible, resolving instead to using caches or other techniques -- for example, using with comments.count as cnt in templates may improve performance quite a lot.
Then, if everything else fails, and only in that case, think about what could be the best approach for the specific problem.
Django offers a great and efficient (though not very known) alternative to counter denormalization.
It will save your many lines of code and it's really slow since you retrieve the count in the same SQL query.
I will suppose you have these classes:
class BlogEntry(models.Model):
title = models.CharField()
...
class Comment(models.Model):
body = models.TextField()
blog_entry = models.ForeignKey(BlogEntry)
In your views.py, use annotations:
from django.db.models import Count
def blog_entry_list(Request):
blog_entries = BlogEntry.objects.annotate(count=Count('comment_set')).all()
...
And you will have an extra field per each BlogEntry, that contains the count of comments, plus the rest of fields of BlobEntry.
You can use this extra field in the templates too:
{% for blog_entry in blog_entries %}
{{ blog_entry.title }} has {{ blog_entry.count }} comments!
{% endfor %}
This will not only save you coding and maintenance time but it is really efficient (the query takes only a bit longer to be executed).
Why not just get the set of comments, and find the number of elements, using the count() method:
count = blog_entry.comment_set.count()
Then you can pass that into your template.
Or, alternative, in the template itself, you can do:
{{ blog_entry.comment_set.count }}
to get the number of comments.

Syntax error whenever I put Python code inside a Django template

I'm trying to do the following in my Django template:
{% for embed in embeds %}
{% embed2 = embed.replace("<", "<") %}
{{embed2}}<br />
{% endfor %}
However, I always get an invalid block or some syntax error when I do anything like that (by that I mean {% %} code inside a loop). Python doesn't have {} to signify "scope" so I think this might be my problem? Am I formatting my code wrong?
Edit: the exact error is: Invalid block tag: 'embed2'
Edit2: Since someone said what I'm doing is not supported by Django templates, I rewrote the code, putting the logic in the view. I now have:
embed_list = []
for embed in embeds:
embed_list[len(embed_list):] = [embed.replace("<", "<")] #this is line 35
return render_to_response("scanvideos.html", {
"embed_list" :embed_list
})
However, I now get an error: 'NoneType' object is not callable" on line 35.
I am quite sure that Django templates does not support that.
For your replace operation I would look into different filters.
You really should try to keep as much logic as you can in your views and not in the templates.
Django's template language is deliberately hobbled. When used by non-programming designers, this is definitely a Good Thing, but there are times when you need to do a little programming. (No, I don't want to argue about that. This has come up several times on django-users and django-dev.)
Two ways to accomplish what you were trying:
Use a different template engine. See Jinja2 for a good example that is fully explained for integrating with Django.
Use a template tag that permits you to do Python expressions. See limodou's Expr tag.
I have used the expr tag in several places and it has made life much easier. My next major Django site will use jinja2.
I don't see why you'd get "NoneType object is not callable". That should mean that somewhere on the line is an expression like "foo(...)", and it means foo is None.
BTW: You are trying to extend the embed_list, and it's easier to do it like this:
embed_list = []
for embed in embeds:
embed_list.append(embed.replace("<", "<")) #this is line 35
return render_to_response("scanvideos.html", {"embed_list":embed_list})
and even easier to use a list comprehension:
embed_list = [embed.replace("<", "<") for embed in embeds]
Instead of using a slice assignment to grow a list
embed_list[len(embed_list):] = [foo]
you should probably just do
embed_list.append(foo)
But really you should try unescaping html with a library function rather than doing it yourself.
That NoneType error sounds like embed.replace is None at some point, which only makes sense if your list is not a list of strings - you might want to double-check that with some asserts or something similar.
Django templates use their own syntax, not like Kid or Genshi.
You have to roll your own Custom Template Tag.
I guess the main reason is enforcing good practice. In my case, I've already a hard time explaining those special templates tags to the designer on our team. If it was plain Python I'm pretty sure we wouldn't have chosen Django at all. I think there's also a performance issue, Django templates benchmarks are fast, while last time I checked genshi was much slower. I don't know if it's due to freely embedded Python, though.
You either need to review your approach and write your own custom templates (more or less synonyms to "helpers" in Ruby on Rails), or try another template engine.
For your edit, there's a better syntax in Python:
embed_list.append(embed.replace("<", "<"))
I don't know if it'll fix your error, but at least it's less JavaScriptesque ;-)
Edit 2: Django automatically escapes all variables. You can enforce raw HTML with |safe filter : {{embed|safe}}.
You'd better take some time reading the documentation, which is really great and useful.

Categories