Display all jinja object attributes - python

Is there a way to display the name/content/functions of all attributes of a given object in a jinja template. This would make it easier to debug a template that is not acting as expected.
I am building a website using the hyde framework and this would come in quite handy since I am still learning the intricacies of both jinja and hyde.
Originally, I had thought it would work to use the attr filter, but this seems to require a name value. I would like to to not have to specify the name in order to get all available attributes for the object.
Some google searching showed django syntax looks like the following, but I am not familiar with django so this may only apply to database items. Long story short, I would like a method that works kind of like this for any object named obj
{% for field, value in obj.get_fields %}
{{ field }} : {{ value }} </br>
{% endfor %}
final solution:
#jayven was right, I could create my own jinja2 filter. Unfortunately, using the stable version of hyde (0.8.4), this is not a trivial act of having a filter in the pythonpath and setting a simple yaml value in the site.yaml file (There is a pull-request for that). That being said, I was able to figure it out! So the following is my final solution which ends up being very helpful for debugging any unkown attributes.
It's easy enough to create site-specific hyde extensions just create a local python package with the following directory tree
hyde_ext
__init__.py
custom_filters.py
Now create the extension:
from hyde.plugin import Plugin
from jinja2 import environmentfilter, Environment
debug_attr_fmt = '''name: %s
type: %r
value: %r'''
#environmentfilter
def debug_attr(env, value, verbose=False):
'''
A jinja2 filter that creates a <pre> block
that lists all the attributes of a given object
inlcuding the value of those attributes and type.
This filter takes an optional variable "verbose",
which prints underscore attributes if set to True.
Verbose printing is off by default.
'''
begin = "<pre class='debug'>\n"
end = "\n</pre>"
result = ["{% filter escape %}"]
for attr_name in dir(value):
if not verbose and attr_name[0] == "_":
continue
a = getattr(value, attr_name)
result.append(debug_attr_fmt % (attr_name, type(a), a))
result.append("{% endfilter %} ")
tmpl = Environment().from_string("\n\n".join(result))
return begin + tmpl.render() + end
#return "\n\n".join(result)
# list of custom-filters for jinja2
filters = {
'debug_attr' : debug_attr
}
class CustomFilterPlugin(Plugin):
'''
The curstom-filter plugin allows any
filters added to the "filters" dictionary
to be added to hyde
'''
def __init__(self, site):
super(CustomFilterPlugin, self).__init__(site)
def template_loaded(self,template):
super(CustomFilterPlugin, self).template_loaded(template)
self.template.env.filters.update(filters)
To let hyde know about the extension add hyde_ext.custom_filters.CustomFilterPlugin to the "plugins" list of the site.yaml file.
Lastly, test it out on a file, you can add this to some random page {{resource|debug_attr}} or the following to get even the underscore-attributes {{resource|debug_attr(verbose=True)}}
Of course, I should add, that it seems like this might become much easier in the future whenever hyde 1.0 is released. Especially since there is already a pull request waiting to implement a simpler solution. This was a great way to learn a little more about how to use jinja and hyde though!

I think you can implement a filter yourself, for example:
from jinja2 import *
def show_all_attrs(value):
res = []
for k in dir(value):
res.append('%r %r\n' % (k, getattr(value, k)))
return '\n'.join(res)
env = Environment()
env.filters['show_all_attrs'] = show_all_attrs
# using the filter
tmpl = env.from_string('''{{v|show_all_attrs}}''')
class Myobj(object):
a = 1
b = 2
print tmpl.render(v=Myobj())
Also see the doc for details: http://jinja.pocoo.org/docs/api/#custom-filters

Related

Is there a way to check if a path is absolute in jinja2?

In the pydata-sphinx-theme we need to check if a path is absolute or not before adding it to the template. Currently we use the following:
{% set image_light = image_light if image_light.startswith("http") else pathto('_static/' + image_light, 1) %}
It's working but fails to capture local files and many other absolute configurations. Is there a more elegant way to perform this check ?
I would consider implementing this logic in Python proper, and bundle it as a custom template function. This way it'd be much easier to implement, debug and test.
thanks #klas ล .for the guidances.
for anyone coming here I did add:
from urllib.parse import urlparse
# The registration function
def setup_is_absolute(app, pagename, templatename, context, doctree):
def is_absolute(link):
return bool(urlparse(link).netloc) or link.startswith("/")
context['is_absolute'] = is_absolute
# Your extension's setup function
def setup(app):
app.connect("html-page-context", setup_is_absolute)
and in my template:
{{ is_absolute(logo) }}

How to use dag_run.conf for typed arguments

I have a DAG that create a Google Dataproc cluster and submit a job to it.
I would like to be able to customize the cluster (number of workers) and the job (arguments passed to it) through the dag_run.conf parameter.
Cluster creation
For the cluster creation, I wrote a logic with something like:
DataprocCreateClusterOperator(...
cluster_config = {...
num_workers = "{% if 'cluster' is in dag_run.conf and 'secondary_worker_config' is in dag_run.conf['cluster'] and 'num_instances' is in dag_run.conf['cluster']['secondary_worker_config'] %}{{ dag_run.conf['cluster']['secondary_worker_config']['num_instances'] }}{% else %}16{% endif %}"
}
)
That is to say, if cluster.secondary_worker_config.num_instances is available in dag_run.conf, use it, else fallback on default value 16.
However, when rendered, this is expanded as a Python string, like "16", leading to failure because the num_workers parameter must be an int or a long.
I cannot parse it to int during operator declaration:
num_workers = int("{% ... %}")
because this will try to interpret the whole jinja script as an integer (and not the resulting value).
Using the | int jinja filter neither solve the problem.
Job submission
I have a similar problem for job submission.
Operator expect a job dict argument, with field spark.args to provide arguments to the spark job. This field must be an iterable, and is expected to be a list of strings, e.g: ["--arg=foo", "bar"].
I want to be able to add some arguments by providing them through dag_run.conf:
{
args = ["--new_arg=baz", "bar2"]
}
But adding these arguments to the initial list doesn't seem to be possible. You either get a single argument for all additional args: ["--arg=foo", "bar", "--new_arg=baz bar2"], or a single string with all arguments.
In any case, the resulting job submission is not working as expected...
Is there an existing way to workaround this problem?
If not, is there a way to add a "casting step" after "template rendering" one, either in the provider operators or directly in the BaseOperator abstract class?
Edit
I think that the solution proposed by Josh Fell is the way to go. However, for those that don't want to upgrade Airflow, I tried to implement the solution proposed by Jarek.
import unittest
import datetime
from typing import Any
from airflow import DAG
from airflow.models import BaseOperator, TaskInstance
# Define an operator which check its argument type at runtime (during "execute")
class TypedOperator(BaseOperator):
def __init__(self, int_param: int, **kwargs):
super(TypedOperator, self).__init__(**kwargs)
self.int_param = int_param
def execute(self, context: Any):
assert(type(self.int_param) is int)
# Extend the "typed" operator with an operator handling templating
class TemplatedOperator(TypedOperator):
template_fields = ['templated_param']
def __init__(self,
templated_param: str = "{% if 'value' is in dag_run.conf %}{{ dag_run.conf['value'] }}{% else %}16{% endif %}",
**kwargs):
super(TemplatedOperator, self).__init__(int_param=int(templated_param), **kwargs)
# Run a test, instantiating a task and executing it
class JinjaTest(unittest.TestCase):
def test_templating(self):
print("Start test")
dag = DAG("jinja_test_dag", default_args=dict(
start_date=datetime.date.today().isoformat()
))
print("Task intanciation (regularly done by scheduler)")
task = TemplatedOperator(task_id="my_task", dag=dag)
print("Done")
print("Task execution (only done when DAG triggered)")
context = TaskInstance(task=task, execution_date=datetime.datetime.now()).get_template_context()
task.execute(context)
print("Done")
self.assertTrue(True)
Which give the output:
Start test
Task intanciation (regularly done by scheduler)
Ran 1 test in 0.006s
FAILED (errors=1)
Error
Traceback (most recent call last):
File "/home/alexis/AdYouLike/Repositories/data-airflow-dags/tests/data_airflow_dags/utils/tasks/test_jinja.py", line 38, in test_templating
task = TemplatedOperator(task_id="my_task", dag=dag)
File "/home/alexis/AdYouLike/Repositories/data-airflow-dags/.venv/lib/python3.6/site-packages/airflow/models/baseoperator.py", line 89, in __call__
obj: BaseOperator = type.__call__(cls, *args, **kwargs)
File "/home/alexis/AdYouLike/Repositories/data-airflow-dags/tests/data_airflow_dags/utils/tasks/test_jinja.py", line 26, in __init__
super(TemplatedOperator, self).__init__(int_param=int(templated_param), **kwargs)
ValueError: invalid literal for int() with base 10: "{% if 'value' is in dag_run.conf %}{{ dag_run.conf['value'] }}{% else %}16{% endif %}"
As you can see, this fails at the task instanciation step, because in the TemplatedOperator.__init__ we try to cast to int the JINJA template (and not the rendered value).
Maybe I missed a point in this solution, but it seems to be unusable as is.
Unfortunately all Jinja templates are rendered as strings so the solution proposed by #JarekPotiuk is your best bet.
However, for anyone using Airflow 2.1+ or if you'd like to upgrade, there is a new parameter that can be set at the DAG level: render_template_as_native_obj
When enabling this parameter, the output from Jinja templating will be returned as native Python types (e.g. list, tuple, int, etc.). Learn more here: https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html#rendering-fields-as-native-python-objects
THe easiest way is to define your custom operator deriving from DataprocCreateClusterOperator . It's super easy and you can even do it within the dag file:
Conceptually something like that:
class MyDataprocCreateClusterOperator(DataprocCreateClusterOperator):
template_fields = DataprocCreateClusterOperator.template_fields + ['my_param']
def __init__(my_param='{{ ... }}', .....):
super(int_param=int(my_param), ....)

Understanding lambdas in pystache

Say I want to write a function that format floats with 2 digits in pystache.
I want both the float number and the function in the context (I believe this is the correct philosophy of Mustache).
What should I do?
In order to make the problem clear, I will show a pair of code snippets I have written. They do not work.
Snippet (a):
import pystache
tpl = "{{#fmt}}{{q}}{{/fmt}}"
ctx = {
'q' : 1234.5678,
'fmt' : lambda x: "%.2f" % float(x)
}
print pystache.render(tpl, ctx)
This fails, with error "could not convert string to float: {{q}}". I can understand this error: {{fmt}} is evaluated before {{q}}.
Snippet (b):
import pystache
tpl = "{{#q}}{{fmt}}{{/q}}"
ctx = {
'q' : 1234.5678,
'fmt' : lambda x: "%.2f" % float(x)
}
print pystache.render(tpl, ctx)
This fails with error: "lambda() takes exactly 1 argument (0 given)". I can't understand this error: shouldn't the context be passed as argument?
Short answer: mustache doesn't support this. It expects all data values to come preprocessed.
In 2012, at formating of dates, numbers and more ยท Issue #41 ยท mustache/spec, ppl suggessted various implementations, incl. from other template engines, and couldn't reach any conclusion.
As per mustache.js date formatting , ppl have constructed several extensions and/or workarounds (To me, the most promising one looks to be the {{value | format}} syntax extension), or suggested moving to other markup engines.
Additional info:
The spec at http://mustache.github.io/mustache.5.html (linked from http://mustache.github.io front page no less) is obsolete, dated 2009. The latest spec that pystache follows resides at https://github.com/mustache/spec , and looks abandoned, too: the latest commit is dated 02.2015 and the latest spec update is from 2011. Nor does it have a successor.
So, by this point, the standard is dead, and the markup is thus free-for-all to augment.
I'd still suggest to consider other formats linked to in the aforementioned discussions before reinventing the wheel.
According to ivan_pozdeev's comments, I will post my own answer. I will not accept it, because I think it's a too ugly solution.
import pystache
import copy
tpl = "{{#fmt}}{{q}}{{/fmt}}"
ctx = {
'q' : 1234.5678,
}
class OtherContext(object):
def __init__(self, renderer):
self.renderer = renderer
def fmt(self):
return lambda s: "%.2f" % float(copy.deepcopy(self.renderer).render(s, self.renderer.context))
renderer = pystache.Renderer(missing_tags='strict')
ctx2 = OtherContext(renderer)
print renderer.render(tpl, ctx, ctx2)
I also had this problem. After setting a breakpoint on a defined method that I called with my lambda, I found that the data Pystache passes is only the lambda subsection of the template. Not very helpful.
Then, after a lot of digging around, I found that
on Nov 5, 2013, cjerdonek mentions accessing the current context via the renderer, and ultimately this code comment from the pystache.renderer module:
# This is an experimental way of giving views access to the current context.
# [...]
#property
def context(self):
"""
Return the current rendering context [experimental].
"""
return self._context
Thankfully, this experiment works ๐Ÿ˜„๐Ÿ˜„ Ultimately:
Your lambda must call a defined method.
A strict inline lambda doesn't work in my experience.
Your defined method must be able to access an instance of the Renderer class.
In the method, you'll get the context, process the value, and print it.
Your Renderer instance must be used to render your template and context, so that it then has access to the context.
I built the following to meet your requirements. This is in Python 3.6.5, and I expanded the names for readability.
import pystache
RENDERER = None
def format_float(template):
'''
#param template: not actually used in your use case
(returns '{{value}}' in this case, which you could render if needed)
'''
context_stack = RENDERER.context._stack
# the data has been last in the stack in my experience
context_data = context_stack[len(context_stack) - 1]
x = float(context_data['value'])
print("%.2f" % x)
if __name__ == '__main__':
RENDERER = pystache.Renderer()
template = '{{#formatter}}{{value}}{{/formatter}}'
context = {
'value' : 1234.5678,
'formatter' : lambda template: format_float(template)
}
RENDERER.render(template, context)
This prints 1234.57 to the console. ๐Ÿ‘๐Ÿ‘

Should I be worried about Django template inefficiencies?

I'm relatively new to Django, and I'm using version 1.5 to build a REST api. Calls to the api expect JSON to be returned (I'm using this with an Ember.js front-end).
I'm wondering if I can't do something like this:
def listproject(request, pk_id):
# list single project at /projects/<pk_id>
project = Project.objects.get(pk = pk_id)
snapshots = Snapshot.objects.filter(project = project)
# (both are same up to here)
return render_to_response('project.json',
{"project":project, "snapshots":snapshots},
mimetype="text/json")
Where project.json is this Django template:
{
"id": "{{ project.pk }}",
"name": "{{ project.name }}",
"snapshot_ids": [ {% for snapshot in snapshots %}"{{ snapshot.pk }}"{% if not forloop.last %}, {% endif %}{% endfor %}
}
Someone who has worked with Django much longer than I have is suggesting that using templates for this will be inefficient. He suggests I do the following instead:
def listproject(request, pk_id):
# list single project at /projects/<pk_id>
project = Project.objects.get(pk = pk_id)
snapshots = Snapshot.objects.filter(project = project)
# (both are same up to here)
ret_json = []
ret_json.append('{"id": "' + str(project.pk) + '", ')
ret_json.append('"name": "' + project.name + '", "snapshot_ids": [')
snapshot_json = []
for snapshot in snapshots:
snapshot_json.append('"' + str(snapshot.pk) + '",')
ret_json.append(''.join(snapshot_json)[0:-1] + ']}')
return HttpResponse(content=''.join(ret_json), mimetype="text/json")
I've tested both. They work identically, though my version produces more readable JSON.
Please help us end our debate! Which is more efficient (and why)?
It's true that Django templates are not particularly efficient. However, that's only really a problem when you have very large templates that themselves extend or include many other templates, for example in a complex content management system. With a single template containing a small number of fields like you have, template rendering is insignificant compared to the overall overhead of serving the request.
That said I'm a bit confused about both of your alternatives. Why aren't you generating JSON via the standard json library? That's the proper way to do it, not by building up strings either in templates or in Python code.
ret = {'id': project.id,
'name': project.name,
'snapshot_ids': [snapshot.id for snapshot in snapshots]}
ret_json = json.dumps(ret)
Both of these options look horrible to me. I'd prefer to avoid 'hand-writing' the JSON as much as possible and just convert directly from Python data structures.
Fortunately the json module is designed for this.
import json
def listproject(request, pk_id):
# list single project at /projects/<pk_id>
project = Project.objects.get(pk=pk_id)
snapshots = Snapshot.objects.filter(project=project)
data = {
"id": project.pk,
"name": project.name,
"snapshot_ids": [snapshot.pk for snapshot in snapshots],
}
return HttpResponse(content=json.dumps(data), mimetype="text/json")
Reasons to avoid 'hand-writing' the code are obvious - avoid bugs from typos, code is shorter and simpler, json module is likely to be faster.
If you are concerned about the 'readability' of the generated JSON the json module provides some options for controlling the output (indents etc):
http://docs.python.org/2/library/json.html
I usually use this little function:
import json
from django.http import HttpResponse
def json_response(ob):
return HttpResponse(
json.dumps(ob), mimetype="application/json")
So then you can just return the result of that from a view:
def listproject(request, pk_id):
project = Project.objects.get(pk=pk_id) # Use get_object_or_404 ?
snapshots = Snapshot.objects.filter(project=project)
return json_response({
"id": project.pk,
"name": project.name,
"snapshot_ids": [snapshot.pk for snapshot in snapshots],
})

Math on Django Templates

Here's another question about Django.
I have this code:
views.py
cursor = connections['cdr'].cursor()
calls = cursor.execute("SELECT * FROM cdr where calldate > '%s'" %(start_date))
result = [SQLRow(cursor, r) for r in cursor.fetchall()]
return render_to_response("cdr_user.html",
{'calls':result }, context_instance=RequestContext(request))
I use a MySQL query like that because the database is not part of a django project.
My cdr table has a field called duration, I need to divide that by 60 and multiply the result by a float number like 0.16.
Is there a way to multiply this values using the template tags? If not, is there a good way to do it in my views?
My template is like this:
{% for call in calls %}
<tr class="{% cycle 'odd' 'even' %}"><h3>
<td valign="middle" align="center"><h3>{{ call.calldate }}</h3></td>
<td valign="middle" align="center"><h3>{{ call.disposition }}</h3></td>
<td valign="middle" align="center"><h3>{{ call.dst }}</h3></td>
<td valign="middle" align="center"><h3>{{ call.billsec }}</h3></td>
<td valign="middle" align="center">{{ (call.billsec/60)*0.16 }}</td></h3>
</tr>
{% endfor %}
The last is where I need to show the value, I know the "(call.billsec/60)*0.16" is impossible to be done there. I wrote it just to represent what I need to show.
You can do it on three different layers:
Database level. SQL is a powerful language capable of mathematics. You could write your equation in the select part of your query. In your case, that should be along the lines SELECT (duration/60*0.16) FROM cdr;. Examples can be found here and on Google. Note that in this case, stress (algorithm complexity) is put on the MySQL server process and not the Python process.
View level. In your example, just before your return, you could loop over every element of your result variable to modify its value. You can follow the example that was given by Lie Ryan for this level.
Template level. This is done by a custom filter. You can write your custom filter as written in the documentation and pipe your template variable through this filter in order to get the desired value.
Something along these lines would represent a custom filter applicable on your template(s):
#register.filter
def customFilter(value):
return value / 60.0 * 0.16
You would then use it this way in your template, after {% load %}-ing the custom filter (read the documentation for more implementation information):
{{ billsec|customFilter }}
If the math operations are not too complex I normally use custom template tags. Add operation is already available as a template tag and I use the below snippet in my project for multiplication, subtraction and division respectively. Put this code inside a .py file inside your app/templatetags location and also add a __init__.py in there.
from django import template
#Django template custom math filters
#Ref : https://code.djangoproject.com/ticket/361
register = template.Library()
def mult(value, arg):
"Multiplies the arg and the value"
return int(value) * int(arg)
def sub(value, arg):
"Subtracts the arg from the value"
return int(value) - int(arg)
def div(value, arg):
"Divides the value by the arg"
return int(value) / int(arg)
register.filter('mult', mult)
register.filter('sub', sub)
register.filter('div', div)
EDIT: the following answer is totally wrong, since I thought OP was using sqlite. MySQL has its own way of wrapping things into dictionary, see far below.
You can subclass sqlite3.Row and write your own "computed field":
class MyRow(sqlite3.Row):
def comp_billsec(self):
return (self['billsec'] / 60) * 0.16
cursor = ...
cursor.row_factory = MyRow
for r in cursor.execute('...'):
print r['billsec'], r.comp_billsec()
note that our comp_billsec is accessed using method call syntax, while sqlite3.Row factory provides access through dictionary syntax. This discrepancy would disappear in django template since inside django template, dictionary access and zero-argument function call have the same syntax, so you can do {{ call.billsec }} and {{ call.comp_billsec }}.
EDIT: In MySQL, you can insert computed values in the view along the lines of:
cursor = connections['cdr'].cursor(cursorclass=MySQLdb.cursors.DictCursor)
calls = cursor.execute("...")
result = [r + dict(comp_billsec=r['billsec'] / 60 * 0.16) for r in cursor.fetchall()]
return render_to_response("cdr_user.html",
{'calls':result }, context_instance=RequestContext(request))
Additionaly, you should use parameterized query (note the comma instead of %):
cursor.execute("SELECT * FROM cdr where calldate > '%s'", (start_date,))
Your previous code is subject to SQL injection security issue since you're interpolating start_date into the SQL query directly. If start_date contains ' OR 1 OR ', for example, your query will be interpolated as SELECT * FROM cdr where calldate > '' OR 1 OR '' which will select all rows in the table; it could be even worse.

Categories