Split tags in python - python

I have a file that contains this:
<html>
<head>
<title> Hello! - {{ today }}</title>
</head>
<body>
{{ runner_up }}
avasd
{{ blabla }}
sdvas
{{ oooo }}
</body>
</html>
What is the best or most Pythonic way to extract the {{today}}, {{runner_up}}, etc.?
I know it can be done with splits/regular expressions, but I wondered if there were another way.
PS: consider the data loaded in a variable called thedata.
Edit: I think that the HTML example was bad, because it directed some commenters to BeautifulSoup. So, here is a new input data:
Fix grammatical or {{spelling}} errors.
Clarify meaning without changing it.
Correct minor {{mistakes}}.
Add related resources or links.
Always respect the original {{author}}.
Output:
spelling
mistakes
author

Mmkay, well here's a generator solution that seems to work well for me. You can also provide different open and close tags if you like.
def get_tags(s, open_delim ='{{',
close_delim ='}}' ):
while True:
# Search for the next two delimiters in the source text
start = s.find(open_delim)
end = s.find(close_delim)
# We found a non-empty match
if -1 < start < end:
# Skip the length of the open delimiter
start += len(open_delim)
# Spit out the tag
yield s[start:end].strip()
# Truncate string to start from last match
s = s[end+len(close_delim):]
else:
return
Run against your target input like so:
# prints: today, runner_up, blabla, oooo
for tag in get_tags(html):
print tag
Edit: it also works against your new example :). In my obviously quick testing, it also seemed to handle malformed tags in a reasonable way, though I make no guarantees of its robustness!

try templatemaker, a reverse-template maker. it can actually learn them automatically from examples!

I know you said no regex/split, but I couldn't help but try for a one-liner solution:
import re
for s in re.findall("\{\{.*\}\}",thedata):
print s.replace("{","").replace("}","")
EDIT: JFS
Compare:
>>> re.findall('\{\{.*\}\}', '{{a}}b{{c}}')
['{{a}}b{{c}}']
>>> re.findall('{{(.+?)}}', '{{a}}b{{c}}')
['a', 'c']

If the data is that straightforward, a simple regex would do the trick.

J.F. Sebastian wrote this in a comment but I thought it was good enough to deserve its own answer:
re.findall(r'{{(.+?)}}', thestring)
I know the OP was asking for a way that didn't involve splits or regexes - so maybe this doesn't quite answer the question as stated. But this one line of code definitely gets my vote as the most Pythonic way to accomplish the task.

Related

Custom templating using Jinja2

I am trying to use jinja2 as follows.
Suppose,
Following are tags:
tags: {"world":"WORLD", "c language": "Dennis Ritchie", "apple":"JOBS" }
Input:
HELLO {{ world }}, C is written by **{{ c language }}**, **}}** while **{{** java is written by {{ java }}, hola.
Output:
HELLO WORLD, C is written by Dennis Ritchie, **}}** while **{{** java is written by, hola.
So in short there are following things I have to do.
delimiters - {{ & }}
If there is no tag predefined, it should put empty.
If there is only single delimiter {{ or }} ( I mean not pair) ,it should not consider tag else it should be printed as it it.
Tags should allow spaces.
Out of 4, for only 1 & 2 jinja2 is working fine.
from jinja2 import Template
t = Template(input_string)
t.render(context)
But for 3rd & 4th, it's not working.(or I am mistaking.)
I found only 1 template engine called "mustache" which supports above all 4 conditions. But I don't know how it works in case of performance.
As jinja2 is mature template engine, I think it's possible to customize default behaviour.
Can anybody know solution?
Thnx in advance.
My primary testing shows that Mustache(Pystache) is too faster than jinja2. If possible please give expert opinion.
http://mustache.github.io/
https://github.com/defunkt/pystache
Finally I continue with mustache. It's really awesome template engine.
http://mustache.github.io/
For mustache build for python
https://github.com/defunkt/pystache
I don't think this is possible. The documentation is quite clear on identifiers:
Jinja2 uses the regular Python 2.x naming rules. Valid identifiers
have to match [a-zA-Z_][a-zA-Z0-9_]*. As a matter of fact non ASCII
characters are currently not allowed. This limitation will probably go
away as soon as unicode identifiers are fully specified for Python 3.

How to transform hyperlink codes into normal URL strings?

I'm trying to build a blog system. So I need to do things like transforming '\n' into < br /> and transform http://example.com into < a href='http://example.com'>http://example.com< /a>
The former thing is easy - just using string replace() method
The latter thing is more difficult, but I found solution here: Find Hyperlinks in Text using Python (twitter related)
But now I need to implement "Edit Article" function, so I have to do the reverse action on this.
So, how can I transform < a href='http://example.com'>http://example.com< /a> into http://example.com?
Thanks! And I'm sorry for my poor English.
Sounds like the wrong approach. Making round-trips work correctly is always challenging. Instead, store the source text only, and only format it as HTML when you need to display it. That way, alternate output formats / views (RSS, summaries, etc) are easier to create, too.
Separately, we wonder whether this particular wheel needs to be reinvented again ...
Since you are using the answer from that other question your links will always be in the same format. So it should be pretty easy using regex. I don't know python, but going by the answer from the last question:
import re
myString = 'This is my tweet check it out http://tinyurl.com/blah'
r = re.compile(r'(http://[^ ]+)')
print r.sub(r'\1', myString)
Should work.

Split in py. write in django

asd = "qweasdzxc";
qwen = asd.split("")
self.response.out.write(qwen[0]) # write q
i can split the sentence with this way but i want to do this with using django template in my html doc.
How can i do this
Thanks for helping
You might want to write a custom filter for django for this. Here is a snippet for splitting.
There is also make_list and slice (look on the same page as the last link) to accomplish the same goal.
Given the example slice is probably what you want.

Python Django Templates and testing if a variable is null or empty string

I am pretty new to django, but have many years experience coding in the java world, so I feel ridiculous asking this question - I am sure the answer is obvious and I am just missing it. I can't seem to find the right way to query this in google or something... I have searched through the django docs and it either isn't there or I am just not seeing it. All I want to do is in a template test if the var is not null OR an empty string OR just a bunch of spaces. I have an issue where spaces are getting introduced into my field - another issue I have to, and will, work out... but, I want my logic to work regardless. Right now, because my string contains just spaces simply doing this: {% if lesson.assignment %} always passes even though I don't want it to. I have looked for a trim type functionality that would work between {% %}, but I can't seem to find anything. I have tried strip, but it doesn't work between {% %}. Can someone please point me in the direction of the answer... some docs I might have missed... something?
Thanks a ton in advance!
{% if lesson.assignment and lesson.assignment.strip %}
The .strip calls str.strip() so you can handle whitespace-only strings as empty, while the preceding check makes sure we weed out None first (which would not have the .strip() method)
Proof that it works (in ./manage.py shell):
>>> import django
>>> from django.template import Template, Context
>>> t = Template("{% if x and x.strip %}OK{% else %}Empty{% endif %}")
>>> t.render(Context({"x": "ola"}))
u'OK'
>>> t.render(Context({"x": " "}))
u'Empty'
>>> t.render(Context({"x": ""}))
u'Empty'
>>> t.render(Context({"x": None}))
u'Empty'
This may not have been available when the question was posted but one option could be using the built in template filter default_if_none (Django Documentation Link).
For example:
{{ lesson.assignment|default_if_none:"Empty" }}
If lesson.assignment is a model field, you could define a helper function in your model and use it:
class Lesson(models.Model):
assignment = models.CharField(..., null=True, ...)
# ...
#property
def has_assignment(self):
return self.assignment is not None and self.assignment.strip() != ""
Then use {% if lesson.has_assignment %} in your template.
You can simulate it by using the cut filter to remove all spaces. But you should probably find out why it contains all spaces in the first place.
You can call any built-in methods anywhere in a Django template variable. For example, you can call the Python string method strip. So this will work:
{% if lesson.assignment.strip %}
Here is a simple way to do this, from Django version 3.2
{{ lesson.assignment|default:"nothing" }}
This works for both cases (empty string and None object).
ref link: https://docs.djangoproject.com/en/3.2/ref/templates/builtins/#std:templatefilter-default

Limit number of characters with Django Template filter

I am trying to output the first 255 characters of a description on a list of items and am looking for a method to get that.
Example: I have a variable that contains 300 or so characters.
I call that variable like this, {{ my_variable|characterlimit:255 }}
and it would return only the first 255 characters of that variable.
If this tag doesn't exist, I will simply create it (and suggest that it goes into django), but I wanted to make sure it didn't before I took the time to do that. Thanks!
If the "my_variable" is a string, you can take advantage of the slice filter, which treats the string as a list of characters. If it's a set of words, the rough equivilant is truncatewords - but that doesn't quite sound like your need.
truncatewordsalso adds an ellipsis ... at the end of the truncated result.
Usage would be something like
{{ my_variable|slice:":255" }}
There is an official built-in filter:
{{ variable|truncatechars:255 }}
A more simple way by using the standard template tag is:
{{ variable|stringformat:".10s" }}
In this case the 10 is the position argument and for a string it is the maximum number of characters to be displayed.
If do you want to truncate by word, take a look at this
https://docs.djangoproject.com/en/1.4/ref/templates/builtins/#truncatechars
It doesn't exist unfortunately. There are moves to implement it, but it's still in the design stage (well, implemented, but waiting for design decision), as described here.
Patch attached to that ticket contains implementation.
Using a templatefilter for truncating your text isn't really suitable for a responsive design. Therefore you could also use css to truncate your text that is responsive. I know the OP asked to do this with a django templatefilter.
You can achieve a responsive truncated text using this:
.class {
width: 100%;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}

Categories