Templates with argument in string formatting - python

I'm looking for a package or any other approach (other than manual replacement) for the templates within string formatting.
I want to achieve something like this (this is just an example so you could get the idea, not the actual working code):
text = "I {what:like,love} {item:pizza,space,science}".format(what=2,item=3)
print(text)
So the output would be:
I love science
How can I achieve this? I have been searching but cannot find anything appropriate. Probably used wrong naming terms.
If there isnt any ready to use package around I would love to read some tips on the starting point to code this myself.

I think using list is sufficient since python lists are persistent
what = ["like","love"]
items = ["pizza","space","science"]
text = "I {} {}".format(what[1],items[2])
print(text)
output:
I love science

My be use a list or a tuple for what and item as both data types preserve insertion order.
what = ['like', 'love']
item = ['pizza', 'space', 'science']
text = "I {what} {item}".format(what=what[1],item=item[2])
print(text) # I like science
or even this is possible.
text = "I {what[1]} {item[2]}".format(what=what, item=item)
print(text) # I like science
Hope this helps!

Why not use a dictionary?
options = {'what': ('like', 'love'), 'item': ('pizza', 'space', 'science')}
print("I " + options['what'][1] + ' ' + options['item'][2])
This returns: "I love science"
Or if you wanted a method to rid yourself of having to reformat to accommodate/remove spaces, then incorporate this into your dictionary structure, like so:
options = {'what': (' like', ' love'), 'item': (' pizza', ' space', ' science'), 'fullstop': '.'}
print("I" + options['what'][0] + options['item'][0] + options['fullstop'])
And this returns: "I like pizza."

Since no one have provided an appropriate answer that answers my question directly, I decided to work on this myself.
I had to use double brackets, because single ones are reserved for the string formatting.
I ended up with the following class:
class ArgTempl:
def __init__(self, _str):
self._str = _str
def format(self, **args):
for k in re.finditer(r"{{(\w+):([\w,]+?)}}", self._str,
flags=re.DOTALL | re.MULTILINE | re.IGNORECASE):
key, replacements = k.groups()
if not key in args:
continue
self._str = self._str.replace(k.group(0), replacements.split(',')[args[key]])
return self._str
This is a primitive, 5 minute written code, therefore lack of checks and so on. It works as expected and can be improved easly.
Tested on Python 2.7 & 3.6~
Usage:
test = "I {{what:like,love}} {{item:pizza,space,science}}"
print(ArgTempl(test).format(what=1, item=2))
> I love science
Thanks for all of the replies.

Related

Replacing similar strings with single unified string in python

Currently working on a data science project and I'm having trouble with data preparation.
Specifically this one: What's Cooking?
The dataset has strings like 'medium eggs', 'large free range egg', 'eggplants', 'large egg whites', 'chinese egg noodles' and 'eggs'
So in this case, I would like to find and replace all the 'medium eggs' and 'large free range egg' as just 'eggs' while strings like 'eggplants' and 'chinese egg noodles' are supposed to be left alone. I would also need to replace 'large egg whites' as 'egg whites'
Another case would be 'garbanzo beans' and 'chick peas' since they refer to the same ingredient.
The initial attempt was just to find any string with 'egg' in its string and replace it, but because there are so many conditions, I'm not sure what kind of approach to take now.
Since this is a classification project, the code needs to be able to take potential ingredients like 'small egg' and still understand it as 'eggs'
This can be done most cleanly with regex, checking for spaces on either side of the query string:
import re
def replace_eggs(string_to_replace, replacement_text, *query_strings):
for query_string in query_strings:
return re.sub(f"\s?{query_string}([\.,]?)\s?",replacement_text, string_to_replace)
WARNING: This code is very bad. It doesn't work very well, and I don't have enough time to fix it. I am sorry. I would suggest learning about regex and catch groups to do this a bit better. Just to re-iterate (ba-dum ching!), I'm sorry but I have many things to do.
As a partial solution, you could write a simple function using this:
import spacy
items = ['medium eggs', 'large free range egg', 'eggplants', 'large egg whites', 'chinese egg noodles', 'eggs']
clean = []
for i in items:
doc = nlp(i)
temp = ''
for token in doc:
#print(token.text , token.pos_)
if token.pos_=='NOUN' or token.pos_=='PROPN':
temp += ' ' + token.text
clean.append(temp)
print(clean)
Output: [' eggs', ' range egg', ' eggplants', ' egg whites', ' egg noodles', ' eggs']
NOTE: You might need to take care of a few cases like 'garbanzo beans' and 'chick peas' manually

Generate text from a given template

For example I have a string such as
text = '{Hello|Good morning|Hi}{. We|, we} have a {good |best }offer for you.'
How can I generate a set of all possible strings with variants of words in braces?
Hello. We have a good offer for you.
Good morning, we have a best offer for you.
etc...
You can use the re and random module, like this:
import random
import re
def randomize(match):
res = match.group(1).split('|')
random.shuffle(res)
return res[0]
def random_sentence(tpl):
return re.sub(r'{(.*?)}', randomize, tpl)
tpl = '{Hello|Good morning|Hi}{. We|, we} have a {good |best }offer for you.'
print(random_sentence(tpl))
I would use tree-traversal method to get all possible variants:
import re
text = '{Hello|Good morning|Hi}{. We|, we} have a {good |best }offer for you.'
variants = ['']
elements = re.split(r'([{\|}])',text)
inside = False
options = []
for elem in elements:
if elem=='{':
inside = True
continue
if not inside:
variants = [v+elem for v in variants]
if inside and elem not in '|}':
options.append(elem)
if inside and elem=='}':
variants = [v+opt for opt in options for v in variants]
options = []
inside = False
print(*variants,sep='\n')
Output:
Hello. We have a good offer for you.
Good morning. We have a good offer for you.
Hi. We have a good offer for you.
Hello, we have a good offer for you.
Good morning, we have a good offer for you.
Hi, we have a good offer for you.
Hello. We have a best offer for you.
Good morning. We have a best offer for you.
Hi. We have a best offer for you.
Hello, we have a best offer for you.
Good morning, we have a best offer for you.
Hi, we have a best offer for you.
Explanation: I use re.split to split str into elements:
['', '{', 'Hello', '|', 'Good morning', '|', 'Hi', '}', '', '{', '. We', '|', ', we', '}', ' have a ', '{', 'good ', '|', 'best ', '}', 'offer for you.']
Then I create flag inside which I will use to store if I am currently inside or outside { and } and act accordingly.
If I find { I set flag and go to next element (continue)
If I am not inside brackets I simply add given element to every
variant.
If I am inside and elements is not { and is not | I append
this element to options list.
If I am inside and find } then I made variants for every
possible part of (one of variants),(one of options) and
variants become effect of this operation.
Note that I assume that: always correct str will be given as text and { will be used solely as control character and } will be used solely as control character and | inside { } will be used solely as control character.

Search multiple keywords in string

I'm looking for a way to complete this task as efficient as possible.
Here is how I want it to work.Say user inputs
"My screen is broken"
The script finds the two keywords "screen" and "broken" and then prints an appropriate string. Being a noob I thought I might be able to use a dictionary like this
{"screen", "broken", "smashed":"use a repair kit"}
Then I would just search all the keys in the dictionary.
But upon further research it seems this is not possible.
So what would be the best way to do this? I thought maybe sql but I was wondering if there was a better way which would involve just python.Thanks
If you are just looking for "screen" and "broken", something like this can work.
sentence = "My screen is broken"
keys = ["screen", "broken"]
if all(i in sentence for i in keys):
print "Use a repair kit"
Building on zyxue's answer, you could make it check for certain values but not all of them. This will work with your above code, but you can nest multiple tuples together if you'd like to group other names.
sentence = "My screen is smashed"
solution_dict = {}
solution_dict[("screen", ("broken", "smashed"))] = "use a repair kit"
#If value is a tuple, run function on every value and return if there are any matches
#If not, check the word is part of the sentence
def check_match(sentence_words, keyword):
if isinstance(keyword, tuple):
return any([check_match(sentence_words, i) for i in keyword])
return keyword in sentence_words
#Make sure each value in the main tuple has a match
sentence_words = [i.lower() for i in sentence.split()]
for k,v in solution_dict.iteritems():
if all(check_match(sentence_words, i) for i in k):
print v
So you'll get results like this:
>>> sentence = "My screen is smashed"
use a repair kit
>>> sentence = "My screen is smashed and broken"
use a repair kit
>>> sentence = "My screen is broken"
use a repair kit
>>> sentence = "My phone is broken"
(nothing)
To work with phone too, along with iphone and android, you could set it like this, having iphone and android in another tuple makes no difference but just groups it a little better. solution_dict[(("screen", "phone", ("android", "iphone")), ("broken", "smashed"))] = "use a repair kit"
Dictionary keys need to be immutable, you could use a tuple, e.g.
# {("screen", "broken"): "use a repair kit"}
# input is a keyword in Python, so use input_ instead
input_ = input_.split()
if 'screen' in input_ and 'broken' in input_:
return "use a repair kit"
solutions = [{'keywords': ['screen', 'broken', 'smashed'], 'solution': 'use a repair kit'}]
s = 'My screen is broken'
words = set(s.lower().split(' '))
print '\n'.join([x.get('solution') for x in solutions if words & set(x.get('keywords', []))])

Function's result is not updated?

I started learning Python quite recently. I am trying to a piece of code that does some simple text editing. The program is suppose to take a txt file encoding in UTF-8, make sure everything is indented 1 space starting from the second line and delete any potential double or triple spaces.
My plan is, reading the information from the txt file and store it in a list. Then I am going to process the elements in the list, then finally rewrite them back to the file (which has not been implemented yet) The first part of auto indent code is working I think.
However for the code that detects and deletes unwanted spaces, I tried in the function method, I think it is working; However when I test for the list content in the body code, the contents seem unaltered (the original state). What could I have been done wrong?
To give an idea of an example file, I will post parts of the txt file I am trying to process
Original:
There are various kinds of problems concerning human rights. Every day we hear news reporting on human rights violation. Human rights NGOs (For example, Amnesty International or Human Rights Watch) have been trying to deal with and resolve these problems in order to restore the human rights of individuals.
Expected:
There are various kinds of problems concerning human rights. Every day we hear news reporting on human rights violation. Human rights NGOs (For example, Amnesty International or Human Rights Watch) have been trying to deal with and resolve these problems in order to restore the human rights of individuals.
My code is as follows
import os
os.getcwd()
os.chdir('D:')
os.chdir('/Documents/2011_data/TUFS_08_2011')
words = []
def indent(string):
for x in range(0, len(string)):
if x>0:
if string[x]!= "\n":
if string[x][0] != " ":
y = " " + string[x]
def delete(self):
for x in self:
x = x.replace(" ", " ")
x = x.replace(" ", " ")
x = x.replace(" ", " ")
print(x, end='')
return self
with open('dummy.txt', encoding='utf_8') as file:
for line in file:
words.append(line)
file.close()
indent(words)
words = delete(words)
for x in words:
print(x, end='')
You can easily remove spaces with a split() and a join;
In [1]: txt = ' This is a text with multiple spaces. '
Using the split() method of a string gives you a list of words without whitespace.
In [3]: txt.split()
Out[3]: ['This', 'is', 'a', 'text', 'with', 'multiple', 'spaces.']
Then you can use the join method with a single space;
In [4]: ' '.join(txt.split())
Out[4]: 'This is a text with multiple spaces.'
If you want an extra space in front, insert an empty string in the list;
In [7]: s = txt.split()
In [8]: s
Out[8]: ['This', 'is', 'a', 'text', 'with', 'multiple', 'spaces.']
In [9]: s.insert(0, '')
In [10]: s
Out[10]: ['', 'This', 'is', 'a', 'text', 'with', 'multiple', 'spaces.']
In [11]: ' '.join(s)
Out[11]: ' This is a text with multiple spaces.'
Your delete function iterates through a list, assigning each string to x, then successively reassigns x with the result of various replaces. But it never then puts the result back in the list, which is returned unchanged.
The easiest thing to do would be to build up a new list consisting of the results of the modifications, and then return that.
def delete(words):
result = []
for x in words:
... modify...
result.append(x)
return result
(Note it's not a good idea to use the name 'self', as that implies you're in an object method, which you're not.)

Can you have variables within triple quotes? If so, how?

This is probably a very simple question for some, but it has me stumped. Can you use variables within python's triple-quotes?
In the following example, how do use variables in the text:
wash_clothes = 'tuesdays'
clean_dishes = 'never'
mystring =""" I like to wash clothes on %wash_clothes
I like to clean dishes %clean_dishes
"""
print(mystring)
I would like it to result in:
I like to wash clothes on tuesdays
I like to clean dishes never
If not what is the best way to handle large chunks of text where you need a couple variables, and there is a ton of text and special characters?
The preferred way of doing this is using str.format() rather than the method using %:
This method of string formatting is the new standard in Python 3.0, and should be preferred to the % formatting described in String Formatting Operations in new code.
Example:
wash_clothes = 'tuesdays'
clean_dishes = 'never'
mystring =""" I like to wash clothes on {0}
I like to clean dishes {1}
"""
print mystring.format(wash_clothes, clean_dishes)
Yes! Starting from Python 3.6 you can use the f strings for this: They're interpolated in place, so mystring would have the desired value after the mystring = ... line:
wash_clothes = 'tuesdays'
clean_dishes = 'never'
mystring = f"""I like to wash clothes on {wash_clothes}
I like to clean dishes {clean_dishes}
"""
print(mystring)
Should you need to add a literal { or } in the string, you would just double it:
if use_squiggly:
kind = 'squiggly'
else:
kind = 'curly'
print(f"""The {kind} brackets are:
- '{{', or the left {kind} bracket
- '}}', or the right {kind} bracket
""")
would print, depending on the value of use_squiggly, either
The squiggly brackets are:
- '{', or the left squiggly bracket
- '}', or the right squiggly bracket
or
The curly brackets are:
- '{', or the left curly bracket
- '}', or the right curly bracket
One of the ways in Python 2 :
>>> mystring =""" I like to wash clothes on %s
... I like to clean dishes %s
... """
>>> wash_clothes = 'tuesdays'
>>> clean_dishes = 'never'
>>>
>>> print mystring % (wash_clothes, clean_dishes)
I like to wash clothes on tuesdays
I like to clean dishes never
Also look at string formatting
http://docs.python.org/library/string.html#string-formatting
Yes. I believe this will work.
do_stuff = "Tuesday"
mystring = """I like to do stuff on %(tue)s""" % {'tue': do_stuff}
EDIT: forgot an 's' in the format specifier.
I think the simplest way is str.format() as others have said.
However, I thought I'd mention that Python has a string.Template class starting in Python2.4.
Here's an example from the docs.
>>> from string import Template
>>> s = Template('$who likes $what')
>>> s.substitute(who='tim', what='kung pao')
'tim likes kung pao'
One of the reasons I like this is the use of a mapping instead of positional arguments.
Also note that you don't need the intermediate variable:
name = "Alain"
print """
Hello %s
""" % (name)
Pass multiple args in simple way
wash_clothes = 'tuesdays'
clean_dishes = 'never'
a=""" I like to wash clothes on %s I like to clean dishes %s"""%(wash_clothes,clean_dishes)
print(a)

Categories