Can I include variable in string formatting mini-language (Python)? [duplicate] - python

Is it possible to use variables in the format specifier in the format()-function in Python? I have the following code, and I need VAR to equal field_size:
def pretty_printer(*numbers):
str_list = [str(num).lstrip('0') for num in numbers]
field_size = max([len(string) for string in str_list])
i = 1
for num in numbers:
print("Number", i, ":", format(num, 'VAR.2f')) # VAR needs to equal field_size

You can use the str.format() method, which lets you interpolate other variables for things like the width:
'Number {i}: {num:{field_size}.2f}'.format(i=i, num=num, field_size=field_size)
Each {} is a placeholder, filling in named values from the keyword arguments (you can use numbered positional arguments too). The part after the optional : gives the format (the second argument to the format() function, basically), and you can use more {} placeholders there to fill in parameters.
Using numbered positions would look like this:
'Number {0}: {1:{2}.2f}'.format(i, num, field_size)
but you could also mix the two or pick different names:
'Number {0}: {1:{width}.2f}'.format(i, num, width=field_size)
If you omit the numbers and names, the fields are automatically numbered, so the following is equivalent to the preceding format:
'Number {}: {:{width}.2f}'.format(i, num, width=field_size)
Note that the whole string is a template, so things like the Number string and the colon are part of the template here.
You need to take into account that the field size includes the decimal point, however; you may need to adjust your size to add those 3 extra characters.
Demo:
>>> i = 3
>>> num = 25
>>> field_size = 7
>>> 'Number {i}: {num:{field_size}.2f}'.format(i=i, num=num, field_size=field_size)
'Number 3: 25.00'
Last but not least, of Python 3.6 and up, you can put the variables directly into the string literal by using a formatted string literal:
f'Number {i}: {num:{field_size}.2f}'
The advantage of using a regular string template and str.format() is that you can swap out the template, the advantage of f-strings is that makes for very readable and compact string formatting inline in the string value syntax itself.

I prefer this (new 3.6) style:
name = 'Eugene'
f'Hello, {name}!'
or a multi-line string:
f'''
Hello,
{name}!!!
{a_number_to_format:.1f}
'''
which is really handy.
I find the old style formatting sometimes hard to read. Even concatenation could be more readable. See an example:
'{} {} {} {} which one is which??? {} {} {}'.format('1', '2', '3', '4', '5', '6', '7')

I used just assigned the VAR value to field_size and change the print statement. It works.
def pretty_printer(*numbers):
str_list = [str(num).lstrip('0') for num in numbers]
field_size = max([len(string) for string in str_list])
VAR=field_size
i = 1
for num in numbers:
print("Number", i, ":", format(num, f'{VAR}.2f'))

Related

python: extracting variables from string templates

I am familiar with the ability to insert variables into a string using Templates, like this:
Template('value is between $min and $max').substitute(min=5, max=10)
What I now want to know is if it is possible to do the reverse. I want to take a string, and extract the values from it using a template, so that I have some data structure (preferably just named variables, but a dict is fine) that contains the extracted values. For example:
>>> string = 'value is between 5 and 10'
>>> d = Backwards_template('value is between $min and $max').extract(string)
>>> print d
{'min': '5', 'max':'10'}
Is this possible?
That's called regular expressions:
import re
string = 'value is between 5 and 10'
m = re.match(r'value is between (.*) and (.*)', string)
print(m.group(1), m.group(2))
Output:
5 10
Update 1. Names can be given to groups:
m = re.match(r'value is between (?P<min>.*) and (?P<max>.*)', string)
print(m.group('min'), m.group('max'))
But this feature is not used often, as there are usually enough problems with a more important aspect: how to capture exactly what you want (with this particular case that's not a big deal, but even here: what if the string is value is between 1 and 2 and 3 -- should the string be accepted and what's the min and max?).
Update 2. Rather than making a precise regex, it's sometimes easier to combine regular expressions and "regular" code like this:
m = re.match(r'value is between (?P<min>.*) and (?P<max>.*)', string)
try:
value_min = float(m.group('min'))
value_max = float(m.group('max'))
except (AttributeError, ValueError): # no match or failed conversion
value_min = None
value_max = None
This combined approach is especially worth remembering when your text consists of many chunks (like phrases in quotes of different types) to be processed: in tricky cases, it's harder to define a single regex to handle both delimiters and contents of chunks than to define several steps like text.split(), optional merging of chunks, and independent processing of each chunk (using regexes and other means).
It's not possible to perfectly reverse the substitution. The problem is that some strings are ambiguous, for example
value is between 5 and 7 and 10
would have two possible solutions: min = "5", max = "7 and 10" and min = "5 and 7", max = "10"
However, you might be able to achieve useful results with regex:
import re
string = 'value is between 5 and 10'
template= 'value is between $min and $max'
pattern= re.escape(template)
pattern= re.sub(r'\\\$(\w+)', r'(?P<\1>.*)', pattern)
match= re.match(pattern, string)
print(match.groupdict()) # output: {'max': '10', 'min': '5'}
The behave module for Behavior-Driven Development provides a few different mechanisms for specifying and parsing templates.
Depending on the complexity of your templates, and the other needs of your app, you might find one or the other most useful. (Plus, you can steal their pre-written code.)
You can use the difflib module to compare the two strings and pull out the information you want.
https://docs.python.org/3.6/library/difflib.html
For example:
import difflib
def backwards_template(my_string, template):
my_lib = {}
entry = ''
value = ''
for s in difflib.ndiff(my_string, template):
if s[0]==' ':
if entry != '' and value != '':
my_lib[entry] = value
entry = ''
value = ''
elif s[0]=='-':
value += s[2]
elif s[0]=='+':
if s[2] != '$':
entry += s[2]
# check ending if non-empty
if entry != '' and value != '':
my_lib[entry] = value
return my_lib
my_string = 'value is between 5 and 10'
template = 'value is between $min and $max'
print(backwards_template(my_string, template))
Gives:
{'min': '5', 'max': '10'}

Using variables in the format() function in Python

Is it possible to use variables in the format specifier in the format()-function in Python? I have the following code, and I need VAR to equal field_size:
def pretty_printer(*numbers):
str_list = [str(num).lstrip('0') for num in numbers]
field_size = max([len(string) for string in str_list])
i = 1
for num in numbers:
print("Number", i, ":", format(num, 'VAR.2f')) # VAR needs to equal field_size
You can use the str.format() method, which lets you interpolate other variables for things like the width:
'Number {i}: {num:{field_size}.2f}'.format(i=i, num=num, field_size=field_size)
Each {} is a placeholder, filling in named values from the keyword arguments (you can use numbered positional arguments too). The part after the optional : gives the format (the second argument to the format() function, basically), and you can use more {} placeholders there to fill in parameters.
Using numbered positions would look like this:
'Number {0}: {1:{2}.2f}'.format(i, num, field_size)
but you could also mix the two or pick different names:
'Number {0}: {1:{width}.2f}'.format(i, num, width=field_size)
If you omit the numbers and names, the fields are automatically numbered, so the following is equivalent to the preceding format:
'Number {}: {:{width}.2f}'.format(i, num, width=field_size)
Note that the whole string is a template, so things like the Number string and the colon are part of the template here.
You need to take into account that the field size includes the decimal point, however; you may need to adjust your size to add those 3 extra characters.
Demo:
>>> i = 3
>>> num = 25
>>> field_size = 7
>>> 'Number {i}: {num:{field_size}.2f}'.format(i=i, num=num, field_size=field_size)
'Number 3: 25.00'
Last but not least, of Python 3.6 and up, you can put the variables directly into the string literal by using a formatted string literal:
f'Number {i}: {num:{field_size}.2f}'
The advantage of using a regular string template and str.format() is that you can swap out the template, the advantage of f-strings is that makes for very readable and compact string formatting inline in the string value syntax itself.
I prefer this (new 3.6) style:
name = 'Eugene'
f'Hello, {name}!'
or a multi-line string:
f'''
Hello,
{name}!!!
{a_number_to_format:.1f}
'''
which is really handy.
I find the old style formatting sometimes hard to read. Even concatenation could be more readable. See an example:
'{} {} {} {} which one is which??? {} {} {}'.format('1', '2', '3', '4', '5', '6', '7')
I used just assigned the VAR value to field_size and change the print statement. It works.
def pretty_printer(*numbers):
str_list = [str(num).lstrip('0') for num in numbers]
field_size = max([len(string) for string in str_list])
VAR=field_size
i = 1
for num in numbers:
print("Number", i, ":", format(num, f'{VAR}.2f'))

How to use in python regular expression

I would like to use a numeric variable regular expression part.
What should I do if I want to use a variable in this part (?P<hh>\d)
I want to output lines that contain the input number.
Using string interpolation:
m = re.compile(r'\d{%d}:\d{%d}' % (var1, var2))
If the vars aren't already integers you may need to convert types like so:
m = re.compile(r'\d{%d}:\d{%d}' % (int(var1), int(var2)))
Your question isn't clear.
If you want to capture some specific part of the regex, you have to create groups (using pharentesis):
hh = sys.argv[1]
m = re.compile(r'(?P<hh>\d):(\d{2})')
match = m.match(hh)
print match.group(1)
print match.group(2)
for example, if hh = '1:23', the above code will print:
1
23
Now, if what you need is replace \d{2} by some variable, you can do:
variable = r'\d{2}'
m = re.compile(r'(?P<hh>\d):%s' % variable)
or if you just want to replace the 2, you can do:
variable = '2'
m = re.compile(r'(?P<hh>\d):\d{%s}' % variable)
Another option could be using:
r'(?P<hh>\d):{0}'.format(variable)
You can pass it in as a string (I'd escape it first):
m = re.compile(re.escape(hh) + r':\d{2}')

parsing a line of text to get a specific number

I have a line of text in the form " some spaces variable = 7 = '0x07' some more data"
I want to parse it and get the number 7 from "some variable = 7". How can this be done in python?
I would use a simpler solution, avoiding regular expressions.
Split on '=' and get the value at the position you expect
text = 'some spaces variable = 7 = ...'
if '=' in text:
chunks = text.split('=')
assignedval = chunks[1]#second value, 7
print 'assigned value is', assignedval
else:
print 'no assignment in line'
Use a regular expression.
Essentially, you create an expression that goes something like "variable = (\d+)", do a match, and then take the first group, which will give you the string 7. You can then convert it to an int.
Read the tutorial in the link above.
Basic regex code snippet to find numbers in a string.
>>> import re
>>> input = " some spaces variable = 7 = '0x07' some more data"
>>> nums = re.findall("[0-9]*", input)
>>> nums = [i for i in nums if i] # remove empty strings
>>> nums
['7', '0', '07']
Check out the documentation and How-To on python.org.

variable length of %s with the % operator in python

I'm trying to do this:
max_title_width = max([len(text) for text in columns])
for column in columns:
print "%10s, blah" % column
But I want to replace the 10 with the value of max_title_width. How do I do this in the most pythonic way possible?
This is a carryover from the C formatting markup:
print "%*s, blah" % (max_title_width,column)
If you want left-justified text (for entries shorter than max_title_width), put a '-' before the '*'.
>>> text = "abcdef"
>>> print "<%*s>" % (len(text)+2,text)
< abcdef>
>>> print "<%-*s>" % (len(text)+2,text)
<abcdef >
>>>
If the len field is shorter than the text string, the string just overflows:
>>> print "<%*s>" % (len(text)-2,text)
<abcdef>
If you want to clip at a maximum length, use the '.' precision field of the format placeholder:
>>> print "<%.*s>" % (len(text)-2,text)
<abcd>
Put them all together this way:
%
- if left justified
* or integer - min width (if '*', insert variable length in data tuple)
.* or .integer - max width (if '*', insert variable length in data tuple)
You have the new strings formatting methods from Python 3 and Python 2.6.
Starting in Python 2.6, the built-in str and unicode classes provide the ability to do complex variable substitutions and value formatting via the str.format() method described in PEP 3101. The Formatter class in the string module allows you to create and customize your own string formatting behaviors using the same implementation as the built-in format() method.
(...)
For example, suppose you wanted to have a replacement field whose field width is determined by another variable:
>>> "A man with two {0:{1}}.".format("noses", 10)
"A man with two noses ."
>>> print("A man with two {0:{1}}.".format("noses", 10))
A man with two noses .
So for your example it would be
max_title_width = max(len(text) for text in columns)
for column in columns:
print "A man with two {0:{1}}".format(column, max_title_width)
I personally love the new formatting methods, as they are far more powerful and readable in my humble opinion.
Python 2.6+ alternate version examples:
>>> '{:{n}s}, blah'.format('column', n=10)
'column , blah'
>>> '{:*>{l}s}'.format(password[-3:], l=len(password)) # password = 'stackoverflow'
'**********low'
>>> '{:,.{n}f} {}'.format(1234.567, 'USD', n=2)
'1,234.57 USD'
Hint: first non-keyword args, then keyword args.
you could create your template outside of the loop:
tmpl = '%%%ds, blah' % max_title_width
for column in columns:
print tmpl % column
You could also learn about the new formatting in python.
and btw, max doesn't require a list, you can pass it an iterable:
max_title_width = max(len(i) for i in columns)

Categories