What does = (equal) do in f-strings inside the expression curly brackets? - python

The usage of {} in Python f-strings is well known to execute pieces of code and give the result in string format (some tutorials here). However, what does the '=' at the end of the expression mean?
log_file = open("log_aug_19.txt", "w")
console_error = '...stuff...' # the real code generates it with regex
log_file.write(f'{console_error=}')

This is actually a brand-new feature as of Python 3.8.
Added an = specifier to f-strings. An f-string such as f'{expr=}'
will expand to the text of the expression, an equal sign, then the
representation of the evaluated expression.
Essentially, it facilitates the frequent use-case of print-debugging, so, whereas we would normally have to write:
f"some_var={some_var}"
we can now write:
f"{some_var=}"
So, as a demonstration, using a shiny-new Python 3.8.0 REPL:
>>> print(f"{foo=}")
foo=42
>>>

From Python 3.8, f-strings support "self-documenting expressions", mostly for print de-bugging. From the docs:
Added an = specifier to f-strings. An f-string such as f'{expr=}' will
expand to the text of the expression, an equal sign, then the
representation of the evaluated expression. For example:
user = 'eric_idle'
member_since = date(1975, 7, 31)
f'{user=} {member_since=}'
"user='eric_idle' member_since=datetime.date(1975, 7, 31)"
The usual f-string format specifiers allow more control over how the
result of the expression is displayed:
>>> delta = date.today() - member_since
>>> f'{user=!s} {delta.days=:,d}'
'user=eric_idle delta.days=16,075'
The = specifier will display the whole expression so that calculations
can be shown:
>>> print(f'{theta=} {cos(radians(theta))=:.3f}')
theta=30 cos(radians(theta))=0.866

This was introduced in python 3.8. It helps reduce a lot of f'expr = {expr} while writing codes. You can check the docs at What's new in Python 3.8.
A nice example was shown by Raymond Hettinger in his tweet:
>>> from math import radians, sin
>>> for angle in range(360):
print(f'{angle=}\N{degree sign} {(theta:=radians(angle))=:.3f}')
angle=0° (theta:=radians(angle))=0.000
angle=1° (theta:=radians(angle))=0.017
angle=2° (theta:=radians(angle))=0.035
angle=3° (theta:=radians(angle))=0.052
angle=4° (theta:=radians(angle))=0.070
angle=5° (theta:=radians(angle))=0.087
angle=6° (theta:=radians(angle))=0.105
angle=7° (theta:=radians(angle))=0.122
angle=8° (theta:=radians(angle))=0.140
angle=9° (theta:=radians(angle))=0.157
angle=10° (theta:=radians(angle))=0.175
...
You can also check out this to get the underlying idea on why this was proposed in the first place.

As mention here:
Equals signs are now allowed inside f-strings starting with Python 3.8. This lets you quickly evaluate an expression while outputting the expression that was evaluated. It's very handy for debugging.:
It mean it will run the execution of the code in the f-string braces, and add the result at the end with the equals sign.
So it virtually means:
"something={executed something}"

f'{a_string=}' is not exactly the same as f'a_string={a_string}'
The former escapes special characters while the latter does not.
e.g:
a_string = 'word 1 tab \t double quote \\" last words'
print(f'a_string={a_string}')
print(f'{a_string=}')
gets:
a_string=word 1 tab double quote \" last words
a_string='word 1 tab \t double quote \\" last words
I just realised that the difference is that the latter is printing the repr while the former is just printing the value. So, it would be more accurate to say:
f'{a_string=}' is the same as f'a_string={a_string!r}'
and allows formatting specifications.

Related

strings inside python template- and f-strings

Could someone please break down why "{dic['string_key']}".format(dic=dic) considers the single quotations to be part of the string-key and does a lookup under dic["'string_key'"] ?
a) and b) show the correct way, however I am missing the reason.
a = "{dic[string_key]}"
print(a.format(dic=dic))
b = f"{dic['string_key']}"
print(b)
format and f-strings use braces differently.
With str.format, the contents of the braces are part of a mini-language used by format to substitute its arguments into the format string.
In an f-string, it's an arbitrary Python expression to evaluate.
In this case:
a = "{dic[string_key]}"
print(a.format(dic=dic))
... the string is formatted when .format() is called and that function uses a formatting language that is documented here https://docs.python.org/3/library/string.html#formatstrings.
But in this case:
b = f"{dic['string_key']}"
print(b)
... the string is formatted when the assignment to b is executed, by Python itself. The expression inside the f-string follows normal Python syntax, with the exception that you cannot reuse the quotes used to enclose the f-string.
As a result, you need to specify the quotes around the dictionary key as you would normally, while the mini-language for .format() expects you to omit them.
Also note that this makes a lot of sense: b = f"{dic[string_key]}" should use the value of the variable string_key to index the dictionary.

Why isn't it possible to use backslashes inside the braces of f-strings? How can I work around the problem? [duplicate]

This question already has answers here:
How can I use newline '\n' in an f-string to format output?
(7 answers)
Closed last month.
In Python >=3.6, f-strings can be used as a replacement for the str.format method. As a simple example, these are equivalent:
'{} {}'.format(2+2, "hey")
f'{2+2} {"hey"}'
Disregarding format specifiers, I can basically move the positional arguments of str.format inside braces in an f-string. Note specifically that I am allowed to just put str literals in here, although it may seem a bit unwieldy.
There are however some limitations. Specifically, backslashes in any shape or form are disallowed inside the braces of an f-string:
'{}'.format("new\nline") # legal
f'{"new\nline"}' # illegal
f'{"\\"}' # illegal
I cannot even use \ to split up a long line if it's inside the braces;
f'{2+\
2}' # illegal
even though this usage of \ is perfectly allowed inside normal str's;
'{\
}'.format(2+2) # legal
It seems to me that a hard stop is coded into the parser if it sees the \ character at all inside the braces of an f-string. Why is this limitation implemented? Though the docs specify this behavior, it does not justify why.
You seem to expect
'{}'.format("new\nline")
and
f'{"new\nline"}'
to be equivalent. That's not what I would expect, and it's not how backslashes in f-strings worked back in the pre-release versions of Python 3.6 where backslashes between the braces were allowed. Back then, you'd get an error because
"new
line"
is not a valid Python expression.
As just demonstrated, backslashes in the braces are confusing and ambiguous, and they were banned to avoid confusion:
The point of this is to disallow convoluted code like:
>>> d = {'a': 4}
>>> f'{d[\'a\']}'
'4'
In addition, I'll disallow escapes to be used for brackets, as in:
>>> f'\x7bd["a"]}'
'4'
(where chr(0x7b) == "{").
It's annoying that you can't do this:
things = ['Thing one','Thing two','Thing three']
print(f"I have a list of things: \n{'\n'.join(things)}")
But you can do this:
things = ['Thing one','Thing two','Thing three']
nl = '\n'
print(f"I have a list of things:\n{nl.join(things)}")
For new lines, you can use os.linesep instead of \n. For example:
>>> import os
>>>
>>> print(f"Numbers:\n{os.linesep.join(map(str, [10, 20, 30]))}")
Numbers:
10
20
30
I am not sure if this helps, but instead of the illegal
f'{"new\nline"}'
one could use
f'{"new"+chr(10)+"line"}'

Replicating behavior of the Python string.split() function in Qt

I'm currently trying to exactly replicate the behavior of the Python split() function (the default version, without any arguments) in Qt.
I have been told that the default delimiter is any number of CR/LF/TAB symbols, therefore I tried using the following:
s_body.split(QRegExp("[\r\n\t ]+"), QString::SkipEmptyParts);
However, this does not replicate its behavior precisely.
If I run this on approximately 4 megabytes worth of text, and count the number of unique words, i get 133293. However, if I do the same using the Python function, the result becomes 133367 - therefore there is still something amiss.
Any feedback on how to fix this would be greatly welcome.
My guess is that Python is not skipping empty strings, and they are accounting for the difference. If you want your function to mimic's Python functionality, you can choose to include empty strings, or if you want to get the behavior you've implemented, you can write s_body.split() in Python; with no arguments, it strips all whitespace between non-whitespace characters, which means you get no empty strings back.
With a unicode string, python's split() will, quite naturally, split on the set of all unicode whitespace characters, not just the feeble ascii set:
>>> s = '\t_\n_\x0b_\x0c_\r_ _\x85_\xa0_\u1680_\u2000_\u2001_\u2002_\u2003_\u2004_\u2005_\u2006_\u2007_\u2008_\u2009_\u200a_\u2028_\u2029_\u202f_\u205f_\u3000_'
>>> len(s)
50
>>> len(s.split())
25
>>> ''.join(s.split())
'_________________________'
Now let's see what Qt does (using PyQt4):
>>> qs = QString(s)
>>> r = qs.split(QRegExp('\\s+'), QString.SkipEmptyParts)
>>> r.count()
24
>>> str(r.join(''))
'______\x85___________________'
So, almost there, but for some reason U+0085 NEL (Next Line) is not recognzed as whitespace in Qt4 - but that's easily remedied:
>>> r = qs.split(QRegExp('[\\s\x85]+'), QString.SkipEmptyParts)
>>> r.count()
25
>>> str(r.join(''))
'_________________________'

Printing subscript in python

In Python 3.3, is there any way to make a part of text in a string subscript when printed?
e.g. H₂ (H and then a subscript 2)
If all you care about are digits, you can use the str.maketrans() and str.translate() methods:
example_string = "A0B1C2D3E4F5G6H7I8J9"
SUB = str.maketrans("0123456789", "₀₁₂₃₄₅₆₇₈₉")
SUP = str.maketrans("0123456789", "⁰¹²³⁴⁵⁶⁷⁸⁹")
print(example_string.translate(SUP))
print(example_string.translate(SUB))
Which will output:
A⁰B¹C²D³E⁴F⁵G⁶H⁷I⁸J⁹
A₀B₁C₂D₃E₄F₅G₆H₇I₈J₉
Note that this won't work in Python 2 - see Python 2 maketrans() function doesn't work with Unicode for an explanation of why that's the case, and how to work around it.
The output performed on the console is simple text. If the terminal supports unicode (most do nowadays) you can use unicode's subscripts. (e.g H₂) Namely the subscripts are in the ranges:
0x208N for numbers, +, -, =, (, ) (N goes from 0 to F)
0x209N for letters
For example:
In [6]: print(u'H\u2082O\u2082')
H₂O₂
For more complex output you must use a markup language (e.g. HTML) or a typesetting language (e.g. LaTeX).
Using code like this works too:
print('\N{GREEK SMALL LETTER PI}r\N{SUPERSCRIPT TWO}')
print('\N{GREEK CAPITAL LETTER THETA}r\N{SUBSCRIPT TWO}')
The output being:
πr²
Θ₂
Note that this works on Python versions 3.3 and higher only. Unicode formatting.
If you want to use it on the axes of a plot you can do:
import matplotlib.pyplot as plt
plt.plot([1])
plt.ylabel(r'$H_{2}$')
plt.show()
which gives
By using this code you can use alphabets on the superscript and subscript
In This code
format() is Function and in Format function ('\unicode')
By using this table (Unicode subscripts and superscripts on Wikipedia) you can give suitable unicode to the suitable one
you can use superscript and sub script
"10{}".format('\u00B2') # superscript 2

regarding backslash from postgresql

i have a noob question.
I have a record in a table that looks like '\1abc'
I then use this string as a regex replacement in re.sub("([0-9])",thereplacement,"2")
I'm a little confused with the backslashes. The string i got back was "\\1abc"
Are you using python interactivly?
In regular string you need to escape backslashes in your code, or use r"..." (Link to docs). If you are running python interactivly and don't assign the results from your database to a variable, it'll be printed out using it's __repr__() method.
>>> s = "\\1abc"
>>> s
'\\1abc' # <-- How it's represented in Python code
>>> print s
\1abc # <-- The actual string
Also, your re.sub is a bit weird. 1) Maybe you meant [0-9] as the pattern? (Matching a single digit). The arguments are probably switche too, if thereplacement is your input. This is the syntax:
re.sub(pattern, repl, string, count=0)
So my guess is you expect something like this:
>>> s_in = yourDbMagic() # Which returns \1abc
>>> s_out = re.sub("[0-9]", "2", s_in)
>>> print s_in, s_out
\1abc \2abc
Edit: Tried to better explain escaping/representation.
Note that you can make \ stop being an escape character by setting standard_conforming_strings to on.

Categories