Printing subscript in python - python

In Python 3.3, is there any way to make a part of text in a string subscript when printed?
e.g. H₂ (H and then a subscript 2)

If all you care about are digits, you can use the str.maketrans() and str.translate() methods:
example_string = "A0B1C2D3E4F5G6H7I8J9"
SUB = str.maketrans("0123456789", "₀₁₂₃₄₅₆₇₈₉")
SUP = str.maketrans("0123456789", "⁰¹²³⁴⁵⁶⁷⁸⁹")
print(example_string.translate(SUP))
print(example_string.translate(SUB))
Which will output:
A⁰B¹C²D³E⁴F⁵G⁶H⁷I⁸J⁹
A₀B₁C₂D₃E₄F₅G₆H₇I₈J₉
Note that this won't work in Python 2 - see Python 2 maketrans() function doesn't work with Unicode for an explanation of why that's the case, and how to work around it.

The output performed on the console is simple text. If the terminal supports unicode (most do nowadays) you can use unicode's subscripts. (e.g H₂) Namely the subscripts are in the ranges:
0x208N for numbers, +, -, =, (, ) (N goes from 0 to F)
0x209N for letters
For example:
In [6]: print(u'H\u2082O\u2082')
H₂O₂
For more complex output you must use a markup language (e.g. HTML) or a typesetting language (e.g. LaTeX).

Using code like this works too:
print('\N{GREEK SMALL LETTER PI}r\N{SUPERSCRIPT TWO}')
print('\N{GREEK CAPITAL LETTER THETA}r\N{SUBSCRIPT TWO}')
The output being:
πr²
Θ₂
Note that this works on Python versions 3.3 and higher only. Unicode formatting.

If you want to use it on the axes of a plot you can do:
import matplotlib.pyplot as plt
plt.plot([1])
plt.ylabel(r'$H_{2}$')
plt.show()
which gives

By using this code you can use alphabets on the superscript and subscript
In This code
format() is Function and in Format function ('\unicode')
By using this table (Unicode subscripts and superscripts on Wikipedia) you can give suitable unicode to the suitable one
you can use superscript and sub script
"10{}".format('\u00B2') # superscript 2

Related

What does = (equal) do in f-strings inside the expression curly brackets?

The usage of {} in Python f-strings is well known to execute pieces of code and give the result in string format (some tutorials here). However, what does the '=' at the end of the expression mean?
log_file = open("log_aug_19.txt", "w")
console_error = '...stuff...' # the real code generates it with regex
log_file.write(f'{console_error=}')
This is actually a brand-new feature as of Python 3.8.
Added an = specifier to f-strings. An f-string such as f'{expr=}'
will expand to the text of the expression, an equal sign, then the
representation of the evaluated expression.
Essentially, it facilitates the frequent use-case of print-debugging, so, whereas we would normally have to write:
f"some_var={some_var}"
we can now write:
f"{some_var=}"
So, as a demonstration, using a shiny-new Python 3.8.0 REPL:
>>> print(f"{foo=}")
foo=42
>>>
From Python 3.8, f-strings support "self-documenting expressions", mostly for print de-bugging. From the docs:
Added an = specifier to f-strings. An f-string such as f'{expr=}' will
expand to the text of the expression, an equal sign, then the
representation of the evaluated expression. For example:
user = 'eric_idle'
member_since = date(1975, 7, 31)
f'{user=} {member_since=}'
"user='eric_idle' member_since=datetime.date(1975, 7, 31)"
The usual f-string format specifiers allow more control over how the
result of the expression is displayed:
>>> delta = date.today() - member_since
>>> f'{user=!s} {delta.days=:,d}'
'user=eric_idle delta.days=16,075'
The = specifier will display the whole expression so that calculations
can be shown:
>>> print(f'{theta=} {cos(radians(theta))=:.3f}')
theta=30 cos(radians(theta))=0.866
This was introduced in python 3.8. It helps reduce a lot of f'expr = {expr} while writing codes. You can check the docs at What's new in Python 3.8.
A nice example was shown by Raymond Hettinger in his tweet:
>>> from math import radians, sin
>>> for angle in range(360):
print(f'{angle=}\N{degree sign} {(theta:=radians(angle))=:.3f}')
angle=0° (theta:=radians(angle))=0.000
angle=1° (theta:=radians(angle))=0.017
angle=2° (theta:=radians(angle))=0.035
angle=3° (theta:=radians(angle))=0.052
angle=4° (theta:=radians(angle))=0.070
angle=5° (theta:=radians(angle))=0.087
angle=6° (theta:=radians(angle))=0.105
angle=7° (theta:=radians(angle))=0.122
angle=8° (theta:=radians(angle))=0.140
angle=9° (theta:=radians(angle))=0.157
angle=10° (theta:=radians(angle))=0.175
...
You can also check out this to get the underlying idea on why this was proposed in the first place.
As mention here:
Equals signs are now allowed inside f-strings starting with Python 3.8. This lets you quickly evaluate an expression while outputting the expression that was evaluated. It's very handy for debugging.:
It mean it will run the execution of the code in the f-string braces, and add the result at the end with the equals sign.
So it virtually means:
"something={executed something}"
f'{a_string=}' is not exactly the same as f'a_string={a_string}'
The former escapes special characters while the latter does not.
e.g:
a_string = 'word 1 tab \t double quote \\" last words'
print(f'a_string={a_string}')
print(f'{a_string=}')
gets:
a_string=word 1 tab double quote \" last words
a_string='word 1 tab \t double quote \\" last words
I just realised that the difference is that the latter is printing the repr while the former is just printing the value. So, it would be more accurate to say:
f'{a_string=}' is the same as f'a_string={a_string!r}'
and allows formatting specifications.

Is there a proper way to set compound greek letters as a symbol in SymPy?

As silly as it may sound, I would like to use compound greek letters as a single symbol in SymPy. For example, if the following is entered in a Jupyter notebook:
import sympy as sp
ab = sp.Symbol("alpha beta")
sp.pprint(ab)
ab behaves as desired when used in symbolic manipulations, but the output is:
alpha beta
I would like the output to be:
α⋅β
I could use the subs command after manipulations, like so:
ab.subs({ab : sp.Symbol("alpha") * sp.Symbol("beta")})
but this is tedious and undesirable.
Symbol names can be any string, but the automatic conversion of greek letter names to greek letters for printing doesn't work for all input. I guess it doesn't try to split the string in spaces.
If you are using the Jupyter notebook, you can just set the symbol name to be the LaTeX of what you want
ab = Symbol(r'\alpha\cdot\beta')
(don't forget to prefix the string with r, so that Python doesn't eat the backslashes)
If you are using plain text output, you can set it to the Unicode string. This should work in the Jupyter notebook as well, although it will render slightly differently since it will be rendering the actual Unicode characters instead of the LaTeX.
ab = Symbol(u'α⋅β')
The proper way to set any unicode letter as symbol in sympy is - writing symbols(u'any_unicode_you_want_just_copy_paste')
Here is a snap :
Code :
ab = symbols(r'\alpha\cdot\beta')
print(ab)
a = symbols(u'α')
b = symbols(u'β')
bangla = symbols(u'ম')
print(a)
print(b)
print(bangla)
I cut those lines in my snap because it doesn't work currently and has higher scores

Replicating behavior of the Python string.split() function in Qt

I'm currently trying to exactly replicate the behavior of the Python split() function (the default version, without any arguments) in Qt.
I have been told that the default delimiter is any number of CR/LF/TAB symbols, therefore I tried using the following:
s_body.split(QRegExp("[\r\n\t ]+"), QString::SkipEmptyParts);
However, this does not replicate its behavior precisely.
If I run this on approximately 4 megabytes worth of text, and count the number of unique words, i get 133293. However, if I do the same using the Python function, the result becomes 133367 - therefore there is still something amiss.
Any feedback on how to fix this would be greatly welcome.
My guess is that Python is not skipping empty strings, and they are accounting for the difference. If you want your function to mimic's Python functionality, you can choose to include empty strings, or if you want to get the behavior you've implemented, you can write s_body.split() in Python; with no arguments, it strips all whitespace between non-whitespace characters, which means you get no empty strings back.
With a unicode string, python's split() will, quite naturally, split on the set of all unicode whitespace characters, not just the feeble ascii set:
>>> s = '\t_\n_\x0b_\x0c_\r_ _\x85_\xa0_\u1680_\u2000_\u2001_\u2002_\u2003_\u2004_\u2005_\u2006_\u2007_\u2008_\u2009_\u200a_\u2028_\u2029_\u202f_\u205f_\u3000_'
>>> len(s)
50
>>> len(s.split())
25
>>> ''.join(s.split())
'_________________________'
Now let's see what Qt does (using PyQt4):
>>> qs = QString(s)
>>> r = qs.split(QRegExp('\\s+'), QString.SkipEmptyParts)
>>> r.count()
24
>>> str(r.join(''))
'______\x85___________________'
So, almost there, but for some reason U+0085 NEL (Next Line) is not recognzed as whitespace in Qt4 - but that's easily remedied:
>>> r = qs.split(QRegExp('[\\s\x85]+'), QString.SkipEmptyParts)
>>> r.count()
25
>>> str(r.join(''))
'_________________________'

get escaped unicode code from string

I seem to be having the opposite issue as everyone else in the development world. I need to generate escaped characters from strings. For instance, say I have the word MESSAGE:, I need to generate:
\\u004D\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003A\\u0053\\u0069\\u006D
The closest thing I could get using Python was:
u'MESSAGE:'.encode('utf16')
# output = '\xff\xfeM\x00E\x00S\x00S\x00A\x00G\x00E\x00:\x00'
My first thought was that I could replace \x with \u00 (or something to that effect), but I quickly realized that wouldn't work. What can I do to output the escaped (unescaped?) string in Python (preferably)?
Before everyone starts "answering" and down voting, the escaped \u00... string is what my app is getting from another 3rd party app which I have no control over. I'm trying to generate my own test data so I don't have to rely on that 3rd party app.
Pierre's answer is nearly right, but the for x in u'MESSAGE:' bit would fail for characters above U+FFFF, except for ‘narrow builds’ (primarily Python 1.6–3.2 on Windows) which use UTF-16 for Unicode strings.
On ‘wide builds’ (and in 3.3+ where the distinction no longer exists), len(unichr(0x10000)) is 1 not 2. When this code point is UTF-16BE-encoded you get two surrogates taking up four bytes, so the output is '\\uD800DC00' instead of what you probably wanted, u'\\uD800\\uDC00'.
To cover it on both variants of Python you can do:
>>> h = u'MESSAGE:\U00010000'.encode('utf-16be').encode('hex')
# '004d004500530053004100470045003ad800dc00'
>>> ''.join(r'\u' + h[i:i+4] for i in range(0, len(h), 4))
'\\u004d\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003a\\ud800\\udc00'
I think this (quick & dirty) code does what you want:
''.join('\\u' + x.encode('utf_16_be').encode('hex') for x in u'MESSAGE:')
# output: '\\u004d\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003a'
Or if you want more '\':
''.join('\\\\u' + x.encode('utf_16_be').encode('hex') for x in u'MESSAGE:')
# output: '\\\\u004d\\\\u0045\\\\u0053\\\\u0053\\\\u0041\\\\u0047\\\\u0045\\\\u003a'
print _
# output: \\u004d\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003a
If you absolutely need upper-case for hexadecimal codes:
''.join('\\u' + x.encode('utf_16_be').encode('hex').upper() for x in u'MESSAGE:')
# output: '\\u004D\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003A'
There's no need to go through the .encode() step if you don't have characters outside the BMP (>0xFFFF):
>>> ''.join('\\u{:04x}'.format(ord(a)) for a in u'Message')
'\\u004d\\u0065\\u0073\\u0073\\u0061\\u0067\\u0065'

How to do a fancy print in ipython using sympy.pprint()

I'm trying to pprint() in Sympy a variable that I call barphi. What I want to get is
$\bar{\phi}$
when printed as pprint(barphi).
I try
barphy = Symbol('\bar{phi}')
but it does not work. Any help? Thanks in advance.
This was answered on the SymPy mailing list.
There are two issues with what you wrote
First, Python converts \ + character in strings as escaping. The \b in your string becomes a backspace (see https://en.wikipedia.org/wiki/ASCII#ASCII_control_code_chart).
You need to either escape the \, i.e., use '\\bar{\\phi}$', or, much easier, if you don't care about escaping, use a raw string, which just means to put an r in front of the quotes, like r'\bar{\phi}'.
Second, if you want to get LaTeX, pprint() will not do it (pprint pretty prints to 2D text). You should use init_printing() to enable LaTeX printing in the notebook.
Finally, as pointed out by Julien Rioux on the mailing list, you can just name the symbol phibar, and SymPy will automatically render it as \bar{\phi}, as you can see here even in Unicode
In [11]: Symbol('phibar')
Out[11]: φ̅
If you still want to get the latex code rather than printing it, you can do so by:
In [2]: from sympy.printing.latex import latex, translate
In [3]: latex(translate('phibar'),mode='inline')
Out[3]: '$\\bar{\\phi}$'
you can see the documentation for latex function here
The documentation for translate function are
Check for a modifier ending the string. If present, convert the
modifier to latex and translate the rest recursively.
Given a description of a Greek letter or other special character,
return the appropriate latex.
Let everything else pass as given.
>>> from sympy.printing.latex import translate
>>> translate('alphahatdotprime')
"{\\dot{\\hat{\\alpha}}}'"

Categories