Conversion of strings like \\uXXXX in python [duplicate] - python

This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed 7 months ago.
I receive a string like this from a third-party service:
>>> s
'\\u0e4f\\u032f\\u0361\\u0e4f'
I know that this string actually contains sequences of a single backslash, lowercase u etc. How can I convert the string such that the '\\u0e4f' is replaced by '\u0e4f' (i.e. '๏'), etc.? The result for this example input should be '๏̯͡๏'.

In 2.x:
>>> u'\\u0e4f\\u032f\\u0361\\u0e4f'.decode('unicode-escape')
u'\u0e4f\u032f\u0361\u0e4f'
>>> print u'\\u0e4f\\u032f\\u0361\\u0e4f'.decode('unicode-escape')
๏̯͡๏

There's an interesting list of encodings supported by .encode() and .decode() methods. Those magic ones in the second table include the unicode_escape.

Python3:
bytes("\\u0e4f\\u032f\\u0361\\u0e4f", "ascii").decode("unicode-escape")

Related

How Python decodes UTF8 Encoding in String Format [duplicate]

This question already has answers here:
Convert "\x" escaped string into readable string in python
(4 answers)
Closed 1 year ago.
Now there is a string of utf-8:
s = '\\346\\235\\216\\346\\265\\267\\347\\216\\211'
I need to decode it, but now I only do it in this way:
result = eval(bytes(f"b'{s}'", encoding="utf8")).decode('utf-8')
This is not safe, so is there a better way?
Use ast.literal_eval(), it's not unsafe.
Then you don't need to call bytes(), since it will return a byte string.
result = ast.literal_eval(f"b'{s}'").decode('utf-8')
Might be what you are hoping to get ... :
'\\346\\235\\216\\346\\265\\267\\347\\216\\211'.encode('utf8').decode('unicode-escape')
you can do decoded_string = s.decode("utf8")

Python console returns single quotes for string? [duplicate]

This question already has answers here:
Understanding difference between Double Quote and Single Quote with __repr__()
(3 answers)
Closed 2 years ago.
Why does the Python console returns single quotes for string literals for all types of quote delimeters?
>> '1'
'1'
>> "1"
'1'
>>"""1"""
'1'
Python makes it convenient for itself to handle objects in a simple manner. If you really need an explicit way of memorizing quotes use literal quote characters.
>>> "\"\"Look, I'm around two qoutes!!!\"\""
'""Look, I\'m around two double qoutes!!!""'

Changing string to ascii in python [duplicate]

This question already has answers here:
Convert a Unicode string to a string in Python (containing extra symbols)
(12 answers)
Closed 3 years ago.
I need to convert word
name = 'Łódź'
to ASCII characters
output: 'Lodz'
I can't import any library like unicodedata.
I need to do it in clear python.
I've tried to encode than decode and nothing worked.
Well, a simple method would be to map and replace. This also does not require any special imports.
name = 'Łódź'
name=name.replace('Ł','L')
name=name.replace('ó','o')
name=name.replace('ź','z')
print(name)

Are there literal strings in Python [duplicate]

This question already has answers here:
How to write string literals in Python without having to escape them?
(6 answers)
Closed 5 years ago.
In F# there is something called a literal string (not string literal), basically if a string literal is preceded by # then it is interpreted as-is, without any escapes.
For example if you want to write the path of a file in Windows(for an os.walk for example) you would do it like this:
"d:\\projects\\re\\p1\\v1\\pjName\\log\\"
Or you could do this(the F# way):
#"d:\projects\re\p1\v1\pjName\log\"
The second variant looks much more clear and pleasing to the eye. Is there something of the sort in python? The documentation doesn't seem to have anything regarding that.
I am working in Python 3.6.3.
There are: https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
You can use r prefix.
https://docs.python.org/2.0/ref/strings.html
TL;DR use little r
myString = r'\n'

Convert bytes data inside a string to a true bytes object [duplicate]

This question already has answers here:
What is the difference between UTF-8 and ISO-8859-1? [closed]
(8 answers)
Converting Byte to String and Back Properly in Python3?
(2 answers)
Process escape sequences in a string in Python
(8 answers)
Closed 7 months ago.
In Python 3, I have a string like the following:
mystr = "\x00\x00\x01\x01\x80\x02\xc0\x02\x00"
This string was read from a file and it is the bytes representation of some text. To be clear, this is a unicode string, not a bytes object.
I need to transform mystr into a bytes object like the following:
mybytes = b"\x00\x00\x01\x01\x80\x02\xc0\x02\x00"
Notice that the translation should be literal. I don't want to encode the string.
Running .encode('utf-8') will escape the \.
It I manually copy and past the content into a bytes string, then everything works. What I couldn't find anywhere is how could I convert it without copy+paste.
mystr.encode("latin-1") is what you want.

Categories