How to ignore backslashes as escape characters in Python? [duplicate]

How to ignore backslashes as escape characters in Python? [duplicate] - python

This question already has answers here:
How to write string literals in Python without having to escape them?
(6 answers)
Closed 7 months ago.
I know this is similar to many other questions regarding backslashes, but this deals with a specific problem that has yet to have been addressed. Is there a mode that can be used to completely eliminate backslashes as escape characters in a print statement? I need to know this for ascii art, as it is very difficult to find correct positioning when all backslashes must be doubled.
print('''
/\\/\\/\\/\\/\\
\\/\\/\\/\\/\\/
''')
\```

Preface the string with r (for "raw", I think) and it will be interpreted literally without substitutions:
>>> # Your original
>>> print('''
... /\\/\\/\\/\\/\\
... \\/\\/\\/\\/\\/
... ''')
/\/\/\/\/\
\/\/\/\/\/
>>> # as a raw string instead
>>> print(r'''
... /\\/\\/\\/\\/\\
... \\/\\/\\/\\/\\/
... ''')
/\\/\\/\\/\\/\\
\\/\\/\\/\\/\\/
These are often used for regular expressions, where it gets tedious to have to double-escape backslashes. There are a couple other letters you can do this with, including f (for format strings, which act differently), b (a literal bytes object, instead of a string), and u, which used to designate Unicode strings in python 2 and I don't think does anything special in python 3.

Related

Why is the string automatically being changed? Is it because of backslash \ in the string? [duplicate]

This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 2 years ago.
Say I assign variable
x = '\\dnassmb1\biloadfiles_dev\Workday'
print(x)
Output:
'\\dnassmb1\x08iloadfiles_dev\\Workday'
I would like to know why it's changing to "x08.." specifically and how to avoid that automatic change and use string as it is. Thank you!

You are doing wrong.Backslash has a different meaning in pyhton while using in strings.
Backslashes are actually used to put some special character inside the string.
If you want to get the above string printed;
x = '\\\dnassmb1\\biloadfiles_dev\\Workday'
print(x)
If you got this, i am using an extra backslash everywhere where i want a backslash to be printed. This is because the first backslash indicates that what ever is going to come after it is just a part of the string and has no special meaning.

Use raw strings:
x = r'\\dnassmb1\biloadfiles_dev\Workday'
This will prevent python from treating your backslashes as escape sequences. See string and byte literals in the Python documentation for a full treatment of string parsing.
It's important to pay close attention to the difference between representation and value here. Just because a string appears to have four backslashes in it, doesn't mean that those backslashes are in the value of the string. Consider:
>>> x = '\\dnassmb1\biloadfiles_dev\Workday' # regular string
>>> y = r'\\dnassmb1\biloadfiles_dev\Workday' # raw string
>>> print(x); print(y)
\dnassmbiloadfiles_dev\Workday
\\dnassmb1\biloadfiles_dev\Workday
Here, x and y are both just strings, once Python has parsed them. But even though the parts inside the quotes are the same, the bytes of the string are different. In y's case, you see exactly the number of backslashes you put in.

What is the right way to encode a string with backslashes? [duplicate]

This question already has answers here:
How to fix "<string> DeprecationWarning: invalid escape sequence" in Python?
(2 answers)
Closed 4 years ago.
In the given example: "\info\more info\nName"
how would I turn this into bytes
I tried using unicode-escape but that didn't seem to work :(
data = "\info\more info\nName"
dataV2 = str.encode(data)
FinalData = dataV2.decode('unicode-escape').encode('utf_8')
print(FinalData)
This is were I should get b'\info\more info\nName'
but something unexpected happens and I get DeprecationWarnings in my terminal
I'm assuming that its because of the backslashes causing a invalid sequence but I need them for this project

Backslashes before characters indicate an attempt to escape the character that follows to make it into a special character of some sort. You get the DeprecationWarning because Python is (finally) going to make unrecognized escapes an error, rather than silently treating them as a literal backslash followed by the character.
To fix, either double your backslashes (not sure if you intended a newline; if so, double double the backslash before the n):
data = "\\info\\more info\\nName"
or, if you want all the backslashes to be literal backslashes (the \n shouldn't be a newline), then you can use a raw string by prefixing with r:
data = r"\info\more info\nName"
which disables backslashes interpolation for everything except the quote character itself.
Note that if you just let data echo in the interactive interpreter, it will show the backslashes as doubled (because it implicitly uses the repr of the str, which is what you'd type to reproduce it). To avoid that, print the str to see what it would actually look like:
>>> "\\info\\more info\\nName" # repr produced by simply evaluating it, which shows backslashes doubled, but there's really only one each time
"\\info\\more info\\nName"
>>> print("\\info\\more info\\nName") # print shows the "real" contents
\info\more info\nName
>>> print("\\info\\more info\nName") # With new line left in place
\info\more info
Name
>>> print(r"\info\more info\nName") # Same as first option, but raw string means no doubling backslashes
\info\more info\nName

You can escape a backslash with another backslash.
data = "\\info\\more info\nName"
You could also use a raw string for the parts that don't need escapes.
data = r"\info\more info""\nName"
Note that raw strings don't work if the final character is a backslash.

Difference between u"string" and ur"string" in Python [duplicate]

This question already has answers here:
What exactly do "u" and "r" string prefixes do, and what are raw string literals?
(7 answers)
Closed 6 years ago.
From documentation:
The solution is to use Python’s raw string notation for regular
expression patterns; backslashes are not handled in any special way in
a string literal prefixed with 'r'. So r"\n" is a two-character string
containing '\' and 'n', while "\n" is a one-character string
containing a newline. Usually patterns will be expressed in Python
code using this raw string notation.
Types also match; type(u"text") == type(ur"text"), and same goes when you remove u. Therefore, I have to ask: what is the difference between these two? If there is no difference, why use r at all?

For example:
>>> len(ur"tex\t")
5
>>> len(u"tex\t")
4
Without r, the \t is one character (the tab) so the string has length 4.
Use r if you want to build a regular expression that involves \. In an non-r string, you'd have to escape these which is not funny.
>>> len(u"\\")
1
>>> len(ur"\\")
2

python regex: how to remove hex dec characters from string [duplicate]

This question already has answers here:
What does a leading `\x` mean in a Python string `\xaa`
(2 answers)
Closed 7 years ago.
text="\xe2\x80\x94"
print re.sub(r'(\\(?<=\\)x[a-z0-9]{2})+',"replacement_text",text)
output is —
how can I handle the hex decimal characters in this situation?

Your input doesn't have backslashes. It has 3 bytes, the UTF-8 encoding for the U+2014 EM DASH character:
>>> text = "\xe2\x80\x94"
>>> len(text)
3
>>> text[0]
'\xe2'
>>> text.decode('utf8')
u'\u2014'
>>> print text.decode('utf8')
—
You either need to match those UTF-8 bytes directly, or decode from UTF-8 to unicode and match the codepoint. The latter is preferable; always try to deal with text as Unicode to simplify how many characters you have to transform at a time.
Also note that Python's repr() output (which is used impliciltly when echoing in the interactive interpreter or when printing lists, dicts or other containers) uses \xhh escape sequences to represent any non-printable character. For UTF-8 strings, that includes anything outside the ASCII range. You could just replace anything outside that range with:
re.sub(r'[\x80-\xff]+', "replacement_text", text)
Take into account that this'll match multiple UTF-8-encoded characters in a row, and replace these together as a group!

Your input is in hex, not an actual "\xe2\x80\x94".
\x is just the way to say that the following characters should be interpreted in hex.
This was explained in this post.

backslash in Yaml string [duplicate]

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
Closed 8 years ago.
So I'm using yaml for some configuration files and py yaml to parse it.
For one field I have something like:
host: HOSTNAME\SERVER,5858
But when it gets parsed here what I get:
{
"host": "HOSTNAME\\SERVER,5858"
}
With 2 backslashes. I tried every combination of single quotes, double quotes, etc.
What's the best way to parse it correctly ?
Thanks

len("\\") == 1. What you see is the representation of the string as Python string literal. Backslash has special meaning in a Python literal e.g., "\n" is a single character (a newline). To get literal backslash in a string, it should be escaped "\\".

You aren't getting two backslashes. Python is displaying the single backslash as \\ so that you don't think you've actually got a \S character (which doesn't exist... but e.g. \n does, and Python is trying to be as unambiguous as possible) in your string. Here's proof:
>>> data = {"host": "HOSTNAME\\SERVER,5858"}
>>> print(data["host"])
HOSTNAME\SERVER,5858
>>>
For more background, check out the documentation for repr().

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to ignore backslashes as escape characters in Python? [duplicate] - python

Related

Why is the string automatically being changed? Is it because of backslash \ in the string? [duplicate]

What is the right way to encode a string with backslashes? [duplicate]

Difference between u"string" and ur"string" in Python [duplicate]

python regex: how to remove hex dec characters from string [duplicate]

backslash in Yaml string [duplicate]

Categories

Resources