This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 7 months ago.
What is the way for preventing Python from interpreting \ followed by numbers as something else?
e.g.
I get DirectoryNameFromAnotherProgram (say it is equal to 'N:\Some Directory')
print DirectoryNameFromAnotherProgram + '1234.txt'
# prints:
# N:\Some DirectoryS4.txt
Since the string with "\" comes as output from another script, I do not have a choice to change it.
Put a "\" in front of the "\". The meaning of "\" in a sting is: the next character doesn't mean what it normally means. If the next character was not normally special (for example, if it's a digit), it means something special now. If the next character does normally mean something special (for example, a backslash), it's not special now. Either way, the initial "\" has done its thing, and is removed.
Special case: if the next character is not normally special (for example, the "S" in your string), but cannot be made special (the sequence "\S" has no special meaning), then the backslash doesn't do anything and is not removed.
Related
This question already has answers here:
Why do backslashes appear twice?
(2 answers)
Closed 7 months ago.
I found a python package on GitHub that doesn't work. It attempts to replace a substring within a url with another string.
string = "filename.txt"
rewrite = "c:\\windows\\system32\\drivers\\hosts"
url = "https://www.example.com/path?parameter=filename.txt"
fullrewrite = re.sub(string, rewrite, url)
The string, rewrite, and url parameters are arbitrary and not hard-coded. I just put them there as an example (this is a path traversal testing library I'm trying to play around with).
When I run this code, I get a KeyError from re, which is expected according to the docs:
If you’re not using a raw string to express the pattern, remember that Python also uses the backslash as an escape sequence in string literals; if the escape sequence isn’t recognized by Python’s parser, the backslash and subsequent character are included in the resulting string. However, if Python would recognize the resulting sequence, the backslash should be repeated twice. This is complicated and hard to understand, so it’s highly recommended that you use raw strings for all but the simplest expressions.
I tried using repr to convert the string into a raw string:
raw = repr(rewrite)[1:-1] # [1:-1] removes extra quotes.
fullrewrite = re.sub(string, raw, url)
But this creates double backslashes in the resulting url: https://www.example.com/path?parameter=c:\\windows\\system32\\drivers\\hosts
My question is how am I supposed to have it replace the key word so that the resulting string is: https://www.example.com/path?parameter=c:\windows\system32\drivers\hosts?
This is my understanding, please correct me if i'm wrong.
You don't get double backslashes, but escaped backslashes. In Re and Python, one backslash is a special character. It does not match the backslash character.(or rather, not always) To print one backslash, one would need to escape it with another.(again - most often) Thus, one can say that a double backslash is an internal representation of a backslash.
If one puts 'c:\\' into print() or save it to a 'txt' file, one will get 'c:\'.
P.S. Since '\q' is not a special sequence in Python, '\q'=='\\q' returns True.
This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 7 months ago.
I am writing documentations for a python package with a clear_stop_character function, in which users can provide extra stop-chars in a list. In the documentation I have written:
"""
stoplist (list, default empty): Accepts a list of extra stop characters,
should escape special regex characters (e.g., stoplist=['\\*']).
"""
It is crucial for the users to see the double backslash before the stop-char. However, the help() outcome of the built package shows:
"""
stoplist (list, default empty): Accepts a list of extra stop characters,
should escape special regex characters (e.g., stoplist=['\*']).
"""
So, it will be misleading for the users.
BTW, I did not find a solution based on the previous questions.
Any ideas?
\ in Python is an escape character which tells Python to interpret the character following it literally. This means that \\ tells Python to interpret the second \ literally, thus causing the error where the first backslash is not displayed.
The simplest solution to this problem is to use four backslashes: \\\\. This way, Python sees the first backslash and interprets the second one literally, printing \. Then, the third backslash will tell Python to interpret the fourth one literally like \.
Simply rewrite your code as:
"""
stoplist (list, default empty): Accepts a list of extra stop characters,
should escape special regex characters (e.g., stoplist=['\\\\*']).
"""
This question already has answers here:
How to escape “\” characters in python
(4 answers)
Closed 4 years ago.
I am totally confused with the escape characters in Python. Sometimes I expect it to output one single '/', it prints '//'; Sometimes I use '//' as '/' it works, but other times it doesn't; And so on so on...
Some example:
print('\\hello') #output --> \hello
print(['\\hello']) #output --> ['\\hello']
So how should I understand '\hello', as '\hello' or '\\hello'? Can anyone explain the mechanism of escape characters more generally?
Firstly there is the question of getting the right characters into your strings. Then there is the question of how Python displays your string. The same string can be displayed in two different ways.
>>> s = '\\asd'
>>> s
'\\asd'
>>> print(s)
\asd
In this example the string only has one slash. We use two slashes to create it but that results in a string with one slash. We can see that there's only one slash when we print the string.
But when we display the string simply by typing s we see two slashes. Why is that? In that situation the interpreter shows the repr of the string. That is it shows us the code that would be needed to make the string - we need to use quotes and also two slashes on our code to make a string that then has one slash (as s does).
When you print a list with a string as an element we will see the repr of the string inside the list:
>>> print([s])
['\\asd']
This question already has answers here:
What does the "r" in pythons re.compile(r' pattern flags') mean?
(3 answers)
Closed 5 years ago.
I understand that the 'r' prefix indicates a raw string, hence why in the following example is the 'r' prefix being used, since there are special regex characters in the string, which should not be taken literally?
the 'string' that is being searched is an nltk Text object, I suppose it has something to do with this? However I don't understand how it affects the usage of findall.
moby.findall(r"<a> (<.*>) <man>")
In this particular case, r makes no difference, as this string does not contain any sequences which could be misinterpreted. However, it is a good habit to use r when writing regular expressions, to avoid misinterpretation of sequences like \n or \t; with r, they are treated literally, as two characters - backslash followed by a letter; without r, they evaluate to newline and tab, respectively.
The r preceeding the string is called a sigil.
For example, '\n' will be treated as a newline character, while r'\n' will be treated as the characters \ followed by n.
But for your regex:
moby.findall(r"<a> (<.*>) <man>")
it doesn't make a difference but it is always a good idea to treat regex as raw strings to avoid escaping backslashes.
This question already has answers here:
Why do backslashes appear twice?
(2 answers)
Closed 7 years ago.
Pretty much as question states- I have a code that finds sentences in a big string using regex- findall(). It then uses this sentence later, however when it uses it it puts a backslash infront of any apostrophe, for example Today's becomes Today\'s. Why is this happening, and how can I stop this happening?
It's called escaping a string. When you use " or ' inside of a string use \ to avoid lexical syntax errors. I believe there is a method that removes the escape character from a string if that's what you'd like to do.
The backslash denotes a so called escape sequence, which basically tells python that this character has to be interpreted differently from a "normal" ' character (which would signal the beginning or end of a string for the interpreter).