(Python) Print all '\' escape characters - python

I have several strings with \ in it. I try to not using escape characters by using the replace function like that:
string.replace("\\", "\\\\")
It does not work.
When I use string.replace("\n", "\\n") for \n directly it works.
Is there an easy working solution for that?

Important to note that replace does not act on the variable but rather returns a copy with the replacement applied See the docs:
this = "string\\string"
this.replace("\\", "")
print(this)
>"string\\string"
If you want the original string to be replaced,
this = this.replace("\\", "")
However, to convert \n -> n would require that you think differently about this. i.e. replace('\n', 'n'), as \n is a single character.

You can try this:
def test():
foo = r"this is a \\ and this is a \ "
foo = foo.replace("\\" , r'\\') # added the r to ignore string escapes as escapes
print(foo)
test()
output:
this is a \\\\ and this is a \\

Use raw strings.
r"This \ string \ includes '\'s"

Related

How to properly unescape select sequences in python

I'm escaping certain characters in strings (e.g., \n, \\) with double backslashes, like this: text.replace("\\", "\\\\").replace("\n", "\\n")
Naïvely, I tried to unescape using: text.replace("\\n", "\n").replace("\\\\", "\\")
However, this fails on strings like:
>>> text = "\\\n\\n"
>>> print(text)
\
\n
>>> etext = text.replace("\\", "\\\\").replace("\n", "\\n")
>>> print(etext)
\\\n\\n
>>> ftext = etext.replace("\\n", "\n").replace("\\\\", "\\")
>>> print(ftext)
\
\
>>>
As you can see the original string doesn't survive the round trip.
Even changing the order of replaces around would not solve the issue.
The only way to correctly unescape is to do the replacements in one go.
Python's str has maketrans and translate to achieve a similar effect
but they only work on single characters as keys.
re.sub also does not work since the substitution would need to distinguish the case somehow. (\1 does not work since if the second character is n we want the newline character as output instead of n)
A correct (but slow) solution would be:
def unescape(text: str) -> str:
res: list[str] = []
in_escape = False
for c in text:
if in_escape:
in_escape = False
if c == "\\":
res.append("\\")
continue
if c == "n":
res.append("\n")
continue
if c == "\\":
in_escape = True
continue
res.append(c)
return "".join(res)
>>> text = "\\\n\\n"
>>> print(text)
\
\n
>>> etext = text.replace("\\", "\\\\").replace("\n", "\\n")
>>> print(etext)
\\\n\\n
>>> print(unescape(etext))
\
\n
>>>
Is there a proper/canonical/fast way of escaping (only certain sequences in) strings?
(EDIT: to answer why a subset of escapes is preferred. in my case other escapes are not needed and it's easy to permanently corrupt your data by escaping things that don't need to. for example, from the top of my head I can think of three different escape functions just in python alone that all escape completely different subsets of characters. even the str.escape function changes what it escapes between python versions. now most of the time unescape can handle a wider set of escape sequences than its corresponding escape function but this is not always the case. this all doesn't even take into account trying to load the escaped data in a different language)

Python prevent decoding HEX to ASCII while removing backslashes from my Var

I want to strip some unwanted symbols from my variable. In this case the symbols are backslashes. I am using a HEX number, and as an example I will show some short simple code down bellow. But I don't want python to convert my HEX to ASCII, how would I prevent this from happening.? I have some long shell codes for asm to work with later which are really long and removing \ by hand is a long process. I know there are different ways like using echo -e "x\x\x\x" > output etc, but my whole script will be written in python.
Thanks
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> b = a.strip("\\")
>>> print b
1�Phtv
>>> a = "\x31\x32\x33\x34\x35\x36"
>>> b = a.strip("\\")
>>> print b
123456
At the end I would like it to print my var:
>>> print b
x31x32x33x34x35x36
There are no backslashes in your variable:
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> print(a)
1ÀPhtv
Take newline for example: writing "\n" in Python will give you string with one character -- newline -- and no backslashes. See string literals docs for full syntax of these.
Now, if you really want to write string with such backslashes, you can do it with r modifier:
>>> a = r"\x31\xC0\x50\x68\x74\x76"
>>> print(a)
\x31\xC0\x50\x68\x74\x76
>>> print(a.replace('\\', ''))
x31xC0x50x68x74x76
But if you want to convert a regular string to hex-coded symbols, you can do it character by character, converting it to number ("\x31" == "1" --> 49), then to hex ("0x31"), and finally stripping the first character:
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> print(''.join([hex(ord(x))[1:] for x in a]))
'x31xc0x50x68x74x76'
There are two problems in your Code.
First the simple one:
strip() just removes one occurrence. So you should use replace("\\", ""). This will replace every backslash with "", which is the same as removing it.
The second problem is pythons behavior with backslashes:
To get your example working you need to append an 'r' in front of your string to indicate, that it is a raw string. a = r"\x31\xC0\x50\x68\x74\x76". In raw strings, a backlash doesn't escape a character but just stay a backslash.
>>> r"\x31\xC0\x50\x68\x74\x76"
'\\x31\\xC0\\x50\\x68\\x74\\x76'

Python replace backward (\) with forward (/)

I am trying to replace \ with /. However, I'm having no success.
Following is the snapshot of the scenario that I am trying to achieve
string = "//SQL-SERVER/Lacie/City of X/Linservo\171002"
print string.replace("\\","/")
Output:
//SQL-SERVER/Lacie/City of X/Linservoy002
Desired output:
//SQL-SERVER/Lacie/City of X/Linservo/171002
You need to escape "\" with an extra "\".
>>> string = "//SQL-SERVER/Lacie/City of X/Linservo\\171002"
>>> string
'//SQL-SERVER/Lacie/City of X/Linservo\\171002'
>>> print string.replace("\\","/")
//SQL-SERVER/Lacie/City of X/Linservo/171002
string = r"//SQL-SERVER/Lacie/City of X/Linservo\171002"
print string.replace("\\","/")
output
//SQL-SERVER/Lacie/City of X/Linservo/171002
You have errors both in replace function and in string definition.
In your string definition \171 gives char with octal value of 171 – y
In you replace function, backslash escapes quote.
You should escape backslashes
string = "//SQL-SERVER/Lacie/City of X/Linservo\\171002"
string.replace("\\","/")
You can simply use ".replace" in python or if you want you can use regex :
import re
string = r"//SQL-SERVER/Lacie/City of X/Linservo\171002"
pattern=r'[\\]'
replaced_string=re.sub(pattern,"/",string)
print(replaced_string)
Since your original question shows : "X/Linservo\171002" here \171 referring to character encoding so it's replacing \171 to "y". you can try this in python interpreter :
In[2]: print("\171")
y

Removing the single quotes after using re.sub() in python

After replacing all word characters in a string with the character '^', using re.sub("\w", "^" , stringorphrase) I'm left with :
>>> '^^^ ^^ ^^^^'
Is there any way to remove the single quotes so it looks cleaner?
>>> ^^^ ^^ ^^^^
Are you sure it's just not how it's displayed in the interactive prompt or something (and there aren't actually apost's in your string)?
If the ' is actually part of the string, and is first/last then either:
string = string.strip("'")
or:
string = string[1:-1] # lop ending characters off
Use the print statement. The quotes aren't actually part of the string.
To remove all occurrences of single quotes:
mystr = some_string_with_single_quotes
answer = mystr.replace("'", '')
To remove single quotes ONLY at the ends of the string:
mystr = some_string_with_single_quotes
answer = mystr.strip("'")
Hope this helps

Escape string and split it right after

i've the following code:
import re
key = re.escape('#one #two #some #tests #are #done')
print(key)
key = key.split()
print(key)
and the following output:
\#one\ \#two\ \#some\ \#tests\ \#are\ \#done
['\\#one\\', '\\#two\\', '\\#some\\', '\\#tests\\', '\\#are\\', '\\#done']
How come the backslashes are duplicated? I just want them once in my list, because i would like to use this list in a regular expression.
Thanks in advance! John
There is only one backslash each, but when printing the repr of the strings, they are duplicated (escaped) - just as you would need to duplicate them when using a string to build a regex. So everything is fine.
For example:
>>> len("\\")
1
>>> len("\\n")
2
>>> len("\n")
1
>>> print "\\n"
\n
>>> print "\n"
>>>
The \ character is an escape character, that is a character that changes the meaning of the subsequent character[s]. For example the "n" character is simply an "n". But if you escape it like "\n" it becomes the "newline" character. So, if you need to use a \ literal, you need to escape it with... itself: \\
The backslashes are not duplicated. To realize this, try to do:
for element in key:
print element
And you will see this output:
\#one\
\#two\
\#some\
\#tests\
\#are\
\#done
When you have printed whole list, the python used representation where strings are printed not as they are, but they are printed as python expression (notice the quotes "", they are not in the strings)
To actually encode string containing backslash, you need to duplicate that backslash. That is it.
When you convert a list to a string (e.g. to print it), it calls repr on each object contained in the list. That's why you get the quotes and extra backslashes in your second line of output. Try this:
s = "\\a string with an escaped backslash"
print s # prints: \a string with an escaped backslash
print repr(s) # prints: '\\a string with an escaped backslash'
The repr call puts quotes around the string, and shows the backslash escapes.

Categories