python prefix string with backslash - python

I am looking for a way to prefix strings in python with a single backslash, e.g. "]" -> "]". Since "\" is not a valid string in python, the simple
mystring = '\' + mystring
won't work. What I am currently doing is something like this:
mystring = r'\###' + mystring
mystring.replace('###','')
While this works most of the time, it is not elegant and also can cause problems for strings containing "###" or whatever the "filler" is set to. Is there a bette way of doing this?

You need to escape the backslash with a second one, to make it a literal backslash:
mystring = "\\" + mystring
Otherwise it thinks you're trying to escape the ", which in turn means you have no quote to terminate the string
Ordinarily, you can use raw string notation (r'string'), but that won't work when the backslash is the last character
The difference between print a and just a:
>>> a = 'hello'
>>> a = '\\' + a
>>> a
'\\hello'
>>> print a
\hello

Python strings have a feature called escape characters. These allow you to do special things inside as string, such as showing a quote (" or ') without closing the string you're typing
See this table
So when you typed
mystring = '\' + mystring
the \' is an escaped apostrophe, meaning that your string now has an apostrophe in it, meaning it isn't actually closed, which you can see because the rest of that line is coloured.
To type a backslash, you must escape one, which is done like this:
>>> aBackSlash = '\\'
>>> print(aBackSlash)
\

You should escape the backslash as follows:
mystring = "\\" + mystring
This is because if you do '\' it will end up escaping the second quotation. Therefore to treat the backslash literally, you must escape it.
Examples
>>> s = 'hello'
>>> s = '\\' + s
>>> print
\hello
Your case
>>> mystring = 'it actually does work'
>>> mystring = '\\' + mystring
>>> print mystring
\it actually does work

As a different way of approaching the problem, have you considered string formatting?
r'\%s' % mystring
or:
r'\{}'.format(mystring)

Related

Opening a simple file using OS in python [duplicate]

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.

Python / Remove special character from string

I'm writing server side in python.
I noticed that the client sent me one of the parameter like this:
"↵ tryit1.tar↵ "
I want to get rid of spaces (and for that I use the replace command), but I also want to get rid of the special character: "↵".
How can I get rid of this character (and other weird characters, which are not -,_,*,.) using python command?
A regex would be good here:
re.sub('[^a-zA-Z0-9-_*.]', '', my_string)
>>> import string
>>> my_string = "↵ tryit1.tar↵ "
>>> acceptable_characters = string.letters + string.digits + "-_*."
>>> filter(lambda c: c in acceptable_characters, my_string)
'tryit1.tar'
I would use a regex like this:
import re
string = "↵ tryit1.tar↵ "
print re.sub(r'[^\w.]', '', string) # tryit1.tar

How to get rid of double backslash in python windows file path string? [duplicate]

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.

Python Replace \\ with \ [duplicate]

This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed 7 months ago.
So I can't seem to figure this out... I have a string say, "a\\nb" and I want this to become "a\nb". I've tried all the following and none seem to work;
>>> a
'a\\nb'
>>> a.replace("\\","\")
File "<stdin>", line 1
a.replace("\\","\")
^
SyntaxError: EOL while scanning string literal
>>> a.replace("\\",r"\")
File "<stdin>", line 1
a.replace("\\",r"\")
^
SyntaxError: EOL while scanning string literal
>>> a.replace("\\",r"\\")
'a\\\\nb'
>>> a.replace("\\","\\")
'a\\nb'
I really don't understand why the last one works, because this works fine:
>>> a.replace("\\","%")
'a%nb'
Is there something I'm missing here?
EDIT I understand that \ is an escape character. What I'm trying to do here is turn all \\n \\t etc. into \n \t etc. and replace doesn't seem to be working the way I imagined it would.
>>> a = "a\\nb"
>>> b = "a\nb"
>>> print a
a\nb
>>> print b
a
b
>>> a.replace("\\","\\")
'a\\nb'
>>> a.replace("\\\\","\\")
'a\\nb'
I want string a to look like string b. But replace isn't replacing slashes like I thought it would.
There's no need to use replace for this.
What you have is a encoded string (using the string_escape encoding) and you want to decode it:
>>> s = r"Escaped\nNewline"
>>> print s
Escaped\nNewline
>>> s.decode('string_escape')
'Escaped\nNewline'
>>> print s.decode('string_escape')
Escaped
Newline
>>> "a\\nb".decode('string_escape')
'a\nb'
In Python 3:
>>> import codecs
>>> codecs.decode('\\n\\x21', 'unicode_escape')
'\n!'
You are missing, that \ is the escape character.
Look here: http://docs.python.org/reference/lexical_analysis.html
at 2.4.1 "Escape Sequence"
Most importantly \n is a newline character.
And \\ is an escaped escape character :D
>>> a = 'a\\\\nb'
>>> a
'a\\\\nb'
>>> print a
a\\nb
>>> a.replace('\\\\', '\\')
'a\\nb'
>>> print a.replace('\\\\', '\\')
a\nb
r'a\\nb'.replace('\\\\', '\\')
or
'a\nb'.replace('\n', '\\n')
Your original string, a = 'a\\nb' does not actually have two '\' characters, the first one is an escape for the latter. If you do, print a, you'll see that you actually have only one '\' character.
>>> a = 'a\\nb'
>>> print a
a\nb
If, however, what you mean is to interpret the '\n' as a newline character, without escaping the slash, then:
>>> b = a.replace('\\n', '\n')
>>> b
'a\nb'
>>> print b
a
b
It's because, even in "raw" strings (=strings with an r before the starting quote(s)), an unescaped escape character cannot be the last character in the string. This should work instead:
'\\ '[0]
In Python string literals, backslash is an escape character. This is also true when the interactive prompt shows you the value of a string. It will give you the literal code representation of the string. Use the print statement to see what the string actually looks like.
This example shows the difference:
>>> '\\'
'\\'
>>> print '\\'
\
In Python 3 it will be:
bytes(s, 'utf-8').decode("unicode_escape")
This works on Windows with Python 3.x:
import os
str(filepath).replace(os.path.sep, '/')
Where: os.path.sep is \ on Windows and / on Linux.
Case study
Used this to prevent errors when generating a Markdown file then rendering it to pdf.
path = "C:\\Users\\Programming\\Downloads"
# Replace \\ with a \ along with any random key multiple times
path.replace('\\', '\pppyyyttthhhooonnn')
# Now replace pppyyyttthhhooonnn with a blank string
path.replace("pppyyyttthhhooonnn", "")
print(path)
#Output...
C:\Users\Programming\Downloads

How can I strip first and last double quotes?

I want to strip double quotes from:
string = '"" " " ""\\1" " "" ""'
to obtain:
string = '" " " ""\\1" " "" "'
I tried to use rstrip, lstrip and strip('[^\"]|[\"$]') but it did not work.
How can I do this?
If the quotes you want to strip are always going to be "first and last" as you said, then you could simply use:
string = string[1:-1]
If you can't assume that all the strings you process have double quotes you can use something like this:
if string.startswith('"') and string.endswith('"'):
string = string[1:-1]
Edit:
I'm sure that you just used string as the variable name for exemplification here and in your real code it has a useful name, but I feel obliged to warn you that there is a module named string in the standard libraries. It's not loaded automatically, but if you ever use import string make sure your variable doesn't eclipse it.
IMPORTANT: I'm extending the question/answer to strip either single or double quotes. And I interpret the question to mean that BOTH quotes must be present, and matching, to perform the strip. Otherwise, the string is returned unchanged.
To "dequote" a string representation, that might have either single or double quotes around it (this is an extension of #tgray's answer):
def dequote(s):
"""
If a string has single or double quotes around it, remove them.
Make sure the pair of quotes match.
If a matching pair of quotes is not found,
or there are less than 2 characters, return the string unchanged.
"""
if (len(s) >= 2 and s[0] == s[-1]) and s.startswith(("'", '"')):
return s[1:-1]
return s
Explanation:
startswith can take a tuple, to match any of several alternatives. The reason for the DOUBLED parentheses (( and )) is so that we pass ONE parameter ("'", '"') to startswith(), to specify the permitted prefixes, rather than TWO parameters "'" and '"', which would be interpreted as a prefix and an (invalid) start position.
s[-1] is the last character in the string.
Testing:
print( dequote("\"he\"l'lo\"") )
print( dequote("'he\"l'lo'") )
print( dequote("he\"l'lo") )
print( dequote("'he\"l'lo\"") )
=>
he"l'lo
he"l'lo
he"l'lo
'he"l'lo"
(For me, regex expressions are non-obvious to read, so I didn't try to extend #Alex's answer.)
To remove the first and last characters, and in each case do the removal only if the character in question is a double quote:
import re
s = re.sub(r'^"|"$', '', s)
Note that the RE pattern is different than the one you had given, and the operation is sub ("substitute") with an empty replacement string (strip is a string method but does something pretty different from your requirements, as other answers have indicated).
If string is always as you show:
string[1:-1]
Almost done. Quoting from http://docs.python.org/library/stdtypes.html?highlight=strip#str.strip
The chars argument is a string
specifying the set of characters to be
removed.
[...]
The chars argument is not a prefix or
suffix; rather, all combinations of
its values are stripped:
So the argument is not a regexp.
>>> string = '"" " " ""\\1" " "" ""'
>>> string.strip('"')
' " " ""\\1" " "" '
>>>
Note, that this is not exactly what you requested, because it eats multiple quotes from both end of the string!
Remove a determinated string from start and end from a string.
s = '""Hello World""'
s.strip('""')
> 'Hello World'
Starting in Python 3.9, you can use removeprefix and removesuffix:
'"" " " ""\\1" " "" ""'.removeprefix('"').removesuffix('"')
# '" " " ""\\1" " "" "'
If you are sure there is a " at the beginning and at the end, which you want to remove, just do:
string = string[1:len(string)-1]
or
string = string[1:-1]
I have some code that needs to strip single or double quotes, and I can't simply ast.literal_eval it.
if len(arg) > 1 and arg[0] in ('"\'') and arg[-1] == arg[0]:
arg = arg[1:-1]
This is similar to ToolmakerSteve's answer, but it allows 0 length strings, and doesn't turn the single character " into an empty string.
in your example you could use strip but you have to provide the space
string = '"" " " ""\\1" " "" ""'
string.strip('" ') # output '\\1'
note the \' in the output is the standard python quotes for string output
the value of your variable is '\\1'
Below function will strip the empty spces and return the strings without quotes. If there are no quotes then it will return same string(stripped)
def removeQuote(str):
str = str.strip()
if re.search("^[\'\"].*[\'\"]$",str):
str = str[1:-1]
print("Removed Quotes",str)
else:
print("Same String",str)
return str
find the position of the first and the last " in your string
>>> s = '"" " " ""\\1" " "" ""'
>>> l = s.find('"')
>>> r = s.rfind('"')
>>> s[l+1:r]
'" " " ""\\1" " "" "'

Categories