This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.
Related
This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.
I want to strip some unwanted symbols from my variable. In this case the symbols are backslashes. I am using a HEX number, and as an example I will show some short simple code down bellow. But I don't want python to convert my HEX to ASCII, how would I prevent this from happening.? I have some long shell codes for asm to work with later which are really long and removing \ by hand is a long process. I know there are different ways like using echo -e "x\x\x\x" > output etc, but my whole script will be written in python.
Thanks
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> b = a.strip("\\")
>>> print b
1�Phtv
>>> a = "\x31\x32\x33\x34\x35\x36"
>>> b = a.strip("\\")
>>> print b
123456
At the end I would like it to print my var:
>>> print b
x31x32x33x34x35x36
There are no backslashes in your variable:
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> print(a)
1ÀPhtv
Take newline for example: writing "\n" in Python will give you string with one character -- newline -- and no backslashes. See string literals docs for full syntax of these.
Now, if you really want to write string with such backslashes, you can do it with r modifier:
>>> a = r"\x31\xC0\x50\x68\x74\x76"
>>> print(a)
\x31\xC0\x50\x68\x74\x76
>>> print(a.replace('\\', ''))
x31xC0x50x68x74x76
But if you want to convert a regular string to hex-coded symbols, you can do it character by character, converting it to number ("\x31" == "1" --> 49), then to hex ("0x31"), and finally stripping the first character:
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> print(''.join([hex(ord(x))[1:] for x in a]))
'x31xc0x50x68x74x76'
There are two problems in your Code.
First the simple one:
strip() just removes one occurrence. So you should use replace("\\", ""). This will replace every backslash with "", which is the same as removing it.
The second problem is pythons behavior with backslashes:
To get your example working you need to append an 'r' in front of your string to indicate, that it is a raw string. a = r"\x31\xC0\x50\x68\x74\x76". In raw strings, a backlash doesn't escape a character but just stay a backslash.
>>> r"\x31\xC0\x50\x68\x74\x76"
'\\x31\\xC0\\x50\\x68\\x74\\x76'
At some point our python script receives string like that:
In [1]: ab = 'asd\xeffe\ctive'
In [2]: print ab
asd�fe\ctve \ \\ \\\k\\\
Data is damaged we need escape \x to be properly interpreted as \x but \c has not special meaning in string thus must be intact.
So far the closest solution I found is do something like:
In [1]: ab = 'asd\xeffe\ctve \\ \\\\ \\\\\\k\\\\\\'
In [2]: print ab.encode('string-escape').replace('\\\\', '\\').replace("\\'", "'")
asd\xeffe\ctve \ \\ \\\k\\\
Output taken from IPython, I assumed that ab is a string not unicode string (in the later case we would have to do something like that:
def escape_string(s):
if isinstance(s, str):
s = s.encode('string-escape').replace('\\\\', '\\').replace("\\'", "'")
elif isinstance(s, unicode):
s = s.encode('unicode-escape').replace('\\\\', '\\').replace("\\'", "'")
return s
\xhh is an escape character and \x is seen as the start of this escape.
'\\' is the same as '\x5c'. It is just two different ways to write the backslash character as a Python string literal.
These literal strings: r'\c', '\\c', '\x5cc', '\x5c\x63' are identical str objects in memory.
'\xef' is a single byte (239 as an integer), but r'\xef' (same as '\\xef') is a 4-byte string: '\x5c\x78\x65\x66'.
If s[0] returns '\xef' then it is what s object actually contains. If it is wrong then fix the source of the data.
Note: string-escape also escapes \n and the like:
>>> print u'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.encode('unicode-escape')
\xef\\c\\\u2603"'\u2603\u2603"'\n\xa0
>>> print b'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.encode('string-escape')
\xef\\c\\\\N{SNOWMAN}"\'\xe2\x98\x83\\u2603"\'\n\xa0
backslashreplace is used only on characters that cause UnicodeEncodeError:
>>> print u'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''
ï\c\☃"'☃☃"'
>>> print b'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''
�\c\\N{SNOWMAN}"'☃\u2603"'
�
>>> print u'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.encode('ascii', 'backslashreplace')
\xef\c\\u2603"'\u2603\u2603"'
\xa0
>>> print b'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.decode('latin1').encode('ascii', 'backslashreplace')
\xef\c\\N{SNOWMAN}"'\xe2\x98\x83\u2603"'
\xa0
Backslashes introduce "escape sequences". \x specifically allows you to specify a byte, which is given as two hexadecimal digits after the x. ef are two hexadecimal digits, hence you get no error. Double the backslash to escape it, or use a raw string r"\xeffective".
Edit: While the Python console may show you '\\', this is precisely what you expect. You just say you expect something else because you confuse the string and its representation. It's a string containing a single backslash. If you were to output it with print, you'd see a single backslash.
But the string literal '\' is ill-formed (not closed because \' is an apostrophe, not a backslash and end-of-string-literal), so repr, which formats the results at the interactive shell, does not produce it. Instead it produces a string literal which you could paste into Python source code and get the same string object. For example, len('\\') == 1.
The \x escape sequence signifies a Unicode character in the string, and ef is being interpreted as the hex code. You can sanitize the string by adding an additional \, or else make it a raw string (r'\xeffective').
>>> r'\xeffective'[0]
'\\'
EDIT: You could convert an existing string using the following hack:
>>> a = '\xeffective'
>>> b = repr(a).strip("'")
>>> b
'\\xeffective'
This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.
This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed 7 months ago.
So I can't seem to figure this out... I have a string say, "a\\nb" and I want this to become "a\nb". I've tried all the following and none seem to work;
>>> a
'a\\nb'
>>> a.replace("\\","\")
File "<stdin>", line 1
a.replace("\\","\")
^
SyntaxError: EOL while scanning string literal
>>> a.replace("\\",r"\")
File "<stdin>", line 1
a.replace("\\",r"\")
^
SyntaxError: EOL while scanning string literal
>>> a.replace("\\",r"\\")
'a\\\\nb'
>>> a.replace("\\","\\")
'a\\nb'
I really don't understand why the last one works, because this works fine:
>>> a.replace("\\","%")
'a%nb'
Is there something I'm missing here?
EDIT I understand that \ is an escape character. What I'm trying to do here is turn all \\n \\t etc. into \n \t etc. and replace doesn't seem to be working the way I imagined it would.
>>> a = "a\\nb"
>>> b = "a\nb"
>>> print a
a\nb
>>> print b
a
b
>>> a.replace("\\","\\")
'a\\nb'
>>> a.replace("\\\\","\\")
'a\\nb'
I want string a to look like string b. But replace isn't replacing slashes like I thought it would.
There's no need to use replace for this.
What you have is a encoded string (using the string_escape encoding) and you want to decode it:
>>> s = r"Escaped\nNewline"
>>> print s
Escaped\nNewline
>>> s.decode('string_escape')
'Escaped\nNewline'
>>> print s.decode('string_escape')
Escaped
Newline
>>> "a\\nb".decode('string_escape')
'a\nb'
In Python 3:
>>> import codecs
>>> codecs.decode('\\n\\x21', 'unicode_escape')
'\n!'
You are missing, that \ is the escape character.
Look here: http://docs.python.org/reference/lexical_analysis.html
at 2.4.1 "Escape Sequence"
Most importantly \n is a newline character.
And \\ is an escaped escape character :D
>>> a = 'a\\\\nb'
>>> a
'a\\\\nb'
>>> print a
a\\nb
>>> a.replace('\\\\', '\\')
'a\\nb'
>>> print a.replace('\\\\', '\\')
a\nb
r'a\\nb'.replace('\\\\', '\\')
or
'a\nb'.replace('\n', '\\n')
Your original string, a = 'a\\nb' does not actually have two '\' characters, the first one is an escape for the latter. If you do, print a, you'll see that you actually have only one '\' character.
>>> a = 'a\\nb'
>>> print a
a\nb
If, however, what you mean is to interpret the '\n' as a newline character, without escaping the slash, then:
>>> b = a.replace('\\n', '\n')
>>> b
'a\nb'
>>> print b
a
b
It's because, even in "raw" strings (=strings with an r before the starting quote(s)), an unescaped escape character cannot be the last character in the string. This should work instead:
'\\ '[0]
In Python string literals, backslash is an escape character. This is also true when the interactive prompt shows you the value of a string. It will give you the literal code representation of the string. Use the print statement to see what the string actually looks like.
This example shows the difference:
>>> '\\'
'\\'
>>> print '\\'
\
In Python 3 it will be:
bytes(s, 'utf-8').decode("unicode_escape")
This works on Windows with Python 3.x:
import os
str(filepath).replace(os.path.sep, '/')
Where: os.path.sep is \ on Windows and / on Linux.
Case study
Used this to prevent errors when generating a Markdown file then rendering it to pdf.
path = "C:\\Users\\Programming\\Downloads"
# Replace \\ with a \ along with any random key multiple times
path.replace('\\', '\pppyyyttthhhooonnn')
# Now replace pppyyyttthhhooonnn with a blank string
path.replace("pppyyyttthhhooonnn", "")
print(path)
#Output...
C:\Users\Programming\Downloads