I just started learning python 2.7.11 from the docs.
I was doing the examples which were given in there, but got confused when print and r were introduced. I got confused about the how backslashes are being displayed as output. This is related to character escaping but till now I just used \b for backspace, \n for new line, \\ for a backslash
This is what I am talking about:
There are four different cases:
using neither print nor r
using print but not r
using r but not print
using both print and r
and all of them give different outputs. I can't get my head around what's actually happening in all the four cases.
There's no connection between the two, necessarily. r introduces a raw string literal, which doesn't parse escape codes, which means that a \ between the quotes will be an actual \ character in the string data.
print will print out the actual string, while just typing a string literal at the >>> prompt will cause Python to implicitly print the value of the string, but it will be printed as a string literal (with special characters escaped), so you could take it, paste it back into >>> and get the same string.
Edit: repr() was mentioned in some of the comments. Basically, it's meant to return a representation of an object that can be parsed back into an equivalent object, whereas str() is aimed more towards a human-readable description. For built-in objects, both are predefined as appropriate, while for user objects, you can define them by defining the special methods __repr__() and __str__().
>>> obj will print out repr(obj), while print obj will print the value of str(obj).
>>> class Foo(object):
... def __str__(self):
... return "str of Foo"
... def __repr__(self):
... return "Foo()"
...
>>> f = Foo()
>>> f
Foo()
>>> print f
str of Foo
>>> eval(repr(f))
Foo()
Also, regarding the \es, it would also be worth noting that Python will ignore escape sequences it doesn't know:
>>> [c for c in '\w']
['\\', 'w']
>>> [c for c in '\n']
['\n']
\n becomes a newline character, while \w remains a backslash followed by a w.
For example, in 'hello\\\world', the first \ escapes the second one, and the third one is left alone because it's followed by a w.
Related
This sample code prints the representation of a line from a file. It allows its contents to be viewed, including control characters like '\n', on a single line—so we refer to it as the "raw" output of the line.
print("%r" % (self.f.readline()))
The output, however, appears with ' characters added to each end which aren't in the file.
'line of content\n'
How to get rid of the single quotes around the output?
(Behavior is the same in both Python 2.7 and 3.6.)
%r takes the repr representation of the string. It escapes newlines and etc. as you want, but also adds quotes. To fix this, strip off the quotes yourself with index slicing.
print("%s" %(repr(self.f.readline())[1:-1]))
If this is all you are printing, you don't need to pass it through a string formatter at all
print(repr(self.f.readline())[1:-1])
This also works:
print("%r" %(self.f.readline())[1:-1])
Although this approach would be overkill, in Python you can subclass most, if not all, of the built-in types, including str. This means you could define your own string class whose representation is whatever you want.
The following illustrates using that ability:
class MyStr(str):
""" Special string subclass to override the default representation method
which puts single quotes around the result.
"""
def __repr__(self):
return super(MyStr, self).__repr__().strip("'")
s1 = 'hello\nworld'
s2 = MyStr('hello\nworld')
print("s1: %r" % s1)
print("s2: %r" % s2)
Output:
s1: 'hello\nworld'
s2: hello\nworld
New to python and I am learning this tutorial:
http://learnpythonthehardway.org/book/ex8.html
I just cannot see why the line "But it didn't sing." got printed out with double-quote and all the others got printed with single quote.. Cannot see any difference from the code...
The quotes depends on the string: if there are no quotes, it will use simple quotes:
>>> """no quotes"""
'no quotes'
if there is a single quote, it will use double quotes:
>>> """single quote:'"""
"single quote:'"
if there is a double quote, it will use single quotes:
"""double quote:" """
'double quote:" '
if there are both, it will use single quotes, hence escaping the single one:
>>> """mix quotes:'" """
'mix quotes:\'" '
>>> """mix quotes:"' """
'mix quotes:"\' '
>>> '''mix quotes:"' '''
'mix quotes:"\' '
There won't be a difference though when you print the string:
>>> print '''mix quotes:"' '''
mix quotes:"'
the surroundings quotes are for the representation of the strings:
>>> print str('''mix quotes:"' ''')
mix quotes:"'
>>> print repr('''mix quotes:"' ''')
'mix quotes:"\' '
You might want to check the python tutorial on strings.
The representation of a value should be equivalent to the Python code required to generate it. Since the string "But it didn't sing." contains a single quote, using single quotes to delimit it would create invalid code. Therefore double quotes are used instead.
Python has several rules for outputting the repr of strings.
Normally, it uses ' to surround them, except if there are 's within it - then it uses " for removing the need of quoting.
If a string contains both ' and '"characters, it uses's and quotes the"`.
As there can be several valid and equivalent representations of a string, these rues might change from version to version.
BTW, in the site you linked to the answer is given as well:
Q: Why does %r sometimes print things with single-quotes when I wrote them with double-quotes?
A: Python is going to print the strings in the most efficient way it can, not replicate exactly the way you wrote them. This is perfectly fine since %r is used for debugging and inspection, so it's not necessary that it be pretty.
Is there a way to declare a string variable in Python such that everything inside of it is automatically escaped, or has its literal character value?
I'm not asking how to escape the quotes with slashes, that's obvious. What I'm asking for is a general purpose way for making everything in a string literal so that I don't have to manually go through and escape everything for very large strings.
Raw string literals:
>>> r'abc\dev\t'
'abc\\dev\\t'
If you're dealing with very large strings, specifically multiline strings, be aware of the triple-quote syntax:
a = r"""This is a multiline string
with more than one line
in the source code."""
There is no such thing. It looks like you want something like "here documents" in Perl and the shells, but Python doesn't have that.
Using raw strings or multiline strings only means that there are fewer things to worry about. If you use a raw string then you still have to work around a terminal "\" and with any string solution you'll have to worry about the closing ", ', ''' or """ if it is included in your data.
That is, there's no way to have the string
' ''' """ " \
properly stored in any Python string literal without internal escaping of some sort.
You will find Python's string literal documentation here:
http://docs.python.org/tutorial/introduction.html#strings
and here:
http://docs.python.org/reference/lexical_analysis.html#literals
The simplest example would be using the 'r' prefix:
ss = r'Hello\nWorld'
print(ss)
Hello\nWorld
(Assuming you are not required to input the string from directly within Python code)
to get around the Issue Andrew Dalke pointed out, simply type the literal string into a text file and then use this;
input_ = '/directory_of_text_file/your_text_file.txt'
input_open = open(input_,'r+')
input_string = input_open.read()
print input_string
This will print the literal text of whatever is in the text file, even if it is;
' ''' """ “ \
Not fun or optimal, but can be useful, especially if you have 3 pages of code that would’ve needed character escaping.
Use print and repr:
>>> s = '\tgherkin\n'
>>> s
'\tgherkin\n'
>>> print(s)
gherkin
>>> repr(s)
"'\\tgherkin\\n'"
# print(repr(..)) gets literal
>>> print(repr(s))
'\tgherkin\n'
>>> repr('\tgherkin\n')
"'\\tgherkin\\n'"
>>> print('\tgherkin\n')
gherkin
>>> print(repr('\tgherkin\n'))
'\tgherkin\n'
i have a noob question.
I have a record in a table that looks like '\1abc'
I then use this string as a regex replacement in re.sub("([0-9])",thereplacement,"2")
I'm a little confused with the backslashes. The string i got back was "\\1abc"
Are you using python interactivly?
In regular string you need to escape backslashes in your code, or use r"..." (Link to docs). If you are running python interactivly and don't assign the results from your database to a variable, it'll be printed out using it's __repr__() method.
>>> s = "\\1abc"
>>> s
'\\1abc' # <-- How it's represented in Python code
>>> print s
\1abc # <-- The actual string
Also, your re.sub is a bit weird. 1) Maybe you meant [0-9] as the pattern? (Matching a single digit). The arguments are probably switche too, if thereplacement is your input. This is the syntax:
re.sub(pattern, repl, string, count=0)
So my guess is you expect something like this:
>>> s_in = yourDbMagic() # Which returns \1abc
>>> s_out = re.sub("[0-9]", "2", s_in)
>>> print s_in, s_out
\1abc \2abc
Edit: Tried to better explain escaping/representation.
Note that you can make \ stop being an escape character by setting standard_conforming_strings to on.
Given a string named line whose raw version has this value:
\rRAWSTRING
how can I detect if it has the escape character \r? What I've tried is:
if repr(line).startswith('\r'):
blah...
but it doesn't catch it. I also tried find, such as:
if repr(line).find('\r') != -1:
blah
doesn't work either. What am I missing?
thx!
EDIT:
thanks for all the replies and the corrections re terminolgy and sorry for the confusion.
OK, if i do this
print repr(line)
then what it prints is:
'\rSET ENABLE ACK\n'
(including the single quotes). i have tried all the suggestions, including:
line.startswith(r'\r')
line.startswith('\\r')
each of which returns False. also tried:
line.find(r'\r')
line.find('\\r')
each of which returns -1
If:
print repr(line)
Returns:
'\rSET ENABLE ACK\n'
Then:
line.find('\r')
line.startswith('\r')
'\r' in line
are what you are looking for. Example:
>>> line = '\rSET ENABLE ACK\n'
>>> print repr(line)
'\rSET ENABLE ACK\n'
>>> line.find('\r')
0
>>> line.startswith('\r')
True
>>> '\r' in line
True
repr() returns a display string. It actually contains the quotes and backslashes you see when you print the line:
>>> print line
SET ENABLE ACK
>>> print repr(line)
'\rSET ENABLE ACK\n'
>>> print len(line)
16
>>> print len(repr(line))
20
Dude, seems you have tried everything but the simplest thing, lines.startswith('\r'):
>>> line = '\rSET ENABLE ACK\n'
>>> line.startswith('\r')
True
>>> '\rSET ENABLE ACK\n'.startswith('\r')
True
For now, just hold on on using repr(), \r and r'string' and go with the simplest thing to avoid confusion.
You can try either:
if repr(line).startswith(r'\r'):
or
if line.startswith('\r'):
The latter is better: it seems like you are using repr only to get at the escaped character.
It's not entirely clear what you're asking. You speak of a string, but also a "raw version", and your string contains "RAWSTRING", which seems to imply you are talking about raw strings.
None of these are quite the same thing.
If you have an ordinary string with the character represented by '\r' in it, then you can use any ordinary means to match:
>>> "\rastring".find('\r')
0
>>>
If you defined an actual "raw string", that won't work because what you put in was not the '\r' character, but the two characters '\' and 'r':
>>> r"\rastring".find('\r')
-1
>>>
In this case, or in the case of an ordinary string that happens to have the characters '\' and 'r', and you want to find those two characters, you'll need to search using a raw string:
>>> r"\rastring".find(r'\r')
0
>>>
Or you can search for the sequence by escaping the backslash itself:
>>> r"\rastring".find('\\r')
0
>>>
if '\r' in line:
If that isn't what you mean, tell us PRECISELY what is in line. Do this:
print repr(line)
and copy/past the result into a edit of your question.
Re your subject: backslash is an escape character, "\r" is an escaped character.
Simplest:
>>> s = r'\rRAWSTRING'
>>> s.startswith(r'\r')
True
Unless I badly misunderstand what you're saying, you need to look for r'\r' (or, equivalently but perhaps a tad less readably, '\\r'), the escape sequence, not for '\r', the carriage return character itself.