how to detect an escape sequence in a string

how to detect an escape sequence in a string - python

Given a string named line whose raw version has this value:
\rRAWSTRING
how can I detect if it has the escape character \r? What I've tried is:
if repr(line).startswith('\r'):
blah...
but it doesn't catch it. I also tried find, such as:
if repr(line).find('\r') != -1:
blah
doesn't work either. What am I missing?
thx!
EDIT:
thanks for all the replies and the corrections re terminolgy and sorry for the confusion.
OK, if i do this
print repr(line)
then what it prints is:
'\rSET ENABLE ACK\n'
(including the single quotes). i have tried all the suggestions, including:
line.startswith(r'\r')
line.startswith('\\r')
each of which returns False. also tried:
line.find(r'\r')
line.find('\\r')
each of which returns -1

If:
print repr(line)
Returns:
'\rSET ENABLE ACK\n'
Then:
line.find('\r')
line.startswith('\r')
'\r' in line
are what you are looking for. Example:
>>> line = '\rSET ENABLE ACK\n'
>>> print repr(line)
'\rSET ENABLE ACK\n'
>>> line.find('\r')
0
>>> line.startswith('\r')
True
>>> '\r' in line
True
repr() returns a display string. It actually contains the quotes and backslashes you see when you print the line:
>>> print line
SET ENABLE ACK
>>> print repr(line)
'\rSET ENABLE ACK\n'
>>> print len(line)
16
>>> print len(repr(line))
20

Dude, seems you have tried everything but the simplest thing, lines.startswith('\r'):
>>> line = '\rSET ENABLE ACK\n'
>>> line.startswith('\r')
True
>>> '\rSET ENABLE ACK\n'.startswith('\r')
True
For now, just hold on on using repr(), \r and r'string' and go with the simplest thing to avoid confusion.

You can try either:
if repr(line).startswith(r'\r'):
or
if line.startswith('\r'):
The latter is better: it seems like you are using repr only to get at the escaped character.

It's not entirely clear what you're asking. You speak of a string, but also a "raw version", and your string contains "RAWSTRING", which seems to imply you are talking about raw strings.
None of these are quite the same thing.
If you have an ordinary string with the character represented by '\r' in it, then you can use any ordinary means to match:
>>> "\rastring".find('\r')
0
>>>
If you defined an actual "raw string", that won't work because what you put in was not the '\r' character, but the two characters '\' and 'r':
>>> r"\rastring".find('\r')
-1
>>>
In this case, or in the case of an ordinary string that happens to have the characters '\' and 'r', and you want to find those two characters, you'll need to search using a raw string:
>>> r"\rastring".find(r'\r')
0
>>>
Or you can search for the sequence by escaping the backslash itself:
>>> r"\rastring".find('\\r')
0
>>>

if '\r' in line:
If that isn't what you mean, tell us PRECISELY what is in line. Do this:
print repr(line)
and copy/past the result into a edit of your question.
Re your subject: backslash is an escape character, "\r" is an escaped character.

Simplest:
>>> s = r'\rRAWSTRING'
>>> s.startswith(r'\r')
True
Unless I badly misunderstand what you're saying, you need to look for r'\r' (or, equivalently but perhaps a tad less readably, '\\r'), the escape sequence, not for '\r', the carriage return character itself.

Related

Why does '\b' look like invalid as the last character of sentence in Python?

I feel puzzled about '\b'.
I know that '\b' means backspace, but in Python, if the last character of sentence it seems invalid.
Example:
>>print 'abc\be'
>>abe
>>print 'abc\b'
>>abc
Why?
And，another example on OSX/python2.7.10 IPython:
>> import sys
>> sys.stdout.write('abc\b')
>> abc
>> sys.stdout.write('abc\be')
>> abe

There's an implied newline after the print finishes, which causes a newline immediately after the \b is echoed. This causes the cursor to move to the next line, so there won't be anything overwriting the c from the previous line.
If you did something like:
print 'abc\b', 'def'
you would see output like:
ab def
i.e. it's not 'invalid' at the end of a sentence, it's just that because you immediately print a newline, nothing gets an opportunity to overwrite the character that backspaced into.
To make this a little bit more clear (hopefully) - taken by typing the lines into python directly:
print adds the newline, if we use sys.stdout.write, it won't add a newline automatically:
>>> import sys
>>> sys.stdout.write('abc')
abc>>> sys.stdout.write('abc\b')
ab>>>

Python - What happens when I write statements in the interpreter without 'print'?

I just started learning python 2.7.11 from the docs.
I was doing the examples which were given in there, but got confused when print and r were introduced. I got confused about the how backslashes are being displayed as output. This is related to character escaping but till now I just used \b for backspace, \n for new line, \\ for a backslash
This is what I am talking about:
There are four different cases:
using neither print nor r
using print but not r
using r but not print
using both print and r
and all of them give different outputs. I can't get my head around what's actually happening in all the four cases.

There's no connection between the two, necessarily. r introduces a raw string literal, which doesn't parse escape codes, which means that a \ between the quotes will be an actual \ character in the string data.
print will print out the actual string, while just typing a string literal at the >>> prompt will cause Python to implicitly print the value of the string, but it will be printed as a string literal (with special characters escaped), so you could take it, paste it back into >>> and get the same string.
Edit: repr() was mentioned in some of the comments. Basically, it's meant to return a representation of an object that can be parsed back into an equivalent object, whereas str() is aimed more towards a human-readable description. For built-in objects, both are predefined as appropriate, while for user objects, you can define them by defining the special methods __repr__() and __str__().
>>> obj will print out repr(obj), while print obj will print the value of str(obj).
>>> class Foo(object):
... def __str__(self):
... return "str of Foo"
... def __repr__(self):
... return "Foo()"
...
>>> f = Foo()
>>> f
Foo()
>>> print f
str of Foo
>>> eval(repr(f))
Foo()
Also, regarding the \es, it would also be worth noting that Python will ignore escape sequences it doesn't know:
>>> [c for c in '\w']
['\\', 'w']
>>> [c for c in '\n']
['\n']
\n becomes a newline character, while \w remains a backslash followed by a w.
For example, in 'hello\\\world', the first \ escapes the second one, and the third one is left alone because it's followed by a w.

How to print variable inside quotation marks?

I would like to print a variable within quotation marks. I want to print out "variable"
I have tried a lot, what worked was:
print('"', variable, '"')
but then I have two spaces in the output:
" variable "
How can I print something within a pair of quotation marks?

you can use format:
>>> s='hello'
>>> print '"{}"'.format(s)
"hello"
Learn about format here:Format
In 3x you can use f:
>>> f'"{s}"'
'"hello"'

If apostrophes ("single quotes") are okay, then the easiest way is to:
print repr(str(variable))
Otherwise, prefer the .format method over the % operator (see Hackaholic's answer).
The % operator (see Bhargav Rao's answer) also works, even in Python 3 so far, but is intended to be removed in some future version.
The advantage to using repr() is that quotes within the string will be handled appropriately. If you have an apostrophe in the text, repr() will switch to "" quotes. It will always produce something that Python recognizes as a string constant.
Whether that's good for your user interface, well, that's another matter. With % or .format, you get a shorthand for the way you might have done it to begin with:
print '"' + str(variable) + '"'
...as mentioned by Charles Duffy in comment.

Simply do:
print '"A word that needs quotation marks"'
Or you can use a triple quoted string:
print( """ "something" """ )

There are two simple ways to do this. The first is to just use a backslash before each quotation mark, like so:
s = "\"variable\""
The other way is, if there're double quotation marks surrounding the string, use single single quotes, and Python will recognize those as a part of the string (and vice versa):
s = '"variable"'

format is the best. These are alternatives.
>>> s='hello' # Used widly in python2, might get deprecated
>>> print '"%s"'%(s)
"hello"
>>> print '"',s,'"' # Usin inbuilt , technique of print func
" hello "
>>> print '"'+s+'"' # Very old fashioned and stupid way
"hello"

you can use string Concatenation check the below example
n= "test"
print ('"' + n + '"')

How to write string literals in Python without having to escape them?

Is there a way to declare a string variable in Python such that everything inside of it is automatically escaped, or has its literal character value?
I'm not asking how to escape the quotes with slashes, that's obvious. What I'm asking for is a general purpose way for making everything in a string literal so that I don't have to manually go through and escape everything for very large strings.

Raw string literals:
>>> r'abc\dev\t'
'abc\\dev\\t'

If you're dealing with very large strings, specifically multiline strings, be aware of the triple-quote syntax:
a = r"""This is a multiline string
with more than one line
in the source code."""

There is no such thing. It looks like you want something like "here documents" in Perl and the shells, but Python doesn't have that.
Using raw strings or multiline strings only means that there are fewer things to worry about. If you use a raw string then you still have to work around a terminal "\" and with any string solution you'll have to worry about the closing ", ', ''' or """ if it is included in your data.
That is, there's no way to have the string
' ''' """ " \
properly stored in any Python string literal without internal escaping of some sort.

You will find Python's string literal documentation here:
http://docs.python.org/tutorial/introduction.html#strings
and here:
http://docs.python.org/reference/lexical_analysis.html#literals
The simplest example would be using the 'r' prefix:
ss = r'Hello\nWorld'
print(ss)
Hello\nWorld

(Assuming you are not required to input the string from directly within Python code)
to get around the Issue Andrew Dalke pointed out, simply type the literal string into a text file and then use this;
input_ = '/directory_of_text_file/your_text_file.txt'
input_open = open(input_,'r+')
input_string = input_open.read()
print input_string
This will print the literal text of whatever is in the text file, even if it is;
' ''' """ “ \
Not fun or optimal, but can be useful, especially if you have 3 pages of code that would’ve needed character escaping.

Use print and repr:
>>> s = '\tgherkin\n'
>>> s
'\tgherkin\n'
>>> print(s)
gherkin
>>> repr(s)
"'\\tgherkin\\n'"
# print(repr(..)) gets literal
>>> print(repr(s))
'\tgherkin\n'
>>> repr('\tgherkin\n')
"'\\tgherkin\\n'"
>>> print('\tgherkin\n')
gherkin
>>> print(repr('\tgherkin\n'))
'\tgherkin\n'

regarding backslash from postgresql

i have a noob question.
I have a record in a table that looks like '\1abc'
I then use this string as a regex replacement in re.sub("([0-9])",thereplacement,"2")
I'm a little confused with the backslashes. The string i got back was "\\1abc"

Are you using python interactivly?
In regular string you need to escape backslashes in your code, or use r"..." (Link to docs). If you are running python interactivly and don't assign the results from your database to a variable, it'll be printed out using it's __repr__() method.
>>> s = "\\1abc"
>>> s
'\\1abc' # <-- How it's represented in Python code
>>> print s
\1abc # <-- The actual string
Also, your re.sub is a bit weird. 1) Maybe you meant [0-9] as the pattern? (Matching a single digit). The arguments are probably switche too, if thereplacement is your input. This is the syntax:
re.sub(pattern, repl, string, count=0)
So my guess is you expect something like this:
>>> s_in = yourDbMagic() # Which returns \1abc
>>> s_out = re.sub("[0-9]", "2", s_in)
>>> print s_in, s_out
\1abc \2abc
Edit: Tried to better explain escaping/representation.

Note that you can make \ stop being an escape character by setting standard_conforming_strings to on.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to detect an escape sequence in a string - python

You can try either: if repr(line).startswith(r'\r'): or if line.startswith('\r'): The latter is better: it seems like you are using repr only to get at the escaped character.

if '\r' in line: If that isn't what you mean, tell us PRECISELY what is in line. Do this: print repr(line) and copy/past the result into a edit of your question. Re your subject: backslash is an escape character, "\r" is an escaped character.

Simplest: >>> s = r'\rRAWSTRING' >>> s.startswith(r'\r') True Unless I badly misunderstand what you're saying, you need to look for r'\r' (or, equivalently but perhaps a tad less readably, '\\r'), the escape sequence, not for '\r', the carriage return character itself.

Related

Why does '\b' look like invalid as the last character of sentence in Python?

Python - What happens when I write statements in the interpreter without 'print'?

How to print variable inside quotation marks?

How to write string literals in Python without having to escape them?

regarding backslash from postgresql

Categories

Resources