Is there a way to declare a string variable in Python such that everything inside of it is automatically escaped, or has its literal character value?
I'm not asking how to escape the quotes with slashes, that's obvious. What I'm asking for is a general purpose way for making everything in a string literal so that I don't have to manually go through and escape everything for very large strings.
Raw string literals:
>>> r'abc\dev\t'
'abc\\dev\\t'
If you're dealing with very large strings, specifically multiline strings, be aware of the triple-quote syntax:
a = r"""This is a multiline string
with more than one line
in the source code."""
There is no such thing. It looks like you want something like "here documents" in Perl and the shells, but Python doesn't have that.
Using raw strings or multiline strings only means that there are fewer things to worry about. If you use a raw string then you still have to work around a terminal "\" and with any string solution you'll have to worry about the closing ", ', ''' or """ if it is included in your data.
That is, there's no way to have the string
' ''' """ " \
properly stored in any Python string literal without internal escaping of some sort.
You will find Python's string literal documentation here:
http://docs.python.org/tutorial/introduction.html#strings
and here:
http://docs.python.org/reference/lexical_analysis.html#literals
The simplest example would be using the 'r' prefix:
ss = r'Hello\nWorld'
print(ss)
Hello\nWorld
(Assuming you are not required to input the string from directly within Python code)
to get around the Issue Andrew Dalke pointed out, simply type the literal string into a text file and then use this;
input_ = '/directory_of_text_file/your_text_file.txt'
input_open = open(input_,'r+')
input_string = input_open.read()
print input_string
This will print the literal text of whatever is in the text file, even if it is;
' ''' """ “ \
Not fun or optimal, but can be useful, especially if you have 3 pages of code that would’ve needed character escaping.
Use print and repr:
>>> s = '\tgherkin\n'
>>> s
'\tgherkin\n'
>>> print(s)
gherkin
>>> repr(s)
"'\\tgherkin\\n'"
# print(repr(..)) gets literal
>>> print(repr(s))
'\tgherkin\n'
>>> repr('\tgherkin\n')
"'\\tgherkin\\n'"
>>> print('\tgherkin\n')
gherkin
>>> print(repr('\tgherkin\n'))
'\tgherkin\n'
Related
Let's say I have a multiline string in Python; I would like to convert it to a single-line representation, where the line endings are written as \n escapes (and tabs as \t etc) - say, in order to use it as some command line argument. So far, I figured that pprint.pformat can be used for the purpose - but the problem is, I cannot convert from this single-line representation back to a "proper" multi-line string; here's an example:
import string
import pprint
MYSTRTEMP = """Hello $FNAME!
I am writing this.
Just to test.
"""
print("--Testing multiline:")
print(MYSTRTEMP)
print("--Testing single-line (escaped) representation:")
testsingle = pprint.pformat(MYSTRTEMP)
print(testsingle)
# http://stackoverflow.com/questions/12768107/string-substitutions-using-templates-in-python
MYSTR = string.Template(testsingle).substitute({'FNAME': 'Bobby'})
print("--Testing single-line replaced:")
print(MYSTR)
print("--Testing going back to multiline - cannot:")
print("%s"%(MYSTR))
This example with Python 2.7 outputs:
$ python test.py
--Testing multiline:
Hello $FNAME!
I am writing this.
Just to test.
--Testing single-line (escaped) representation:
'Hello $FNAME!\n\nI am writing this.\n\nJust to test.\n'
--Testing single-line replaced:
'Hello Bobby!\n\nI am writing this.\n\nJust to test.\n'
--Testing going back to multiline - cannot:
'Hello Bobby!\n\nI am writing this.\n\nJust to test.\n'
One problem is that the single-line representation seems to include the ' single quotes in the string itself - second problem is that I cannot go back from this representation back to a proper multiline string.
Is there a standard method in Python to achieve this, such that - as in the example - I could go from multi-line to escaped single-line, then do templating, then convert the templated single-line back to a multi-line representation?
To go from the repr of a string (which is what pformat gives you for a string) to the actual string, you can use ast.literal_eval:
>>> repr(MYSTRTEMP)
"'Hello $FNAME!\\n\\nI am writing this.\\n\\nJust to test.\\n'"
>>> ast.literal_eval(repr(MYSTRTEMP))
'Hello $FNAME!\n\nI am writing this.\n\nJust to test.\n'
Converting to repr just to convert back is probably not a good way to accomplish your original goal, though, but this is how you would do it.
New to python and I am learning this tutorial:
http://learnpythonthehardway.org/book/ex8.html
I just cannot see why the line "But it didn't sing." got printed out with double-quote and all the others got printed with single quote.. Cannot see any difference from the code...
The quotes depends on the string: if there are no quotes, it will use simple quotes:
>>> """no quotes"""
'no quotes'
if there is a single quote, it will use double quotes:
>>> """single quote:'"""
"single quote:'"
if there is a double quote, it will use single quotes:
"""double quote:" """
'double quote:" '
if there are both, it will use single quotes, hence escaping the single one:
>>> """mix quotes:'" """
'mix quotes:\'" '
>>> """mix quotes:"' """
'mix quotes:"\' '
>>> '''mix quotes:"' '''
'mix quotes:"\' '
There won't be a difference though when you print the string:
>>> print '''mix quotes:"' '''
mix quotes:"'
the surroundings quotes are for the representation of the strings:
>>> print str('''mix quotes:"' ''')
mix quotes:"'
>>> print repr('''mix quotes:"' ''')
'mix quotes:"\' '
You might want to check the python tutorial on strings.
The representation of a value should be equivalent to the Python code required to generate it. Since the string "But it didn't sing." contains a single quote, using single quotes to delimit it would create invalid code. Therefore double quotes are used instead.
Python has several rules for outputting the repr of strings.
Normally, it uses ' to surround them, except if there are 's within it - then it uses " for removing the need of quoting.
If a string contains both ' and '"characters, it uses's and quotes the"`.
As there can be several valid and equivalent representations of a string, these rues might change from version to version.
BTW, in the site you linked to the answer is given as well:
Q: Why does %r sometimes print things with single-quotes when I wrote them with double-quotes?
A: Python is going to print the strings in the most efficient way it can, not replicate exactly the way you wrote them. This is perfectly fine since %r is used for debugging and inspection, so it's not necessary that it be pretty.
I'm trying to find a way to print a string in raw form from a variable. For instance, if I add an environment variable to Windows for a path, which might look like 'C:\\Windows\Users\alexb\', I know I can do:
print(r'C:\\Windows\Users\alexb\')
But I cant put an r in front of a variable.... for instance:
test = 'C:\\Windows\Users\alexb\'
print(rtest)
Clearly would just try to print rtest.
I also know there's
test = 'C:\\Windows\Users\alexb\'
print(repr(test))
But this returns 'C:\\Windows\\Users\x07lexb'
as does
test = 'C:\\Windows\Users\alexb\'
print(test.encode('string-escape'))
So I'm wondering if there's any elegant way to make a variable holding that path print RAW, still using test? It would be nice if it was just
print(raw(test))
But its not
I had a similar problem and stumbled upon this question, and know thanks to Nick Olson-Harris' answer that the solution lies with changing the string.
Two ways of solving it:
Get the path you want using native python functions, e.g.:
test = os.getcwd() # In case the path in question is your current directory
print(repr(test))
This makes it platform independent and it now works with .encode. If this is an option for you, it's the more elegant solution.
If your string is not a path, define it in a way compatible with python strings, in this case by escaping your backslashes:
test = 'C:\\Windows\\Users\\alexb\\'
print(repr(test))
In general, to make a raw string out of a string variable, I use this:
string = "C:\\Windows\Users\alexb"
raw_string = r"{}".format(string)
output:
'C:\\\\Windows\\Users\\alexb'
You can't turn an existing string "raw". The r prefix on literals is understood by the parser; it tells it to ignore escape sequences in the string. However, once a string literal has been parsed, there's no difference between a raw string and a "regular" one. If you have a string that contains a newline, for instance, there's no way to tell at runtime whether that newline came from the escape sequence \n, from a literal newline in a triple-quoted string (perhaps even a raw one!), from calling chr(10), by reading it from a file, or whatever else you might be able to come up with. The actual string object constructed from any of those methods looks the same.
I know i'm too late for the answer but for people reading this I found a much easier way for doing it
myVariable = 'This string is supposed to be raw \'
print(r'%s' %myVariable)
try this. Based on what type of output you want. sometime you may not need single quote around printed string.
test = "qweqwe\n1212as\t121\\2asas"
print(repr(test)) # output: 'qweqwe\n1212as\t121\\2asas'
print( repr(test).strip("'")) # output: qweqwe\n1212as\t121\\2asas
Get rid of the escape characters before storing or manipulating the raw string:
You could change any backslashes of the path '\' to forward slashes '/' before storing them in a variable. The forward slashes don't need to be escaped:
>>> mypath = os.getcwd().replace('\\','/')
>>> os.path.exists(mypath)
True
>>>
Just simply use r'string'. Hope this will help you as I see you haven't got your expected answer yet:
test = 'C:\\Windows\Users\alexb\'
rawtest = r'%s' %test
I have my variable assigned to big complex pattern string for using with re module and it is concatenated with few other strings and in the end I want to print it then copy and check on regex101.com.
But when I print it in the interactive mode I get double slash - '\\w'
as #Jimmynoarms said:
The Solution for python 3x:
print(r'%s' % your_variable_pattern_str)
Your particular string won't work as typed because of the escape characters at the end \", won't allow it to close on the quotation.
Maybe I'm just wrong on that one because I'm still very new to python so if so please correct me but, changing it slightly to adjust for that, the repr() function will do the job of reproducing any string stored in a variable as a raw string.
You can do it two ways:
>>>print("C:\\Windows\Users\alexb\\")
C:\Windows\Users\alexb\
>>>print(r"C:\\Windows\Users\alexb\\")
C:\\Windows\Users\alexb\\
Store it in a variable:
test = "C:\\Windows\Users\alexb\\"
Use repr():
>>>print(repr(test))
'C:\\Windows\Users\alexb\\'
or string replacement with %r
print("%r" %test)
'C:\\Windows\Users\alexb\\'
The string will be reproduced with single quotes though so you would need to strip those off afterwards.
To turn a variable to raw str, just use
rf"{var}"
r is raw and f is f-str; put them together and boom it works.
Replace back-slash with forward-slash using one of the below:
re.sub(r"\", "/", x)
re.sub(r"\", "/", x)
This does the trick
>>> repr(string)[1:-1]
Here is the proof
>>> repr("\n")[1:-1] == r"\n"
True
And it can be easily extrapolated into a function if need be
>>> raw = lambda string: repr(string)[1:-1]
>>> raw("\n")
'\\n'
i wrote a small function.. but works for me
def conv(strng):
k=strng
k=k.replace('\a','\\a')
k=k.replace('\b','\\b')
k=k.replace('\f','\\f')
k=k.replace('\n','\\n')
k=k.replace('\r','\\r')
k=k.replace('\t','\\t')
k=k.replace('\v','\\v')
return k
Here is a straightforward solution.
address = 'C:\Windows\Users\local'
directory ="r'"+ address +"'"
print(directory)
"r'C:\\Windows\\Users\\local'"
Ok on this link it shows the last line of output that has ' around everything except the third sentence and I do not know why. This bothered me at the beginning and thought it was just a weird mistake but its on the "extra credit" so now I am even more curious.
This is because the %r formatter prints the argument in the form you may use in source code, which, for strings, means that it is quote-delimited and escaped. For boolean values, this is just True or False. To print the string as it is, use %s instead.
>>> print '%s' % '"Hello, you\'re"'
"Hello, you're"
>>> print '%r' % '"Hello, you\'re"'
'"Hello, you\'re"'
python's repr() function, which is invoked by interpolating the %r formatting directive, has the approximate effect of printing objects the way they would appear in source code.
There are several ways to format strings in python source, using single or double quotes, with backslash escapes or as raw strings, as simple, single line strings or multi line strings (in any combination). Python picks only two ways to format strings, as single or double quoted, single line strings with escapes instead of raw.
Python makes a crude attempt at picking a minimal format, with a slight bias in favor of the single quote version (since that would be one fewer keystrokes on most keyboards).
The rules are very simple. If a string contains a single quote, but no double quotes, python prints the string as it would appear in python source if it were double quoted, Otherwise it uses single quotes.
Some examples to illustrate. Note for simplicity all of the inputs use triple quotes to avoid backslash escapes.
>>> ''' Hello world '''
' Hello world '
>>> ''' "Hello world," he said. '''
' "Hello world," he said. '
>>> ''' You don't say? '''
" You don't say? "
>>> ''' "Can't we all just get along?" '''
' "Can\'t we all just get along?" '
>>>
i have a noob question.
I have a record in a table that looks like '\1abc'
I then use this string as a regex replacement in re.sub("([0-9])",thereplacement,"2")
I'm a little confused with the backslashes. The string i got back was "\\1abc"
Are you using python interactivly?
In regular string you need to escape backslashes in your code, or use r"..." (Link to docs). If you are running python interactivly and don't assign the results from your database to a variable, it'll be printed out using it's __repr__() method.
>>> s = "\\1abc"
>>> s
'\\1abc' # <-- How it's represented in Python code
>>> print s
\1abc # <-- The actual string
Also, your re.sub is a bit weird. 1) Maybe you meant [0-9] as the pattern? (Matching a single digit). The arguments are probably switche too, if thereplacement is your input. This is the syntax:
re.sub(pattern, repl, string, count=0)
So my guess is you expect something like this:
>>> s_in = yourDbMagic() # Which returns \1abc
>>> s_out = re.sub("[0-9]", "2", s_in)
>>> print s_in, s_out
\1abc \2abc
Edit: Tried to better explain escaping/representation.
Note that you can make \ stop being an escape character by setting standard_conforming_strings to on.