Printing Single Quote inside the string - python

I want to output
XYZ's "ABC"
I tried the following 3 statements in Python IDLE.
1st and 2nd statement output a \ before '.
3rd statement with print function doesn't output \ before '.
Being new to Python, I wanted to understand why \ is output before ' in the 1st and 2nd statements.
>>> "XYZ\'s \"ABC\""
'XYZ\'s "ABC"'
>>> "XYZ's \"ABC\""
'XYZ\'s "ABC"'
>>> print("XYZ\'s \"ABC\"")
XYZ's "ABC"

Here are my observations when you call repr() on a string: (It's the same in IDLE, REPL, etc)
If you print a string(a normal string without single or double quote) with repr() it adds a single quote around it. (note: when you hit enter on REPL the repr() gets called not __str__ which is called by print function.)
If the word has either ' or " : First, there is no backslash in the output. The output is gonna be surrounded by " if the word has ' and ' if the word has ".
If the word has both ' and ": The output is gonna be surrounded by single quote. The ' is gonna get escaped with backslash but the " is not escaped.
Examples:
def print_it(s):
print(repr(s))
print("-----------------------------------")
print_it('Soroush')
print_it("Soroush")
print_it('Soroush"s book')
print_it("Soroush's book")
print_it('Soroush"s book and Soroush\' pen')
print_it("Soroush's book and Soroush\" pen")
output:
'Soroush'
-----------------------------------
'Soroush'
-----------------------------------
'Soroush"s book'
-----------------------------------
"Soroush's book"
-----------------------------------
'Soroush"s book and Soroush\' pen'
-----------------------------------
'Soroush\'s book and Soroush" pen'
-----------------------------------
So with that being said, the only way to get your desired output is by calling str() on a string.
I know Soroush"s book is grammatically incorrect in English. I just want to put it inside an expression.

Not sure what you want it to print.
Do you want it to output XYZ\'s \"ABC\" or XYZ's "ABC"?
The \ escapes next special character like quotes, so if you want to print a \ the code needs to have two \\.
string = "Im \\"
print(string)
Output: Im \
If you want to print quotes you need single quotes:
string = 'theres a "lot of "" in" my "" script'
print(string)
Output: theres a "lot of "" in" my "" script
Single quotes makes you able to have double quotes inside the string.

Related

take a list in python3 and make into a string but escape double quotes in order to pass as search parameter for azure search

I have an application in python that accepts a list of text (strings) that we want to use as search terms in Azure Cognitive Search. The search parameter needs to be a string, so if I have a list of words I can do something like:
words_to_search_list = ["toy", "durable"]
words_to_search_str = ' '.join(words_to_search_list)
and then pass words_to_search_str as the "search" parameter in Azure Search, and it can search for text that has "durable" or "toy".
"toy durable"
However, I am not sure how to handle situations where there are bigrams or trigrams in the words_to_search_list like here:
words_to_search_list = ["more toys", "free treats"]
In order to get back text from Azure that contains either "more toys" or "free treats" we'd need to pass the parameter like this:
"\"more toys\" \"free treats\""
Meaning the bigrams need to be in double quotes, but escaped.
I started this:
words_to_search_str=""
for words in words_to_search_list:
words_list=words.split()
if len(words_list)>1:
words_escaped='\\"'+ words + '\\"'
words_to_search_str+=words_escaped
else:
words_to_search_str+=words
But this makes words_to_search_str into the following:
'\\"more toys\\"\\"free treats\\"'
which is not what I want (the double backsplashes won't work).
Is there any way to take that list of strings and end up with one string, but where the bigrams are each in (escaped) double quotes?
Edit: I'd like to add that in the solution I have here, if you print it, you get what looks to be the right object (single backslashes, not double), but the actual object still seems to have the double backslashes and they don't give the same result when you pass into the search parameter...
This should do it if you are running 3.6+:
words_to_search_list = [
"toy", "durable", "more toys", "free treats", "big durable toys"
]
words_to_search_str = '\"search\": \"' + ' '.join([
f'\\"{word}\\"' if ' ' in word else word for word in words_to_search_list
]) + '\"'
print(words_to_search_str)
If not, try:
words_to_search_list = [
"toy", "durable", "more toys", "free treats", "big durable toys"
]
words_to_search_str = '\"search\": \"' + ' '.join([
'\\"{}\\"'.format(word) if ' ' in word else word for word in words_to_search_list
]) + '\"'
print(words_to_search_str)
When you display a string it's basically how would enter the string as python syntax. The double backslash isn't really a double backslash, like when you wrote your python code you used the double back slash to indicate an actual backslash by escaping it, python is simply doing that. That's also the reason why the double quotes are not being escaped, it's showing the string in single quotes. I hope that was helpful
The following should give you that format:
words_to_search_list = ["more toys", "free treats"]
updated_words = ['\\"{}\\"'.format(words) for words in words_to_search_list]
words_to_search_str = '"{}"'.format(' '.join(updated_words))
print(words_to_search_str)
The problem arises from you escaping the \ : words_escaped='\\"'+ words + '\\"'
You should escape but the " like:
words_escaped='\"'+ words + '\"'
That should produce the anticipated result

How to replace single quotes within a string with quoted double quotes

For example:
Input:
{'match_all':{}}
Output:
{'"match_all"':{}}
Is there some regex that can do this?
I know I could iterate through the string and whenever I encounter a key replace each side of it with ‘“ followed by “‘; however, I was wondering if any of you knew a more pythonic way of doing this.
why not try using this method: https://www.tutorialspoint.com/python/string_replace.htm and try to replace, ' for '" and the second ' for "'...
str = "this is string example....wow!!! this is really string"
print str.replace("is", "was")
print str.replace("is", "was", 3)
the output returns:
thwas was string example....wow!!! thwas was really string
thwas was string example....wow!!! thwas is really string
print str.replace("'", "'"")
print str.replace("'", ""'", 1)
use ' " as needed to avoid errors...

How to print "\" in python?

print "\\"
It print me in console...
But I want to get string \
How to get string string \?
There's clearly some sort of configuration with your console that's wrong. Doing this:
print "\\"
Clearly prints \ for me.
You can add letter r to string before quotes (r for raw). It will ignore all special symbols.
For example
>>> print '\x63\\ Hello \n\n8'
c\ Hello
8
>>> print r'\x63\\ Hello \n\n8'
\x63\\ Hello \n\n8
So printing backslash is print r'\'

Remove all line breaks from a long string of text

Basically, I'm asking the user to input a string of text into the console, but the string is very long and includes many line breaks. How would I take the user's string and delete all line breaks to make it a single line of text. My method for acquiring the string is very simple.
string = raw_input("Please enter string: ")
Is there a different way I should be grabbing the string from the user? I'm running Python 2.7.4 on a Mac.
P.S. Clearly I'm a noob, so even if a solution isn't the most efficient, the one that uses the most simple syntax would be appreciated.
How do you enter line breaks with raw_input? But, once you have a string with some characters in it you want to get rid of, just replace them.
>>> mystr = raw_input('please enter string: ')
please enter string: hello world, how do i enter line breaks?
>>> # pressing enter didn't work...
...
>>> mystr
'hello world, how do i enter line breaks?'
>>> mystr.replace(' ', '')
'helloworld,howdoienterlinebreaks?'
>>>
In the example above, I replaced all spaces. The string '\n' represents newlines. And \r represents carriage returns (if you're on windows, you might be getting these and a second replace will handle them for you!).
basically:
# you probably want to use a space ' ' to replace `\n`
mystring = mystring.replace('\n', ' ').replace('\r', '')
Note also, that it is a bad idea to call your variable string, as this shadows the module string. Another name I'd avoid but would love to use sometimes: file. For the same reason.
You can try using string replace:
string = string.replace('\r', '').replace('\n', '')
You can split the string with no separator arg, which will treat consecutive whitespace as a single separator (including newlines and tabs). Then join using a space:
In : " ".join("\n\nsome text \r\n with multiple whitespace".split())
Out: 'some text with multiple whitespace'
https://docs.python.org/2/library/stdtypes.html#str.split
The canonic answer, in Python, would be :
s = ''.join(s.splitlines())
It splits the string into lines (letting Python doing it according to its own best practices). Then you merge it. Two possibilities here:
replace the newline by a whitespace (' '.join())
or without a whitespace (''.join())
updated based on Xbello comment:
string = my_string.rstrip('\r\n')
read more here
Another option is regex:
>>> import re
>>> re.sub("\n|\r", "", "Foo\n\rbar\n\rbaz\n\r")
'Foobarbaz'
If anybody decides to use replace, you should try r'\n' instead '\n'
mystring = mystring.replace(r'\n', ' ').replace(r'\r', '')
A method taking into consideration
additional white characters at the beginning/end of string
additional white characters at the beginning/end of every line
various end-line characters
it takes such a multi-line string which may be messy e.g.
test_str = '\nhej ho \n aaa\r\n a\n '
and produces nice one-line string
>>> ' '.join([line.strip() for line in test_str.strip().splitlines()])
'hej ho aaa a'
UPDATE:
To fix multiple new-line character producing redundant spaces:
' '.join([line.strip() for line in test_str.strip().splitlines() if line.strip()])
This works for the following too
test_str = '\nhej ho \n aaa\r\n\n\n\n\n a\n '
Regular expressions is the fastest way to do this
s='''some kind of
string with a bunch\r of
extra spaces in it'''
re.sub(r'\s(?=\s)','',re.sub(r'\s',' ',s))
result:
'some kind of string with a bunch of extra spaces in it'
The problem with rstrip() is that it does not work in all cases (as I myself have seen few). Instead you can use
text = text.replace("\n"," ")
This will remove all new line '\n' with a space.
You really don't need to remove ALL the signs: lf cr crlf.
# Pythonic:
r'\n', r'\r', r'\r\n'
Some texts must have breaks, but you probably need to join broken lines to keep particular sentences together.
Therefore it is natural that line breaking happens after priod, semicolon, colon, but not after comma.
My code considers above conditions. Works well with texts copied from pdfs.
Enjoy!:
def unbreak_pdf_text(raw_text):
""" the newline careful sign removal tool
Args:
raw_text (str): string containing unwanted newline signs: \\n or \\r or \\r\\n
e.g. imported from OCR or copied from a pdf document.
Returns:
_type_: _description_
"""
pat = re.compile((r"[, \w]\n|[, \w]\r|[, \w]\r\n"))
breaks = re.finditer(pat, raw_text)
processed_text = raw_text
raw_text = None
for i in breaks:
processed_text = processed_text.replace(i.group(), i.group()[0]+" ")
return processed_text

Escape string and split it right after

i've the following code:
import re
key = re.escape('#one #two #some #tests #are #done')
print(key)
key = key.split()
print(key)
and the following output:
\#one\ \#two\ \#some\ \#tests\ \#are\ \#done
['\\#one\\', '\\#two\\', '\\#some\\', '\\#tests\\', '\\#are\\', '\\#done']
How come the backslashes are duplicated? I just want them once in my list, because i would like to use this list in a regular expression.
Thanks in advance! John
There is only one backslash each, but when printing the repr of the strings, they are duplicated (escaped) - just as you would need to duplicate them when using a string to build a regex. So everything is fine.
For example:
>>> len("\\")
1
>>> len("\\n")
2
>>> len("\n")
1
>>> print "\\n"
\n
>>> print "\n"
>>>
The \ character is an escape character, that is a character that changes the meaning of the subsequent character[s]. For example the "n" character is simply an "n". But if you escape it like "\n" it becomes the "newline" character. So, if you need to use a \ literal, you need to escape it with... itself: \\
The backslashes are not duplicated. To realize this, try to do:
for element in key:
print element
And you will see this output:
\#one\
\#two\
\#some\
\#tests\
\#are\
\#done
When you have printed whole list, the python used representation where strings are printed not as they are, but they are printed as python expression (notice the quotes "", they are not in the strings)
To actually encode string containing backslash, you need to duplicate that backslash. That is it.
When you convert a list to a string (e.g. to print it), it calls repr on each object contained in the list. That's why you get the quotes and extra backslashes in your second line of output. Try this:
s = "\\a string with an escaped backslash"
print s # prints: \a string with an escaped backslash
print repr(s) # prints: '\\a string with an escaped backslash'
The repr call puts quotes around the string, and shows the backslash escapes.

Categories