I need to save string having " ' " in a dictionary along with \" instead of '.
the example is shown below.
code:
ss = "{'userName': {'suffix': None}"
print ss
print ss.replace("'", '\\"')
temp = dict()
temp["key"] = ss.replace("'", '\\"')
print str(temp)
output:
{'userName': {'suffix': None}
{\"userName\": {\"suffix\": None}
{'key': '{\\"userName\\": {\\"suffix\\": None}'}
please let me know any one have any solution or alternative for this.
You are looking at the repr() representation of a string. This is normal. A string representation uses escape codes for non-printable characters or anything that requires escaping.
Python containers show their contents, when printed, as string representations for debugging purposes. The resulting string representation is re-usable as a string literal, you can paste that right back into Python and it'll produce the same value.
Print individual values of you want to see the output unescaped:
print temp["key"]
and if you feel so inclined, compare that with the repr() result of the string:
print repr(temp["key"])
Related
I want to strip some unwanted symbols from my variable. In this case the symbols are backslashes. I am using a HEX number, and as an example I will show some short simple code down bellow. But I don't want python to convert my HEX to ASCII, how would I prevent this from happening.? I have some long shell codes for asm to work with later which are really long and removing \ by hand is a long process. I know there are different ways like using echo -e "x\x\x\x" > output etc, but my whole script will be written in python.
Thanks
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> b = a.strip("\\")
>>> print b
1�Phtv
>>> a = "\x31\x32\x33\x34\x35\x36"
>>> b = a.strip("\\")
>>> print b
123456
At the end I would like it to print my var:
>>> print b
x31x32x33x34x35x36
There are no backslashes in your variable:
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> print(a)
1ÀPhtv
Take newline for example: writing "\n" in Python will give you string with one character -- newline -- and no backslashes. See string literals docs for full syntax of these.
Now, if you really want to write string with such backslashes, you can do it with r modifier:
>>> a = r"\x31\xC0\x50\x68\x74\x76"
>>> print(a)
\x31\xC0\x50\x68\x74\x76
>>> print(a.replace('\\', ''))
x31xC0x50x68x74x76
But if you want to convert a regular string to hex-coded symbols, you can do it character by character, converting it to number ("\x31" == "1" --> 49), then to hex ("0x31"), and finally stripping the first character:
>>> a = "\x31\xC0\x50\x68\x74\x76"
>>> print(''.join([hex(ord(x))[1:] for x in a]))
'x31xc0x50x68x74x76'
There are two problems in your Code.
First the simple one:
strip() just removes one occurrence. So you should use replace("\\", ""). This will replace every backslash with "", which is the same as removing it.
The second problem is pythons behavior with backslashes:
To get your example working you need to append an 'r' in front of your string to indicate, that it is a raw string. a = r"\x31\xC0\x50\x68\x74\x76". In raw strings, a backlash doesn't escape a character but just stay a backslash.
>>> r"\x31\xC0\x50\x68\x74\x76"
'\\x31\\xC0\\x50\\x68\\x74\\x76'
For example:
Input:
{'match_all':{}}
Output:
{'"match_all"':{}}
Is there some regex that can do this?
I know I could iterate through the string and whenever I encounter a key replace each side of it with ‘“ followed by “‘; however, I was wondering if any of you knew a more pythonic way of doing this.
why not try using this method: https://www.tutorialspoint.com/python/string_replace.htm and try to replace, ' for '" and the second ' for "'...
str = "this is string example....wow!!! this is really string"
print str.replace("is", "was")
print str.replace("is", "was", 3)
the output returns:
thwas was string example....wow!!! thwas was really string
thwas was string example....wow!!! thwas is really string
print str.replace("'", "'"")
print str.replace("'", ""'", 1)
use ' " as needed to avoid errors...
I have a number of strings from which I am aiming to remove charactars using replace. However, this dosent seem to wake. To give a simplified example, this code:
row = "b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'"
row = row.replace("b'", "").replace("'", "").replace('b"', '').replace('"', '')
print(row.encode('ascii', errors='ignore'))
still ouputs this b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38' wheras I would like it to output James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38. How can I do this?
Edit: Updataed the code with a better example.
You seem to be mistaking single quotes for double quotes. Simple replace 'b:
>>> row = "xyz'b"
>>> row.replace("'b", "")
'xyz'
As an alternative to str.replace, you can simple slice the string to remove the unwanted leading and trailing characters:
>>> row[2:-1]
'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'
In your first .replace, change b' to 'b. Hence your code should be:
>>> row = "xyz'b"
>>> row = row.replace("'b", "").replace("'", "").replace('b"', '').replace('"', '')
# ^ changed here
>>> print(row.encode('ascii', errors='ignore'))
xyz
I am assuming rest of the conditions you have are the part of other task/matches that you didn't mentioned here.
If all you want is to take the string before first ', then you may just do:
row.split("'")[0]
You haven't listed this to remove 'b:
.replace("'b", '')
import ast
row = "b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'"
b_string = ast.literal_eval(row)
print(b_string)
u_string = b_string.decode('utf-8')
print(u_string)
out:
b_string:b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'
u_string: James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38
The real question is how to convert a string to python object.
You get a string which contains an a binary string, to convert it to python's binary string object, you should use eval(). ast.literal_eval() is more safe way to do it.
Now you get a binary string, you can convert it to unicode string which do not start with "b" by using decode()
i've the following code:
import re
key = re.escape('#one #two #some #tests #are #done')
print(key)
key = key.split()
print(key)
and the following output:
\#one\ \#two\ \#some\ \#tests\ \#are\ \#done
['\\#one\\', '\\#two\\', '\\#some\\', '\\#tests\\', '\\#are\\', '\\#done']
How come the backslashes are duplicated? I just want them once in my list, because i would like to use this list in a regular expression.
Thanks in advance! John
There is only one backslash each, but when printing the repr of the strings, they are duplicated (escaped) - just as you would need to duplicate them when using a string to build a regex. So everything is fine.
For example:
>>> len("\\")
1
>>> len("\\n")
2
>>> len("\n")
1
>>> print "\\n"
\n
>>> print "\n"
>>>
The \ character is an escape character, that is a character that changes the meaning of the subsequent character[s]. For example the "n" character is simply an "n". But if you escape it like "\n" it becomes the "newline" character. So, if you need to use a \ literal, you need to escape it with... itself: \\
The backslashes are not duplicated. To realize this, try to do:
for element in key:
print element
And you will see this output:
\#one\
\#two\
\#some\
\#tests\
\#are\
\#done
When you have printed whole list, the python used representation where strings are printed not as they are, but they are printed as python expression (notice the quotes "", they are not in the strings)
To actually encode string containing backslash, you need to duplicate that backslash. That is it.
When you convert a list to a string (e.g. to print it), it calls repr on each object contained in the list. That's why you get the quotes and extra backslashes in your second line of output. Try this:
s = "\\a string with an escaped backslash"
print s # prints: \a string with an escaped backslash
print repr(s) # prints: '\\a string with an escaped backslash'
The repr call puts quotes around the string, and shows the backslash escapes.
By definition the JSON string is wrapped with double quote.
In fact:
json.loads('{"v":1}') #works
json.loads("{'v':1}") #doesn't work
But how to deal with the second statements?
I'm looking for a solution different from eval or replace.
Thanks.
If you get a mailformed json why don't you just replace the double quotes with single quotes before
json.load
If you cannot fix the other side you will have to convert invalid JSON into valid JSON. I think the following treats escaped characters properly:
def fixEscapes(value):
# Replace \' by '
value = re.sub(r"[^\\]|\\.", lambda match: "'" if match.group(0) == "\\'" else match.group(0), value)
# Replace " by \"
value = re.sub(r"[^\\]|\\.", lambda match: '\\"' if match.group(0) == '"' else match.group(0), value)
return value
input = "{'vt\"e\\'st':1}"
input = re.sub(r"'(([^\\']|\\.)+)'", lambda match: '"%s"' % fixEscapes(match.group(1)), input)
print json.loads(input)
Not sure if I got your requirements right, but are you looking for something like this?
def fix_json(string_):
if string_[0] == string_[-1] == "'":
return '"' + string_[1:-1] +'"'
return string_
Example usage:
>>> fix_json("'{'key':'val\"'...cd'}'")
"{'key':'val"'...cd'}"
EDIT: it seems that the humour I tried to have in making the example above is not self-explanatory. So, here's another example:
>>> fix_json("'This string has - I'm sure - single quotes delimiters.'")
"This string has - I'm sure - single quotes delimiters."
This examples show how the "replacement" only happens at the extremities of the string, not within it.
you could also achieve the same with a regular expression, of course, but if you are just checking the starting and finishing char of a string, I find using regular string indexes more readable....
unfortunately you have to do this:
f = open('filename.json', 'rb')
json = eval(f.read())
done!
this works, but apparently people don't like the eval function. Let me know if you find a better approach. I used this on some twitter data...