python 2.7 get rid of double backslashes - python

I have list with one string element, see below
>>> s
['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
want to get rid of these '\\' and have dict instead:
{"SrcIP":"1.1.1.1","DstIP":"2.2.2.2","DstPort":"80"}

It looks like JSON object. You can load it to dict by using json package, but first to get rid of list and \\ you can call s[0].replace('\\', '')
import json
my_dict = json.loads(s[0].replace('\\', ''))

You can try this:
import re
import ast
s = ['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
final_response = [ast.literal_eval(re.sub('\\\\', '', i)) for i in s][0]
Output:
{'SrcIP': '1.1.1.1', 'DstIP': '2.2.2.2', 'DstPort': '80'}

Just use string replace method :
list_1=['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
for i in list_1:
print(str(i).replace("\\",""))
Or you can do in one line:
print(str(list_1[0]).replace("\\",""))
output:
{"SrcIP":"1.1.1.1","DstIP":"2.2.2.2","DstPort":"80"}

s is a list with one text item, you could get your desired output as follows:
import ast
s = ['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
s_dict = ast.literal_eval(s[0].replace('\\', ''))
print s_dict
print s_dict['DstIP']
Giving you the following output:
{'SrcIP': '1.1.1.1', 'DstIP': '2.2.2.2', 'DstPort': '80'}
2.2.2.2
The Python function ast.litertal_eval() can be used to safely convert a string into a Python object, in this case a dictionary.

Related

python split string into multiple delimiters and put into dictionary

i have the below string that i am trying to split into a dictionary with specific names.
string1 = "fdsfsf:?x=klink:apple&nn=specialtime&tr=instruction1&tr=instruction2&tr=instruction3"
what I am hoping to obtain is:
>>> print(dict)
{'namy_names': 'specialtime', 'tracks': ['instruction1', 'instruction2', 'instruction3']}
i'm quite new to working with dictionaries, so not too sure how it is supposed to turn out.
I have tried the below code, but it only provides instruction1 instead of the full list of instructions
delimiters = ['&nn', '&tr']
values = re.split('|'.join(delimiters), string1)
values.pop(0) # remove the initial empty string
keys = re.findall('|'.join(delimiters), string1)
output = dict(zip(keys, values))
print(output)
Use url-parsing.
from urllib import parse
url = "fdsfsf:?x=klink:apple&nn=specialtime&tr=instruction1&tr=instruction2&tr=instruction3"
d = parse.parse_qs(parse.urlparse(url).query)
print(d)
Returns:
{'nn': ['specialtime'],
'tr': ['instruction1', 'instruction2', 'instruction3'],
'x': ['klink:apple']}
And from this point, if necessary..., you would simply have to rename and pick your vars. Like this:
d = {
'namy_names':d.get('nn',['Empty'])[0],
'tracks':d.get('tr',[])
}
# {'namy_names': 'specialtime', 'tracks': ['instruction1', 'instruction2', 'instruction3']}
This looks like url-encoded data, so you can/should use urllib.parse.parse_qs:
import urllib.parse
string1 = "fdsfsf:?x=klink:apple&nn=specialtime&tr=instruction1&tr=instruction2&tr=instruction3"
dic = urllib.parse.parse_qs(string1)
dic = {'namy_names': dic['nn'][0],
'tracks': dic['tr']}
# result: {'namy_names': 'specialtime',
# 'tracks': ['instruction1', 'instruction2', 'instruction3']}

Python- urllib.urlencode: parse dictionary items into string

I need to encode dictionary item like this
data = OrderedDict([('mID', ['54a309ae1c61be23aba0da54', '54a309ae1c61be23aba0da63'])])
into a string formatted like this
mID=[54a309ae1c61be23aba0da54,54a309ae1c61be23aba0da63]
When I use url_values = urllib.urlencode(data)
I get mID=%5B%2754a309ae1c61be23aba0da54%27%2C+%2754a309ae1c61be23aba0da63%27%5D
What could I do?
May be:
"{}=[{}]".format("mID",",".join(data["mID"]))
With urllib.parse module for Python v3.x:
import collections
from urllib import parse
data = collections.OrderedDict([('mID', ['54a309ae1c61be23aba0da54', '54a309ae1c61be23aba0da63'])])
urlenc_str = parse.unquote_plus(parse.urlencode(data))
urlenc_str = urlenc_str.replace("'", '').replace(' ', '')
print(urlenc_str)
The output:
mID=[54a309ae1c61be23aba0da54,54a309ae1c61be23aba0da63]
Checking type:
print(type(urlenc_str)) # <class 'str'>

Remove a parameter from string

I have this string:
orderby=alphabetical&page=3
but it can be even like this:
orderby=alphabetical&other=param&page=1234
What I want to do is to delete from that string the paramater &page=[Number]
in such a way to have the following string:
orderby=alphabetical&other=param
How can I do that?
You could use parse.parse_qsl to decompose the param string into a list of name/value pairs. Then use a list comprehension to filter out any name/value pair for which name equals 'page'. Finally, rebuild the param string using parse.urlencode:
import urllib.parse as parse
paramstr = 'orderby=alphabetical&other=param&page=1234'
params = parse.parse_qsl(paramstr)
params = [(name, val) for name, val in params if name != 'page']
print(parse.urlencode(params))
yields
orderby=alphabetical&other=param
Simply
url = 'orderby=alphabetical&other=param&page=1234'
params = url.split('&')
print('&'.join(i for i in params if 'page=' not in i))
Is this what you're looking for?
>>> newstring=string[0:string.find("&page")]
>>> print newstring
orderby=alphabetical

Regex to extract multiple fields from pattern

I have a pattern like this in a txt file:
["kiarix moreno","116224357500406255237","z120gbkosz2oc3ckv23bc10hhwrudlcjy04",1409770337,"com.youtube.www/watch?v\u003dp1JPKLa-Ofc:https","es"]
and I need a regex to extract each field in python. Every field can contain any character (not only alphanumeric) except for the 4th which is a long number. How can I do it? Many thanks.
EDIT: the file contains other html elements, that's why I can't parse it directly in a python List.
The following provides three different options for getting your data:
>>> TEXT = '["kiarix moreno","116224357500406255237","z120gbkosz2oc3ckv23bc10hhwrudlcjy04",1409770337,"com.youtube.www/watch?v\u003dp1JPKLa-Ofc:https","es"]'
>>> import json, ast, re
>>> json.loads(TEXT)
['kiarix moreno', '116224357500406255237', 'z120gbkosz2oc3ckv23bc10hhwrudlcjy04', 1409770337, 'com.youtube.www/watch?v=p1JPKLa-Ofc:https', 'es']
>>> ast.literal_eval(TEXT)
['kiarix moreno', '116224357500406255237', 'z120gbkosz2oc3ckv23bc10hhwrudlcjy04', 1409770337, 'com.youtube.www/watch?v=p1JPKLa-Ofc:https', 'es']
>>> re.search(r'\["(?P<name>[^"]*)","(?P<number1>[^"]*)","(?P<data>[^"]*)",(?P<number2>\d*),"(?P<website>[^"]*)","(?P<language>[^"]*)"\]', TEXT).groupdict()
{'website': 'com.youtube.www/watch?v=p1JPKLa-Ofc:https', 'number2': '1409770337', 'language': 'es', 'data': 'z120gbkosz2oc3ckv23bc10hhwrudlcjy04', 'number1': '116224357500406255237', 'name': 'kiarix moreno'}
>>>
In particular, your regular expression would be the following: r'\["(?P<name>[^"]*)","(?P<number1>[^"]*)","(?P<data>[^"]*)",(?P<number2>\d*),"(?P<website>[^"]*)","(?P<language>[^"]*)"\]'
"([^"]*")|(\d+)
You can try this.Grab the matches.See demo.
http://regex101.com/r/dK1xR4/5
you can
1)open the file.
2)use getline to scan each line.
3)use split() function to split using "," and then use the resulting tuple/list
however you want.
I'm going to combine re, try/except, ast.literal_eval and file to read all possible elements, also to avoid any [ ] across several lines so readline won't work.
Here is my solution:
import re
import ast
# grab all possible lists in the file
found = re.findall(r'\[.*\]', open('yourfile.txt' ,'r').read())
for each in found:
try:
for el in ast.literal_eval(each):
print el
except SyntaxError:
pass
kiarix moreno
116224357500406255237
z120gbkosz2oc3ckv23bc10hhwrudlcjy04
1409770337
com.youtube.www/watch?v\u003dp1JPKLa-Ofc:https
es

Converting a string representation of a list in python to a list object

I have some data of the format
[[prod149090160, prod146340131, prod160860042, prod147040186, prod147860348, prod157590283, prod153940219, prod162460011, prod160410115, prod157370014], [prod162290002, prod151790213, prod159380278, prod154180602, prod160020244, prod161410007, prod155540059, prod152810207, prod152870263, prod159300061], [prod156900051, prod157590288, prod153540027, prod162940222, prod160330181, prod162680033, prod155370061, prod156970034, prod159310027, prod159410165]]
This is a list of list in string format. Is there any simple way to convert this into an in-built python list type.
Or PyYAML:
>>> import yaml
>>> s = '[[prod149090160, prod146340131, prod160860042, prod147040186, prod147860348, prod157590283, prod153940219, prod162460011, prod160410115, prod157370014], [prod162290002, prod151790213, prod159380278, prod154180602, prod160020244, prod161410007, prod155540059, prod152810207, prod152870263, prod159300061], [prod156900051, prod157590288, prod153540027, prod162940222, prod160330181, prod162680033, prod155370061, prod156970034, prod159310027, prod159410165]]'
>>> yaml.load(s)
Use regular expressions:
>>> import re
>>> s = '[[prod149090160, prod146340131, prod160860042, prod147040186, prod147860348, prod157590283, prod153940219, prod162460011, prod160410115, prod157370014], [prod162290002, prod151790213, prod159380278, prod154180602, prod160020244, prod161410007, prod155540059, prod152810207, prod152870263, prod159300061], [prod156900051, prod157590288, prod153540027, prod162940222, prod160330181, prod162680033, prod155370061, prod156970034, prod159310027, prod159410165]]'
>>> groups = re.findall('\[([^\]]*)\]', s[1:-1])
>>> [re.findall('(prod\d+)', group) for group in groups]
[['prod149090160', 'prod146340131', 'prod160860042', 'prod147040186', 'prod147860348', 'prod157590283', 'prod153940219', 'prod162460011', 'prod160410115', 'prod157370014'], ['prod162290002', 'prod151790213', 'prod159380278', 'prod154180602', 'prod160020244', 'prod161410007', 'prod155540059', 'prod152810207', 'prod152870263', 'prod159300061'], ['prod156900051', 'prod157590288', 'prod153540027', 'prod162940222', 'prod160330181', 'prod162680033', 'prod155370061', 'prod156970034', 'prod159310027', 'prod159410165']]
This is what Bakuriu was talking about:
data = '''[["prod149090160", "prod146340131", "prod160860042", "prod147040186",
"prod147860348", "prod157590283", "prod153940219", "prod162460011",
"prod160410115", "prod157370014"],
["prod162290002", "prod151790213", "prod159380278", "prod154180602",
"prod160020244", "prod161410007", "prod155540059", "prod152810207",
"prod152870263", "prod159300061"],
["prod156900051", "prod157590288", "prod153540027", "prod162940222",
"prod160330181", "prod162680033", "prod155370061", "prod156970034",
"prod159310027", "prod159410165"]]'''
import ast
print ast.literal_eval(data)
Output:
[['prod149090160', 'prod146340131', 'prod160860042', 'prod147040186',
'prod147860348', 'prod157590283', 'prod153940219', 'prod162460011',
'prod160410115', 'prod157370014'],
['prod162290002', 'prod151790213', 'prod159380278', 'prod154180602',
'prod160020244', 'prod161410007', 'prod155540059', 'prod152810207',
'prod152870263', 'prod159300061'],
['prod156900051', 'prod157590288', 'prod153540027', 'prod162940222',
'prod160330181', 'prod162680033', 'prod155370061', 'prod156970034',
'prod159310027', 'prod159410165']]
The format shown would also be a legal JSON parse-able string:
import json
print json.loads(data)
import json
import re
print json.loads(re.sub(r'([^\[\],\s+]+)', r'"\1"', i))

Categories