I have a dictionary of bigrams, obtained by importing a csv and transforming it to a dictionary:
bigram_dict = {"('key1', 'key2')": 'meaning', "('key22', 'key13')": 'mean2'}
I want keys' dictionary to be without quotation marks, i.e.:
desired_bigram_dict={('key1', 'key2'): 'meaning', ('key22', 'key13'): 'mean2'}
Would you please suggest me how to do this?
This can be done using a dictionary comprehension, where you call literal_eval on the key:
from ast import literal_eval
bigram_dict = {"('key1', 'key2')": 'meaning', "('key22', 'key13')": 'mean2'}
res = {literal_eval(k): v for k,v in bigram_dict.items()}
Result:
{('key22', 'key13'): 'mean2', ('key1', 'key2'): 'meaning'}
You can literal_eval each key and reassign:
from ast import literal_eval
bigram_dict = {"('key1', 'key2')": 'meaning', "('key22', 'key13')": 'mean2'}
for k,v in bigram_dict.items():
bigram_dict[literal_eval(k)] = v
Or to create a new dict, just use the same logic with a dict comprehension:
{literal_eval(k):v for k,v in bigram_dict.items()}
Both will give you:
{('key1', 'key2'): 'meaning', ('key22', 'key13'): 'mean2'}
Related
i have the below string that i am trying to split into a dictionary with specific names.
string1 = "fdsfsf:?x=klink:apple&nn=specialtime&tr=instruction1&tr=instruction2&tr=instruction3"
what I am hoping to obtain is:
>>> print(dict)
{'namy_names': 'specialtime', 'tracks': ['instruction1', 'instruction2', 'instruction3']}
i'm quite new to working with dictionaries, so not too sure how it is supposed to turn out.
I have tried the below code, but it only provides instruction1 instead of the full list of instructions
delimiters = ['&nn', '&tr']
values = re.split('|'.join(delimiters), string1)
values.pop(0) # remove the initial empty string
keys = re.findall('|'.join(delimiters), string1)
output = dict(zip(keys, values))
print(output)
Use url-parsing.
from urllib import parse
url = "fdsfsf:?x=klink:apple&nn=specialtime&tr=instruction1&tr=instruction2&tr=instruction3"
d = parse.parse_qs(parse.urlparse(url).query)
print(d)
Returns:
{'nn': ['specialtime'],
'tr': ['instruction1', 'instruction2', 'instruction3'],
'x': ['klink:apple']}
And from this point, if necessary..., you would simply have to rename and pick your vars. Like this:
d = {
'namy_names':d.get('nn',['Empty'])[0],
'tracks':d.get('tr',[])
}
# {'namy_names': 'specialtime', 'tracks': ['instruction1', 'instruction2', 'instruction3']}
This looks like url-encoded data, so you can/should use urllib.parse.parse_qs:
import urllib.parse
string1 = "fdsfsf:?x=klink:apple&nn=specialtime&tr=instruction1&tr=instruction2&tr=instruction3"
dic = urllib.parse.parse_qs(string1)
dic = {'namy_names': dic['nn'][0],
'tracks': dic['tr']}
# result: {'namy_names': 'specialtime',
# 'tracks': ['instruction1', 'instruction2', 'instruction3']}
I have list with one string element, see below
>>> s
['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
want to get rid of these '\\' and have dict instead:
{"SrcIP":"1.1.1.1","DstIP":"2.2.2.2","DstPort":"80"}
It looks like JSON object. You can load it to dict by using json package, but first to get rid of list and \\ you can call s[0].replace('\\', '')
import json
my_dict = json.loads(s[0].replace('\\', ''))
You can try this:
import re
import ast
s = ['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
final_response = [ast.literal_eval(re.sub('\\\\', '', i)) for i in s][0]
Output:
{'SrcIP': '1.1.1.1', 'DstIP': '2.2.2.2', 'DstPort': '80'}
Just use string replace method :
list_1=['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
for i in list_1:
print(str(i).replace("\\",""))
Or you can do in one line:
print(str(list_1[0]).replace("\\",""))
output:
{"SrcIP":"1.1.1.1","DstIP":"2.2.2.2","DstPort":"80"}
s is a list with one text item, you could get your desired output as follows:
import ast
s = ['{\\"SrcIP\\":\\"1.1.1.1\\",\\"DstIP\\":\\"2.2.2.2\\",\\"DstPort\\":\\"80\\"}']
s_dict = ast.literal_eval(s[0].replace('\\', ''))
print s_dict
print s_dict['DstIP']
Giving you the following output:
{'SrcIP': '1.1.1.1', 'DstIP': '2.2.2.2', 'DstPort': '80'}
2.2.2.2
The Python function ast.litertal_eval() can be used to safely convert a string into a Python object, in this case a dictionary.
How would I remove a \n or newline character from a dict value in Python?
testDict = {'salutations': 'hello', 'farewell': 'goodbye\n'}
testDict.strip('\n') # I know this part is incorrect :)
print(testDict)
To update the dictionary in-place, just iterate over it and apply str.rstrip() to values:
for key, value in testDict.items():
testDict[key] = value.rstrip()
To create a new dictionary, you can use a dictionary comprehension:
testDict = {key: value.rstrip() for key, value in testDict.items()}
Use dictionary comprehension:
testDict = {key: value.strip('\n') for key, value in testDict.items()}
You're trying to strip a newline from the Dictionary Object.
What you want is to iterate over all Dictionary keys and update their values.
for key in testDict.keys():
testDict[key] = testDict[key].strip()
That would do the trick.
I am trying to convert :
datalist = [u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg'}"]
To list containing python dict. If i try to extract value using keyword i got this error:
for i in datalist:
print i['smallimage']
....:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-20-686ea4feba66> in <module>()
1 for i in datalist:
----> 2 print i['smallimage']
3
TypeError: string indices must be integers
How do i convert list containing Unicode Dict to Dict..
You could use the demjson module which has a non-strict mode that handles the data you have:
import demjson
for data in datalist:
dct = demjson.decode(data)
print dct['gallery'] # etc...
In this case, I'd hand-craft a regular expression to make these into something you can evaluate as Python:
import re
import ast
from functools import partial
keys = re.compile(r'(gallery|smallimage|largeimage)')
fix_keys = partial(keys.sub, r'"\1"')
for entry in datalist:
entry = ast.literal_eval(fix_keys(entry))
Yes, this is limited; but it works for this set and is robust as long as the keys match. The regular expression is simple to maintain. Moreover, this doesn't use any external dependencies, it's all based on batteries already included.
Result:
>>> for entry in datalist:
... print ast.literal_eval(fix_keys(entry))
...
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg'}
Just as another thought, your list is properly formatted Yaml.
> yaml.load(u'{foo: "bar"}')['foo']
'bar'
And if you want to be really fancy and parse everything at once:
> data = yaml.load('['+','.join(datalist)+']')
> data[0]['smallimage']
'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'
> data[3]['gallery']
'gal1'
If your dictionary keys were quoted, you could
use json.loads to load the string.
import json
for i in datalist:
print json.loads(i)['smallimage']
(ast.literal_eval would have worked too...)
however, as it is, this will work with an old-school eval:
>>> class Mdict(dict):
... def __missing__(self,key):
... return key
...
>>> eval(datalist[0],Mdict(__builtins__=None))
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}
Note that this is probably vulnerable to injection attacks, so only use it if the string is from a trusted source.
Finally, for anyone wanting a short, although somewhat dense solution that uses only the standard library and isn't vulnerable to injection attacks... This little gem does the trick (assuming the dictionary keys are valid identifiers)!
import ast
class RewriteName(ast.NodeTransformer):
def visit_Name(self,node):
return ast.Str(s=node.id)
transformer = RewriteName()
for x in datalist:
tree = ast.parse(x,mode='eval')
transformer.visit(tree)
print ast.literal_eval(tree)['smallimage']
Your datalist is a list of unicode strings.
You could use eval, except your keys are not properly quoted. what you can do is requote your keys on the fly with replace:
for i in datalist:
my_dict = eval(i.replace("gallery", "'gallery'").replace("smallimage", "'smallimage'").replace("largeimage", "'largeimage'"))
print my_dict["smallimage"]
I don't see why the need for all the extra things such as using re or json...
fdict = {str(k): v for (k, v) in udict.items()}
Where udict is the dict that has unicode keys. Simply convert them to str. In your given data, you can simply...
datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]
Simple test:
>>> datalist = [{u'a':1,u'b':2},{u'a':1,u'b':2}]
[{u'a': 1, u'b': 2}, {u'a': 1, u'b': 2}]
>>> datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]
>>> datalist
[{'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
No import re or import json. Simple and quick.
I'd like to automaticaly form a dictionary from files that have the following structure.
str11 str12 str13
str21 str22
str31 str32 str33 str34
...
that is, two, three or four strings each line, with spaces in between. The dictionary I'd like to construct out of this list must have following structure:
{str11:(str12,str13),str21:(str22),str31:(str32,str33,str34), ... }
(that is, all entries str*1 are the keys -- all of them different -- and the remaining ones are the values). What can I use?
>>> with open('abc') as f:
... dic = {}
... for line in f:
... key, val = line.split(None,1)
... dic[key] = tuple(val.split())
...
>>> dic
{'str31': ('str32', 'str33', 'str34'),
'str21': ('str22',),
'str11': ('str12', 'str13')}
If you want the order of items to be preserved then consider using OrderedDict:
>>> from collections import OrderedDict
>>> with open('abc') as f:
dic = OrderedDict()
for line in f:
key, val = line.split(None,1)
dic[key] = tuple(val.split())
...
>>> dic
OrderedDict([
('str11', ('str12', 'str13')),
('str21', ('str22',)),
('str31', ('str32', 'str33', 'str34'))
])
Using a StringIO instance for simplicity:
import io
fobj = io.StringIO("""str11 str12 str13
str21 str22
str31 str32 str33 str34""")
One line does the trick:
>>> {line.split(None, 1)[0]: tuple(line.split()[1:]) for line in fobj}
{'str11': ('str12', 'str13'),
'str21': ('str22',),
'str31': ('str32', 'str33', 'str34')}
Note the line.split(None, 1). This limits the splitting to one item because we have to use .split() twice in a dict comprehension. We cannot store intermediate results for reuse as in a loop. The None means split at any whitespace.
For an OrderedDict you can also get away with one line using a generator expression:
from collections import OrderedDict
>>> OrderedDict((line.split(None, 1)[0], tuple(line.split()[1:]))
for line in fobj)
OrderedDict([('str11', ('str12', 'str13')), ('str21', ('str22',)),
('str31', ('str32', 'str33', 'str34'))])