Dictionary is below
a = {'querystring': {'dataproduct.keyword': 'abc,def'}}
How to split into two dictionary with values?
a['querystring'] = {'dataproduct.keyword': 'abc,def'}
Expected out while printing
{'dataproduct.keyword': 'abc'}
{'dataproduct.keyword': 'def'}
Since dictionary is hashmap
[{'dataproduct.keyword': 'abc'} {'dataproduct.keyword': 'def'}]
Disclaimer:
before executing need to check the comma
if a['querystring'] = {'dataproduct.keyword': 'abc'} then no need to split
if a['querystring'] = {'dataproduct.keyword': 'abc,def,efg'} if comma is there then only need to split
[{key: item} for key, value in a['querystring'].items() for item in value.split(',')]
A solution that works with across all top-level entries, not just the entry with key "querystring":
a = {'querystring': {'dataproduct.keyword': 'abc,def'}}
split_a = []
for value in a.values():
for sub_key, sub_value in value.items():
for split_sub_value in sub_value.split(","):
split_a.append({sub_key: split_sub_value})
Resulting value of split_a is [{'dataproduct.keyword': 'abc'}, {'dataproduct.keyword': 'def'}].
Related
I have a list in dict which I have extracted the data that I need; 'uni', 'gp', 'fr', 'rn'.
uni:1 gp:CC fr:c2 rn:DS
uni:1 gp:CC fr:c2 rn:PP
uni:1 gp:CC fr:c2 rn:LL
uni:2 gp:CC fr:c2 rn:DS
uni:2 gp:CC fr:c2 rn:LL
.
.
.
Above is the output that I write in a txt file with code in below:
for line in new_l:
for key,value in line.items():
if key == 'uni':
unique.append(value)
elif key == 'gp':
pg.append(value)
elif key == 'fr':
rf.append(value)
elif key == 'rn':
rn.append(value)
with open('sampel1.list',mode='w') as f:
for unique,gp,fr,rn in zip(uni,gp,fr,rn):
f.write('uni:{uni}\t,gp:{gp}\t,fr:{fr}\t,rn:{rn}-\n'.format(uni=uni,gp=gp,fr=fr,rn=rn))
The expected output that I want is to merge the 'rn' which has different value with each other and same value of 'unique','gp','fr'.
unique:1 gp:CC fr:c2 rn:DS+PP+LL
unique:2 gp:CC fr:c2 rn:DS+LL
Here's one way I might do something like this using pure Python. Note: this particular solution is relying on the fact that Python 3.7 dicts preserve insertion order:
from collections import defaultdict
# This will map the (uni, gp, fr) triplets to the list of merged rn values
merged = defaultdict(list)
for l in new_l:
# Assuming these keys are always present; if not you will need to check
# that and skip invalid entries
key = (l['uni'], l['gp'], l['fr'])
merged[key].append(l['rn'])
# Now if you wanted to write this to a file, say:
with open(filename, 'w') as f:
for (uni, gp, fr), rn in merged.items():
f.write(f'uni:{uni}\tgp:{gp}\tfr:{fr}\trn:{"+".join(rn)}\n')
Note, when I wrote "pure Python" I meant just using the standard library. In practice I might use Pandas if I'm working with tabular data.
You need study a little about algoritms and data structure.
In this case you can use the first 3 elements to create a unique hash, and based in this value append or not the last element.
Example:
lst = []
lst.append({'uni':1, 'gp':'CC', 'fr':'c2', 'rn':'DS'})
lst.append({'uni':1, 'gp':'CC', 'fr':'c2', 'rn':'PP'})
lst.append({'uni':1, 'gp':'CC', 'fr':'c2', 'rn':'LL'})
lst.append({'uni':2, 'gp':'CC', 'fr':'c2', 'rn':'DS'})
lst.append({'uni':2, 'gp':'CC', 'fr':'c2', 'rn':'PP'})
lst.append({'uni':3, 'gp':'CC', 'fr':'c2', 'rn':'DS'})
hash = {}
for line in lst:
hashkey = str(line['uni'])+line['gp']+line['fr']
if hashkey in hash.keys():
hash[hashkey]['rn']+="+"+line['rn']
else:
hash[hashkey]={'uni':line['uni'], 'gp':line['gp'], 'fr':line['fr'], 'rn':line['rn']}
print(hash)
result: {'1CCc2': {'uni': 1, 'gp': 'CC', 'fr': 'c2', 'rn': 'DS+PP+LL'}, '2CCc2': {'uni': 2, 'gp': 'CC', 'fr': 'c2', 'rn': 'DS+PP'}, '3CCc2': {'uni': 3, 'gp': 'CC', 'fr': 'c2', 'rn':
'DS'}}
I thought I'd add how I approach problems like this. You are grouping by the first 3 fields, so I would place them in a tuple (not a list; the dictionary index must be an immutable object) and use that as an index to a dictionary. Then, as you read each line from your input file, test if the tuple is already in the dictionary or not. If it is, concatenate to the previous values already saved.
myDict = {}
f = open("InputData.txt","r")
for line in f:
#print( line.strip())
tup = line.strip().split('\t')
#print(tup)
ind = (tup[0],tup[1],tup[2])
if ind in myDict:
if tup[3] not in myDict[ind]:
myDict[ind] = myDict[ind] + "+" + tup[3][3:]
else:
myDict[ind] = tup[3][3:]
f.close()
print(myDict)
Once the data is in the dictionary object, you can iterate over it and write your output like in the other answers above. (My answer assumes your input text file is tab delimited.)
I find dictionaries very helpful in cases like this.
I have the following code to create empty dictionary:
empty_dict = dict.fromkeys(['apple','ball'])
empty_dict = {'apple': None, 'ball': None}
I have this empty dictionary.
Now I want to add the values from value.txt which has the following content:
value.txt
1
2
I want the resultant dictionary to be as:
{
"apple" : 1,
"ball" : 2
}
I'm not sure how to update only the value from the dictionary.
You don't really need to make the dict first — it makes it inconvenient to get the order correct. You can just zip() the keys and the file lines and pass it to the dictionary constructor like:
keys = ['apple','ball']
with open(path, 'r') as file:
d = dict(zip(keys, map(str.strip, file)))
print(d)
# {'apple': '1', 'ball': '2'}
This uses strip() to remove the \n characters from the lines in the file.
It's not clear what should happen if you have more lines than keys, but the above will ignore them.
I have python dictionary like this and sometimes I get an mdash or some other character (like "\u2014") that I need to replace with a simply hyphen. I've seen lots of ways to change keys and whole values of a dictionary but I just need to change this character. Is there a way to do this such that they output is the dictionary with the character changed. Thanks
dict={1:{'name':'Event','description':'Fireside Chat'}, 2: {'name':'Class','description':'Friendship — Day'} }
Iterate through the values with a loop:
data = {1: {'name': 'Event', 'description': 'Fireside Chat'}, 2: {'name': 'Class', 'description': 'Friendship — Day'}}
for id, row in data.items():
for k, v in row.items():
data[id][k] = v.replace('\u20124', '-')
Or with a dict comprehension:
data = {id: {k: v.replace('\u2014', '-') for k, v in row.items()} for id, row in data.items()}
Also: using keywords like 'dict' as variable names is bad practice.
How would I remove a \n or newline character from a dict value in Python?
testDict = {'salutations': 'hello', 'farewell': 'goodbye\n'}
testDict.strip('\n') # I know this part is incorrect :)
print(testDict)
To update the dictionary in-place, just iterate over it and apply str.rstrip() to values:
for key, value in testDict.items():
testDict[key] = value.rstrip()
To create a new dictionary, you can use a dictionary comprehension:
testDict = {key: value.rstrip() for key, value in testDict.items()}
Use dictionary comprehension:
testDict = {key: value.strip('\n') for key, value in testDict.items()}
You're trying to strip a newline from the Dictionary Object.
What you want is to iterate over all Dictionary keys and update their values.
for key in testDict.keys():
testDict[key] = testDict[key].strip()
That would do the trick.
Quick question: in Python 3, if I have the following code
def file2dict(filename):
dictionary = {}
data = open(filename, 'r')
for line in data:
[ key, value ] = line.split(',')
dictionary[key] = value
data.close()
return dictionary
It means that file MUST contain exactly 2 strings(or numbers, or whatever) on every line in the file because of this line:
[ key, value ] = line.split(',')
So, if in my file I have something like this
John,45,65
Jack,56,442
The function throws an exception.
The question: why key, value are in square brackets? Why, for example,
adr, port = s.accept()
does not use square brackets?
And how to modify this code if I want to attach 2 values to every key in a dictionary? Thank you.
The [ and ] around key, value aren't getting you anything.
The error that you're getting, ValueError: too many values to unpack is because you are splitting text like John,45,65 by the commas. Do "John,45,65".split(',') in a shell. You get
>>> "John,45,65".split(',')
['John', '45', '65']
Your code is trying to assign 3 values, "John", 45, and 65, to two variables, key and value, thus the error.
There are a few options:
1) str.split has an optional maxsplit parameter:
>>> "John,45,65".split(',', 1)
['John', '45,65']
if "45,65" is the value you want to set for that key in the dictionary.
2) Cut the extra value.
If the 65 isn't what you want, then you can do something either like
>>> name, age, unwanted = "John,45,65".split(',',)
>>> name, age, unwanted
('John', '45', '65')
>>> dictionary[name] = age
>>> dictionary
{'John': '45'}
and just not use the unwanted variable, or split into a list and don't use the last element:
>>> data = "John,45,65".split(',')
>>> dictionary[data[0]] = data[1]
>>> dictionary
{'John': '45'}
you can use three variable's instead of two, make first one key,
def file2dict(filename):
dictionary = {}
data = open(filename, 'r')
for line in data:
key, value1,value2 = line.split(',')
dictionary[key] = [int(value1), int(value2)]
data.close()
return dictionary
When doing a line split to a dictionary, consider limiting the number of splits by specifying maxsplit, and checking to make sure that the line contains a comma:
def file2dict(filename):
data = open(filename, 'r')
dictionary = dict(item.split(",",1) for item in data if "," in item)
data.close()
return dictionary