Python 3.6 merge dictionaries fails - python

I am trying to merge two dictionaries, after searching for a close question on stack overflow, I found the next solution:
mergeDicts = {**dict1, **dict2}
but that doesn't work. While I know my code is alright as I observe right results for single dictionary, once I merge I don't get right results
def readFiles(path1):
// count words
if __name__ == '__main__':
a = readFiles('C:/University/learnPy/dir')
b = readFiles('C:/Users/user/Anaconda3/dir')
bigdict = {**a, **b}
print(a['wee'])
print(b['wee'])
print(bigdict['wee'])
In a there's 1 .txt file containing 2 wee
In b there's 1 .txt file containing 1 wee
So I'd expect bigdict output to be 3, but what I observe is bigdict is just getting the numbers of the first dict. {**dict1 (THIS ONE), **dict2} and the merge is not working.
Question: what went wrong ? why is this failing on python 3.6 when answers stated it should work.

dict(**x, **y) is doing what its supposed to do. Creates bigdict by overwriting values of 1st arg with the 2nd arg. You will need to sum the values by your self.
You can use a Counter
from collections import Counter
a = {'wee':1, 'woo':2 }
b = {'wee':10, 'woo': 20 }
bigdict = dict(Counter(a)+Counter(b))
Out[23]: {'wee': 11, 'woo': 22}

Related

How can I make a dictionary - list structure in python?

I think this is a basic question, but I want to know this.
For example, in the below situation:
str1 = {}
str1['a'] = 1
str1['b'] = 2
I want to access to the following value:
str1['a'][0]
I think i've solved it.
I could use it this way. when i got a key, check the key and can make a list.
Sorry for confusing..
import sys
key='a'
str1={}
if 'a' in str1.keys():
str1['a'].append(0)
else:
str1 = {key : list()}
str1['a'].append(0)
print(str1['a'][0])
str1['a'].append(3)
print(str1['a'])
Sang Yoon Kim;
I think you might be confusing nested lists with Dictionaries
Dictionary is a key value pair.
Take a look at this;
L = [1,2,3] ----> L[0] = 1
G = [[1,2,3],[4,5,6]]
Now G[1] gives a list [4,5,6]
to access the element 6 , do G[1][2]
Dictionaries are entirely different
Please take a look at these links;
https://www.geeksforgeeks.org/python-list/
https://www.geeksforgeeks.org/python-dictionary/
To answer you question, you cannot access str['a][0] because its an invalid operation.
I think i've solved it. I could use it this way. when i got a key, check the key and can make a list. Sorry for confusing..
import sys
key='a'
str1={}
if 'a' in str1.keys():
str1['a'].append(0)
else:
str1 = {key : list()}
str1['a'].append(0)
print(str1['a'][0])
str1['a'].append(3)
print(str1['a'])

Python : comparing lines of a file with keys of a dictionary in the memory

I am trying to compare values of key "rrname" in a json format file ,jadata dic (Each line is a dictionary) with keys of dictionary d which is already loaded in memory.
Here is my code:
import simplejson
ap = '/data/data/2014/A.1/ap.txt'
ddb = '/data/data/2014/A.1/test'
d={}
f = open(ap,'r')
g = open(ddb,'r')
for line in f:
domain,bl_date= line.split('|')
d[domain]=bl_date
for line in g:
line=line.strip('')
try:
jdata = simplejson.loads(line)
if jdata.get('rrname') == d.keys():
print rrname
except:
raise
here is my ddb file :
{"rrname": "bba186684.alshamil.net.ae.", "time_last": 1389295255, "time_first": 1389241418, }
{"rrname": "bba186686.alshamil.net.ae.", "time_last": 1390910891, "time_first": 1390910891}
{"rrname": "0001ewm.rcomhost.com", "time_last": 1390147425, "time_first": 1390124988}
here is ap file:
0001elk.rcomhost.com|1391726703
0001ewm.rcomhost.com|1393472522
0001qz6.wcomhost.com|1399977648
when I run this code, it cannot find the similarities, although there is. Can somebody help me with this?
jdata.get('rrname') == d.keys()
will always fail -- the single entry on the left of the == won't equal all entries on the right as you're asking.
Rather, check whether:
jdata.get('rrname') in d
The in operator looks for the left side to be contained in the right side. It's important for performance to use d, not d.keys(), as the right side, since checking for containment in a dictionary is much faster than checking in a list (as .keys would be in Python 2, which I guess is what you're using, even though you don't tell us!, based on the syntax for that print:-).

Creating (seeding) large dictionaries efficiently in Python

I have a long (500K+ rows) two column spreadsheet that looks like this:
Name Code
1234 A
1234 B
1456 C
4556 A
4556 B
4556 C
...
So there is an element (with a Name) that can have a number of Codes. But instead of one row per code, I would like to a list of all codes that occur for each element. What I want is a dictionary like this:
{"1234":["A","B"],"1456":["C"],"4556":["A","B","C"] ...]}
What I have tried is this (and I'm not including the file reading syntax).
codelist = {}
for row in rows:
name,code = well.split()
if name in codelist.keys():
codelist[name].append(code)
else:
codelist[name] = [code]
This creates the right output but progress becomes incredibly slow. So I've tried priming my dictionary with keys:
allnames = [.... list of all the names ...]
codelist = dict.fromkeys(allnames)
for row in rows:
name,code = well.split()
if codelist[name]:
codelist[name].append(code)
else:
codelist[name] = [code]
This is dramatically faster, and my question is why? Doesn't the program each time still have to search all the keys in the dict? Is there another way to speed up the dict search that doesn't include traversing a tree?
Interesting is the error I get when I use the same conditional check as before (if name in codelist.keys():) after priming my dictionary.
Traceback (most recent call last):
File ....
codelist[name].append(code)
AttributeError: 'NoneType' object has no attribute 'append'
Now, there is a key but no list to append to. So I use codelist[name] which is <NoneType> as well and appears to work. What does it mean when mydict["primed key"] is <NoneType> ?enter code here
The former one is slower because .keys() has to create a list of all keys in memory first and then the in operator performs a search on it. So, it is an O(N) search for each line from the text file, hence it is slow.
On the other hand a simple key in dict search takes O(1) time.
dict.fromkeys(allnames)
The default value assigned by dict.fromkeys is None, so you can't use append on it.
>>> d = dict.fromkeys('abc')
>>> d
{'a': None, 'c': None, 'b': None}
A better solution will be to use collections.defaultdict here, in case that is not an option then use a normal dict with either a simple if-else check or dict.setdefault.
In Python3 .keys() returns a View Object, so time complexity may differ there. But, it is still going to be slightly slower than normal key in dict search.
You might want to have a look at the defaultdict container to avoid checks
from collections import defaultdict
allnames [.... list of all the names ...]
codelist = defaultdict(list)
for row in rows:
name,code = well.split()
codelist[name].append(code)

Adding a new key to a nested dictionary in python

I need to add a key with a value that increases by one for every item in the nested dictionary. I have been trying to use the dict['key']='value' syntax but can't get it to work for a nested dictionary. I'm sure it's a very simple.
My Dictionary:
mydict={'a':{'result':[{'key1':'value1','key2':'value2'},
{'key1':'value3','key2':'value4'}]}}
This is the code that will add the key to the main part of the dictionary:
for x in range(len(mydict)):
number = 1+x
str(number)
mydict[d'index']=number
print mydict
#out: {d'index':d'1',d'a'{d'result':[...]}}
I want to add the new key and value to the small dictionaries inside the square parentheses:
{'a':{'result':[{'key1':'value1',...,'index':'number'}]}}
If I try adding more layers to the last line of the for loop I get a traceback error:
Traceback (most recent call last):
File "C:\Python27\program.py", line 34, in <module>
main()
File "C:\Python27\program.py", line 23, in main
mydict['a']['result']['index']=number
TypeError: list indices must be integers, not unicode
I've tried various different ways of listing the nested items but no joy. Can anyone help me out here?
The problem is that mydict is not simply a collection of nested dictionaries. It contains a list as well. Breaking up the definition helps clarify the internal structure:
dictlist = [{'key1':'value1','key2':'value2'},
{'key1':'value3','key2':'value4'}]
resultdict = {'result':dictlist}
mydict = {'a':resultdict}
So to access the innermost values, we have to do this. Working backwards:
mydict['a']
returns resultdict. Then this:
mydict['a']['result']
returns dictlist. Then this:
mydict['a']['result'][0]
returns the first item in dictlist. Finally, this:
mydict['a']['result'][0]['key1']
returns 'value1'
So now you just have to amend your for loop to iterate correctly over mydict. There are probably better ways, but here's a first approach:
for inner_dict in mydict['a']['result']: # remember that this returns `dictlist`
for key in inner_dict:
do_something(inner_dict, key)
I'm not fully sure what you're trying to do, but I think itertools.count would be able to help here.
>>> c = itertools.count()
>>> c.next()
0
>>> c.next()
1
>>> c.next()
2
>>> c.next()
3
... and so on.
Using this, you can keep incrementing the value that you want to use in your dicts
Hope this helps

Join two CSV files in python using dictreader

I realise the info to answer this question is probably already on here, but as a python newby I've been trying to piece together the info for a few weeks now and I'm hitting some trouble.
this question Python "join" function like unix "join" answers how to do a join on two lists easily, but the problem is that dictreader objects are iterables and not straightforward lists, meaning that there's an added layer of complications.
I basically am looking for an inner join on two CSV files, using the dictreader object. Here's the code I have so far:
def test(dictreader1, dictreader2):
matchedlist = []
for dictline1 in dictreader1:
for dictline2 in dictreader2:
if dictline1['member']=dictline2['member']:
matchedlist.append(dictline1, dictline2)
else: continue
return matchedlist
This is giving me an error at the if statement, but more importantly, I don't seem to be able to access the ['member'] element of the dictionary within the iterable, as it says it has no attribute "getitem".
Does anyone have any thoughts on how to do this? For reference, I need to keep the lists as iterables because each file is too big to fit in memory. The plan is to control this entire function within another for loop that only feeds it a few lines at a time to iterate over. So it will read one line of the left hand file, iterate over the whole second file to find a member field that matches and then join the two lines, similar to an SQL join statement.
Thanks for any help in advance, please forgive any obvious errors on my part.
A few thoughts:
Replace the = with ==. The latter is used for equality tests; the former for assignments.
Add a line a the beginning, dictreader2 = list(dictreader2). That will make it possible to loop over the dictionary entries more than once.
Add a second pair of parenthese to matchedlist.append((dictline1, dictline2)). The list.append method takes just one argument, so you want to create a tuple out of dictline1 and dictline2.
The final else: continue is unnecessary. A for-loop will automatically loop for you.
Use a print statement or somesuch to verify that dictline1 and dictline2 are both dictionary objects that have member as a key. It could be that your function is correct, but is being called with something other than a dictreader object.
Here is a worked out example using a list of dicts as input (similar to what a DictReader would return):
>>> def test(dictreader1, dictreader2):
dictreader2 = list(dictreader2)
matchedlist = []
for dictline1 in dictreader1:
for dictline2 in dictreader2:
if dictline1['member'] == dictline2['member']:
matchedlist.append((dictline1, dictline2))
return matchedlist
>>> dr1 = [{'member': 2, 'value':'abc'}, {'member':3, 'value':'def'}]
>>> dr2 = [{'member': 4, 'tag':'t4'}, {'member':3, 'tag':'t3'}]
>>> test(dr1, dr2)
[({'member': 3, 'value': 'def'}, {'member': 3, 'tag': 't3'})]
A further suggestion is to combine the two dictionaries into a single entry (this is closer to what an SQL inner join would do):
>>> def test(dictreader1, dictreader2):
dictreader2 = list(dictreader2)
matchedlist = []
for dictline1 in dictreader1:
for dictline2 in dictreader2:
if dictline1['member'] == dictline2['member']:
entry = dictline1.copy()
entry.update(dictline2)
matchedlist.append(entry)
return matchedlist
>>> test(dr1, dr2)
[{'member': 3, 'tag': 't3', 'value': 'def'}]
Good luck with your project :-)

Categories