I have a huge file (with around 200k inputs). The inputs are in the form:
A B C D
B E F
C A B D
D
I am reading this file and storing it in a list as follows:
text = f.read().split('\n')
This splits the file whenever it sees a new line. Hence text is like follows:
[[A B C D] [B E F] [C A B D] [D]]
I have to now store these values in a dictionary where the key values are the first element from each list. i.e the keys will be A, B, C, D.
I am finding it difficult to enter the values as the remaining elements of the list. i.e the dictionary should look like:
{A: [B C D]; B: [E F]; C: [A B D]; D: []}
I have done the following:
inlinkDict = {}
for doc in text:
adoc= doc.split(' ')
docid = adoc[0]
inlinkDict[docid] = inlinkDict.get(docid,0) + {I do not understand what to put in here}
Please help as to how should i add the values to my dictionary. It should be 0 if there are no elements in the list except for the one which will be the key value. Like in example for 0.
A dictionary comprehension makes short work of this task:
>>> s = [['A','B','C','D'], ['B','E','F'], ['C','A','B','D'], ['D']]
>>> {t[0]:t[1:] for t in s}
{'A': ['B', 'C', 'D'], 'C': ['A', 'B', 'D'], 'B': ['E', 'F'], 'D': []}
Try using a slice:
inlinkDict[docid] = adoc[1:]
This will give you an empty list instead of a 0 for the case where only the key value is on the line. To get a 0 instead, use an or (which always returns one of the operands):
inlinkDict[docid] = adoc[1:] or 0
Easier way with a dict comprehension:
>>> with open('/tmp/spam.txt') as f:
... data = [line.split() for line in f]
...
>>> {d[0]: d[1:] for d in data}
{'A': ['B', 'C', 'D'], 'C': ['A', 'B', 'D'], 'B': ['E', 'F'], 'D': []}
>>> {d[0]: ' '.join(d[1:]) if d[1:] else 0 for d in data}
{'A': 'B C D', 'C': 'A B D', 'B': 'E F', 'D': 0}
Note: dict keys must be unique, so if you have, say, two lines beginning with 'C' the first one will be over-written.
The accepted answer is correct, except that it reads the entire file into memory (may not be desirable if you have a large file), and it will overwrite duplicate keys.
An alternate approach using defaultdict, which is available from Python 2.4 solves this:
from collections import defaultdict
d = defaultdict(list)
with open('/tmp/spam.txt') as f:
for line in f:
parts = line.strip().split()
d[parts[0]] += parts[1:]
Input:
A B C D
B E F
C A B D
D
C H I J
Result:
>>> d = defaultdict(list)
>>> with open('/tmp/spam.txt') as f:
... for line in f:
... parts = line.strip().split()
... d[parts[0]] += parts[1:]
...
>>> d['C']
['A', 'B', 'D', 'H', 'I', 'J']
Related
If I have a nested dictionary and varying lists:
d = {'a': {'b': {'c': {'d': 0}}}}
list1 = ['a', 'b']
list2 = ['a', 'b', 'c']
list3 = ['a', 'b', 'c', 'd']
How can I access dictionary values like so:
>>> d[list1]
{'c': {'d': 0}}
>>> d[list3]
0
you can use functools reduce. info here. You have a nice post on reduce in real python
from functools import reduce
reduce(dict.get, list3, d)
>>> 0
EDIT: mix of list and dictioanries
in case of having mixed list and dictionary values the following is possible
d = {'a': [{'b0': {'c': 1}}, {'b1': {'c': 1}}]}
list1 = ['a', 1, 'b1', 'c']
fun = lambda element, indexer: element[indexer]
reduce(fun, list1, d)
>>> 1
Use a short function:
def nested_get(d, lst):
out = d
for x in lst:
out = out[x]
return out
nested_get(d, list1)
# {'c': {'d': 0}}
I have a existing dict that maps single values to lists.
I want to reverse this dictionary and map from every list entry on the original key.
The list entries are unique.
Given:
dict { 1: ['a', 'b'], 2: ['c'] }
Result:
dict { 'a' : 1, 'b' : 1, 'c' : 2 }
How can this be done?
Here's an option
new_dict = {v: k for k, l in d.items() for v in l}
{'a': 1, 'b': 1, 'c': 2}
You can use a list comprehension to produce a tuple with the key-value pair, then, flatten the new list and pass to the built-in dictionary function:
d = { 1: ['a', 'b'], 2: ['c'] }
new_d = dict([c for h in [[(i, a) for i in b] for a, b in d.items()] for c in h])
Output:
{'a': 1, 'c': 2, 'b': 1}
I have this data.
CITY1 CITY2
A B
A C
A D
B C
B D
C D
How i can create dictionary looking like this from the above data
x={A:[B,C,D],
B:[A,C,D],
C:[A,B,D],
D:[A,B,C]
}
Thanks
Is it in a csv? It looks like, from the data you provide, you are doing an undirected graph. Assuming that the data is in some kind of "row" type format that you can loop through, (i.e. row[0] is the city1 value, and row[1] is the city2 value):
from collections import defaultdict
def make_graph(data):
graph = defaultdict(set)
for a, b in data:
graph[a].add(b)
graph[b].add(a) # delete this line if you want a directed graph
return graph
data = [
['A','B'],
['C','D'],
['A','C']
]
print make_graph(data)
I was trying to do it without any library import.
I made a simple dictionary first .
x={'A':['B','C','D'],'B':['C','D'],'C':['D']}
for i,j in x.items():
for p in j:
if p not in x.keys():
x[p]=[]
if p in x[i] and i not in x[p]:
x[p].append(i)
print x
{'A': ['B', 'C', 'D'], 'C': ['D', 'A', 'B'], 'B': ['C', 'D', 'A'], 'D': ['A', 'C', 'B']}
I have a dictionary and want to remove certain values in bad_list from its value list, and return the remainder. Here is the code:
d = {1: ['a', 'c', 'd'], 2: ['b'], 5: ['e']}
bad_list = ['d','e']
ad = {k:d[k].remove(i) for k in d.keys() for sublist in d[k] for i in sublist if i in bad_list}
print 'd =', d
print 'ad =', ad
Unfortunately what that does is it changes the values in d permanently, and returns None for values in ad.
d = {1: ['a', 'c'], 2: ['b'], 5: []}
ad = {1: None, 5: None}
How can I get a dictionary that looks like this:
new_dict = {1: ['a','c'], 2:['b']}
without looping through? I have a much larger dictionary to deal with, and I'd like to do it in the most efficient way.
There is no way to do it without loop:
d = dict((key, [x for x in value if x not in bad_list]) for key, value in d.iteritems())
or with filter:
d = dict((key, filter(lambda x: x not in bad_list, d[key])) for key in d)
UPDATE
To exclude empty values:
d = dict((key, list(x)) for key in d for x in [set(d[key]).difference(bad_list)] if x)
Well, you could just use 'list comprehension', this one liner works, thought I find if ugly.
ad = {k:v for k,v in {k:[i for i in v if i not in bad_list] for k,v in d.items()}.items() if v}
I'd better use a for loop.
ad2 = dict()
for k,v in d.items():
_data_ = [item for item in v if item not in bad_list]
if _data_:
ad2[k]=_data_
Output:
print 'd =', d
print 'ad =', ad
print 'ad2=', ad2
>d = {1: ['a', 'c', 'd'], 2: ['b'], 5: ['e']}
>ad = {1: ['a', 'c'], 2: ['b']}
>ad2= {1: ['a', 'c'], 2: ['b']}
The following code written in Python 3.5 appears to do as requested in your question. Minimal change should be required for it to work with Python 2.x instead. Just use print statements instead of functions.
d = {1: ['a', 'c', 'd'], 2: ['b'], 5: ['e']}
bad_list = ['d', 'e']
ad = {a: b for a, b in ((a, [c for c in b if c not in bad_list]) for a, b in d.items()) if b}
print('d =', d)
print('ad =', ad)
In the dictionaries below I want to check whether the value in aa matches the value in bb and produce a mapping of the keys of aa to the keys of bb. Do I need to rearrange the dictionaries? I import the data from a tab separated file, so I am not attached to dictionaries. Note that aa is about 100 times bigger than bb (100k lines for aa), but this is to be run infrequently and offline.
Input:
aa = {1: 'a', 3: 'c', 2 : 'b', 4 : 'd'}
bb = {'apple': 'a', 'pear': 'b', 'mango' : 'g'}
Desired output (or any similar data structure):
dd = {1 : 'apple', 2 : 'pear'}
aa = {1:'a', 3:'c', 2:'b', 4:'d'}
bb = {'apple':'a', 'pear':'b', 'mango': 'g'}
bb_rev = dict((value, key)
for key, value in bb.iteritems()) # bb.items() in python3
dd = dict((key, bb_rev[value])
for key, value in aa.iteritems() # aa.items() in python3
if value in bb_rev)
print dd
You can do something like this:
>>> aa = {1: 'a', 3: 'c', 2 : 'b', 4 : 'd'}
>>> bb = {'apple': 'a', 'pear': 'b', 'mango' : 'g'}
>>> tmp = {v: k for k, v in bb.iteritems()}
>>> dd = {k: tmp[v] for k, v in aa.iteritems() if v in tmp}
>>> dd
{1: 'apple', 2: 'pear'}
but note that this will only work if each value of the aa dictionary appears as a value of the bb dictionary either once or not at all.