how to remove a specific element in the tuple? - python

how to remove a specific element in the tuple?
for example:
L={('a','b','c','d'):1,('a','b','c','e'):2}
remove='b'
I want to get a result of :
{('a','c','d'):1,('a','c','e'):2}

In [20]: L={('a','b','c','d'):1,('a','b','c','e'):2}
In [21]: {tuple(y for y in x if y != "b"):L[x] for x in L}
Out[21]: {('a', 'c', 'd'): 1, ('a', 'c', 'e'): 2}
or using filter():
In [24]: { tuple(filter(lambda y:y!="b",x)) : L[x] for x in L}
Out[24]: {('a', 'c', 'd'): 1, ('a', 'c', 'e'): 2}

You could create an updated version of the dictionary using a dictionary comprehension expression:
L = {('a', 'b', 'c', 'd'): 1, ('a', 'b', 'c', 'e'): 2, ('f', 'g', 'h'): 3}
remove='b'
L = {tuple(i for i in k if i != remove) if remove in k else k:v for (k,v) in L.items()}
print L
Output:
{('a', 'c', 'e'): 2, ('a', 'c', 'd'): 1, ('f', 'g', 'h'): 3}
As you can see it, it leaves items without the element in their tuple key alone.

Related

How do I convert a list of pairs into a dictionary with each element as a key to a list of paired values?

I'm doing coursework which involves graphs. I have edge lists E=[('a','b'),('a','c'),('a','d'), ('b','c') etc. ] and I want to a function to convert them into adjacency matrices in the form of dictionaries {'a':['b','c','d'], 'b':['a', etc. } so that I can use a function that only inputs these dictionaries.
My main issue is I can't figure out how to use a loop to add key:values without just overwriting the lists. A previous version of my function would output [] as all values because 'f' has no connections.
I've tried this:
V = ['a','b','c','d','e','f']
E=[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
def EdgeListtoAdjMat(V,E):
GA={}
conneclist=[]
for v in V:
for i in range(len(V)):
conneclist.append([])
if (v,V[i]) in E:
conneclist[i].append(V[i])
for i in range(len(V)):
GA[V[i]]=conneclist[i]
return(GA)
EdgeListtoAdjMat(V,E) outputs:
{'a': [], 'b': ['b'], 'c': ['c', 'c'], 'd': ['d', 'd', 'd'], 'e': [], 'f': []}
whereas it should output:
{'a':['b','c','d'],
'b':['a','c','d'],
'c':['a','b','d'],
'd':['a','b','c'],
'e':[],
'f':[]
}
The logic of what you're trying to achieve is actually quite simple:
V = ['a','b','c','d','e','f']
E=[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
result = {}
for elem in V:
tempList = []
for item in E:
if elem in item:
if elem == item[0]:
tempList.append(item[1])
else:
tempList.append(item[0])
result[elem] = tempList
tempList = []
print(result)
Result:
{'a': ['b', 'c', 'd'], 'b': ['a', 'c', 'd'], 'c': ['a', 'b', 'd'], 'd': ['a', 'b', 'c'], 'e': [], 'f': []}
For every element in V, perform a check to see whether that element exists in any tuple in E. If it exists, then take the element that together form a pair on that tuple and append to a temporary list. After checking every element in E, update the result dictionary and move to the next element of V until you're done.
To get back to your code, you need to modify it as following:
def EdgeListtoAdjMat(V,E):
GA={}
conneclist=[]
for i in range(len(V)):
for j in range(len(V)):
# Checking if a pair of two different elements exists in either format inside E.
if not i==j and ((V[i],V[j]) in E or (V[j],V[i]) in E):
conneclist.append(V[j])
GA[V[i]]=conneclist
conneclist = []
return(GA)
A more efficient approach is to iterate through the edges and append to the output dict of lists the vertices in both directions. Use dict.setdefault to initialize each new key with a list. And when the iterations over the edges finish, iterate over the rest of the vertices that are not yet in the output dict to assign to them empty lists:
def EdgeListtoAdjMat(V,E):
GA = {}
for a, b in E:
GA.setdefault(a, []).append(b)
GA.setdefault(b, []).append(a)
for v in V:
if v not in GA:
GA[v] = []
return GA
so that given:
V = ['a', 'b', 'c', 'd', 'e', 'f']
E = [('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
EdgeListtoAdjMat(V, E)) would return:
{'a': ['b', 'c', 'd'], 'b': ['a', 'c', 'd'], 'c': ['a', 'b', 'd'], 'd': ['a', 'b', 'c'], 'e': [], 'f': []}
Since you already have your list of vertices in V, it is easy to prepare a dictionary with an empty list of connections. Then, simply go through the edge list and add to the array on each side:
V = ['a','b','c','d','e','f']
E = [('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
GA = {v:[] for v in V}
for v1,v2 in E:
GA[v1].append(v2)
GA[v2].append(v1)
I think your code is not very pythonic, you could write a more readable code that is simpler to debug and also faster since you are using python's built-in libraries and numpy's indexing.
def EdgeListToAdjMat(V, E):
AdjMat = np.zeros((len(V), len(V))) # the shape of Adjancy Matrix
connectlist = {
# Mapping each character to its index
x: idx for idx, x in enumerate(V)
}
for e in E:
v1, v2 = e
idx_1, idx_2 = connectlist[v1], connectlist[v2]
AdjMat[idx_1, idx_2] = 1
AdjMat[idx_2, idx_1] = 1
return AdjMat
If you'd consider using a library, networkx is designed for these type of network problems:
import networkx as nx
V = ['a','b','c','d','e','f']
E = [('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
G=nx.Graph(E)
G.add_nodes_from(V)
GA = nx.to_dict_of_lists(G)
print(GA)
# {'a': ['c', 'b', 'd'], 'c': ['a', 'b', 'd'], 'b': ['a', 'c', 'd'], 'e': [], 'd': ['a', 'c', 'b'], 'f': []}
You can convert the edge list to the map using itertools.groupby
from itertools import groupby
from operator import itemgetter
V = ['a','b','c','d','e','f']
E = [('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
# add edge in the other direction. E.g., for a -> b, add b -> a
nondirected_edges = E + [tuple(reversed(pair)) for pair in E]
# extract start and end vertices from an edge
v_start = itemgetter(0)
v_end = itemgetter(1)
# group edges by their starting vertex
groups = groupby(sorted(nondirected_edges), key=v_start)
# make a map from each vertex -> adjacent vertices
mapping = {vertex: list(map(v_end, edges)) for vertex, edges in groups}
# if you don't need all the vertices to be present
# and just want to be able to lookup the connected
# list of vertices to a given vertex at some point
# you can use a defaultdict:
from collections import defaultdict
adj_matrix = defaultdict(list, mapping)
# if you need all vertices present immediately:
adj_matrix = dict(mapping)
adj_matrix.update({vertex: [] for vertex in V if vertex not in mapping})

Find elements that appear in more than k sets in python

I am implementing a basic spell-correction system and I have built an inverted index for my domain's language, where every character bigram is mapped to a list of words that contain that bigram.
Now I want to find all words that share more than 3 character bigrams with the given word w. So the main problem is: given a set of lists, how can one efficiently find elements that occur in 3 or more of them?
For example, given sets:
('a', 'b', 'c', 'd') , ('a', 'e', 'f', 'g'), ('e', 'f', 'g', 'h'), ('b', 'c', 'z', 'y'), ('e', 'k', 'a', 'j')
I like to get the output:
('a', 'e')
since a and e have each appeared in more than 3 sets.
I would appreciate your ideas.
Additional to #Ralf. You can use dicts to construct a histogram
someCollection = [('a', 'b', 'c', 'd') , ('a', 'e', 'f', 'g'), ('e', 'f', 'g', 'h'), ('b', 'c', 'z', 'y'), ('e', 'k', 'a', 'j')]
hist = {}
for collection in someCollection:
for member in collection:
hist[member] = hist.get(member, 0) + 1
Hist now is:
{'a': 3,
'b': 2,
'c': 2,
'd': 1,
'e': 3,
'f': 2,
'g': 2,
'h': 1,
'z': 1,
'y': 1,
'k': 1,
'j': 1}
Which can be sorted with sorted(hist.items(), key = lambda x[1]) # sort along values
You could try using collections.Counter:
from collections import Counter
data = [
('a', 'b', 'c', 'd'),
('a', 'e', 'f', 'g'),
('e', 'f', 'g', 'h'),
('b', 'c', 'z', 'y'),
('e', 'k', 'a', 'j'),
]
c = Counter()
for e in data:
c.update(e)
# print(c)
# for k, v in c.items():
# if v >= 3:
# print(k, v)
You get the output by using this (or something similar):
>>> [k for k, v in c.items() if v >= 3]
['a', 'e']

Python - Sorting a Tuple-Keyed Dictionary by Tuple Value

I have a dictionary constituted of tuple keys and integer counts and I want to sort it by the third value of the tuple (key[2]) as so
data = {(a, b, c, d): 1, (b, c, b, a): 4, (a, f, l, s): 3, (c, d, j, a): 7}
print sorted(data.iteritems(), key = lambda x: data.keys()[2])
with this desired output
>>> {(b, c, b, a): 4, (a, b, c, d): 1, (c, d, j, a): 7, (a, f, l, s): 3}
but my current code seems to do nothing. How should this be done?
Edit: The appropriate code is
sorted(data.iteritems(), key = lambda x: x[0][2])
but in the context
from collections import Ordered Dict
data = {('a', 'b', 'c', 'd'): 1, ('b', 'c', 'b', 'a'): 4, ('a', 'f', 'l', 's'): 3, ('c', 'd', 'j', 'a'): 7}
xxx = []
yyy = []
zzz = OrderedDict()
for key, value in sorted(data.iteritems(), key = lambda x: x[0][2]):
x = key[2]
y = key[3]
xxx.append(x)
yyy.append(y)
zzz[x + y] = 1
print xxx
print yyy
print zzz
zzz is unordered. I know that this is because dictionaries are by default unordered and that I need to use OrderedDict to sort it but I don't know where to use it. If I use it as the checked answer suggests I get a 'tuple index out of range' error.
Solution:
from collections import OrderedDict
data = {('a', 'b', 'c', 'd'): 1, ('b', 'c', 'b', 'a'): 4, ('a', 'f', 'l', 's'): 3, ('c', 'd', 'j', 'a'): 7}
xxx = []
yyy = []
zzz = OrderedDict()
for key, value in sorted(data.iteritems(), key = lambda x: x[0][2]):
x = key[2]
y = key[3]
xxx.append(x)
yyy.append(y)
zzz[x + y] = 1
print xxx
print yyy
print zzz
Dictionaries are unordered in Python. You can however use an OrderedDict.
You then have to sort like:
from collections import OrderedDict
result = OrderedDict(sorted(data.iteritems(),key=lambda x:x[0][2]))
You need to use key=lambda x:x[0][2] because the elements are tuples (key,val) and so to obtain the key, you use x[0].
This gives:
>>> data = {('a', 'b', 'c', 'd'): 1, ('b', 'c', 'b', 'a'): 4, ('a', 'f', 'l', 's'): 3, ('c', 'd', 'j', 'a'): 7}
>>> from collections import OrderedDict
>>> result = OrderedDict(sorted(data.iteritems(),key=lambda x:x[0][2]))
>>> result
OrderedDict([(('b', 'c', 'b', 'a'), 4), (('a', 'b', 'c', 'd'), 1), (('c', 'd', 'j', 'a'), 7), (('a', 'f', 'l', 's'), 3)])
EDIT:
In order to make zzz ordered as well, you can update your code to:
data = {('a', 'b', 'c', 'd'): 1, ('b', 'c', 'b', 'a'): 4, ('a', 'f', 'l', 's'): 3, ('c', 'd', 'j', 'a'): 7}
xxx = []
yyy = []
zzz = OrderedDict()
for key, value in sorted(data.iteritems(), key = lambda x: x[0][2]):
x = key[2]
y = key[3]
xxx.append(x)
yyy.append(y)
zzz[x + y] = 1
print xxx
print yyy
print zzz
Your key function is completely broken. It's passed the current value as x, but you ignore that and instead always get the second item from the list of keys.
Use key=lambda x: x[0][2] instead.

Custom list of lists into dictionary

I have list of lists and I wish to create a dictionary with length of each element as values. I tried the following:
tmp = [['A', 'B', 'E'], ['B', 'E', 'F'], ['A', 'G']]
tab = []
for line in tmp:
tab.append(dict((k, len(tmp)) for k in line))
But it gives the output as:
[{'A': 3, 'B': 3, 'E': 3}, {'B': 3, 'E': 3, 'F': 3}, {'A': 3, 'G': 3}]
What is the modification that I should make to get the output:
{['A', 'B', 'E']:3, ['B', 'E', 'F']:3, ['A', 'G']:2}
Thanks in advance.
AP
You can't use list objects as dictionary keys, they are mutable and unhashable. You can convert them to tuple. Also note that you are looping over each sub list. You can use a generator expression by only looping over the main list:
In [3]: dict((tuple(sub), len(sub)) for sub in tmp)
Out[3]: {('A', 'B', 'E'): 3, ('A', 'G'): 2, ('B', 'E', 'F'): 3}
{tuple(t):len(t) for t in tmp}
Input :
[['A', 'B', 'E'], ['B', 'E', 'F'], ['A', 'G']]
Output :
{('A', 'G'): 2, ('A', 'B', 'E'): 3, ('B', 'E', 'F'): 3}
Dictionnary does not accept list as key, but tuple

Summing similar elements within a tuple-of-tuples

Following on from this question, I now need to sum similar entries (tuples) within an overall tuple.
So given a tuple-of-tuples such as:
T = (('a', 'b', 2),
('a', 'c', 4),
('b', 'c', 1),
('a', 'b', 8),)
For all tuples where the first and second element are identical, I want to sum the third element, otherwise, leave the tuple in place. So I will end up with the following tuple-of-tuples:
(('a', 'b', 10),
('a', 'c', 4),
('b', 'c', 1),)
The order of the tuples within the enclosing tuple (and the summing) doesn't matter.
We are dealing with tuples so we can't take advantage of something like dict.get(). If we go the defaultdict route :
In [1218]: d = defaultdict(lambda: defaultdict(int))
In [1220]: for t in T:
d[t[0]][t[1]] += t[2]
......:
In [1225]: d
Out[1225]:
defaultdict(<function __main__.<lambda>>,
{'a': defaultdict(int, {'b': 10, 'c': 4}),
'b': defaultdict(int, {'c': 1})})
I'm not quite sure how to reconstruct that into a tuple-of-tuples. Any anyway, although the order of the three elements within each tuple will be consistent, I'm not comfortable with my indexing of the tuples. Can this be done without any conversion to other data types?
Code -
from collections import defaultdict
T1 = (('a', 'b', 2),
('a', 'c', 4),
('b', 'c', 1),
('a', 'b', 8),)
d = defaultdict(int)
for x, y, z in T1:
d[(x, y)] += z
T2 = tuple([(*k, v) for k, v in d.items()])
print(T2)
Output -
(('a', 'c', 4), ('b', 'c', 1), ('a', 'b', 10))
If you're interested in maintaining the original order, then -
from collections import OrderedDict
T1 = (('a', 'b', 2), ('a', 'c', 4), ('b', 'c', 1), ('a', 'b', 8),)
d = OrderedDict()
for x, y, z in T1:
d[(x, y)] = d[(x, y)] + z if (x, y) in d else z
T2 = tuple((*k, v) for k, v in d.items())
print(T2)
Output -
(('a', 'b', 10), ('a', 'c', 4), ('b', 'c', 1))
In Python 2, you should use this -
T2 = tuple([(x, y, z) for (x, y), z in d.items()])
You just need a defaultdict(int):
>>> from collections import defaultdict
>>>
>>> d = defaultdict(int)
>>> T = (('a', 'b', 2),
... ('a', 'c', 4),
... ('b', 'c', 1),
... ('a', 'b', 8),)
>>>
>>> for key1, key2, value in T:
... d[(key1, key2)] += value
...
>>> [(key1, key2, value) for (key1, key2), value in d.items()]
[
('b', 'c', 1),
('a', 'b', 10),
('a', 'c', 4)
]

Categories