Dictionary with multiple values per key via for loop - python

given a List in Python I want top create a dictionary that stores all possible two sums as keys and the corresponding indices as values, e.g.
list = [1,0,-1, 0]
Then I would to compute the dictionary {1:{0,1}, {0,3}, 0: {1,3},{0,2}, -1:{1,2}, {2,3}}.
I am having troubles finding out how to have a dictionary where one key corresponds to multiple values. If I use dict[sum]={i,j} I am always replacing the entries in my dictionary while instead I would like to add them.
Does anyone know if there exists a solution?

IIUC, use a dictionary with setdefault to add the results and itertools.combinations to generate the combinations of indices:
lst = [1,0,-1, 0]
from itertools import combinations
out = {}
for i,j in combinations(range(len(lst)), 2):
a = lst[i] # first value
b = lst[j] # second value
S = a+b # sum of values
# if the key is missing, add empty list
# append combination of indices as value
out.setdefault(S, []).append((i,j))
print(out)
Condensed variant:
out = {}
for i,j in combinations(range(len(lst)), 2):
out.setdefault(lst[i]+lst[j], []).append((i,j))
output:
{ 1: [(0, 1), (0, 3)],
0: [(0, 2), (1, 3)],
-1: [(1, 2), (2, 3)]}

Try this:
arr = [1, 0, -1, 0]
map = {}
for i in range(len(arr)):
for j in range(i + 1, len(arr)):
s = arr[i] + arr[j]
if s not in map:
map[s] = []
map[s].append((i, j))
print(map)

Related

Sparse matrix subtraction

I need to write a function which gets a list of dictionaries (every dictionary represents a sparse matrix) and returns a dictionary of the subtraction matrix.
For example: for the list [{(1, 3): 2, (2, 7): 1}, {(1, 3): 6}] it needs to return {(1, 3): -4, (2, 7): 1} .
The matrices don't have to be the same size, the list can have more than two matrices and if the subtraction is 0 then it should not appear in the final dictionary.
I succeeded in getting the -4 but no matter what I write after defining x I get x == -6 and I can't tell why. I want to insert the -4 as the new value for the element.
lst = [{(1, 3): 2, (2, 7): 1}, {(1, 3): 6}]
def diff_sparse_matrices(lst):
result = {}
for dictionary in lst:
for element in dictionary:
if element not in result:
result[element] = dictionary[element]
if element in result:
x = result[element] - dictionary[element]
def diff_sparse_matrices(lst):
result = lst[0].copy()
for matrix in lst[1:]:
for coordinates, value in matrix.items():
result[coordinates] = result.get(coordinates, 0) - value
if result[coordinates] == 0:
del result[coordinates]
return result
def diff_sparse_matrices(lst):
result = lst[0].copy()
for d in lst[1:]:
for tup in d:
if tup in result:
result[tup] -= d[tup]
else:
result[tup] = -d[tup]
return result

Order dictionary with x and y coordinates in python

I have this problem.
I need order this points 1-7
1(4,2), 2(3, 5), 3(1,4), 4(1,1), 5(2,2), 6(1,3), 7(1,5)
and get this result
4 , 6 , 3 , 5 , 2 , 1 , 7.
I am using a python script for sort with x reference and is ok, but the sort in y is wrong.
I have tried with sorted(dicts,key=itemgetter(1,2))
Someone can help me please ?
Try this:
sorted(dicts,key=itemgetter(1,0))
Indexing in python starts at 0. itemgetter(1,0) is sorting by the second element and then by the first element
This sorts the code based on ordering the first coordinate of the tuple, and then sub-ordering by the second coordinate of the tuple. I.e. Like alphabetically where "Aa", then "Ab", then "Ba", then "Bb". More literall (1,1), (1,2), (2,1), (2,2), etc.
This will work IF (and only if) the tuple value pair associated with #7 is actually out of order in your question (and should actually be between #3 and #5.)
If this is NOT the case, See my other answer.
# Make it a dictionary, with the VALUETUPLES as the KEYS, and the designator as the value
d = {(1,1):4, (1,3):6, (1,4):3, (2,2):5, (3,5):2, (4,2):1,(1,5):7}
# ALSO make a list of just the value tuples
l = [ (1,1), (1,3), (1,4), (2,2), (3,5), (4,2), (1,5)]
# Sort the list by the first element in each tuple. ignoring the second
new = sorted(l, key=lambda x: x[0])
# Create a new dictionary, basically for temp sorting
new_d = {}
# This iterates through the first sorted list "new"
# and creates a dictionary where the key is the first number of value tuples
count = 0
# The extended range is because we don't know if any of the Tuple Values share any same numbers
for r in range(0, len(new)+1,1):
count += 1
new_d[r] = []
for item in new:
if item[0] == r:
new_d[r].append(item)
print(new_d) # So it makes sense
# Make a final list to capture the rdered TUPLES VALUES
final_list = []
# Go through the same rage as above
for r in range(0, len(new)+1,1):
_list = new_d[r] # Grab the first list item from the dic. Order does not matter here
if len(_list) > 0: # If the list has any values...
# Sort that list now by the SECOND tuple value
_list = sorted(_list, key=lambda x: x[1])
# Lists are ordered. So we can now just tack that ordered list onto the final list.
# The order remains
for item in _list:
final_list.append(item)
# This is all the tuple values in order
print(final_list)
# If you need them correlated to their original numbers
by_designator_num = []
for i in final_list: # The the first tupele value
by_designator_num.append(d[i]) # Use the tuple value as the key, to get the original designator number from the original "d" dictionary
print(by_designator_num)
OUTPUT:
[(1, 1), (1, 3), (1, 4), (1, 5), (2, 2), (3, 5), (4, 2)]
[4, 6, 3, 7, 5, 2, 1]
Since you're searching visually from top-to-bottom, then left-to-right, this code is much simpler and provides the correct result. It basically does the equivalent of a visual scan, by checking for all tuples that are at each "y=n" position, and then sorting any "y=n" tuples based on the second number (left-to-right).
Just to be more consistent with the Cartesian number system, I've converted the points on the graph to (x,y) coordinates, with X-positive (increasing to the right) and y-negative (decreasing as they go down).
d = {(2,-4):1, (5,-3):2, (4,-1):3, (1,-1):4, (2,-2):5, (3,-1):6, (1,-5):7}
l = [(2,-4), (5,-3), (4,-1), (1,-1), (2,-2), (3,-1), (1,-5)]
results = []
# Use the length of the list. Its more than needed, but guarantees enough loops
for y in range(0, -len(l), -1):
# For ONLY the items found at the specified y coordinate
temp_list = []
for i in l: # Loop through ALL the items in the list
if i[1] == y: # If tuple is at this "y" coordinate then...
temp_list.append(i) # ... append it to the temp list
# Now sort the list based on the "x" position of the coordinate
temp_list = sorted(temp_list, key=lambda x: x[0])
results += temp_list # And just append it to the final result list
# Final TUPLES in order
print(results)
# If you need them correlated to their original numbers
by_designator_num = []
for i in results: # The the first tupele value
by_designator_num.append(d[i]) # Use the tuple value as the key, to get the original designator number from the original "d" dictionary
print(by_designator_num)
OR if you want it faster and more compact
d = {(2,-4):1, (5,-3):2, (4,-1):3, (1,-1):4, (2,-2):5, (3,-1):6, (1,-5):7}
l = [(2,-4), (5,-3), (4,-1), (1,-1), (2,-2), (3,-1), (1,-5)]
results = []
for y in range(0, -len(l), -1):
results += sorted([i for i in l if i[1] == y ], key=lambda x: x[0])
print(results)
by_designator_num = [d[i] for i in results]
print(by_designator_num)
OUTPUT:
[(1, -1), (3, -1), (4, -1), (2, -2), (5, -3), (2, -4), (1, -5)]
[4, 6, 3, 5, 2, 1, 7]

Iterating multiple variables over a list without double counting?

I was working on part of a program in which I'm trying to input a list of numbers and return all groups of 3 numbers which sum to 0, without double or triple counting each number. Here's where I'm up to:
def threeSumZero2(array):
sums = []
apnd=[sorted([x,y,z]) for x in array for y in array for z in array if x+y+z==0]
for sets in apnd:
if sets not in sums:
sums.append(sets)
return sums
Is there any code I can put in the third line to make sure I don't return [0,0,0] as an answer.
This is my test list:
[-1,0,1,2,-1,4]
Thank you
*Edit: I should have clarified for repeated input values: the result expected for this test list is:
[[-1,-1,2],[-1,0,1]]
You want combinations without replacement, this is something offered by itertools. Your sums can then be made a set to remove the duplicates with regard to ordering.
from itertools import combinations
def threeSumZero2(array):
sums = set()
for comb in combinations(array, 3):
if sum(comb) == 0:
sums.add(tuple(sorted(comb)))
return sums
print(threeSumZero2([-1,0,1,2,-1,4]))
Output
{(-1, -1, 2), (-1, 0, 1)}
This solution can also be written more concisely using a set-comprehension.
def threeSumZero2(nums):
return {tuple(sorted(comb)) for comb in combinations(nums, 3) if sum(comb) == 0}
More efficient algorithm
Although, the above algorithm requires traversing all combinations of three items, which makes it O(n3).
A general strategy used for this kind of n-sum problem is to traverse the n-1 combinations and hash their sums, allowing to efficiently test them against the numbers in the list.
The algorithm complexity drops by one order of magnitude, making it O(n2)
from itertools import combinations
def threeSumZero2(nums, r=3):
two_sums = {}
for (i_x, x), (i_y, y) in combinations(enumerate(nums), r - 1):
two_sums.setdefault(x + y, []).append((i_x, i_y))
sums = set()
for i, n in enumerate(nums):
if -n in two_sums:
sums |= {tuple(sorted([nums[idx[0]], nums[idx[1]], n]))
for idx in two_sums[-n] if i not in idx}
return sums
print(threeSumZero2([-1,0,1,2,-1,4]))
Output
{(-1, -1, 2), (-1, 0, 1)}
You could do this with itertools (see Oliver's answer), but you can also achieve the result with three nested for-loops:
def threeSumZero2(lst):
groups = []
for i in range(len(lst)-2):
for j in range(i + 1, len(lst)-1):
for k in range(j + 1, len(lst)):
if lst[i] + lst[j] + lst[k] == 0:
groups.append((lst[i], lst[j], lst[k]))
return groups
and your test:
>>> threeSumZero2([-1, 0, 1, 2, -1, 4])
[(-1, 0, 1), (-1, 2, -1), (0, 1, -1)]
Oh and list != array!

Find indexes of repeated elements in an array (Python, NumPy)

Assume, I have a NumPy-array of integers, as:
[34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]
I want to find the start and end indices of the array, where a value is more than x-times (say 5-times) repeated. So in the case above, it is the value 22 and 6. Start index of the repeated 22 is 3 and end-index is 8. Same for the repeatening 6.
Is there a special tool in Python that is helpful?
Otherwise, I would loop through the array index for index and compare the actual value with the previous.
Regards.
Using np.diff and the method given here by #WarrenWeckesser for finding runs of zeros in an array:
import numpy as np
def zero_runs(a): # from link
iszero = np.concatenate(([0], np.equal(a, 0).view(np.int8), [0]))
absdiff = np.abs(np.diff(iszero))
ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
return ranges
a = [34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]
zero_runs(np.diff(a))
Out[87]:
array([[ 3, 8],
[15, 22]], dtype=int32)
This can then be filtered on the difference between the start & end of the run:
runs = zero_runs(np.diff(a))
runs[runs[:, 1]-runs[:, 0]>5] # runs of 7 or more, to illustrate filter
Out[96]: array([[15, 22]], dtype=int32)
Here is a solution using Python's native itertools.
Code
import itertools as it
def find_ranges(lst, n=2):
"""Return ranges for `n` or more repeated values."""
groups = ((k, tuple(g)) for k, g in it.groupby(enumerate(lst), lambda x: x[-1]))
repeated = (idx_g for k, idx_g in groups if len(idx_g) >=n)
return ((sub[0][0], sub[-1][0]) for sub in repeated)
lst = [34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]
list(find_ranges(lst, 5))
# [(3, 8), (15, 22)]
Tests
import nose.tools as nt
def test_ranges(f):
"""Verify list results identifying ranges."""
nt.eq_(list(f([])), [])
nt.eq_(list(f([0, 1,1,1,1,1,1, 2], 5)), [(1, 6)])
nt.eq_(list(f([1,1,1,1,1,1, 2,2, 1, 3, 1,1,1,1,1,1], 5)), [(0, 5), (10, 15)])
nt.eq_(list(f([1,1, 2, 1,1,1,1, 2, 1,1,1], 3)), [(3, 6), (8, 10)])
nt.eq_(list(f([1,1,1,1, 2, 1,1,1, 2, 1,1,1,1], 3)), [(0, 3), (5, 7), (9, 12)])
test_ranges(find_ranges)
This example captures (index, element) pairs in lst, and then groups them by element. Only repeated pairs are retained. Finally, first and last pairs are sliced, yielding (start, end) indices from each repeated group.
See also this post for finding ranges of indices using itertools.groupby.
There really isn't a great short-cut for this. You can do something like:
mult = 5
for elem in val_list:
target = [elem] * mult
found_at = val_list.index(target)
I leave the not-found exceptions and longer sequence detection to you.
If you're looking for value repeated n times in list L, you could do something like this:
def find_repeat(value, n, L):
look_for = [value for _ in range(n)]
for i in range(len(L)):
if L[i] == value and L[i:i+n] == look_for:
return i, i+n
Here is a relatively quick, errorless solution which also tells you how many copies were in the run. Some of this code was borrowed from KAL's solution.
# Return the start and (1-past-the-end) indices of the first instance of
# at least min_count copies of element value in container l
def find_repeat(value, min_count, l):
look_for = [value for _ in range(min_count)]
for i in range(len(l)):
count = 0
while l[i + count] == value:
count += 1
if count >= min_count:
return i, i + count
I had a similar requirement. This is what I came up with, using only comprehension lists:
A=[34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]
Find unique and return their indices
_, ind = np.unique(A,return_index=True)
np.unique sorts the array, sort the indices to get the indices in the original order
ind = np.sort(ind)
ind contains the indices of the first element in the repeating group, visible by non-consecutive indices
Their diff gives the number of elements in a group. Filtering using np.diff(ind)>5 shall give a boolean array with True at the starting indices of groups. The ind array contains the end indices of each group just after each True in the filtered list
Create a dict with the key as the repeating element and the values as a tuple of start and end indices of that group
rep_groups = dict((A[ind[i]], (ind[i], ind[i+1]-1)) for i,v in enumerate(np.diff(ind)>5) if v)

top n keys with highest values in dictionary with tuples as keys

I want to get the top n keys of a dictionary with tuples as keys, where the first value of the tuple is a particular number (1 in the example below):
a = {}
a[1,2] = 3
a[1,0] =4
a[1,5] = 1
a[2,3] = 9
I want [1,0] and [1,2] to be returned, where the first element of the tuple/key = 1
this
import heapq
k = heapq.nlargest(2, a, key=a.get(1,))
returns [1,4] and [1,3], the highest keys/tuples with first element = 1, though if I make it
k = heapq.nlargest(2, a, key=a.get(2,))
it returns the same thing?
First you should take only the keys with first coordinate 1. Otherwise, there is the chance if there are a few elements with 1 as first coordinate, to get other tuples also. Then you can use heapq normally. For example:
a = {
(1, 2): 3,
(1, 0): 4,
(1, 5): 1,
(2, 3): 9
}
import heapq
print heapq.nlargest(2, (k for k in a if k[0] == 1), key=lambda k: a[k])
print heapq.nlargest(2, (k for k in a if k[0] == 2), key=lambda k: a[k])
Output:
[(1, 0), (1, 2)]
[(2, 3)]
The key parameter should be a function. But you are passing in a.get(1,). What this does is calling a.get(1,) which is the same as a.get(1) which is the same as a.get(1, None).
The dictionary doesn't have a 1 key so it returns None which means you are doing the equivalent of passing key=None which is the same as not passing a key at all: you are using the identity function as key.
Then heapq.nlargest returns the top 2 elements which are, correctly, [1, 4] and [1, 3].
This explains why using a.get(1,) and a.get(2,) does the same thing. The above reasoning works for both values and you end up with key=None in both cases.
To achieve what you want use something like:
key=lambda x: (x[0] == 1, a[x])
If you find yourself using this kind of keys often you can create a key maker function:
def make_key(value, container):
def key(x):
return x[0] == value, container[x]
return key
using it as:
nlargest(2, a, key=make_key(1, a))
nlargest(2, a, key=make_key(2, a))

Categories