x = [8,2,3,4,5]
y = [6,3,7,2,1]
How to find out the first common element in two lists (in this case, "2") in a concise and elegant way? Any list can be empty or there can be no common elements - in this case None is fine.
I need this to show python to someone who is new to it, so the simpler the better.
UPD: the order is not important for my purposes, but let's assume I'm looking for the first element in x that also occurs in y.
This should be straight forward and almost as effective as it gets (for more effective solution check Ashwini Chaudharys answer and for the most effective check jamylaks answer and comments):
result = None
# Go trough one array
for i in x:
# The element repeats in the other list...
if i in y:
# Store the result and break the loop
result = i
break
Or event more elegant would be to encapsulate the same functionality to functionusing PEP 8 like coding style conventions:
def get_first_common_element(x,y):
''' Fetches first element from x that is common for both lists
or return None if no such an element is found.
'''
for i in x:
if i in y:
return i
# In case no common element found, you could trigger Exception
# Or if no common element is _valid_ and common state of your application
# you could simply return None and test return value
# raise Exception('No common element found')
return None
And if you want all common elements you can do it simply like this:
>>> [i for i in x if i in y]
[1, 2, 3]
A sort is not the fastest way of doing this, this gets it done in O(N) time with a set (hash map).
>>> x = [8,2,3,4,5]
>>> y = [6,3,7,2,1]
>>> set_y = set(y)
>>> next((a for a in x if a in set_y), None)
2
Or:
next(ifilter(set(y).__contains__, x), None)
This is what it does:
>>> def foo(x, y):
seen = set(y)
for item in x:
if item in seen:
return item
else:
return None
>>> foo(x, y)
2
To show the time differences between the different methods (naive approach, binary search an sets), here are some timings. I had to do this to disprove the suprising number of people that believed binary search was faster...:
from itertools import ifilter
from bisect import bisect_left
a = [1, 2, 3, 9, 1, 1] * 100000
b = [44, 11, 23, 9, 10, 99] * 10000
c = [1, 7, 2, 4, 1, 9, 9, 2] * 1000000 # repeats early
d = [7, 6, 11, 13, 19, 10, 19] * 1000000
e = range(50000)
f = range(40000, 90000) # repeats in the middle
g = [1] * 10000000 # no repeats at all
h = [2] * 10000000
from random import randrange
i = [randrange(10000000) for _ in xrange(5000000)] # some randoms
j = [randrange(10000000) for _ in xrange(5000000)]
def common_set(x, y, ifilter=ifilter, set=set, next=next):
return next(ifilter(set(y).__contains__, x), None)
pass
def common_b_sort(x, y, bisect=bisect_left, sorted=sorted, min=min, len=len):
sorted_y = sorted(y)
for a in x:
if a == sorted_y[min(bisect_left(sorted_y, a),len(sorted_y)-1)]:
return a
else:
return None
def common_naive(x, y):
for a in x:
for b in y:
if a == b: return a
else:
return None
from timeit import timeit
from itertools import repeat
import threading, thread
print 'running tests - time limit of 20 seconds'
for x, y in [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h'), ('i', 'j')]:
for func in ('common_set', 'common_b_sort', 'common_naive'):
try:
timer = threading.Timer(20, thread.interrupt_main) # 20 second time limit
timer.start()
res = timeit(stmt="print '[', {0}({1}, {2}), ".format(func, x, y),
setup='from __main__ import common_set, common_b_sort, common_naive, {0}, {1}'.format(x, y),
number=1)
except:
res = "Too long!!"
finally:
print '] Function: {0}, {1}, {2}. Time: {3}'.format(func, x, y, res)
timer.cancel()
The test data was:
a = [1, 2, 3, 9, 1, 1] * 100000
b = [44, 11, 23, 9, 10, 99] * 10000
c = [1, 7, 2, 4, 1, 9, 9, 2] * 1000000 # repeats early
d = [7, 6, 11, 13, 19, 10, 19] * 1000000
e = range(50000)
f = range(40000, 90000) # repeats in the middle
g = [1] * 10000000 # no repeats at all
h = [2] * 10000000
from random import randrange
i = [randrange(10000000) for _ in xrange(5000000)] # some randoms
j = [randrange(10000000) for _ in xrange(5000000)]
Results:
running tests - time limit of 20 seconds
[ 9 ] Function: common_set, a, b. Time: 0.00569520707241
[ 9 ] Function: common_b_sort, a, b. Time: 0.0182240340602
[ 9 ] Function: common_naive, a, b. Time: 0.00978832505249
[ 7 ] Function: common_set, c, d. Time: 0.249175872911
[ 7 ] Function: common_b_sort, c, d. Time: 1.86735751332
[ 7 ] Function: common_naive, c, d. Time: 0.264309220865
[ 40000 ] Function: common_set, e, f. Time: 0.00966861710078
[ 40000 ] Function: common_b_sort, e, f. Time: 0.0505980508696
[ ] Function: common_naive, e, f. Time: Too long!!
[ None ] Function: common_set, g, h. Time: 1.11300018578
[ None ] Function: common_b_sort, g, h. Time: 14.9472068377
[ ] Function: common_naive, g, h. Time: Too long!!
[ 5411743 ] Function: common_set, i, j. Time: 1.88894859542
[ 5411743 ] Function: common_b_sort, i, j. Time: 6.28617268396
[ 5411743 ] Function: common_naive, i, j. Time: 1.11231867458
This gives you an idea of how it will scale for larger inputs, O(N) vs O(N log N) vs O(N^2)
One liner, using next to take the first item from a generator:
x = [8,2,3,4,5]
y = [6,3,7,2,1]
first = next((a for a in x if a in y), None)
Or more efficiently since set.__contains__ is faster than list.__contains__:
set_y = set(y)
first = next((a for a in x if a in set_y), None)
Or more efficiently but still in one line (don't do this):
first = next((lambda set_y: a for a in x if a in set_y)(set(y)), None)
Using a for loops with in will result in a O(N^2) complexity, but you can sort y here and use binary search to improve the time complexity to O(NlogN).
def binary_search(lis,num):
low=0
high=len(lis)-1
ret=-1 #return -1 if item is not found
while low<=high:
mid=(low+high)//2
if num<lis[mid]:
high=mid-1
elif num>lis[mid]:
low=mid+1
else:
ret=mid
break
return ret
x = [8,2,3,4,5]
y = [6,3,7,2,1]
y.sort()
for z in x:
ind=binary_search(y,z)
if ind!=-1
print z
break
output:
2
Using the bisect module to perform the same thing as above:
import bisect
x = [8,2,3,4,5]
y = [6,3,7,2,1]
y.sort()
for z in x:
ind=bisect.bisect(y,z)-1 #or use `ind=min(bisect.bisect_left(y, z), len(y) - 1)`
if ind!=-1 and y[ind] ==z:
print z #prints 2
break
I assume you want to teach this person Python, not just programming. Therefore I do not hesitate to use zip instead of ugly loop variables; it's a very useful part of Python and not hard to explain.
def first_common(x, y):
common = set(x) & set(y)
for current_x, current_y in zip(x, y):
if current_x in common:
return current_x
elif current_y in common:
return current_y
print first_common([8,2,3,4,5], [6,3,7,2,1])
If you really don't want to use zip, here's how to do it without:
def first_common2(x, y):
common = set(x) & set(y)
for i in xrange(min(len(x), len(y))):
if x[i] in common:
return x[i]
elif y[i] in common:
return y[i]
And for those interested, this is how it extends to any number of sequences:
def first_common3(*seqs):
common = set.intersection(*[set(seq) for seq in seqs])
for current_elements in zip(*seqs):
for element in current_elements:
if element in common:
return element
Finally, please note that, in contrast to some other solutions, this works as well if the first common element appears first in the second list.
I just noticed your update, which makes for an even simpler solution:
def first_common4(x, y):
ys = set(y) # We don't want this to be recreated for each element in x
for element in x:
if element in ys:
return element
The above is arguably more readable than the generator expression.
Too bad there is no built-in ordered set. It would have made for a more elegant solution.
Using for loops seems easiest to explain to someone new.
for number1 in x:
for number2 in y:
if number1 == number2:
print number1, number2
print x.index(number1), y.index(number2)
exit(0)
print "No common numbers found."
NB Not tested, just out of my head.
This one uses sets. It returns the first common element or None if no common element.
def findcommon(x,y):
common = None
for i in range(0,max(len(x),len(y))):
common = set(x[0:i]).intersection(set(y[0:i]))
if common: break
return list(common)[0] if common else None
def first_common_element(x,y):
common = set(x).intersection(set(y))
if common:
return x[min([x.index(i)for i in common])]
Just for fun (probably not efficient), another version using itertools:
from itertools import dropwhile, product
from operator import __ne__
def accept_pair(f):
"Make a version of f that takes a pair instead of 2 arguments."
def accepting_pair(pair):
return f(*pair)
return accepting_pair
def get_first_common(x, y):
try:
# I think this *_ unpacking syntax works only in Python 3
((first_common, _), *_) = dropwhile(
accept_pair(__ne__),
product(x, y))
except ValueError:
return None
return first_common
x = [8, 2, 3, 4, 5]
y = [6, 3, 7, 2, 1]
print(get_first_common(x, y)) # 2
y = [6, 7, 1]
print(get_first_common(x, y)) # None
It is simpler, but not as fun, to use lambda pair: pair[0] != pair[1] instead of accept_pair(__ne__).
Use set - this is the generic solution for arbitrary number of lists:
def first_common(*lsts):
common = reduce(lambda c, l: c & set(l), lsts[1:], set(lsts[0]))
if not common:
return None
firsts = [min(lst.index(el) for el in common) for lst in lsts]
index_in_list = min(firsts)
trgt_lst_index = firsts.index(index_in_list)
return lsts[trgt_lst_index][index_in_list]
An afterthought - not an effective solution, this one reduces redundant overhead
def first_common(*lsts):
common = reduce(lambda c, l: c & set(l), lsts[1:], set(lsts[0]))
if not common:
return None
for lsts_slice in itertools.izip_longest(*lsts):
slice_intersection = common.intersection(lsts_slice)
if slice_intersection:
return slice_intersection.pop()
Related
Suppose I have two lists list_1 and list_2
list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
I want to get a list of tuples containing elements from both list_1 and list_2 such that the difference between the numbers in a tuple is under a some constant c.
E.g. suppose c is 2 then the tuples I would have would be:
[(1, 3), (5, 3), (5, 4)]
Of course one can iterate over both lists and check that the difference between 2 elements is less than c, but that has a complexity of n^2 and I would rather reduce that complexity.
Here is an implementation of the idea of Marat from the comments:
import bisect
def close_pairs(list1,list2,c):
#assumes that list2 is sorted
for x in list1:
i = bisect.bisect_left(list2,x-c)
j = bisect.bisect_right(list2,x+c)
yield from ((x,y) for y in list2[i:j])
list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
print(list(close_pairs(list_1,list_2,2)))
#prints [(1, 3), (5, 3), (5, 4)]
To demonstrate the potential improvement of this strategy over what might be thought of as the "naive" approach, let's timeit.
import timeit
setup_naive = '''
import numpy
list_a = numpy.random.randint(0, 2500, 500).tolist()
list_b = numpy.random.randint(0, 2500, 500).tolist()
c = 2
def close_pairs(list_a, list_b, c):
yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
'''
setup_john_coleman = '''
import bisect
import numpy
list_a = numpy.random.randint(0, 2500, 500).tolist()
list_b = numpy.random.randint(0, 2500, 500).tolist()
c = 2
def close_pairs(list_a, list_b, c):
list_a = sorted(list_a)
list_b = sorted(list_b)
for x in list_a:
i = bisect.bisect_left(list_b,x-c)
j = bisect.bisect_right(list_b,x+c)
yield from ((x,y) for y in list_b[i:j])
'''
print(f"john_coleman: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_john_coleman, number=1000):.2f}")
print(f"naive: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_naive, number=1000):.2f}")
On a handy laptop that gives result like:
john_coleman: 0.50
naive: 18.35
If the lists are sorted as your example suggests, then remove the sorting and then this has runtime complexity O(M+N+P) where M and N are the list sizes and P is the number of close pairs. It keeps an index i so that ys[i] is the smallest y-value not too small, and then walks over ys[i:...] as long as they're not too large, yielding each pair.
def close_pairs(xs, ys, c):
xs = sorted(xs)
ys = sorted(ys) + [float('inf')]
i = 0
for x in xs:
while x - ys[i] > c:
i += 1
j = i
while ys[j] - x <= c:
yield x, ys[j]
j += 1
Benchmark results with lists/ranges 1000 times larger than your example:
904.4 ms close_pairs_naive
4.9 ms close_pairs_John_Coleman
1.8 ms close_pairs_Kelly_Bundy
Benchmark code:
from timeit import timeit
import random
import bisect
from collections import deque
def close_pairs_naive(list_a, list_b, c):
yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
def close_pairs_John_Coleman(list_a, list_b, c):
list_a = sorted(list_a)
list_b = sorted(list_b)
for x in list_a:
i = bisect.bisect_left(list_b,x-c)
j = bisect.bisect_right(list_b,x+c)
yield from ((x,y) for y in list_b[i:j])
def close_pairs_Kelly_Bundy(xs, ys, c):
xs = sorted(xs)
ys = sorted(ys) + [float('inf')]
i = 0
for x in xs:
while x - ys[i] > c:
i += 1
j = i
while ys[j] - x <= c:
yield x, ys[j]
j += 1
funcs = [
close_pairs_naive,
close_pairs_John_Coleman,
close_pairs_Kelly_Bundy,
]
xs = random.choices(range(15000), k=3000)
ys = random.choices(range(15000), k=3000)
c = 2
args = xs, ys, c
expect = sorted(funcs[0](*args))
for func in funcs:
result = sorted(func(*args))
print(result == expect, func.__name__, len(result))
print()
for _ in range(3):
for func in funcs:
t = timeit(lambda: deque(func(*args), 0), number=1)
print('%6.1f ms ' % (t * 1e3), func.__name__)
print()
I have a list:
lst = [ 1,2,3,4,5,6,7,8]
I want to increment all numbers above index 4.
for i in range(4,len(lst)):
lst[i]+=2
Since this operation needs to be done many time, I want to do it the most efficient way possible.
How can I do this fast.
Use Numpy for fast array manipulations, check the example below:
import numpy as np
lst = np.array([1,2,3,4,5,6,7,8])
# add 2 at all indices from 4 till the end of the array
lst[4:] += 2
print(lst)
# array([ 1, 2, 3, 4, 7, 8, 9, 10])
If you are updating large ranges of a large list many times, use a more suitable data structure so that the updates don't take O(n) time each.
One such data structure is a segment tree, where each list element corresponds to a leaf node in a tree; the true value of the list element can be represented as the sum of the values on the path between the leaf node and the root node. This way, adding a number to a single internal node is effectively like adding it to all of the list elements represented by that subtree.
The data structure supports get/set operations by index in O(log n) time, and add-in-range operations also in O(log n) time. The solution below uses a binary tree, implemented using a list of length <= 2n.
class RangeAddList:
def __init__(self, vals):
# list length
self._n = len(vals)
# smallest power of 2 >= list length
self._m = 1 << (self._n - 1).bit_length()
# list representing binary tree; leaf nodes offset by _m
self._vals = [0]*self._m + vals
def __repr__(self):
return '{}({!r})'.format(self.__class__.__name__, list(self))
def __len__(self):
return self._n
def __iter__(self):
for i in range(self._n):
yield self[i]
def __getitem__(self, i):
if i not in range(self._n):
raise IndexError()
# add up values from leaf to root node
t = 0
i += self._m
while i > 0:
t += self._vals[i]
i >>= 1
return t + self._vals[0]
def __setitem__(self, i, x):
# add difference (new value - old value)
self._vals[self._m + i] += x - self[i]
def add_in_range(self, i, j, x):
if i not in range(self._n + 1) or j not in range(self._n + 1):
raise IndexError()
# add at internal nodes spanning range(i, j)
i += self._m
j += self._m
while i < j:
if i & 1:
self._vals[i] += x
i += 1
if j & 1:
j -= 1
self._vals[j] += x
i >>= 1
j >>= 1
Example:
>>> r = RangeAddList([0] * 10)
>>> r.add_in_range(0, 4, 10)
>>> r.add_in_range(6, 9, 20)
>>> r.add_in_range(3, 7, 100)
>>> r
RangeAddList([10, 10, 10, 110, 100, 100, 120, 20, 20, 0])
It turns out that NumPy is so well-optimized, you need to go up to lists of length 50,000 or so before the segment tree catches up. The segment tree is still only about twice as fast as NumPy's O(n) range updates for lists of length 100,000 on my machine. You may want to benchmark with your own data to be sure.
This is a fast way of doing it:
lst1 = [1, 2, 3, 4, 5, 6, 7, 8]
new_list = [*lst[:4], *[x+2 for x in lst1[4:]]]
# or even better
new_list[4:] = [x+2 for x in lst1[4:]]
In terms of speed, numpy isn't faster for lists this small:
import timeit
import numpy as np
lst1 = [1, 2, 3, 4, 5, 6, 7, 8]
npa = np.array(lst)
def numpy_it():
global npa
npa[4:] += 2
def python_it():
global lst1
lst1 = [*lst1[:4], *[x+2 for x in lst1[4:]]]
print(timeit.timeit(numpy_it))
print(timeit.timeit(python_it))
For me gets:
1.7008036
0.6737076000000002
But for anything serious numpy beats generating a new list for the slice that needs replacing, which beats regenerating the entire list (which beats in-place replacement with a loop like in your example):
import timeit
import numpy as np
lst1 = list(range(0, 10000))
npa = np.array(lst1)
lst2 = list(range(0, 10000))
lst3 = list(range(0, 10000))
def numpy_it():
global npa
npa[4:] += 2
def python_it():
global lst1
lst1 = [*lst1[:4], *[x+2 for x in lst1[4:]]]
def python_it_slice():
global lst2
lst2[4:] = [x+2 for x in lst2[4:]]
def python_inplace():
global lst3
for i in range(4, len(lst3)):
lst3[i] = lst3[i] + 2
n = 10000
print(timeit.timeit(numpy_it, number=n))
print(timeit.timeit(python_it_slice, number=n))
print(timeit.timeit(python_it, number=n))
print(timeit.timeit(python_inplace, number=n))
Results:
0.057994199999999996
4.3747423
4.5193105000000005
9.949074000000001
Use assign to slice:
lst[4:] = [x+2 for x in lst[4:]]
Test (on my ancient ThinkPad i3-3110, Python 3.5.2):
import timeit
lst = [1, 2, 3, 4, 5, 6, 7, 8]
def python_it():
global lst
lst = [*lst[:4], *[x+2 for x in lst[4:]]]
def python_it2():
global lst
lst[4:] = [x+2 for x in lst[4:]]
print(timeit.timeit(python_it))
print(timeit.timeit(python_it2))
Prints:
1.2732834180060308
0.9285018060181756
use python builtin map function and lambda
lst = [1,2,3,4,5,6,7,8]
lst[4:] = map(lambda x:x+2, lst[4:])
print(lst)
# [1, 2, 3, 4, 7, 8, 9, 10]
Given an array of positive integers, find the minimum number of subsets where:
The sum of each element in the subset does not exceed a value, k.
Each element from the array is only used once in any of the subsets
All values in the array must present in any of the subsets.
Basically, a 'filling' algorithm but need to minimize the containers and need to ensure everything gets filled. My current idea is to sort in descending order and start creating sets when the sum exceeds k, start the next one but not sure what is the better way.
EDIT:
Ex:
Inputs: arr = [1,2,3,4,5], k= 10
Output: [[1,4,5], [2,3]]
# Other solutions such as [[2,3,4],[1,5]] are also acceptable
# But the important thing is the number of sets returned is 2
In the output sets, all 1-5 are used and used only once in the sets. Hope this clears it up.
There may be a smarter way to just find the minimal number of sets, but here's some code which uses Knuth's Algorithm X to do the Exact Cover operation, and a function I wrote last year to generate subsets whose sums are less than a given value. My test code first finds a solution for the data given in the question, and then it finds a solution for a larger random list. It finds the solution for [1, 2, 3, 4, 5] with maximum sum 10 almost instantly, but it takes almost 20 seconds on my old 32 bit 2GHz machine to solve the larger problem.
This code just prints a single solution that is of the minimum size, but it wouldn't be hard to modify it to print all solutions that are of the minimum size.
""" Find the minimal number of subsets of a set of integers
which conform to these constraints:
The sum of each subset does not exceed a value, k.
Each element from the full set is only used once in any of the subsets.
All values from the full set must be present in some subset.
See https://stackoverflow.com/q/50066757/4014959
Uses Knuth's Algorithm X for the exact cover problem,
using dicts instead of doubly linked circular lists.
Written by Ali Assaf
From http://www.cs.mcgill.ca/~aassaf9/python/algorithm_x.html
and http://www.cs.mcgill.ca/~aassaf9/python/sudoku.txt
Written by PM 2Ring 2018.04.28
"""
from itertools import product
from random import seed, sample
from operator import itemgetter
#Algorithm X functions
def solve(X, Y, solution):
if X:
c = min(X, key=lambda c: len(X[c]))
for r in list(X[c]):
solution.append(r)
cols = select(X, Y, r)
yield from solve(X, Y, solution)
deselect(X, Y, r, cols)
solution.pop()
else:
yield list(solution)
def select(X, Y, r):
cols = []
for j in Y[r]:
for i in X[j]:
for k in Y[i]:
if k != j:
X[k].remove(i)
cols.append(X.pop(j))
return cols
def deselect(X, Y, r, cols):
for j in reversed(Y[r]):
X[j] = cols.pop()
for i in X[j]:
for k in Y[i]:
if k != j:
X[k].add(i)
#Invert subset collection
def exact_cover(X, Y):
newX = {j: set() for j in X}
for i, row in Y.items():
for j in row:
newX[j].add(i)
return newX
#----------------------------------------------------------------------
def subset_sums(seq, goal):
totkey = itemgetter(1)
# Store each subset as a (sequence, sum) tuple
subsets = [([], 0)]
for x in seq:
subgoal = goal - x
temp = []
for subseq, subtot in subsets:
if subtot <= subgoal:
temp.append((subseq + [x], subtot + x))
else:
break
subsets.extend(temp)
subsets.sort(key=totkey)
for subseq, _ in subsets:
yield tuple(subseq)
#----------------------------------------------------------------------
# Tests
nums = [1, 2, 3, 4, 5]
k = 10
print("Numbers:", nums, "k:", k)
Y = {u: u for u in subset_sums(nums, k)}
X = exact_cover(nums, Y)
minset = min(solve(X, Y, []), key=len)
print("Minimal:", minset, len(minset))
# Now test with a larger list of random data
seed(42)
hi = 20
k = 2 * hi
size = 10
nums = sorted(sample(range(1, hi+1), size))
print("\nNumbers:", nums, "k:", k)
Y = {u: u for u in subset_sums(nums, k)}
X = exact_cover(nums, Y)
minset = min(solve(X, Y, []), key=len)
print("Minimal:", minset, len(minset))
output
Numbers: [1, 2, 3, 4, 5] k: 10
Minimal: [(2, 3, 5), (1, 4)] 2
Numbers: [1, 2, 3, 4, 8, 9, 11, 12, 17, 18] k: 40
Minimal: [(1, 8, 9, 18), (4, 11, 17), (2, 3, 12)] 3
I want to multiply an element of a list with all other elements.
For example:
def product(a,b,c):
return (a*b, a*c, a*b*c)
I have done this
def product(*args):
list = []
for index,element in enumerate(args):
for i in args:
if (args[index]*i) not in list:
list.append(args[index]*i)
return list
but this gives me [a*a, a*b,a*c, b*b] etc. I don't want the a*a, b*b, c*c bit in there.
you could check for equality
if (args[index]*i) not in list and args[index] != i:
itertools is your friend here:
from itertools import combinations
from functools import reduce, partial
from operator import mul
# Make a sum-like function for multiplication; I'd call it product,
# but that overlaps a name in itertools and our own function
multiplyall = partial(reduce, mul)
def product(*args):
# Loop so you get all two elements combinations, then all three element, etc.
for n in range(2, len(args) + 1):
# Get the combinations for the current combo count
for comb in combinations(args, n):
# Compute product and yield it
# yielding comb as well just for illustration
yield comb, multiplyall(comb)
I made it a generator function, because frankly, almost any function that's just slowly building a list element by element and returning it should really be a generator function (if the caller wants a list, they just do mylist = list(generatorfunc(...))), making it easier to use iteratively without blowing main memory when many arguments are passed.
Example usage:
>>> for pieces, prod in product(2, 3, 4):
print ' * '.join(map(str, pieces)), '=', prod
Which outputs:
2 * 3 = 6
2 * 4 = 8
3 * 4 = 12
2 * 3 * 4 = 24
So if the values are 2, 3, 4, 5 you want all and only these products:
2*3=6, 2*4=8, 2*5=10, 2*3*4=24, 2*3*5=30, 2*4*5=40, 2*3*4*5=120
This means taking all combinations of 3, 4, 5 and then multiplying them togther with 2. The itertools module has a combinations function, and reduce can be used in conjunction with operator.mul to do the calculation:
def product(first, *other):
for n in range(1, len(other) + 1):
for m in combinations(other, n):
yield reduce(mul, m, first)
list(product(2, 3, 4, 5))
Output:
[6, 8, 10, 24, 30, 40, 120]
Does your list have duplicate elements, like [2, 3, 4, 2]?
If it does not, here is a one liner:
First, with tags to illustrate the pattern:
a = ['a1','a2','a3']
lsta = [[x+y for y in [z for z in a if z != x]] for x in a]
lsta
[['a1a2', 'a1a3'], ['a2a1', 'a2a3'], ['a3a1', 'a3a2']]
And here, with numbers:
a =[2,3,4,5]
print [[x*y for y in [z for z in a if z != x]] for x in a]
[[6, 8, 10], [6, 12, 15], [8, 12, 20], [10, 15, 20]]
or the sum of the products, if you wish:
a =[2,3,4,5]
print [sum([x*y for y in [z for z in a if z != x]]) for x in a]
[24, 33, 40, 45]
If the list has duplicates, it gets more complicated. Do you want the first occurrence and the second occurrence of 2 in [2,3,4,2] to be separately calculated (you might need that for some purposes even though you will get the same value for both)?
I have a list of integers...
[1,2,3,4,5,8,9,10,11,200,201,202]
I would like to group them into a list of lists where each sublist contains integers whose sequence has not been broken. Like this...
[[1,5],[8,11],[200,202]]
I have a rather clunky work around...
lSequenceOfNum = [1,2,3,4,5,8,9,10,11,200,201,202]
lGrouped = []
start = 0
for x in range(0,len(lSequenceOfNum)):
if x != len(lSequenceOfNum)-1:
if(lSequenceOfNum[x+1] - lSequenceOfNum[x]) > 1:
lGrouped.append([lSequenceOfNum[start],lSequenceOfNum[x]])
start = x+1
else:
lGrouped.append([lSequenceOfNum[start],lSequenceOfNum[x]])
print lGrouped
It is the best I could do. Is there a more "pythonic" way to do this? Thanks..
Assuming the list will always be in ascending order:
from itertools import groupby, count
numberlist = [1,2,3,4,5,8,9,10,11,200,201,202]
def as_range(g):
l = list(g)
return l[0], l[-1]
print [as_range(g) for _, g in groupby(numberlist, key=lambda n, c=count(): n-next(c))]
I realised I had overcomplicated this a little, far easier to just count manually than use a slightly convoluted generator:
def ranges(seq):
start, end = seq[0], seq[0]
count = start
for item in seq:
if not count == item:
yield start, end
start, end = item, item
count = item
end = item
count += 1
yield start, end
print(list(ranges([1,2,3,4,5,8,9,10,11,200,201,202])))
Producing:
[(1, 5), (8, 11), (200, 202)]
This method is pretty fast:
This method (and the old one, they perform almost exactly the same):
python -m timeit -s "from test import ranges" "ranges([1,2,3,4,5,8,9,10,11,200,201,202])"
1000000 loops, best of 3: 0.47 usec per loop
Jeff Mercado's Method:
python -m timeit -s "from test import as_range; from itertools import groupby, count" "[as_range(g) for _, g in groupby([1,2,3,4,5,8,9,10,11,200,201,202], key=lambda n, c=count(): n-next(c))]"
100000 loops, best of 3: 11.1 usec per loop
That's over 20x faster - although, naturally, unless speed matters this isn't a real concern.
My old solution using generators:
import itertools
def resetable_counter(start):
while True:
for i in itertools.count(start):
reset = yield i
if reset:
start = reset
break
def ranges(seq):
start, end = seq[0], seq[0]
counter = resetable_counter(start)
for count, item in zip(counter, seq): #In 2.x: itertools.izip(counter, seq)
if not count == item:
yield start, end
start, end = item, item
counter.send(item)
end = item
yield start, end
print(list(ranges([1,2,3,4,5,8,9,10,11,200,201,202])))
Producing:
[(1, 5), (8, 11), (200, 202)]
You can do this efficiently in three steps
given
list1=[1,2,3,4,5,8,9,10,11,200,201,202]
Calculate the discontinuity
[1,2,3,4,5,8,9,10,11 ,200,201,202]
- [1,2,3,4,5,8,9 ,10 ,11 ,200,201,202]
----------------------------------------
[1,1,1,1,3,1,1 ,1 ,189,1 ,1]
(index) 1 2 3 4 5 6 7 8 9 10 11
* *
rng = [i+1 for i,e in enumerate((x-y for x,y in zip(list1[1:],list1))) if e!=1]
>>> rng
[5, 9]
Add the boundaries
rng = [0] + rng + [len(list1)]
>>> rng
[0, 5, 9,12]
now calculate the actual continuity ranges
[(list1[i],list1[j-1]) for i,j in zip(list2,list2[1:])]
[(1, 5), (8, 11), (200, 202)]
LB [0, 5, 9, 12]
UB [0, 5, 9, 12]
-----------------------
indexes (LB,UB-1) (0,4) (5,8) (9,11)
The question is quite old, but I thought I'll share my solution anyway
Assuming import numpy as np
a = [1,2,3,4,5,8,9,10,11,200,201,202]
np.split(a, array(np.add(np.where(diff(a)>1),1)).tolist()[0])
pseudo code (with off-by-one errors to fix):
jumps = new array;
for idx from 0 to len(array)
if array[idx] != array[idx+1] then jumps.push(idx);
I think this is actually a case where it makes sense to work with the indices (as in C, before java/python/perl/etc. improved upon this) instead of the objects in the array.
Here's a version that should be easy to read:
def close_range(el, it):
while True:
el1 = next(it, None)
if el1 != el + 1:
return el, el1
el = el1
def compress_ranges(seq):
iterator = iter(seq)
left = next(iterator, None)
while left is not None:
right, left1 = close_range(left, iterator)
yield (left, right)
left = left1
list(compress_ranges([1, 2, 3, 4, 5, 8, 9, 10, 11, 200, 201, 202]))
Similar questions:
Python - find incremental numbered sequences with a list comprehension
Pythonic way to convert a list of integers into a string of comma-separated ranges
input = [1, 2, 3, 4, 8, 10, 11, 12, 17]
i, ii, result = iter(input), iter(input[1:]), [[input[0]]]
for x, y in zip(i,ii):
if y-x != 1:
result.append([y])
else:
result[-1].append(y)
>>> result
[[1, 2, 3, 4], [8], [10, 11, 12], [17]]
>>> print ", ".join("-".join(map(str,(g[0],g[-1])[:len(g)])) for g in result)
1-4, 8, 10-12, 17
>>> [(g[0],g[-1])[:len(g)] for g in result]
[(1, 4), (8,), (10, 12), (17,)]