Summing Consecutive Ranges Pythonically - python

I have a sumranges() function, which sums all the ranges of consecutive numbers found in a tuple of tuples. To illustrate:
def sumranges(nums):
return sum([sum([1 for j in range(len(nums[i])) if
nums[i][j] == 0 or
nums[i][j - 1] + 1 != nums[i][j]]) for
i in range(len(nums))])
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sumranges(nums)
7
As you can see, it returns the number of ranges of consecutive digits within the tuple, that is: len((1, 2, 3, 4), (1), (5, 6), (19, 20), (24), (29), (400)) = 7. The tuples are always ordered.
My problem is that my sumranges() is terrible. I hate looking at it. I'm currently just iterating through the tuple and each subtuple, assigning a 1 if the number is not (1 + previous number), and summing the total. I feel like I am missing a much easier way to accomplish my stated objective. Does anyone know a more pythonic way to do this?
Edit: I have benchmarked all the answers given thus far. Thanks to all of you for your answers.
The benchmarking code is as follows, using a sample size of 100K:
from time import time
from random import randrange
nums = [sorted(list(set(randrange(1, 10) for i in range(10)))) for
j in range(100000)]
for func in sumranges, alex, matt, redglyph, ephemient, ferdinand:
start = time()
result = func(nums)
end = time()
print ', '.join([func.__name__, str(result), str(end - start) + ' s'])
Results are as follows. Actual answer shown to verify that all functions return the correct answer:
sumranges, 250281, 0.54171204567 s
alex, 250281, 0.531121015549 s
matt, 250281, 0.843333005905 s
redglyph, 250281, 0.366822004318 s
ephemient, 250281, 0.805964946747 s
ferdinand, 250281, 0.405596971512 s
RedGlyph does edge out in terms of speed, but the simplest answer is probably Ferdinand's, and probably wins for most pythonic.

My 2 cents:
>>> sum(len(set(x - i for i, x in enumerate(t))) for t in nums)
7
It's basically the same idea as descriped in Alex' post, but using a set instead of itertools.groupby, resulting in a shorter expression. Since sets are implemented in C and len() of a set runs in constant time, this should also be pretty fast.

Consider:
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> flat = [[(x - i) for i, x in enumerate(tu)] for tu in nums]
>>> print flat
[[1, 1, 1, 1], [1, 4, 4], [19, 19, 22, 26, 396]]
>>> import itertools
>>> print sum(1 for tu in flat for _ in itertools.groupby(tu))
7
>>>
we "flatten" the "increasing ramps" of interest by subtracting the index from the value, turning them into consecutive "runs" of identical values; then we identify and could the "runs" with the precious itertools.groupby. This seems to be a pretty elegant (and speedy) solution to your problem.

Just to show something closer to your original code:
def sumranges(nums):
return sum( (1 for i in nums
for j, v in enumerate(i)
if j == 0 or v != i[j-1] + 1) )
The idea here was to:
avoid building intermediate lists but use a generator instead, it will save some resources
avoid using indices when you already have selected a subelement (i and v above).
The remaining sum() is still necessary with my example though.

Here's my attempt:
def ranges(ls):
for l in ls:
consec = False
for (a,b) in zip(l, l[1:]+(None,)):
if b == a+1:
consec = True
if b is not None and b != a+1:
consec = False
if consec:
yield 1
'''
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sum(ranges(nums))
7
'''
It looks at the numbers pairwise, checking if they are a consecutive pair (unless it's at the last element of the list). Each time there's a consecutive pair of numbers it yields 1.

This could probably be put together in a more compact form, but I think clarity would suffer:
def pairs(seq):
for i in range(1,len(seq)):
yield (seq[i-1], seq[i])
def isadjacent(pair):
return pair[0]+1 == pair[1]
def sumrange(seq):
return 1 + sum([1 for pair in pairs(seq) if not isadjacent(pair)])
def sumranges(nums):
return sum([sumrange(seq) for seq in nums])
nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
print sumranges(nums) # prints 7

You could probably do this better if you had an IntervalSet class because then you would scan through your ranges to build your IntervalSet, then just use the count of set members.
Some tasks don't always lend themselves to neat code, particularly if you need to write the code for performance.

There is a formula for this, the sum of the first n numbers, 1+ 2+ ... + n = n(n+1) / 2 . Then if you want to have the sum of i-j then it is (j(j+1)/2) - (i(i+1)/2) this I am sure simplifies but you can work that out. It might not be pythonic but it is what I would use.

Related

python get indexes of where number starts and ends in a list

I have a query which is a list of numbers. I want to get the ranges of indices in which the number 1 appears. The range starts when 1 appears and ends on the index in which it doesn't appear. I have made an example to illustrate this.
query= [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
answer = [[5,8],[9,10],[12,14]]
Note: I am not looking for the first and last index of some value in a list in Python. I'm looking for all the places in which they start and end.
Update: From some of the suggested answers below it looks like Itertools is quite handy for this stuff.
You can use itertools.dropwhile to do this.
>>> query = [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
>>> n = 1
>>>
>>> from itertools import dropwhile
>>> itr = enumerate(query)
>>> [[i, next(dropwhile(lambda t: t[1]==n, itr), [len(query)])[0]] for i,x in itr if x==n]
[[5, 8], [9, 10], [12, 14]]
You could also use itertools.groupby for this. Use enumerate to get the indices, then groupby the actual value, then filter by the value, finally get the first and last index from the group.
>>> from itertools import groupby
>>> query = [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
>>> [(g[0][0], g[-1][0]+1) for g in (list(g) for k, g in
... groupby(enumerate(query), key=lambda t: t[1]) if k == 1)]
...
[(5, 8), (9, 10), (12, 14)]
query= [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
first = 0 # Track the first index in the current group
ingroup = False # Track whether we are currently in a group of ones
answer = []
for i, e in enumerate(query):
if e:
if not ingroup:
first = i
else:
if ingroup:
answer.append([first, i])
ingroup = e
if ingroup:
answer.append([first, len(query)])
>>> answer
[[5, 8], [9, 10], [12, 14]]
I think you probably want something like this.
you can just use basic for loop and if statement where are you checking
where the series of '0' changes to a series of '1' and vice versa
query= [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
r_0 = []
r_1 = []
for i in range(len(query)-1):
if query[i] == 0 and query[i+1] == 1:
r_0.append(i+1) # [5, 9, 12]
if query[i] == 1 and query[i + 1] == 0:
r_1.append(i + 1) # [8, 10, 14]
print (list(zip(r_0,r_1)))
output:
[(5, 8), (9, 10), (12, 14)]
Hope this helps. Its a solution without foor-loops
from itertools import chain
query = [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
result = list(zip(
filter(lambda i: query[i] == 1 and (i == 0 or query[i-1] != 1), range(len(query))),
chain(filter(lambda i: query[i] != 1 and query[i-1] == 1, range(1, len(query))), [len(query)-1])
))
print(result)
The output is:
[(2, 3), (5, 8), (9, 10), (12, 14)]
Wanted to share a recursive approach
query= [0,0,0,0,0,1,1,1,0,1,0,0,1,1,0]
def findOccurrences(of, at, index=0, occurrences=None):
if occurrences == None: occurrences = [] # python has a weird behavior over lists as the default
# parameter, unfortunately this neets to be done
try:
last = start = query.index(of, index)
for i in at[start:]:
if i == of:
last += 1
else:
break
occurrences.append([start, last])
return findOccurrences(of, at, last, occurrences)
except:
pass
return occurrences
print(findOccurrences(1, query))
print(findOccurrences(1, query, 0)) # Offseting
print(findOccurrences(0, query, 9)) # Offseting with defaul list

Python - Combination of Numbers Summing to Greater than or Equal to a Value

I was recently posed the following interview question to be answered in Python - given a list of quantity-value pairs, find the optimal combination(s) of sets of values whose sum is as close to, and at least as large as, some provided value.
For example, given: [(1, $5), (3, $10), (2, $15)], and a desired value of $36, the answer would be [(2,$15), (1,$10)] or [(1,$15), (2,$10), (1,$5)]. The reason is that $40 is the least sum greater than or equal to $36 that can be achieved, and these are the two ways to achieve that sum.
I got stumped. Does anyone have a solution?
The numbers are so small you can just brute force it:
In []:
notes = [(1, 5), (3, 10), (2, 15)]
wallet = [n for a, b in notes for n in [b]*a]
combs = {sum(x): x for i in range(1, len(wallet)) for x in it.combinations(wallet, i)}
target = 36
for i in sorted(combs):
if i >= target:
break
i, combs[i]
Out[]:
(40, (5, 10, 10, 15))
You can extend this for all combinations, just replace the combs dictionary comprehension with:
combs = {}
for i in range(1, len(wallet)):
for x in it.combinations(wallet, i):
combs.setdefault(sum(x), set()).add(x)
...
i, combs[i]
Out[]:
(40, {(5, 10, 10, 15), (10, 15, 15)})

Print a line if conditions have not been met

Hello fellow stackoverflowers, I am practising my Python with an example question given to me (actually a Google interview practice question) and ran into a problem I did not know how to a) pose properly (hence vague title), b) overcome.
The question is: For an array of numbers (given or random) find unique pairs of numbers within the array which when summed give a given number. E.G: find the pairs of numbers in the array below which add to 6.
[1 2 4 5 11]
So in the above case:
[1,5] and [2,4]
The code I have written is:
from secrets import *
i = 10
x = randbelow(10)
number = randbelow(100) #Generate a random number to be the sum that we are after#
if number == 0:
pass
else:
number = number
array = []
while i>0: #Generate a random array to use#
array.append(x)
x = x + randbelow(10)
i -= 1
print("The following is a randomly generated array:\n" + str(array))
print("Within this array we are looking for a pair of numbers which sum to " + str(number))
for i in range(0,10):
for j in range(0,10):
if i == j or i>j:
pass
else:
elem_sum = array[i] + array[j]
if elem_sum == number:
number_one = array[i]
number_two = array[j]
print("A pair of numbers within the array which satisfy that condition is: " + str(number_one) + " and " + str(number_two))
else:
pass
If no pairs are found, I want the line "No pairs were found". I was thinking a try/except, but wasn't sure if it was correct or how to implement it. Also, I'm unsure on how to stop repeated pairs appearing (unique pairs only), so for example if I wanted 22 as a sum and had the array:
[7, 9, 9, 13, 13, 14, 23, 32, 41, 45]
[9,13] would appear twice
Finally forgive me if there are redundancies/the code isn't written very efficiently, I'm slowly learning so any other tips would be greatly appreciated!
Thanks for reading :)
You can simply add a Boolean holding the answer to "was at least one pair found?".
initialize it as found = false at the beginning of your code.
Then, whenever you find a pair (the condition block that holds your current print command), just add found = true.
after all of your search (the double for loop`), add this:
if not found:
print("No pairs were found")
Instead of actually comparing each pair of numbers, you can just iterate the list once, subtract the current number from the target number, and see if the remainder is in the list. If you convert the list to a set first, that lookup can be done in O(1), reducing the overall complexity from O(n²) to just O(n). Also, the whole thing can be done in a single line with a list comprehension:
>>> nums = [1, 2, 4, 5, 11]
>>> target = 6
>>> nums_set = set(nums)
>>> pairs = [(n, target-n) for n in nums_set if target-n in nums_set and n <= target/2]
>>> print(pairs)
[(1, 5), (2, 4)]
For printing the pairs or some message, you can use the or keyword. x or y is interpreted as x if x else y, so if the result set is empty, the message is printed, otherwise the result set itself.
>>> pairs = []
>>> print(pairs or "No pairs found")
No pairs found
Update: The above can fail, if the number added to itself equals the target, but is only contained once in the set. In this case, you can use a collections.Counter instead of a set and check the multiplicity of that number first.
>>> nums = [1, 2, 4, 5, 11, 3]
>>> nums_set = set(nums)
>>> [(n, target-n) for n in nums_set if target-n in nums_set and n <= target/2]
[(1, 5), (2, 4), (3, 3)]
>>> nums_counts = collections.Counter(nums)
>>> [(n, target-n) for n in nums_counts if target-n in nums_counts and n <= target/2 and n != target-n or nums_counts[n] > 1]
[(1, 5), (2, 4)]
List your constraints first!
numbers added must be unique
only 2 numbers can be added
the length of the array can be arbitrary
the number to be summed to can be arbitrary
& Don't skip preprocessing! Reduce your problem-space.
2 things off the bat:
Starting after your 2 print statements, the I would do array = list(set(array)) to reduce the problem-space to [7, 9, 13, 14, 23, 32, 41, 45].
Assuming that all the numbers in question will be positive, I would discard numbers above number. :
array = [x for x in array if x < number]
giving [7, 9, 9, 13, 13, 14]
Combine the last 2 steps into a list comprehension and then use that as array:
smaller_array = [x for x in list(set(array)) if x < number]
which gives array == [7, 9, 13, 14]
After these two steps, you can do a bunch of stuff. I'm fully aware that I haven't answered your question, but from here you got this. ^this is the kind of stuff I'd assume google wants to see.

Adding values from loop to a tuple results in nested tuples instead of a flat tuple or list

I'm trying to perform the following:
tup1 = ()
for i in range(1, 10, 2):
tup1 = (tup1, i)
print tup1
I expect the output to be the sequence 1 to 10.
However, I end up with the following:
((((((), 0), 2), 4), 6), 8)
What would be a correct way to meet the requirement?
If you just want an iterable with the even numbers 1 to 10 then the simplest way to do it:
seq = range(2, 11, 2)
If you are doing this as a means of learning Python and you want to build up your own data structure, use a list:
l = []
for i in range(2, 11, 2):
l.append(i)
The above for loop can be rewritten as a list comprehension:
l = [i for i in range(2, 11, 2)]
or using an if clause in the loop comprehension:
l = [ i for i in range(1, 11) if i % 2 == 0]
You can append an item to a tuple using the += operator.
tup1=()
for i in range(1,10,2):
tup1+= (i,)
print tup1
This prints (1, 3, 5, 7, 9)
Tuples are immutable objects in Python. Thus means you can't modify them. What you're doing right now is creating a new tuple with the previous one inside
You could do:
lst = []
for i in range(1,10,2):
lst.append(i)
tup = tuple(lst) #If you really want a tuple
print tup
But lst = range(1,10,2) or tup = tuple(range(1,10,2)) is much better (Unless you want to use append for some reason)
Read about List Comprehension
tuple(i for i in range(1, 10, 2))
Or
tup1 = ()
for i in range(1, 10, 2):
tup1 += (i,)
print tup1
it's something like this:
print range(1, 11)
You are skipping by two by using for i in range(1,10,2): if you use for i in range(1,11): if will increment by 1. As for tup1=(tup1,i) you are constantly adding a tuple to each other which is creating the weird output. You could use a list if you want to store them. Otherwise using will do it just fine:
print(range(10))
List item
For appending into list or tuple you can use append() function or you can use += operator which does the same.
s=()
for sequence of numbers from 1 to 10
for i in range(1,11):
s+=(i,)
print(s) #(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
for sequence of numbers from 1 to 10 with step size 2
x=()
for i in range(1,11,2):
x+=(i,)
print(x) #odd nos from 1-9 (1, 3, 5, 7, 9)
x=()
for i in range(2,11,2):
x+=(i,)
print(x) #even nos from 2-10 (2, 4, 6, 8, 10)
List item
Storing values from loop in a list or tuple in Python by following ways -
-> By appending the value in the list (here new_data1) as join will not work here.
new_data1 = []
for line in all_words:
new_data=' '.join(lemmatize_sentence(line))
new_data1.append(new_data)
#print (new_data)
print (new_data1)
P.S. - This is just a snapshot of a code just for hint .
Hope this helps!!

Python shuffle such that position will never repeat

I'd like to do a random shuffle of a list but with one condition: an element can never be in the same original position after the shuffle.
Is there a one line way to do such in python for a list?
Example:
list_ex = [1,2,3]
each of the following shuffled lists should have the same probability of being sampled after the shuffle:
list_ex_shuffled = [2,3,1]
list_ex_shuffled = [3,1,2]
but the permutations [1,2,3], [1,3,2], [2,1,3] and [3,2,1] are not allowed since all of them repeat one of the elements positions.
NOTE: Each element in the list_ex is a unique id. No repetition of the same element is allowed.
Randomize in a loop and keep rejecting the results until your condition is satisfied:
import random
def shuffle_list(some_list):
randomized_list = some_list[:]
while True:
random.shuffle(randomized_list)
for a, b in zip(some_list, randomized_list):
if a == b:
break
else:
return randomized_list
I'd describe such shuffles as 'permutations with no fixed points'. They're also known as derangements.
The probability that a random permutation is a derangement is approximately 1/e (fun to prove). This is true however long the list. Thus an obvious algorithm to give a random derangement is to shuffle the cards normally, and keep shuffling until you have a derangement. The expected number of necessary shuffles is about 3, and it's rare you'll have to shuffle more than ten times.
(1-1/e)**11 < 1%
Suppose there are n people at a party, each of whom brought an umbrella. At the end of the party, each person takes an umbrella at random from the basket. What is the probability that no-one holds their own umbrella?
You could generate all possible valid shufflings:
>>> list_ex = [1,2,3]
>>> import itertools
>>> list(itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex))))
[(2, 3, 1), (3, 1, 2)]
For some other sequence:
>>> list_ex = [7,8,9,0]
>>> list(itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex))))
[(8, 7, 0, 9), (8, 9, 0, 7), (8, 0, 7, 9), (9, 7, 0, 8), (9, 0, 7, 8), (9, 0, 8, 7), (0, 7, 8, 9), (0, 9, 7, 8), (0, 9, 8, 7)]
You could also make this a bit more efficient by short-circuiting the iterator if you just want one result:
>>> list_ex = [1,2,3]
>>> i = itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex)))
>>> next(i)
(2, 3, 1)
But, it would not be a random choice. You'd have to generate all of them and choose one for it to be an actual random result:
>>> list_ex = [1,2,3]
>>> i = itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex)))
>>> import random
>>> random.choice(list(i))
(2, 3, 1)
Here is another take on this. You can pick one solution or another depending on your needs. This is not a one liner but shuffles the indices of elements instead of the elements themselves. Thus, the original list may have duplicate values or values of types that cannot be compared or may be expensive to compare.
#! /usr/bin/env python
import random
def shuffled_values(data):
list_length = len(data)
candidate = range(list_length)
while True:
random.shuffle(candidate)
if not any(i==j for i,j in zip(candidate, range(list_length))):
yield [data[i] for i in candidate]
list_ex = [1, 2, 3]
list_gen = shuffled_values(list_ex)
for i in range(0, 10):
print list_gen.next()
This gives:
[2, 3, 1]
[3, 1, 2]
[3, 1, 2]
[2, 3, 1]
[3, 1, 2]
[3, 1, 2]
[2, 3, 1]
[2, 3, 1]
[3, 1, 2]
[2, 3, 1]
If list_ex is [2, 2, 2], this method will keep yielding [2, 2, 2] over and over. The other solutions will give you empty lists. I am not sure what you want in this case.
Use Knuth-Durstenfeld to shuffle the list. As long as it is found to be in the original position during the shuffling process, a new shuffling process is started from the beginning until it returns to a qualified arrangement. The time complexity of this algorithm is the smallest constant term:
def _random_derangement(x: list, randint: Callable[[int, int], int]) -> None:
'''
Random derangement list x in place, and return None.
An element can never be in the same original position after the shuffle. provides uniform distribution over permutations.
The formal parameter randint requires a callable object such as rand_int(b, a) that generates a random integer within the specified closed interval.
'''
from collections import namedtuple
sequence_type = namedtuple('sequence_type', ('sequence_number', 'elem'))
x_length = len(x)
if x_length > 1:
for i in range(x_length):
x[i] = sequence_type(sequence_number = i, elem = x[i])
end_label = x_length - 1
while True:
for i in range(end_label, 0, -1):
random_location = randint(i, 0)
if x[random_location].sequence_number != i:
x[i], x[random_location] = x[random_location], x[i]
else:
break
else:
if x[0].sequence_number != 0: break
for i in range(x_length):
x[i] = x[i].elem
complete_shuffle
Here's another algorithm. Take cards at random. If your ith card is card i, put it back and try again. Only problem, what if when you get to the last card it's the one you don't want. Swap it with one of the others.
I think this is fair (uniformally random).
import random
def permutation_without_fixed_points(n):
if n == 1:
raise ArgumentError, "n must be greater than 1"
result = []
remaining = range(n)
i = 0
while remaining:
if remaining == [n-1]:
break
x = i
while x == i:
j = random.randrange(len(remaining))
x = remaining[j]
remaining.pop(j)
result.append(x)
i += 1
if remaining == [n-1]:
j = random.randrange(n-1)
result.append(result[j])
result[j] = n
return result

Categories