Say I have an array of positive whole integers; I'd like to manipulate the order so that the concatenation of the resultant array is the largest number possible. For example [97, 9, 13] results in 99713; [9,1,95,17,5] results in 9955171. I'm not sure of an answer.
sorted(x, cmp=lambda a, b: -1 if str(b)+str(a) < str(a)+str(b) else 1)
Intuitively, we can see that a reverse sort of single digit numbers would lead to the higest number:
>>> ''.join(sorted(['1', '5', '2', '9'], reverse=True))
'9521'
so reverse sorting should work. The problem arises when there are multi-digit snippets in the input. Here, intuition again lets us order 9 before 95 and 17 before 1, but why does that work? Again, if they had been the same length, it would have been clear how to sort them:
95 < 99
96 < 97
14 < 17
The trick then, is to 'extend' shorter numbers so they can be compared with the longer ones and can be sorted automatically, lexicographically. All you need to do, really, is to repeat the snippet to beyond the maximum length:
comparing 9 and 95: compare 999 and 9595 instead and thus 999 comes first.
comparing 1 and 17: compare 111 and 1717 instead and thus 1717 comes first.
comparing 132 and 13: compare 132132 and 1313 instead and thus 132132 comes first.
comparing 23 and 2341: compare 232323 and 23412341 instead and thus 2341 comes first.
This works because python only needs to compare the two snippets until they differ somewhere; and it's (repeating) matching prefixes that we need to skip when comparing two snippets to determine which order they need to be in to form a largest number.
You only need to repeat a snippet until it is longer than the longest snippet * 2 in the input to guarantee that you can find the first non-matching digit when comparing two snippets.
You can do this with a key argument to sorted(), but you need to determine the maximum length of the snippets first. Using that length, you can 'pad' all snippets in the sort key until they are longer than that maximum length:
def largestpossible(snippets):
snippets = [str(s) for s in snippets]
mlen = max(len(s) for s in snippets) * 2 # double the length of the longest snippet
return ''.join(sorted(snippets, reverse=True, key=lambda s: s*(mlen//len(s)+1)))
where s*(mlen//len(s)+1) pads the snippet with itself to be more than mlen in length.
This gives:
>>> combos = {
... '12012011': [1201, 120, 1],
... '87887': [87, 878],
... '99713': [97, 9, 13],
... '9955171': [9, 1, 95, 17, 5],
... '99799713': [97, 9, 13, 979],
... '10100': [100, 10],
... '13213': [13, 132],
... '8788717': [87, 17, 878],
... '93621221': [936, 21, 212],
... '11101110': [1, 1101, 110],
... }
>>> def test(f):
... for k,v in combos.items():
... print '{} -> {} ({})'.format(v, f(v), 'correct' if f(v) == k else 'incorrect, should be {}'.format(k))
...
>>> test(largestpossible)
[97, 9, 13] -> 99713 (correct)
[1, 1101, 110] -> 11101110 (correct)
[936, 21, 212] -> 93621221 (correct)
[13, 132] -> 13213 (correct)
[97, 9, 13, 979] -> 99799713 (correct)
[87, 878] -> 87887 (correct)
[1201, 120, 1] -> 12012011 (correct)
[100, 10] -> 10100 (correct)
[9, 1, 95, 17, 5] -> 9955171 (correct)
[87, 17, 878] -> 8788717 (correct)
Note that this solution is a) 3 lines short and b) works on Python 3 as well without having to resort to functools.cmp_to_key() and c) does not bruteforce the solution (which is what the itertools.permutations option does).
Hint one: you concatenate strings, not integers.
Hint two: itertools.permutations().
import itertools
nums = ["9", "97", "13"]
m = max(("".join(p) for p in itertools.permutations(nums)), key = int)
You can use itertools.permutations as hinted and use the key argument of the max function (which tells which function to apply to each element in order to decide the maximum) after you concat them with the join function.
It's easier to work with strings to begin with.
I don't like the brute force approach to this. It requires a massive amount of computation for large sets.
You can write your own comparison function for the sorted builtin method, which will return a sorting parameter for any pair, based on any logic you put in the function.
Sample code:
def compareInts(a,b):
# create string representations
sa = str(a)
sb = str(b)
# compare character by character, left to right
# up to first inequality
# if you hit the end of one str before the other,
# and all is equal up til then, continue to next step
for i in xrange(min(len(sa), len(sb))):
if sa[i] > sb[i]:
return 1
elif sa[i] < sb[i]:
return -1
# if we got here, they are both identical up to the length of the shorter
# one.
# this means we need to compare the shorter number again to the
# remainder of the longer
# at this point we need to know which is shorter
if len(sa) > len(sb): # sa is longer, so slice it
return compareInts(sa[len(sb):], sb)
elif len(sa) < len(sb): # sb is longer, slice it
return compareInts(sa, sb[len(sa):])
else:
# both are the same length, and therefore equal, return 0
return 0
def NumberFromList(numlist):
return int(''.join('{}'.format(n) for n in numlist))
nums = [97, 9, 13, 979]
sortednums = sorted(nums, cmp = compareInts, reverse = True)
print nums # [97, 9, 13, 979]
print sortednums # [9, 979, 97, 13]
print NumberFromList(sortednums) # 99799713
Well, there's always the brute force approach...
from itertools import permutations
lst = [9, 1, 95, 17, 5]
max(int(''.join(str(x) for x in y)) for y in permutations(lst))
=> 9955171
Or this, an adaptation of #Zah's answer that receives a list of integers and returns an integer, as specified in the question:
int(max((''.join(y) for y in permutations(str(x) for x in lst)), key=int))
=> 9955171
You can do this with some clever sorting.
If two strings are the same length, choose the larger of the two to come first. Easy.
If they're not the same length, figure out what would be the result if the best possible combination were appended to the shorter one. Since everything that follows the shorter one must be equal to or less than it, you can determine this by appending the short one to itself until it's the same size as the longer one. Once they're the same length you do a direct comparison as before.
If the second comparison is equal, you've proven that the shorter string can't possibly be better than the longer one. Depending on what it's paired with it could still come out worse, so the longer one should come first.
def compare(s1, s2):
if len(s1) == len(s2):
return -1 if s1 > s2 else int(s2 > s1)
s1x, s2x = s1, s2
m = max(len(s1), len(s2))
while len(s1x) < m:
s1x = s1x + s1
s1x = s1x[:m]
while len(s2x) < m:
s2x = s2x + s2
s2x = s2x[:m]
return -1 if s1x > s2x or (s1x == s2x and len(s1) > len(s2)) else 1
def solve_puzzle(seq):
return ''.join(sorted([str(x) for x in seq], cmp=compare))
>>> solve_puzzle([9, 1, 95, 17, 5])
'9955171'
>>> solve_puzzle([97, 9, 13])
'99713'
>>> solve_puzzle([936, 21, 212])
'93621221'
>>> solve_puzzle([87, 17, 878])
'8788717'
>>> solve_puzzle([97, 9, 13, 979])
'99799713'
This should be much more efficient than running through all the permutations.
import itertools
def largestInt(a):
b = list(itertools.permutations(a))
c = []
x = ""
for i in xrange(len(b)):
c.append(x.join(map(str, b[i])))
return max(c)
Related
I am trying to find elements from array(integer array) or list which are unique and those elements must not divisible by any other element from same array or list.
You can answer in any language like python, java, c, c++ etc.
I have tried this code in Python3 and it works perfectly but I am looking for better and optimum solution in terms of time complexity.
assuming array or list A is already sorted and having unique elements
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
while i<len(A)-1:
while j<len(A):
if A[j]%A[i]==0:
A.pop(j)
else:
j+=1
i+=1
j=i+1
For the given array A=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] answer would be like ans=[2,3,5,7,11,13]
another example,A=[4,5,15,16,17,23,39] then ans would be like, ans=[4,5,17,23,39]
ans is having unique numbers
any element i from array only exists if (i%j)!=0, where i!=j
I think it's more natural to do it in reverse, by building a new list containing the answer instead of removing elements from the original list. If I'm thinking correctly, both approaches do the same number of mod operations, but you avoid the issue of removing an element from a list.
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
ans = []
for x in A:
for y in ans:
if x % y == 0:
break
else: ans.append(x)
Edit: Promoting the completion else.
This algorithm will perform much faster:
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
if (A[-1]-A[0])/A[0] > len(A)*2:
result = list()
for v in A:
for f in result:
d,m = divmod(v,f)
if m == 0: v=0;break
if d<f: break
if v: result.append(v)
else:
retain = set(A)
minMult = 1
maxVal = A[-1]
for v in A:
if v not in retain : continue
minMult = v*2
if minMult > maxVal: break
if v*len(A)<maxVal:
retain.difference_update([m for m in retain if m >= minMult and m%v==0])
else:
retain.difference_update(range(minMult,maxVal,v))
if maxVal%v == 0:
maxVal = max(retain)
result = list(retain)
print(result) # [2, 3, 5, 7, 11, 13]
In the spirit of the sieve of Eratostenes, each number that is retained, removes its multiples from the remaining eligible numbers. Depending on the magnitude of the highest value, it is sometimes more efficient to exclude multiples than check for divisibility. The divisibility check takes several times longer for an equivalent number of factors to check.
At some point, when the data is widely spread out, assembling the result instead of removing multiples becomes faster (this last addition was inspired by Imperishable Night's post).
TEST RESULTS
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] (100000 repetitions)
Original: 0.55 sec
New: 0.29 sec
A = list(range(2,5000))+[9697] (100 repetitions)
Original: 3.77 sec
New: 0.12 sec
A = list(range(1001,2000))+list(range(4000,6000))+[9697**2] (10 repetitions)
Original: 3.54 sec
New: 0.02 sec
I know that this is totally insane but i want to know what you think about this:
A = [4,5,15,16,17,23,39]
prova=[[x for x in A if x!=y and y%x==0] for y in A]
print([A[idx] for idx,x in enumerate(prova) if len(prova[idx])==0])
And i think it's still O(n^2)
If you care about speed more than algorithmic efficiency, numpy would be the package to use here in python:
import numpy as np
# Note: doesn't have to be sorted
a = [2, 2, 3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 16, 29, 29]
a = np.unique(a)
result = a[np.all((a % a[:, None] + np.diag(a)), axis=0)]
# array([2, 3, 5, 7, 11, 13, 29])
This divides all elements by all other elements and stores the remainder in a matrix, checks which columns contain only non-0 values (other than the diagonal), and selects all elements corresponding to those columns.
This is O(n*M) where M is the max size of an integer in your list. The integers are all assumed to be none negative. This also assumes your input list is sorted (came to that assumption since all lists you provided are sorted).
a = [4, 7, 7, 8]
# a = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
# a = [4, 5, 15, 16, 17, 23, 39]
M = max(a)
used = set()
final_list = []
for e in a:
if e in used:
continue
else:
used.add(e)
for i in range(e, M + 1):
if not (i % e):
used.add(i)
final_list.append(e)
print(final_list)
Maybe this can be optimized even further...
If the list is not sorted then for the above method to work, one must sort it. The time complexity will then be O(nlogn + Mn) which equals to O(nlogn) when n >> M.
My self-learning task is to find how many sequences are on the list. A sequence is a group of numbers, where each is one 1 bigger than the previous one. So, in the list:
[1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
there are 3 sequences:
1,2,3
12,13,14,15
23,24,25,26
I've spent few hours and got a solution, which I think is a workaround rather than the real solution.
My solution is to have a separate list for adding sequences and count the attempts to update this list. I count the very first appending, and every new appending except for the sequence, which already exists.
I believe there is a solution without additional list, which allows to count the sequences itself rather than the list manipulation attempts.
numbers = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
goods = []
count = 0
for i in range(len(numbers)-1):
if numbers[i] + 1 == numbers[i+1]:
if goods == []:
goods.append(numbers[i])
count = count + 1
elif numbers[i] != goods[-1]:
goods.append(numbers[i])
count = count + 1
if numbers[i+1] != goods[-1]:
goods.append(numbers[i+1])
The output from my debugging:
Number 1 added to: [1]
First count change: 1
Number 12 added to: [1, 2, 3, 12]
Normal count change: 2
Number 23 added to: [1, 2, 3, 12, 13, 14, 15, 23]
Normal count change: 3
Thanks everyone for your help!
Legman suggested the original solution I failed to implemented before I end up with another solution in this post.
MSeifert helped to find a the right way with the lists:
numbers = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
print("Numbers:", numbers)
goods = []
count = 0
for i in range(len(numbers)-1):
if numbers[i] + 1 == numbers[i+1]:
if goods == []:
goods.append([numbers[i]])
count = count + 1
elif numbers[i] != goods[-1][-1]:
goods.append([numbers[i]])
count = count + 1
if numbers[i+1] != goods[-1]:
goods[-1].extend([numbers[i+1]])
print("Sequences:", goods)
print("Number of sequences:", len(goods))
One way would be to iterate over pairwise elements:
l = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
res = [[]]
for item1, item2 in zip(l, l[1:]): # pairwise iteration
if item2 - item1 == 1:
# The difference is 1, if we're at the beginning of a sequence add both
# to the result, otherwise just the second one (the first one is already
# included because of the previous iteration).
if not res[-1]: # index -1 means "last element".
res[-1].extend((item1, item2))
else:
res[-1].append(item2)
elif res[-1]:
# The difference isn't 1 so add a new empty list in case it just ended a sequence.
res.append([])
# In case "l" doesn't end with a "sequence" one needs to remove the trailing empty list.
if not res[-1]:
del res[-1]
>>> res
[[1, 2, 3], [12, 13, 14, 15], [23, 24, 25, 26]]
>>> len(res) # the amount of these sequences
3
A solution without zip only requires small changes (the loop and the the beginning of the loop) compared to the approach above:
l = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
res = [[]]
for idx in range(1, len(l)):
item1 = l[idx-1]
item2 = l[idx]
if item2 - item1 == 1:
if not res[-1]:
res[-1].extend((item1, item2))
else:
res[-1].append(item2)
elif res[-1]:
res.append([])
if not res[-1]:
del res[-1]
Taken from python itertools documentation, as demonstrated here you can use itemgetter and groupby to do that using only one list, like so:
>>> from itertools import groupby
>>> from operator import itemgetter
>>>
>>> l = [1, 2, 3, 5, 8, 10, 12, 13, 14, 15, 17, 19, 21, 23, 24, 25, 26]
>>>
>>> counter = 0
>>> for k, g in groupby(enumerate(l), lambda (i,x):i-x):
... seq = map(itemgetter(1), g)
... if len(seq)>1:
... print seq
... counter+=1
...
[1, 2, 3]
[12, 13, 14, 15]
[23, 24, 25, 26]
>>> counter
3
Notice: As correctly mentioned by #MSeifert, tuple unpacking in the signature is only possible in Python 2 and it will fail on Python 3 - so this is a python 2.x solution.
This could be solved with dynamic programming. If you only want to know the number of sequences and don't actually need to know what the sequences are you should be able to do this with only a couple of variables. Realistically, as you're going through the list you only really need to know if you are currently in a sequence, if not if the next one is incremented by 1 making this the beginning of a sequence and if so is the next one greater than 1 making it the exit of a sequence. After that, you just need to make sure to end the loop one cell before the end of the list since the last cell cant form a sequence by itself and so that it doesn't cause an error when you're performing a check. Below is example code
isSeq=false
for i in range(len(numbers)-1):
if isSeq==false:
if numbers[i]+1==numbers[i+1]:
isSeq=true
count=count+1
elif
if numbers[i]+1!=numbers[i+1]:
isSeq=false
Here is a link to a dynamic programming tutorial.
https://www.codechef.com/wiki/tutorial-dynamic-programming
I've achived these two things.
Find all possible sublists of a list in given range (i ,j).
A = [ 44, 55, 66, 77, 88, 99, 11, 22, 33 ]
Let, i = 2 and j = 4
Then, Possible sublists of the list "A" in the given range (2,4) is :
[66], [66,77], [66,77,88], [77], [77,88], [88]
And, minimum of the resultant product after multipying all the elements of the sublists:
So, the resultant list after multiplying all the elements in the above sublists will become
X = [66, 5082, 447216, 77, 6776, 88]`
Now, the minimum of the above list, which is min(X) i.e 66
My Code:
i, j = 2, 4
A = [ 44, 55, 66, 77, 88, 99, 11, 22, 33 ]
O, P = i, i
mini = A[O]
while O <= j and P <= j:
if O == P:
mini = min(mini, reduce(lambda x, y: x * y, [A[O]]))
else:
mini = min(mini, reduce(lambda x, y: x * y, A[O:P + 1]))
P += 1
if P > j:
O += 1
P = O
print(mini)
My Question:
This code is taking more time to get executed for the Larger Lists and Larger Ranges ! Is there any possible "Pythonic" way of reducing the time complexity of the above code ? Thanks in advance !
EDIT :
Got it. But, If there is more than one such possible sublist with the same minimum product,
I need the longest sub list range (i,j)
If there are still more than one sublists with the same "longest sub range", I need to print the sub-interval which has the lowest start index.
Consider this list A = [2, 22, 10, 12, 2] if (i,j) = (0,4).
There is a tie. Min product = 2 with two possibilities '(0,0)' and '(4,4)' . Both sub list range = 0 [ (0-0) and (4-4) ]
In this case i need to print (minproduct, [sublist-range]) = 2, [0,0]
Tried using dictionaries, It works for some inputs but not for all ! How to do this 'efficiently' ?
Thank you !
First, given the list and the index range, we can get the sublist A[i : j + 1]
[66, 77, 88]
For positive integers a and b, a * b is no less than a or b. So you don't need to do multiplying, it's not possible that multiplying of two or more elements has a smaller result. The minimum of this list is the minimum of all the multiplying results.
So the result is:
min(A[i : j + 1])
For generating the sublists, it is as simple as two nested for loops in a list comprehension:
def sublists(l,i,j):
return [l[m:n+1] for m in range(i,j+1) for n in range(m,j+1)]
example:
>>> sublists(A,2,4)
[[66], [66, 77], [66, 77, 88], [77], [77, 88], [88]]
For finding the minimum product:
>>> min(map(prod, sublists(A,2,4)))
66
(you import prod from numpy, or define it as def prod(x): return reduce(lambda i,j:i*j,x))
The accepted answer is correct for all positive ints as you cannot multiply the smallest element by any number and get a smaller result. It might make more sense if you were getting all the slices greater than length 1.
If you were going to calculate it then you could use itertools.islice to get each slice and get the min using a generator expression:
from itertools import islice
from operator import mul
print(min(reduce(mul, islice(A, n, k + 1), 1)
for n in range(i, j + 1) for k in range(n, j + 1)))
66
If for i = 0 and j = 4 you considered (44, 55, 66, 88) a legitimate slice then you would need to use itertools.combinations.
#EDIT: Quick Solution:
min(A[i:j+1])
Since all the numbers are positive integers, and you want to find the minimum product of all possible sublists of A[i:j+1] list
slice, it will also contain sublists of length 1. The minimum products of all such sublists will be lowest number among the A[i:j+1] slice.
Another Solution:
The below method will be useful when you need to find the maximum product of sublists or you need all the possible combinations of A[i:j+1] list slice.
We'll use itertools.combinations to solve this. We can do this in 3 steps.
Step1: Get the slice of the list
my_list = A[i:j+1]
This will give us the slice to work on.
my_list = A[2:5]
my_list
[66, 77, 88]
Step-2 Generate all possible combinations:
import itertools
my_combinations = []
for x in range(1, len(my_list)+1):
my_combinations.extend(list(itertools.combinations(my_list,x)))
my_combinations
[(66,), (77,), (88,), (66, 77), (66, 88), (77, 88), (66, 77, 88)]
iterools.combinations returns r length subsequences of elements from
the input iterable
So, we will use this to generate subsequences of length 1 to length equal to length of my_list. We will get a list of tuples with each element being a subsequence.
Step-3 : Find min product of all possible combinations
products_list = [reduce(lambda i,j:i*j, x) for x in my_combinations]
[66, 77, 88, 5082, 5808, 6776, 447216]
min(products_list)
66
After getting the subsequences, we apply list comprehension along with reduce() to get the list of products for all the subsequences in my_combinations list. Then we apply min() function to get the minimum product out of the products_list which will give us our answer.
Take a look a itertools.combinations()
https://docs.python.org/3/library/itertools.html#itertools.combinations
Call it passing the sublist, in a loop, with the other parameter varying from 1 to the length of the sublist.
It will definitely take "more time to get executed for the Larger Lists and Larger Ranges", i think that's inevitable. But might be much faster than your approach. Measure and see.
def solution(a_list):
sub = [[]]
for i in range(len(a_list)):
for j in range(len(a_list)):
if(i == j):
sub.append([a_list[i]])
elif(i > j):
sub.append([a_list[j],a_list[i]])
sub.append(a_list)
return sub
solution([10, 20, 30])
[[], [10], [10, 20], [20], [10, 30], [20, 30], [30], [10, 20, 30]]
I am after a string format to efficiently represent a set of indices.
For example "1-3,6,8-10,16" would produce [1,2,3,6,8,9,10,16]
Ideally I would also be able to represent infinite sequences.
Is there an existing standard way of doing this? Or a good library? Or can you propose your own format?
thanks!
Edit: Wow! - thanks for all the well considered responses. I agree I should use ':' instead. Any ideas about infinite lists? I was thinking of using "1.." to represent all positive numbers.
The use case is for a shopping cart. For some products I need to restrict product sales to multiples of X, for others any positive number. So I am after a string format to represent this in the database.
You don't need a string for that, This is as simple as it can get:
from types import SliceType
class sequence(object):
def __getitem__(self, item):
for a in item:
if isinstance(a, SliceType):
i = a.start
step = a.step if a.step else 1
while True:
if a.stop and i > a.stop:
break
yield i
i += step
else:
yield a
print list(sequence()[1:3,6,8:10,16])
Output:
[1, 2, 3, 6, 8, 9, 10, 16]
I'm using Python slice type power to express the sequence ranges. I'm also using generators to be memory efficient.
Please note that I'm adding 1 to the slice stop, otherwise the ranges will be different because the stop in slices is not included.
It supports steps:
>>> list(sequence()[1:3,6,8:20:2])
[1, 2, 3, 6, 8, 10, 12, 14, 16, 18, 20]
And infinite sequences:
sequence()[1:3,6,8:]
1, 2, 3, 6, 8, 9, 10, ...
If you have to give it a string then you can combine #ilya n. parser with this solution. I'll extend #ilya n. parser to support indexes as well as ranges:
def parser(input):
ranges = [a.split('-') for a in input.split(',')]
return [slice(*map(int, a)) if len(a) > 1 else int(a[0]) for a in ranges]
Now you can use it like this:
>>> print list(sequence()[parser('1-3,6,8-10,16')])
[1, 2, 3, 6, 8, 9, 10, 16]
If you're into something Pythonic, I think 1:3,6,8:10,16 would be a better choice, as x:y is a standard notation for index range and the syntax allows you to use this notation on objects. Note that the call
z[1:3,6,8:10,16]
gets translated into
z.__getitem__((slice(1, 3, None), 6, slice(8, 10, None), 16))
Even though this is a TypeError if z is a built-in container, you're free to create the class that will return something reasonable, e.g. as NumPy's arrays.
You might also say that by convention 5: and :5 represent infinite index ranges (this is a bit stretched as Python has no built-in types with negative or infinitely large positive indexes).
And here's the parser (a beautiful one-liner that suffers from slice(16, None, None) glitch described below):
def parse(s):
return [slice(*map(int, x.split(':'))) for x in s.split(',')]
There's one pitfall, however: 8:10 by definition includes only indices 8 and 9 -- without upper bound. If that's unacceptable for your purposes, you certainly need a different format and 1-3,6,8-10,16 looks good to me. The parser then would be
def myslice(start, stop=None, step=None):
return slice(start, (stop if stop is not None else start) + 1, step)
def parse(s):
return [myslice(*map(int, x.split('-'))) for x in s.split(',')]
Update: here's the full parser for a combined format:
from sys import maxsize as INF
def indices(s: 'string with indices list') -> 'indices generator':
for x in s.split(','):
splitter = ':' if (':' in x) or (x[0] == '-') else '-'
ix = x.split(splitter)
start = int(ix[0]) if ix[0] is not '' else -INF
if len(ix) == 1:
stop = start + 1
else:
stop = int(ix[1]) if ix[1] is not '' else INF
step = int(ix[2]) if len(ix) > 2 else 1
for y in range(start, stop + (splitter == '-'), step):
yield y
This handles negative numbers as well, so
print(list(indices('-5, 1:3, 6, 8:15:2, 20-25, 18')))
prints
[-5, 1, 2, 6, 7, 8, 10, 12, 14, 20, 21, 22, 23, 24, 25, 18, 19]
Yet another alternative is to use ... (which Python recognizes as the built-in constant Ellipsis so you can call z[...] if you want) but I think 1,...,3,6, 8,...,10,16 is less readable.
This is probably about as lazily as it can be done, meaning it will be okay for even very large lists:
def makerange(s):
for nums in s.split(","): # whole list comma-delimited
range_ = nums.split("-") # number might have a dash - if not, no big deal
start = int(range_[0])
for i in xrange(start, start + 1 if len(range_) == 1 else int(range_[1]) + 1):
yield i
s = "1-3,6,8-10,16"
print list(makerange(s))
output:
[1, 2, 3, 6, 8, 9, 10, 16]
import sys
class Sequencer(object):
def __getitem__(self, items):
if not isinstance(items, (tuple, list)):
items = [items]
for item in items:
if isinstance(item, slice):
for i in xrange(*item.indices(sys.maxint)):
yield i
else:
yield item
>>> s = Sequencer()
>>> print list(s[1:3,6,8:10,16])
[1, 2, 6, 8, 9, 16]
Note that I am using the xrange builtin to generate the sequence. That seems awkward at first because it doesn't include the upper number of sequences by default, however it proves to be very convenient. You can do things like:
>>> print list(s[1:10:3,5,5,16,13:5:-1])
[1, 4, 7, 5, 5, 16, 13, 12, 11, 10, 9, 8, 7, 6]
Which means you can use the step part of xrange.
This looked like a fun puzzle to go with my coffee this morning. If you settle on your given syntax (which looks okay to me, with some notes at the end), here is a pyparsing converter that will take your input string and return a list of integers:
from pyparsing import *
integer = Word(nums).setParseAction(lambda t : int(t[0]))
intrange = integer("start") + '-' + integer("end")
def validateRange(tokens):
if tokens.from_ > tokens.to:
raise Exception("invalid range, start must be <= end")
intrange.setParseAction(validateRange)
intrange.addParseAction(lambda t: list(range(t.start, t.end+1)))
indices = delimitedList(intrange | integer)
def mergeRanges(tokens):
ret = set()
for item in tokens:
if isinstance(item,int):
ret.add(item)
else:
ret += set(item)
return sorted(ret)
indices.setParseAction(mergeRanges)
test = "1-3,6,8-10,16"
print indices.parseString(test)
This also takes care of any overlapping or duplicate entries, such "3-8,4,6,3,4", and returns a list of just the unique integers.
The parser takes care of validating that ranges like "10-3" are not allowed. If you really wanted to allow this, and have something like "1,5-3,7" return 1,5,4,3,7, then you could tweak the intrange and mergeRanges parse actions to get this simpler result (and discard the validateRange parse action altogether).
You are very likely to get whitespace in your expressions, I assume that this is not significant. "1, 2, 3-6" would be handled the same as "1,2,3-6". Pyparsing does this by default, so you don't see any special whitespace handling in the code above (but it's there...)
This parser does not handle negative indices, but if that were needed too, just change the definition of integer to:
integer = Combine(Optional('-') + Word(nums)).setParseAction(lambda t : int(t[0]))
Your example didn't list any negatives, so I left it out for now.
Python uses ':' for a ranging delimiter, so your original string could have looked like "1:3,6,8:10,16", and Pascal used '..' for array ranges, giving "1..3,6,8..10,16" - meh, dashes are just as good as far as I'm concerned.
I have a generator that takes a number as an argument and yields other numbers.
I want to use the numbers yielded by this generator and pass them as arguments to the same generator, creating a chain of some length.
For example, mygenerator(2) yields 5, 4 and 6. Apply mygenerator to each of these numbers, over and over again to the numbers yielded. The generator always yields bigger numbers than the one passed as argument, and for 2 different numbers will never yield the same number.
mygenerator(2): 4 5
mygenerator(4) : 10 11 12
mygenerator(5): 9 300 500
So the set (9,10,11,12,300,500) has "distance" 2 from the original number, 2. If I apply it to the number 9, I will get a set of numbers with distance "3" from the original 2.
Essentially what I want is to create a set that has a specified distance from a given number and I have problems figuring out how to do that in Python. Help much appreciated :)
Suppose our generator yields square and cube of given number that way it will output unique
so if we want to get numbers at dist D in simplest case we can recursively get numbers at dist D-1 and then apply generator to them
def mygen(N):
yield N**2
yield N**3
def getSet(N, dist):
if dist == 0:
return [N]
numbers = []
for n in getSet(N, dist-1):
numbers += list(mygen(n))
return numbers
print getSet(2,0)
print getSet(2,1)
print getSet(2,2)
print getSet(2,3)
output is
[2]
[4, 8]
[16, 64, 64, 512]
[256, 4096, 4096, 262144, 4096, 262144, 262144, 134217728]
This solution does not require to keep all results in memory: (in case it doesn't fit in memory etc)
def grandKids(generation, kidsFunc, val):
layer = [val]
for i in xrange(generation):
layer = itertools.chain.from_iterable(itertools.imap(kidsFunc, layer))
return layer
Example:
def kids(x): # children indices in a 1-based binary heap
yield x*2
yield x*2+1
>>> list(grandKids(3, kids, 2))
[16, 17, 18, 19, 20, 21, 22, 23]
Btw, solution in Haskell:
grandKids generation kidsFunc val =
iterate (concatMap kidsFunc) [val] !! generation
I have just started learning Python so bear with me if my answer seems a tad amateurish. What you could do is use a list of lists to populate the values returned from the myGenerator function.
So for eg. with 2 as the starting argument your data-structure would resemble something like
resDataSet = [[2],
[4, 5],
[9, 10, 11, 12, 300 , 500]
...
]
The row index should give you the distance and you can use methods like extend to add on more data to your list.