How to split an integer into a list of digits? - python

Suppose I have an input integer 12345. How can I split it into a list like [1, 2, 3, 4, 5]?

Convert the number to a string so you can iterate over it, then convert each digit (character) back to an int inside a list-comprehension:
>>> [int(i) for i in str(12345)]
[1, 2, 3, 4, 5]

return array as string
>>> list(str(12345))
['1', '2', '3', '4', '5']
return array as integer
>>> map(int,str(12345))
[1, 2, 3, 4, 5]

I'd rather not turn an integer into a string, so here's the function I use for this:
def digitize(n, base=10):
if n == 0:
yield 0
while n:
n, d = divmod(n, base)
yield d
Examples:
tuple(digitize(123456789)) == (9, 8, 7, 6, 5, 4, 3, 2, 1)
tuple(digitize(0b1101110, 2)) == (0, 1, 1, 1, 0, 1, 1)
tuple(digitize(0x123456789ABCDEF, 16)) == (15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
As you can see, this will yield digits from right to left. If you'd like the digits from left to right, you'll need to create a sequence out of it, then reverse it:
reversed(tuple(digitize(x)))
You can also use this function for base conversion as you split the integer. The following example splits a hexadecimal number into binary nibbles as tuples:
import itertools as it
tuple(it.zip_longest(*[digitize(0x123456789ABCDEF, 2)]*4, fillvalue=0)) == ((1, 1, 1, 1), (0, 1, 1, 1), (1, 0, 1, 1), (0, 0, 1, 1), (1, 1, 0, 1), (0, 1, 0, 1), (1, 0, 0, 1), (0, 0, 0, 1), (1, 1, 1, 0), (0, 1, 1, 0), (1, 0, 1, 0), (0, 0, 1, 0), (1, 1, 0, 0), (0, 1, 0, 0), (1, 0, 0, 0))
Note that this method doesn't handle decimals, but could be adapted to.

[int(i) for i in str(number)]
or, if do not want to use a list comprehension or you want to use a base different from 10
from __future__ import division # for compatibility of // between Python 2 and 3
def digits(number, base=10):
assert number >= 0
if number == 0:
return [0]
l = []
while number > 0:
l.append(number % base)
number = number // base
return l

While list(map(int, str(x))) is the Pythonic approach, you can formulate logic to derive digits without any type conversion:
from math import log10
def digitize(x):
n = int(log10(x))
for i in range(n, -1, -1):
factor = 10**i
k = x // factor
yield k
x -= k * factor
res = list(digitize(5243))
[5, 2, 4, 3]
One benefit of a generator is you can feed seamlessly to set, tuple, next, etc, without any additional logic.

like #nd says but using the built-in function of int to convert to a different base
>>> [ int(i,16) for i in '0123456789ABCDEF' ]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
>>> [int(i,2) for i in "100 010 110 111".split()]
[4, 2, 6, 7]

Another solution that does not involve converting to/from strings:
from math import log10
def decompose(n):
if n == 0:
return [0]
b = int(log10(n)) + 1
return [(n // (10 ** i)) % 10 for i in reversed(range(b))]

Using join and split methods of strings:
>>> a=12345
>>> list(map(int,' '.join(str(a)).split()))
[1, 2, 3, 4, 5]
>>> [int(i) for i in ' '.join(str(a)).split()]
[1, 2, 3, 4, 5]
>>>
Here we also use map or a list comprehension to get a list.

Strings are just as iterable as arrays, so just convert it to string:
str(12345)

Simply turn it into a string, split, and turn it back into an array integer:
nums = []
c = 12345
for i in str(c):
l = i.split()[0]
nums.append(l)
np.array(nums)

Related

Random sampling a large Cartesian product of iterables

I have multiple iterables and I need to create the Cartesian product of those iterables and then randomly sample from the resulting pool of tuples. The problem is that the total number of combinations of these iterables is somewhere around 1e19, so I can't possibly load all of this into memory.
What I thought was using itertools.product in combination with a random number generator to skip random number of items, then once I get to the randomly selected item, I perform my calculations and continue until I run out of the generator. The plan was to do something like:
from itertools import product
from random import randint
iterables = () # tuple of 18 iterables
versions = product(iterables)
def do_stuff():
# do stuff
STEP_SIZE = int(1e6)
# start both counts from 0.
# First value to be taken is start + step
# after that increment start to be equal to count and repeat
start = 0
count = 0
while True:
try:
step = randint(1, 100) * STEP_SIZE
for v in versions:
# if the count is less than required skip values while incrementing count
if count < start + step:
versions.next()
count += 1
else:
do_stuff(*v)
start = count
except StopIteration:
break
Unfortunately, itertools.product objects don't have the next() method, so this doesn't work. What other way is there to go through this large number of tuples and either take a random sample or directly run calculations on the values?
Don't try to generate the Cartesian product. Sample from one iterable at a time to generate your result using random.choice(). The number of elements across all iterables is small, so you can store all the elements in memory directly.
Here's an example using 18 iterables with 10 elements each (as specified in the comment):
import random
iterables = [list(range(i, i + 10)) for i in range(0, 180, 10)]
result = [random.choice(iterable) for iterable in iterables]
print(result)
Which version of Python are you using? Somewhere along the way .next() methods were deprecated in favor a new next() built-in function. That works fine with all iterators. Here, for example, under the current released 3.10.1:
>>> import itertools
>>> itp = itertools.product(range(5), repeat=6)
>>> next(itp)
(0, 0, 0, 0, 0, 0)
>>> next(itp)
(0, 0, 0, 0, 0, 1)
>>> next(itp)
(0, 0, 0, 0, 0, 2)
>>> next(itp)
(0, 0, 0, 0, 0, 3)
>>> for ignore in range(50):
... ignore = next(itp)
>>> next(itp)
(0, 0, 0, 2, 0, 4)
Beyond that, you didn't show us the most important part of your code: how you build your product.
Without seeing that, I can only guess that it would be far more efficient to make a random choice from the first sequence passed to product(), then another from the second, and so on. Build a random element of the product from one component at a time.
Picking a random product tuple efficiently
Perhaps overkill, but this class shows an especially efficient way to do this. The .index() method maps an integer i to the i'th tuple (0-based) in the product. Then picking a random tuple from the product is simply applying .index() to a random integer in range(total number of elements in the product).
from math import prod
from random import randrange
class RanProduct:
def __init__(self, iterables):
self.its = list(map(list, iterables))
self.n = prod(map(len, self.its))
def index(self, i):
if i not in range(self.n):
raise ValueError(f"index {i} not in range({self.n})")
result = []
for it in reversed(self.its):
i, r = divmod(i, len(it))
result.append(it[r])
return tuple(reversed(result))
def pickran(self):
return self.index(randrange(self.n))
and then
>>> r = RanProduct(["abc", range(2)])
>>> for i in range(6):
... print(i, '->', r.index(i))
0 -> ('a', 0)
1 -> ('a', 1)
2 -> ('b', 0)
3 -> ('b', 1)
4 -> ('c', 0)
5 -> ('c', 1)
>>> r = RanProduct([range(10)] * 19)
>>> r.pickran()
(3, 5, 8, 8, 3, 6, 7, 6, 8, 6, 2, 0, 5, 6, 1, 0, 0, 8, 2)
>>> r.pickran()
(4, 5, 0, 5, 7, 1, 7, 2, 7, 4, 8, 4, 2, 0, 2, 9, 3, 6, 2)
>>> r.pickran()
(8, 7, 4, 1, 3, 0, 4, 6, 4, 3, 9, 8, 5, 8, 9, 9, 7, 1, 8)
>>> r.pickran()
(8, 6, 6, 0, 6, 7, 1, 3, 9, 5, 1, 4, 5, 8, 6, 8, 4, 9, 9)
>>> r.pickran()
(4, 9, 4, 7, 1, 5, 5, 1, 6, 7, 1, 8, 9, 0, 7, 9, 1, 7, 0)
>>> r.pickran()
(3, 0, 3, 9, 8, 6, 3, 0, 3, 0, 9, 9, 3, 5, 2, 3, 7, 8, 8)

Fast python algorithm to find all possible partitions from a list of numbers that has subset sums equal to given ratios

Say I have a list of 20 random integers from 0 to 9. I want to divide the list into N subsets so that the ratio of subset sums equal to given values, and I want to find all possible partitions. I wrote the following code and got it work for the N = 2 case.
import random
import itertools
#lst = [random.randrange(10) for _ in range(20)]
lst = [2, 0, 1, 7, 2, 4, 9, 7, 6, 0, 5, 4, 7, 4, 5, 0, 4, 5, 2, 3]
def partition_sum_with_ratio(numbers, ratios):
target1 = round(int(sum(numbers) * ratios[0] / (ratios[0] + ratios[1])))
target2 = sum(numbers) - target1
p1 = [seq for i in range(len(numbers), 0, -1) for seq in
itertools.combinations(numbers, i) if sum(seq) == target1
and sum([s for s in numbers if s not in seq]) == target2]
p2 = [tuple(n for n in numbers if n not in seq) for seq in p1]
return list(zip(p1, p2))
partitions = partition_sum_with_ratios(lst, ratios=[4, 3])
print(partitions[0])
Output:
((2, 0, 1, 2, 4, 6, 0, 5, 4, 4, 5, 0, 4, 5, 2), (7, 9, 7, 7, 3))
If you calculate the sum of each subset, you will find the ratio is 44 : 33 = 4 : 3, which are exactly the input values. However, I want the function to work for any number of subsets. For example, I expect
partition_sum_with_ratio(lst, ratios=[4, 3, 3])
to return something like
((2, 0, 1, 2, 4, 6, 0, 5, 4, 4, 3), (5, 0, 4, 5, 2, 7), (9, 7, 7))
I have been thinking about this problem for a month and I found this to be extremely hard. My conclusion is that this problem can only be solved by a recursion. I would like to know if there are any relatively fast algorithm for this. Any suggestions?
Yes, recursion is called for. The basic logic is to do a bipartition into one part and the rest and then recursively split the rest in all possible ways. I've followed your lead in assuming that everything is distinguishable, which creates a lot of possibilities, possibly too many to enumerate. Nevertheless:
import itertools
def totals_from_ratios(sum_numbers, ratios):
sum_ratios = sum(ratios)
totals = [(sum_numbers * ratio) // sum_ratios for ratio in ratios]
residues = [(sum_numbers * ratio) % sum_ratios for ratio in ratios]
for i in sorted(
range(len(ratios)), key=lambda i: residues[i] * ratios[i], reverse=True
)[: sum_numbers - sum(totals)]:
totals[i] += 1
return totals
def bipartitions(numbers, total):
n = len(numbers)
for k in range(n + 1):
for combo in itertools.combinations(range(n), k):
if sum(numbers[i] for i in combo) == total:
set_combo = set(combo)
yield sorted(numbers[i] for i in combo), sorted(
numbers[i] for i in range(n) if i not in set_combo
)
def partitions_into_totals(numbers, totals):
assert totals
if len(totals) == 1:
yield [numbers]
else:
for first, remaining_numbers in bipartitions(numbers, totals[0]):
for rest in partitions_into_totals(remaining_numbers, totals[1:]):
yield [first] + rest
def partitions_into_ratios(numbers, ratios):
totals = totals_from_ratios(sum(numbers), ratios)
yield from partitions_into_totals(numbers, totals)
lst = [2, 0, 1, 7, 2, 4, 9, 7, 6, 0, 5, 4, 7, 4, 5, 0, 4, 5, 2, 3]
for part in partitions_into_ratios(lst, [4, 3, 3]):
print(part)

Find unique pairs of array with Python

I'm searching for a pythonic way to do this operation faster
import numpy as np
von_knoten = np.array([0, 0, 1, 1, 1, 2, 2, 2, 3, 4])
zu_knoten = np.array([1, 2, 0, 2, 3, 0, 1, 4, 1, 2])
try:
for i in range(0,len(von_knoten)-1):
for j in range(0,len(von_knoten)-1):
if (i != j) & ([von_knoten[i],zu_knoten[i]] == [zu_knoten[j],von_knoten[j]]):
print(str(i)+".column equal " +str(j)+".column")
von_knoten = sp.delete(von_knoten , j)
zu_knoten = sp.delete(zu_knoten , j)
print(von_knoten)
print(zu_knoten)
except:
print('end')
so I need the fastest way to get
[0 0 1 1 4]
[1 2 2 3 2]
from
[0 0 1 1 1 2 2 2 3 4]
[1 2 0 2 3 0 1 4 1 2]
Thanks ;)
Some comments about your code; as-is, it does not do what you want, it shall print some stuff, did you even try to run it? Could you show us what you obtain?
first, simply do a range(len(von_knoten)); this will do what you want, as range starts at 0 by default, and ends one step before the end.
if you delete some items from the input lists, and try to access to items at end of them, you will likely obtain IndexErrors, this before exhausting the analysis of your input lists.
you do some sp.delete but we do not know what that is (neither do the code), this will raise AttributeErrors.
alas, please do not use except:. This will catch Exceptions you never dreamt of, and may explain why you don't understand what's wrong.
Then, what about using zip built-in function to obtain sorted two-dimensions tuples, and remove the duplicates ? Something like:
>>> von_knoten = [0, 0, 1, 1, 1, 2, 2, 2, 3, 4]
>>> zu_knoten = [1, 2, 0, 2, 3, 0, 1, 4, 1, 2]
>>> set(tuple(sorted([m, n])) for m, n in zip(von_knoten, zu_knoten))
{(0, 1), (0, 2), (1, 2), (1, 3), (2, 4)}
I let you work around this to obtain the exact thing you're looking for.
You are trying to build up a collection of pairs you haven't seen before.
You can use not in but need to check this either way round:
L = []
for x,y in zip(von_knoten, zu_knoten):
if (x, y) not in L and (y, x ) not in L:
L.append((x, y))
This gives a list of tuples
[(0, 1), (0, 2), (1, 2), (1, 3), (2, 4)]
which you can reshape.
Here's a vectorized output -
def unique_pairs(von_knoten, zu_knoten):
s = np.max([von_knoten, zu_knoten])+1
p1 = zu_knoten*s + von_knoten
p2 = von_knoten*s + zu_knoten
p = np.maximum(p1,p2)
sidx = p.argsort(kind='mergesort')
ps = p[sidx]
m = np.concatenate(([True],ps[1:] != ps[:-1]))
sm = sidx[m]
return von_knoten[sm],zu_knoten[sm]
Sample run -
In [417]: von_knoten = np.array([0, 0, 1, 1, 1, 2, 2, 2, 3, 4])
...: zu_knoten = np.array([1, 2, 0, 2, 3, 0, 1, 4, 1, 2])
In [418]: unique_pairs(von_knoten, zu_knoten)
Out[418]: (array([0, 0, 1, 1, 2]), array([1, 2, 2, 3, 4]))
Using np.unique and the void view method from here
def unique_pairs(a, b):
c = np.sort(np.stack([a, b], axis = 1), axis = 1)
c_view = np.ascontiguousarray(c).view(np.dtype((np.void,
c.dtype.itemsize * c.shape[1])))
_, i = np.unique(c_view, return_index = True)
return a[i], b[i]

Consecutive numbers list where each number repeats

How can I create a list of consecutive numbers where each number repeats N times, for example:
list = [0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5]
Another idea, without any need for other packages or sums:
[x//N for x in range((M+1)*N)]
Where N is your number of repeats and M is the maximum value to repeat. E.g.
N = 3
M = 5
[x//N for x in range((M+1)*N)]
yields
[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5]
My first instinct is to get some functional help from the funcy package. If N is the number of times to repeat each value, and M is the maximum value to repeat, then you can do
import funcy as fp
fp.flatten(fp.repeat(i, N) for i in range(M + 1))
This will return a generator, so to get the array you can just call list() around it
sum([[i]*n for i in range(0,x)], [])
The following piece of code is the simplest version I can think of.
It’s a bit dirty and long, but it gets the job done.
In my opinion, it’s easier to comprehend.
def mklist(s, n):
l = [] # An empty list that will contain the list of elements
# and their duplicates.
for i in range(s): # We iterate from 0 to s
for j in range(n): # and appending each element (i) to l n times.
l.append(i)
return l # Finally we return the list.
If you run the code …:
print mklist(10, 2)
[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]
print mklist(5, 3)
[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4
Another version a little neater, with list comprehension.
But uhmmm… We have to sort it though.
def mklist2(s, n):
return sorted([l for l in range(s) * n])
Running that version will give the following results.
print mklist2(5, 3)
Raw : [0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
Sorted: [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4]

Decompress an array in Python

I need to decompress an array and I am not sure where to start.
Here is the input of the function
def main():
# Test case for Decompress function
B = [6, 2, 7, 1, 3, 5, 1, 9, 2, 0]
A = Decompress(B)
print(A)
I want this to come out
A = [2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 5, 5, 5, 9, 0, 0]
If you can't see the pattern, B[1] is how many times B[2] shows up in A[], and then B[3] is how many times B[4] shows up in A[], and so on.
How do I write a function for this?
Compact version with zip() and itertools.chain.from_iterable:
from itertools import chain
list(chain.from_iterable([v] * c for c, v in zip(*([iter(B)]*2))))
Demo:
>>> B = [6, 2, 7, 1, 3, 5, 1, 9, 2, 0]
>>> from itertools import chain
>>> list(chain.from_iterable([v] * c for c, v in zip(*([iter(B)]*2))))
[2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 5, 5, 5, 9, 0, 0]
Breaking this down:
zip(*([iter(B)]*2))) pairs counts with values:
>>> zip(*([iter(B)]*2))
[(6, 2), (7, 1), (3, 5), (1, 9), (2, 0)]
It is a fairly standard Python trick to get pairs out of a input iterable.
([v] * c for c, v in zip(*([iter(B)]*2))) is a generator expression that takes the counts and values and produces lists with the value repeated count times:
>>> next([v] * c for c, v in zip(*([iter(B)]*2)))
[2, 2, 2, 2, 2, 2]
chain.from_iterable takes the various lists produced by the generator expression and lets you iterate over them as if they were one long list.
list() turns it all back to a list.
def unencodeRLE(i):
i = list(i) #Copies the list to a new list, so the original one is not changed.
r = []
while i:
count = i.pop(0)
n = i.pop(0)
r+= [n for _ in xrange(count)]
return r
One more one-liner:
def decompress(vl):
return sum([vl[i] * [vl[i+1]] for i in xrange(0, len(vl), 2)], [])
A list comprehension extracts and unpacks pairs (xrange(0, len(vl), 2) iterates through start indices of pairs, vl[i] is a number of repetitions, vl[i+1] is what to repeat).
sum() joins the results together ([] is the initial value the unpacked lists are sequentially added to).
A slightly faster solution (with Python 2.7.3):
A=list(chain.from_iterable( [ B[i]*[B[i+1]] for i in xrange(0,len(B),2) ] ) )
>>> timeit.Timer(
setup='B=[6,2,7,1,3,5,1,9,2,0];from itertools import chain',
stmt='A=list(chain.from_iterable( [ B[i]*[B[i+1]] for i in xrange(0,len(B),2) ] ) )').timeit(100000)
0.22841787338256836
Comparing with:
>>> timeit.Timer(
setup='B=[6,2,7,1,3,5,1,9,2,0];from itertools import chain',
stmt='A=list(chain.from_iterable([v] * c for c, v in zip(*([iter(B)]*2))))').timeit(100000)
0.31104111671447754

Categories