Need to get set of numbers in cyclic order - Python - python

I am writing some python script to generate JSON. I constructed the jSON successfully. but stuck in getting selecting number in cyclic order.
let us say, i have a list of 1,2,3,4,5. I need to select first 4 numbers (1,2,3,4) here for first first item and 2,3,4,5 for second and 3,4,5,1 for third and it should go on till 30 times.
import json
import random
json_dict = {}
number = []
brokers = [1,2,3,4,5]
json_dict["version"] = version
json_dict["partitions"] = [{"topic": "topic1", "name": i,"replicas":
random.choice(brokers)} for i in range(0, 30)]
with open("output.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4)
Output
"version": "1",
"partitions": [
{
"topic": "topic1",
"name": 0,
"replicas": 1,2,3,4
},
{
"topic": "topic1",
"name": 1,
"replicas": 2,3,4,5
},
{
"topic": "topic1",
"name": 3,
"replicas": 3,4,5,1
Anyway, how can i achieve this?

In order to get a cyclic elements from your brokers list you can use deque from collections module and do a deque.rotation(-1) like this example:
from collections import deque
def grouper(iterable, elements, rotations):
if elements > len(iterable):
return []
b = deque(iterable)
for _ in range(rotations):
yield list(b)[:elements]
b.rotate(-1)
brokers = [1,2,3,4,5]
# Pick 4 elements from brokers and yield 30 cycles
cycle = list(grouper(brokers, 4, 30))
print(cycle)
Output:
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5,1, 2], [5, 1, 2, 3]]
Also, this is a way how to implement this solution to your final dict:
# in this example i'm using only 5 cycles
cycles = grouper(brokers, 4, 5)
partitions = [{"topic": "topic1", "name": i, "replicas": cycle_elem} for i, cycle_elem in zip(range(5), cycles)]
final_dict = {"version": "1", "partitions": partitions}
print(final_dict)
Output:
{'partitions': [{'name': 0, 'replicas': [1, 2, 3, 4], 'topic': 'topic1'}, {'name': 1, 'replicas': [2, 3, 4, 5], 'topic': 'topic1'}, {'name': 2, 'replicas': [3, 4, 5, 1], 'topic': 'topic1'}, {'name': 3, 'replicas': [4, 5, 1, 2], 'topic': 'topic1'}, {'name': 4, 'replicas': [5, 1, 2, 3], 'topic': 'topic1'}], 'version': '1'}

Here's a purely procedural solution which also adds flexibility of selecting any number of groups, of any size (even bigger than the original 'brokers' list) with any offset:
def get_subgroups(groups, base, size, offset=1):
# cover the group size > len(base) case by expanding the base
# this step is completely optional if your group size will never be bigger
base *= -(-size // len(base))
result = [] # storage for our groups
base_size = len(base) # no need to call len() all the time
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
tail = current_offset + size # end index for our current slice
end = min(tail, base_size) # normalize to the base size
group = base[current_offset:end] + base[:tail - end] # get our slice
result.append(group) # append it to our result storage
current_offset = (current_offset + offset) % base_size # increase our current offset
return result
brokers = [1, 2, 3, 4, 5]
print(get_subgroups(5, brokers, 4)) # 5 groups of size 4, with default offset
# prints: [[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3]]
print(get_subgroups(3, brokers, 7, 2)) # 3 groups of size 7, with offset 2
# prints: [[1, 2, 3, 4, 5, 1, 2], [3, 4, 5, 1, 2, 3, 4], [5, 1, 2, 3, 4, 5, 1]]
And it does it in a O(N) time with a single loop.
If you're planning to run this on a very large generator, you can turn get_subgroups() function into a generator by forgoing the result collection and doing yield group instead of result.append(group). That way you can call it in a loop as: for group in get_subgroups(30, broker, 4): and store the group in whatever structure you want.
UPDATE
If memory is not an issue, we can optimize (processing-wise) this even more by expanding the whole base (or brokers in your case) to fit the whole set:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
result = [] # storage for our groups
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
result.append(base[current_offset:current_offset+size])
current_offset += offset
return result
Or we can make it even faster with list comprehension if we don't need the ability to turn it into a generator:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
return [base[i:i+size] for i in range(0, groups * offset, offset)]
# as previously mentioned, use xrange() on Python 2.x instead

This is a pretty cool problem, and I think I have a pretty cool solution:
items = [1, 2, 3, 4, 5]
[(items * 2)[x:x+4] for i in range(30) for x in [i % len(items)]]
which gives
[[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3]]
What this is doing is taking your set of things and appending it to itself (items * 2 -> [1, 2, 3, 4, 5, 1, 2, 3 ,4, 5]), and then picking a starting place (x) by taking our loop iteration (i) and modulating it (probably not the right word) by the number of items we have (i in [x % len(items)]).

Related

Single vector multiple times in 2d array

This is my code:
b = [6 * [1, 3, 4, 2],
4 * [2, 1, 4, 3],
3 * [3, 4, 2, 1],
4 * [4, 2, 1, 3],
4 * [4, 3, 2, 1],
]
Which returns an array which has 6X4=24 elements in the first line 4X4=16 in the second etc...
What i want to achieve is adding the exact same line multiple times like:
1, 3, 4, 2
1, 3, 4, 2
1, 3, 4, 2
1, 3, 4, 2
1, 3, 4, 2
1, 3, 4, 2 #6 tines the first line
2, 1, 4, 3
2, 1, 4, 3
2, 1, 4, 3
2, 1, 4, 3 # 4 times the second
..........
but of course by not copying the same line again and again
Try:
b = [
*[[1, 3, 4, 2] for _ in range(6)],
*[[2, 1, 4, 3] for _ in range(4)],
*[[3, 4, 2, 1] for _ in range(3)],
*[[4, 2, 1, 3] for _ in range(4)],
*[[4, 3, 2, 1] for _ in range(4)],
]
print(b)
Prints:
[
[1, 3, 4, 2],
[1, 3, 4, 2],
[1, 3, 4, 2],
[1, 3, 4, 2],
[1, 3, 4, 2],
[1, 3, 4, 2],
[2, 1, 4, 3],
[2, 1, 4, 3],
[2, 1, 4, 3],
[2, 1, 4, 3],
[3, 4, 2, 1],
[3, 4, 2, 1],
[3, 4, 2, 1],
[4, 2, 1, 3],
[4, 2, 1, 3],
[4, 2, 1, 3],
[4, 2, 1, 3],
[4, 3, 2, 1],
[4, 3, 2, 1],
[4, 3, 2, 1],
[4, 3, 2, 1],
]
You can also put it in one line with
b = 6*[[1, 3, 4, 2]] + 4*[[2, 1, 4, 3]] + 3*[[3, 4, 2, 1]] + 4*[[4, 2, 1, 3]] + 4* [[4, 3, 2, 1]])
print(b)
Only two steps if you don't want to change how b was created.
import numpy as np
B = np.concatenate(b).ravel()
b = np.reshape(B,(21,4))
You can use a list comprehension of a list comprehension to repeat y times your x lists (from bb and aa, respectively)
The notation is a bit strange, the "center" one is the outer element, then it expands on the right.
Advantage: numpy not required, and it will work for any size entries, unlike other answers "harcoding" the rows
aa=[[1, 3, 4, 2],
[2, 1, 4, 3],
[3, 4, 2, 1],
[4, 2, 1, 3],
[4, 3, 2, 1]]
bb=[6,4,3,4,4]
xx = [x for x,y in zip(aa,bb) for _ in range(y) ]
returns:
[[1, 3, 4, 2], [1, 3, 4, 2], [1, 3, 4, 2], [1, 3, 4, 2], [1, 3, 4, 2], [1, 3, 4, 2], [2, 1, 4, 3], [2, 1, 4, 3], [2, 1, 4, 3], [2, 1, 4, 3], [3, 4, 2, 1], [3, 4, 2, 1], [3, 4, 2, 1], [4, 2, 1, 3], [4, 2, 1, 3], [4, 2, 1, 3], [4, 2, 1, 3], [4, 3, 2, 1], [4, 3, 2, 1], [4, 3, 2, 1], [4, 3, 2, 1]]
It can be done systematically by pairing the factors and the sublists and multiplying them together. Then flat the list.
Here a hacky-way
b = [[1, 3, 4, 2],
[2, 1, 4, 3],
[3, 4, 2, 1],
[4, 2, 1, 3],
[4, 3, 2, 1],
]
factors = [6, 4, 3, 4, 4]
print(sum(f*[l] for f, l in zip(factors, b), []))
The flattening part (chaining of lists) is done with sum( , []) but is not performant (and designed for that)... but still useful, in my opinion, to test some small stuffs on the fly. The list chaining have to be done with i.e. itertools.chain. Here an example
import itertools as it
...
list(it.chain.from_iterable(f*[l] for f, l in zip(factors, b)))
# or (for many iterators)
list(it.chain(*(f*[l] for f, l in zip(factors, b))))
# or with repeat
list(it.chain.from_iterable(it.starmap(it.repeat, zip(b, factors))))
# or (another) with repeat
list(it.chain.from_iterable(map(it.repeat, b, factors))
# or ...
list(it.chain.from_iterable(map(list.__mul__, ([i] for i in b) , factors)))
# or with reduce
import functools as fc
list(fc.reduce(lambda i, j: i + j, zip(b, factors)))

How to generate all possible k size vectors in a matrix (diagonals included)?

Given this matrix:
x = [[1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5]]
What is an efficient way to return all the possible 4 by 1 vectors and 1 by 4 matrix in this matrix. As well as any 4 diagonal spaces joined in a line.
For example:
[1,1,1,1] would appear 3 times
Diagonals also need to be addressed so [1,2,3,4] would be included as a row but also a diagonal.
Splitting the problem into two steps:
Step 1 - get all horizontal, vertical and diagonal lines
Diagonals are tackled using the fact either i+j, or respectively i-j, is constant for the indexes i, j
x = [[1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5], [1,2,3,4,5]]
pprint.pprint(x)
# [[1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5],
# [1, 2, 3, 4, 5]]
all_lines = (
# Horizontal
[x[i] for i in range(len(x))] +
# Vertical
[[x[i][j] for i in range(len(x))] for j in range(len(x[0]))] +
# Diagonal k = i - j
[[x[k+j][j] for j in range(len(x[0])) if 0 <= k+j < len(x)] for k in range(-len(x[0])+1, len(x))] +
# Diagonal k = i + j
[[x[k-j][j] for j in range(len(x[0])) if 0 <= k-j < len(x)] for k in range(len(x[0])+len(x)-1)]
)
>>> pprint.pprint(all_lines)
[[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[5],
[4, 5],
[3, 4, 5],
[2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4],
[1, 2, 3],
[1, 2],
[1],
[1],
[1, 2],
[1, 2, 3],
[1, 2, 3, 4],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[2, 3, 4, 5],
[3, 4, 5],
[4, 5],
[5]]
Step 2 - for each line select each 4-length slice
ans = [a[i:i+4] for i in range(len(a)-4+1) for a in all_lines if len(a[i:i+4]) == 4]
>>> ans = [a[i:i+4] for i in range(len(a)-4+1) for a in all_lines if len(a[i:i+4]) == 4]
>>> pprint.pprint(ans)
[[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5],
[2, 3, 4, 5],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4],
[5, 5, 5, 5]]
Maybe not be the most efficient but it's at least a way of doing it.
There will likely be a way of using itertools combinations to simplify this dramatically.

Why do all the lists in my array change when I append a new list?

I'm coding with python 3.6 and am working on a Genetic Algorithm. When generating a new population, when I append the new values to the array all the values in the array are changed to the new value. Is there something wrong with my functions?
Code:
from fuzzywuzzy import fuzz
import numpy as np
import random
import time
def mutate(parent):
x = random.randint(0,len(parent)-1)
parent[x] = random.randint(0,9)
print(parent)
return parent
def gen(cur_gen, pop_size, fittest):
if cur_gen == 1:
population = []
for _ in range(pop_size):
add_to = []
for _ in range(6):
add_to.append(random.randint(0,9))
population.append(add_to)
return population
else:
population = []
for _ in range(pop_size):
print('\n')
population.append(mutate(fittest))
print(population)
return population
def get_fittest(population):
fitness = []
for x in population:
fitness.append(fuzz.ratio(x, [9,9,9,9,9,9]))
fittest = fitness.index(max(fitness))
fittest_fitness = fitness[fittest]
fittest = population[fittest]
return fittest, fittest_fitness
done = False
generation = 1
population = gen(generation, 10, [0,0,0,0,0,0])
print(population)
while not done:
generation += 1
time.sleep(0.5)
print('Current Generation: ',generation)
print('Fittest: ',get_fittest(population))
if get_fittest(population)[1] == 100:
done = True
population = gen(generation, 10, get_fittest(population)[0])
print('Population: ',population)
Output:
Fittest: ([7, 4, 2, 7, 8, 9], 72)
[3, 4, 2, 7, 8, 9]
[[3, 4, 2, 7, 8, 9]]
[3, 4, 2, 7, 5, 9]
[[3, 4, 2, 7, 5, 9], [3, 4, 2, 7, 5, 9]]
[3, 4, 2, 7, 4, 9]
[[3, 4, 2, 7, 4, 9], [3, 4, 2, 7, 4, 9], [3, 4, 2, 7, 4, 9]]
[3, 1, 2, 7, 4, 9]
[[3, 1, 2, 7, 4, 9], [3, 1, 2, 7, 4, 9], [3, 1, 2, 7, 4, 9], [3, 1, 2, 7, 4, 9]]
[3, 1, 2, 7, 4, 2]
[[3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2]]
[3, 1, 2, 5, 4, 2]
[[3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2]]
[3, 1, 2, 5, 4, 2]
[[3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2]]
[3, 1, 2, 5, 4, 5]
[[3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5]]
[3, 1, 2, 5, 4, 3]
[[3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3]]
You keep passing fittest, a reference to the same list, to mutate, which modifies the list in-place, and append it to population, so changes to the list in the next iteration are reflected across all the references to fittest in the population list.
You should pass a copy of the fittest list to mutate instead:
population.append(mutate(fittest[:]))
It's right there in the name:
def mutate(parent):
x = random.randint(0,len(parent)-1)
parent[x] = random.randint(0,9)
print(parent)
return parent
mutate isn't making a new list, it's modifying the existing list in place and returning a reference to the same list it was passed. population.append(mutate(fittest)) is happily storing aliases to fittest over and over, and since fittest is never replaced, it just keeps getting mutated and new aliases to it stored.
If the goal is to store a snapshot of the list at a given stage each time, copy it, changing:
population.append(mutate(fittest))
to:
population.append(mutate(fittest)[:])
where a complete slice of the list makes a shallow copy. population.append(mutate(fittest).copy()) would also work on modern Python, as would import copy, then doing population.append(copy.copy(mutate(fittest))) or (if the contents might themselves be mutable, though they aren't in this case) population.append(copy.deepcopy(mutate(fittest))).
Note that as a rule, Python functions aren't really supposed to both mutate and return the mutated value, as it leads to confusion like this.
Pythonic code intended to mutate in place returns None (implicitly usually, by not returning at all), so the code you'd use would actually be:
def mutate(parent):
x = random.randint(0,len(parent)-1)
parent[x] = random.randint(0,9)
print(parent)
and used with:
mutate(fittest) # Mutate fittest in place
population.append(fittest[:]) # Append copy of most recently updated fittest

List of list, compare the last item with python

I need to compare the elements of a list of list. My code is for two items inside of the list of list but when I have more than two I don't know how proceed.
My inputs have the same len ever. And I need to compare d[][:1] and if it is repeated check the d[][:-1] and print the d[] with the less d[][:-1]
The print I need: d = [[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
Code:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
if d[0][:-1] == d[1][:-1]:
if d[0][-1] < d[1][-1]:
d.remove(d[1])
else:
d.remove(d[0])
>>> print d
[[1, 2, 3, 4, 5], [1, 2, 3, 4, 4], [3, 2, 4, 2, 5], [3, 2, 4, 2, 1]]
Edited:
from operator import itemgetter
from itertools import groupby
d = [['4027221', 'MX', '0.4', 3],
['4027221', 'MX', '30', 1],
['4027222', 'MX', '0.4', 3],
['4027222', 'MX', '30', 1]]
d.sort()
d = [min(g, key=lambda s: s[-2]) for _, g in groupby(d, key=lambda s: s[:-2])]
[['4027221', 'MX', '0.4', 3], ['4027222', 'MX', '0.4', 3]]
You can use itertools.groupby to group the list by all but the last item first, and then sort the sub-lists by the last item with min:
from operator import itemgetter
from itertools import groupby
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
print([min(g, key=itemgetter(-1)) for _, g in groupby(d, key=lambda s: s[:-1])])
This outputs:
[[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
if I understood well what you want, this should do the trick:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
mins = {}
for a_list in d:
list_key = ','.join(map(str, a_list[:-1]))
list_orderer = a_list[-1]
if list_key not in mins or mins[list_key] > list_orderer:
mins[list_key] = a_list
print(sorted(mins.values())) # [[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
It works in Python 2 and 3, it does not require the input to be sorted and it does not require any dependency (which is not a real argument).
You can do it this way too:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
sublists = list(set(tuple(i[:-1]) for i in d))
mins = [min([elem for elem in d if elem[:-1]==list(s)])for s in sublists]
print(mins)
Output:
[[3, 2, 4, 2, 1], [1, 2, 3, 4, 4]]
Scaling up blhsing's solution. For bigger data and skipping the need to sort.
import pandas as pd
cols = ['v1', 'v2', 'v3', 'v4', 'v5']
df = pd.DataFrame(d, columns=cols)
ndf = df.groupby(cols[:-1], as_index=False).min()
out = ndf.values.tolist()
print(out)
[[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
You can use a dictionary, taking advantage of the fact that, as you iterate, only the last value will be attached to any given key. The solution does not require sorting.
d2 = {tuple(key): val for *key, val in d}
res = [list(k) + [v] for k, v in d2.items()]
print(res)
[[1, 2, 3, 4, 4],
[3, 2, 4, 2, 1]]
Note tuple conversion is required since lists are not hashable, so they cannot be used as dictionary keys.
Edit: as #JonClements suggests, you can write this more simply as:
res = list({tuple(el[:-1]): el for el in d}.values())

[Python]: generate and array of all possible combinations

I have a very straightforward combination problem. I have two arrays (a and b). Array a indicates all the values one of the three slots in array b can take on. Each slot in array b can have a value between 1 and 5. An example of which would be [1, 4, 5]. I would like to generate an array (c) with all possible combinations. I like to extend the basic example larger arrays.
Input:
a = [1, 2, 3, 4, 5]
b = [1, 2, 3]
Output:
c = [[1, 1, 1], [1, 1, 2],[1, 1, 3], [1, 1, 4], [1, 1, 5],
[1, 2, 1], [1, 2, 2],[1, 2, 3], [1, 2, 4], [1, 2, 5],
[1, 3, 1], [1, 3, 2],[1, 3, 3], [1, 3, 4], [1, 3, 5],
[1, 4, 1], [1, 4, 2],[1, 4, 3], [1, 4, 4], [1, 4, 5],
[1, 5, 1], [1, 5, 2],[1, 5, 3], [1, 5, 4], [1, 5, 5],
[2, 1, 1], [2, 1, 2],[2, 1, 3], [2, 1, 4], [2, 1, 5],
[2, 2, 1], [2, 2, 2],[2, 2, 3], [2, 2, 4], [2, 2, 5],
[2, 3, 1], [2, 3, 2],[2, 3, 3], [2, 3, 4], [2, 3, 5],
[2, 4, 1], [2, 4, 2],[2, 4, 3], [2, 4, 4], [2, 4, 5],
[2, 5, 1], [2, 5, 2],[2, 5, 3], [2, 5, 4], [2, 5, 5],
[3, 1, 1], [3, 1, 2],[3, 1, 3], [3, 1, 4], [3, 1, 5],
[3, 2, 1], [3, 2, 2],[3, 2, 3], [3, 2, 4], [3, 2, 5],
[3, 3, 1], [3, 3, 2],[3, 3, 3], [3, 3, 4], [3, 3, 5],
[3, 4, 1], [3, 4, 2],[3, 4, 3], [3, 4, 4], [3, 4, 5],
[3, 5, 1], [3, 5, 2],[3, 5, 3], [3, 5, 4], [3, 5, 5],
[4, 1, 1], [4, 1, 2],[4, 1, 3], [4, 1, 4], [4, 1, 5],
[4, 2, 1], [4, 2, 2],[4, 2, 3], [4, 2, 4], [4, 2, 5],
[4, 3, 1], [4, 3, 2],[4, 3, 3], [4, 3, 4], [4, 3, 5],
[4, 4, 1], [4, 4, 2],[4, 4, 3], [4, 4, 4], [4, 4, 5],
[5, 5, 1], [5, 5, 2],[5, 5, 3], [5, 5, 4], [5, 5, 5],
[5, 1, 1], [5, 1, 2],[5, 1, 3], [5, 1, 4], [5, 1, 5],
[5, 2, 1], [5, 2, 2],[5, 2, 3], [5, 2, 4], [5, 2, 5],
[5, 3, 1], [5, 3, 2],[5, 3, 3], [5, 3, 4], [5, 3, 5],
[5, 4, 1], [5, 4, 2],[5, 4, 3], [5, 4, 4], [5, 4, 5],
[5, 5, 1], [5, 5, 2],[5, 5, 3], [5, 5, 4], [5, 5, 5]]
Solution for the problem above:
d = []
for i in range(len(a)):
for j in range(len(a)):
for k in range(len(a)):
e = []
e.append(i+1)
e.append(j+1)
e.append(k+1)
d.append(e)
I am looking for a more generic way. One which can accommodate larger arrays (see below) without the need to use a nested for loop structure. I searched for a comparable example, but was unable to find one on stackoverflow.
Input:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
You are looking for itertools.product().
a = [1, 2, 3, 4, 5]
b = 3 # Actually, you just need the length of the array, values do not matter
c = itertools.product(a, repeat=b)
Note that this returns an iterator, you may need to cast it using list() but be aware this can take forever and highly consume memory if the sizes grow.
In the general case, you should of course use the itertools module, in this particular case itertools.product, as explained in the other answer.
If you want to implement the function yourself, you can use recursion to make it applicable to any array sizes. Also, you should probably make it a generator function (using yield instead of return), as the result could be rather long. You can try something like this:
def combinations(lst, num):
if num > 0:
for x in lst:
for comb in combinations(lst, num - 1):
yield [x] + comb
else:
yield []

Categories