Find smallest repeated piece of a list - python

I've got some list with integers like:
l1 = [8,9,8,9,8,9,8],
l2 = [3,4,2,4,3]
My purpose to slice it into the smallest repeated piece. So:
output_l1 = [8,9]
output_l2 = [3,4,2,4]
Biggest problem that the sequences not fully finished every time. So not
'abcabcabc'
just
'abcabcab'.

def shortest_repeating_sequence(inp):
for i in range(1, len(inp)):
if all(inp[j] == inp[j % i] for j in range(i, len(inp))):
return inp[:i]
# inp doesn't have a repeating pattern if we got this far
return inp[:]
This code is O(n^2). The worst case is one element repeated a lot of times followed by something that breaks the pattern at the end, for example [1, 1, 1, 1, 1, 1, 1, 1, 1, 8].
You start with 1, and then iterate over the entire list checking that each inp[i] is equal to inp[i % 1]. Any number % 1 is equal to 0, so you're checking if each item in the input is equal to the first item in the input. If all items are equal to the first element then the repeating pattern is a list with just the first element so we return inp[:1].
If at some point you hit an element that isn't equal to the first element (all() stops as soon as it finds a False), you try with 2. So now you're checking if each element at an even index is equal to the first element (4 % 2 is 0) and if every odd index is equal to the second item (5 % 2 is 1). If you get all the way through this, the pattern is the first two elements so return inp[:2], otherwise try again with 3 and so on.
You could do range(1, len(inp)+1) and then the for loop will handle the case where inp doesn't contain a repeating pattern, but then you have to needlessly iterate over the entire inp at the end. And you'd still have to have to have return [] at the end to handle inp being the empty list.
I return a copy of the list (inp[:]) instead of the list to have consistent behavior. If I returned the original list with return inp and someone called that function on a list that didn't have a repeating pattern (ie their repeating pattern is the original list) and then did something with the repeating pattern, it would modify their original list as well.
shortest_repeating_sequence([4, 2, 7, 4, 6]) # no pattern
[4, 2, 7, 4, 6]
shortest_repeating_sequence([2, 3, 1, 2, 3]) # pattern doesn't repeat fully
[2, 3, 1]
shortest_repeating_sequence([2, 3, 1, 2]) # pattern doesn't repeat fully
[2, 3, 1]
shortest_repeating_sequence([8, 9, 8, 9, 8, 9, 8])
[8, 9]
shortest_repeating_sequence([1, 1, 1, 1, 1])
[1]
shortest_repeating_sequence([])
[]

The following code is a rework of your solution that addresses some issues:
Your solution as posted doesn't handle your own 'abcabcab' example.
Your solution keeps processing even after it's found a valid result, and then filters through both the valid and non-valid results. Instead, once a valid result is found, we process and return it. Additional valid results, and non-valid results, are simply ignored.
#Boris' issue regarding returning the input if there is no repeating pattern.
CODE
def repeated_piece(target):
target = list(target)
length = len(target)
for final in range(1, length):
result = []
while len(result) < length:
for i in target[:final]:
result.append(i)
if result[:length] == target:
return result[:final]
return target
l1 = [8, 9, 8, 9, 8, 9, 8]
l2 = [3, 4, 2, 4, 3]
l3 = 'abcabcab'
l4 = [1, 2, 3]
print(*repeated_piece(l1), sep='')
print(*repeated_piece(l2), sep='')
print(*repeated_piece(l3), sep='')
print(*repeated_piece(l4), sep='')
OUTPUT
% python3 test.py
89
3424
abc
123
%
You can still use:
print(''.join(map(str, repeated_piece(l1))))
if you're uncomfortable with the simpler Python 3 idiom:
print(*repeated_piece(l1), sep='')

SOLUTION
target = [8,9,8,9,8,9,8]
length = len(target)
result = []
results = [] * length
for j in range(1, length):
result = []
while len(result) < length:
for i in target[:j]:
result.append(i)
results.append(result)
final = []
for i in range(0, len(results)):
if results[i][:length] == target:
final.append(1)
else:
final.append(0)
if 1 in final:
solution = results[final.index(1)][:final.index(1)+1]
else:
solution = target
int(''.join(map(str, solution)))
'result: [8, 9]'.

Simple Solution:
def get_unique_items_list(some_list):
new_list = []
for i in range(len(some_list)):
if not some_list[i] in new_list:
new_list.append(some_list[i])
return new_list
l1 = [8,9,8,9,8,9,8]
l2 = [3,4,2,4,3]
print(get_unique_items_list(l1))
print(get_unique_items_list(l2))
#### Output ####
# [8, 9]
# [3, 4, 2]

Related

How to find a greedy solution in a list with duplicates

I am trying to solve a sequencing/scheduling problem. In this work I have encountered a problem regarding finding a greedy solution for a sequence with duplicates.
My initial sequence looks like this: [2, 9, 4, 11, 11] (or can be similar, containing duplicates).
My algorithm will select an arbitrary/random number of the sequence as its first element and then further pick the elements of lowest value as the following elements.
The code I have proposed is as follows:
l = [2, 9, 4, 11, 11]
t=l.copy()
seq = [random.randint(0,N-1)]
i = 0
while i < N:
a = min(t)
if seq[0] == i:
i +=1
else:
seq.append(l.index(a))
t.remove(t[t.index(a)])
i+=1
print(seq)
One example of a solution from this code is: [3, 0, 2, 1, 3], which is not desired as I want it to be [1, 0, 2, 1, 4].
Thanks for the help!
Now the problem is clear: you're trying to return a list of the indices to the integers in its sorted rising order. I have a solution for that.
def sort(lst):
# Make 2d array containing elements and their index
pairs = [[num,ind] for ind, num in enumerate(lst)]
sorted_indices = []
# Go through sorted pairs and collect indices into sorted_indices
for pair in sorted(pairs):
sorted_indices.append(pair[1])
return sorted_indices
Test:
sort([2, 1, 3, 1])
Output:
[1, 3, 0, 2]

Checking sum of items in a list if equals target value

I am trying to make a program that checks whether which items are equal to a target value in a list and then output their indexes.
E.g.
li = [2, 5, 7, 9, 3]
target = 16
output: [2, 3]
li = [2, 5, 7, 9, 3]
target = 7
output: [0, 1]
Another way, assuming you can sort the list is the following
original_l = [1,2,6,4,9,3]
my_l = [ [index, item] for item,index in zip(original_l, range(0,len(original_l)))]
my_l_sort = sorted(my_l, key=lambda x: x[1])
start_i = 0
end_i = len(my_l_sort)-1
result = []
target = 7
while start_i < end_i:
if my_l_sort[start_i][1] + my_l_sort[end_i][1] == target:
result.append([my_l_sort[start_i][0], my_l_sort[end_i][0]])
break
elif my_l_sort[start_i][1] + my_l_sort[end_i][1] < target:
start_i+=1
else:
end_i-=1
if len(result) != 0:
print(f"Match for indices {result[0]}")
else:
print("No match")
The indices 0 and 1 of result[0] are respectively the 2 positions, given as a 2 element string, in original_l that holds the values that summed give the target.
is this a homework?
Anyways, here is the answer you are looking for
def check_sum(l, target):
for i in range(len(l)):
sum_temp = 0
for j in range(i, len(l)):
if sum_temp == target:
return [i, j-1]
else:
sum_temp += l[j]
return None
print(check_sum([2, 5, 7, 9, 3], 16))
"""
check_sum([2, 5, 7, 9, 3], 16)
>>> [2, 3]
check_sum([2, 5, 7, 9, 3], 7)
>>> [0, 1]
check_sum([2, 5, 7, 9, 3], 99)
>>> None
"""
The code is self-explanatory and does not require extra commenting. It simply iterates over the list of integers you have as an input and tries to find a sequence of values that add up to your target.
If you dont worry about stack explosion, for smaller input.
We divide solutions containing an index and not containing index and merge all those solution. It returns indices of all possible solutions.
It is O(2^n) solutions. Similar ones
def solve(residual_sum, original_list, present_index):
'''Returns list of list of indices where sum gives residual_sum'''
if present_index == len(original_list)-1:
# If at end of list
if residual_sum == original_list[-1]:
# if residual sum if equal to present element
# then this index is part of solution
return [[present_index]]
if residual_sum == 0:
# 0 sum, empty solution
return [[]]
# Reaching here would mean list at caller side can not
# lead to desired sum, so there is no solution possible
return []
all_sols = []
# Get all solutions which contain i
# since i is part of solution,
# so we only need to find for residual_sum-original_list[present_index]
solutions_with_i = solve(residual_sum-original_list[present_index], original_list, present_index+1)
if solutions_with_i:
# Add solutions containing i
all_sols.extend([[present_index] + x for x in solutions_with_i])
# solution dont contain i, so use same residual sum
solutions_without_i = solve(residual_sum, original_list, present_index+1)
if solutions_without_i:
all_sols.extend(solutions_without_i)
return all_sols
print(solve(16, [2, 5, 7, 9, 3], 0))
Indices
[[0, 1, 3], [2, 3]]

Group Consecutive Increasing Numbers in List [duplicate]

This question already has answers here:
Decompose a list of integers into lists of increasing sequences
(6 answers)
Closed 2 years ago.
How can I group together consecutive increasing integers in a list? For example, I have the following list of integers:
numbers = [0, 5, 8, 3, 4, 6, 1]
I would like to group elements together as follow:
[[0, 5, 8], [3, 4, 6], [1]]
While the next integer is more than previous, keep adding to the same nested list; ones the next integer is smaller, add nested list to main list and start again.
I have tried few different ways (while loop, for loop, enumerate and range), but cannot figure out how to make it append to the same nested list as long as next integer is larger.
result = []
while (len(numbers) - 1) != 0:
group = []
first = numbers.pop(0)
second = numbers[0]
while first < second:
group.append(first)
if first > second:
result.append(group)
break
You could use a for loop:
numbers = [0, 5, 8, 3, 4, 6, 1]
result = [[]]
last_num = numbers[0] # last number (to check if the next number is greater or equal)
for number in numbers:
if number < last_num:
result.append([]) # add a new consecutive list
result[-1].append(number)
last_num = number # set last_num to this number, so it can be used later
print(result)
NOTE: This doesn't use .pop(), so the numbers list stays intact. Also, one loop = O(N) time complexity!!
If pandas are allowed, I would do this:
import pandas as pd
numbers = [0, 5, 8, 3, 4, 6, 1]
df = pd.DataFrame({'n':numbers})
[ g['n'].values.tolist() for _,g in df.groupby((df['n'].diff()<0).cumsum())]
produces
[[0, 5, 8], [3, 4, 6], [1]]
You can do this:
numbers = [0, 5, 8, 3, 4, 6, 1]
result = []
while len(numbers) != 0:
secondresult = []
for _ in range(3):
if numbers != []:
toappend = numbers.pop(0)
secondresult.append(toappend)
else:
continue
result.append(secondresult)
print(result)
use while and for loops. and append them to secondresult and result

Grouping numbers in a list of floats in ascending order [duplicate]

Assume no consecutive integers are in the list.
I've tried using NumPy (np.diff) for the difference between each element, but haven't been able to use that to achieve the answer. Two examples of the input (first line) and expected output (second line) are below.
[6, 0, 4, 8, 7, 6]
[[6], [0, 4, 8], [7], [6]]
[1, 4, 1, 2, 4, 3, 5, 4, 0]
[[1, 4], [1, 2, 4], [3, 5], [4], [0]]
You could use itertools.zip_longest to enable iteration over sequential element pairs in your list along with enumerate to keep track of index values where the sequences are not increasing in order to append corresponding slices to your output list.
from itertools import zip_longest
nums = [1, 4, 1, 2, 4, 3, 5, 4, 0]
results = []
start = 0
for i, (a, b) in enumerate(zip_longest(nums, nums[1:])):
if b is None or b <= a:
results.append(nums[start:i+1])
start = i + 1
print(results)
# [[1, 4], [1, 2, 4], [3, 5], [4], [0]]
Here's a simple way to do what you're asking without any extra libraries:
result_list = []
sublist = []
previous_number = None
for current_number in inp:
if previous_number is None or current_number > previous_number:
# still ascending, add to the current sublist
sublist.append(current_number)
else:
# no longer ascending, add the current sublist
result_list.append(sublist)
# start a new sublist
sublist = [current_number]
previous_number = current_number
if sublist:
# add the last sublist, if there's anything there
result_list.append(sublist)
Just cause I feel kind, this will also work with negative numbers.
seq = [6, 0, 4, 8, 7, 6]
seq_by_incr_groups = [] # Will hold the result
incr_seq = [] # Needed to create groups of increasing values.
previous_value = 0 # Needed to assert whether or not it's an increasing value.
for curr_value in seq: # Iterate over the list
if curr_value > previous_value: # It's an increasing value and belongs to the group of increasing values.
incr_seq.append(curr_value)
else: # It was lower, lets append the previous group of increasing values to the result and reset the group so that we can create a new one.
if incr_seq: # It could be that it's empty, in the case that the first number in the input list is a negative.
seq_by_incr_groups.append(incr_seq)
incr_seq = []
incr_seq.append(curr_value)
previous_value = curr_value # Needed so that we in the next iteration can assert that the value is increasing compared to the prior one.
if incr_seq: # Check if we have to add any more increasing number groups.
seq_by_incr_groups.append(incr_seq) # Add them.
print(seq_by_incr_groups)
Below code should help you. However I would recommend that you use proper nomenclature and consider handling corner cases:
li1 = [6, 0, 4, 8, 7, 6]
li2 = [1, 4, 1, 2, 4, 3, 5, 4, 0]
def inc_seq(li1):
lix = []
li_t = []
for i in range(len(li1)):
#print (i)
if i < (len(li1) - 1) and li1[i] >= li1[i + 1]:
li_t.append(li1[i])
lix.append(li_t)
li_t = []
else:
li_t.append(li1[i])
print (lix)
inc_seq(li1)
inc_seq(li2)
You can write a simple script and you don't need numpy as far as I have understood your problem statement. Try the script below. I have tested it using Python 3.6.7 and Python 2.7.15+ on my Ubuntu machine.
def breakIntoList(inp):
if not inp:
return []
sublist = [inp[0]]
output = []
for a in inp[1:]:
if a > sublist[-1]:
sublist.append(a)
else:
output.append(sublist);
sublist = [a]
output.append(sublist)
return output
list = [1, 4, 1, 2, 4, 3, 5, 4, 0]
print(list)
print(breakIntoList(list))
Explanation:
The script first checks if input List passed to it has one or more elements.
It then initialise a sublist (variable name) to hold elements in increasing order. After that, we append input List's first element into our sublist.
We iterate through the input List beginning from it's second element (Index: 1). We keep on checking if the current element in Input List is greater than last element of sublist (by sublist[-1]). If yes, we append the current element to our sublist (at the end). If not, it means we can't hold that current element in sub-List. We append the sublist to output List and clear the sublist (for holding other increasing order sublists) and add the current element to our sublist.
At the end, we append the remaining sublist to the output List.
Here's an alternative using dict, list comprehensions, and zip:
seq = [1, 4, 1, 2, 4, 3, 5, 4, 0]
dict_seq = {i:j for i,j in enumerate(seq)}
# Get the index where numbers start to decrease
idx = [0] # Adding a zero seems counter-intuitive now; we'll see the benefit later.
for k, v in dict_seq.items():
if k>0:
if dict_seq[k]<dict_seq[k-1]:
idx.append(k)
# Using zip, slice and handling the last entry
inc_seq = [seq[i:j] for i, j in zip(idx, idx[1:])] + [seq[idx[-1:]]]
Output
print(inc_seq)
>>> [[1, 4], [1, 2, 4], [3, 5], [4], [0]]
By initiating idx = [0] and creating 2 sublists idx, idx[1:], we can zip these sublists to form [0:2], [2:5], [5:7] and [7:8] with the list comprehension.
>>> print(idx)
>>> [0, 2, 5, 7, 8]
>>> for i, j in zip(idx, idx[1:]):
print('[{}:{}]'.format(i,j))
[0:2]
[2:5]
[5:7]
[7:8] # <-- need to add the last slide [8:]

Remove sublist from list

I want to do the following in Python:
A = [1, 2, 3, 4, 5, 6, 7, 7, 7]
C = A - [3, 4] # Should be [1, 2, 5, 6, 7, 7, 7]
C = A - [4, 3] # Should not be removing anything, because sequence 4, 3 is not found
So, I simply want to remove the first appearance of a sublist (as a sequence) from another list. How can I do that?
Edit: I am talking about lists, not sets. Which implies that ordering (sequence) of items matter (both in A and B), as well as duplicates.
Use sets:
C = list(set(A) - set(B))
In case you want to mantain duplicates and/or oder:
filter_set = set(B)
C = [x for x in A if x not in filter_set]
If you want to remove exact sequences, here is one way:
Find the bad indices by checking to see if the sublist matches the desired sequence:
bad_ind = [range(i,i+len(B)) for i,x in enumerate(A) if A[i:i+len(B)] == B]
print(bad_ind)
#[[2, 3]]
Since this returns a list of lists, flatten it and turn it into a set:
bad_ind_set = set([item for sublist in bad_ind for item in sublist])
print(bad_ind_set)
#set([2, 3])
Now use this set to filter your original list, by index:
C = [x for i,x in enumerate(A) if i not in bad_ind_set]
print(C)
#[1, 2, 5, 6, 7, 7, 7]
The above bad_ind_set will remove all matches of the sequence. If you only want to remove the first match, it's even simpler. You just need the first element of bad_ind (no need to flatten the list):
bad_ind_set = set(bad_ind[0])
Update: Here is a way to find and remove the first matching sub-sequence using a short circuiting for loop. This will be faster because it will break out once the first match is found.
start_ind = None
for i in range(len(A)):
if A[i:i+len(B)] == B:
start_ind = i
break
C = [x for i, x in enumerate(A)
if start_ind is None or not(start_ind <= i < (start_ind + len(B)))]
print(C)
#[1, 2, 5, 6, 7, 7, 7]
I considered this question was like one substring search, so KMP, BM etc sub-string search algorithm could be applied at here. Even you'd like support multiple patterns, there are some multiple pattern algorithms like Aho-Corasick, Wu-Manber etc.
Below is KMP algorithm implemented by Python which is from GitHub Gist.
PS: the author is not me. I just want to share my idea.
class KMP:
def partial(self, pattern):
""" Calculate partial match table: String -> [Int]"""
ret = [0]
for i in range(1, len(pattern)):
j = ret[i - 1]
while j > 0 and pattern[j] != pattern[i]:
j = ret[j - 1]
ret.append(j + 1 if pattern[j] == pattern[i] else j)
return ret
def search(self, T, P):
"""
KMP search main algorithm: String -> String -> [Int]
Return all the matching position of pattern string P in S
"""
partial, ret, j = self.partial(P), [], 0
for i in range(len(T)):
while j > 0 and T[i] != P[j]:
j = partial[j - 1]
if T[i] == P[j]: j += 1
if j == len(P):
ret.append(i - (j - 1))
j = 0
return ret
Then use it to calcuate out the matched position, finally remove the match:
A = [1, 2, 3, 4, 5, 6, 7, 7, 7, 3, 4]
B = [3, 4]
result = KMP().search(A, B)
print(result)
#assuming at least one match is found
print(A[:result[0]:] + A[result[0]+len(B):])
Output:
[2, 9]
[1, 2, 5, 6, 7, 7, 7, 3, 4]
[Finished in 0.201s]
PS: You can try other algorithms also. And #Pault 's answers is good enough unless you care about the performance a lot.
Here is another approach:
# Returns that starting and ending point (index) of the sublist, if it exists, otherwise 'None'.
def findSublist(subList, inList):
subListLength = len(subList)
for i in range(len(inList)-subListLength):
if subList == inList[i:i+subListLength]:
return (i, i+subListLength)
return None
# Removes the sublist, if it exists and returns a new list, otherwise returns the old list.
def removeSublistFromList(subList, inList):
indices = findSublist(subList, inList)
if not indices is None:
return inList[0:indices[0]] + inList[indices[1]:]
else:
return inList
A = [1, 2, 3, 4, 5, 6, 7, 7, 7]
s1 = [3,4]
B = removeSublistFromList(s1, A)
print(B)
s2 = [4,3]
C = removeSublistFromList(s2, A)
print(C)

Categories