Remove sublist from list - python

I want to do the following in Python:
A = [1, 2, 3, 4, 5, 6, 7, 7, 7]
C = A - [3, 4] # Should be [1, 2, 5, 6, 7, 7, 7]
C = A - [4, 3] # Should not be removing anything, because sequence 4, 3 is not found
So, I simply want to remove the first appearance of a sublist (as a sequence) from another list. How can I do that?
Edit: I am talking about lists, not sets. Which implies that ordering (sequence) of items matter (both in A and B), as well as duplicates.

Use sets:
C = list(set(A) - set(B))
In case you want to mantain duplicates and/or oder:
filter_set = set(B)
C = [x for x in A if x not in filter_set]

If you want to remove exact sequences, here is one way:
Find the bad indices by checking to see if the sublist matches the desired sequence:
bad_ind = [range(i,i+len(B)) for i,x in enumerate(A) if A[i:i+len(B)] == B]
print(bad_ind)
#[[2, 3]]
Since this returns a list of lists, flatten it and turn it into a set:
bad_ind_set = set([item for sublist in bad_ind for item in sublist])
print(bad_ind_set)
#set([2, 3])
Now use this set to filter your original list, by index:
C = [x for i,x in enumerate(A) if i not in bad_ind_set]
print(C)
#[1, 2, 5, 6, 7, 7, 7]
The above bad_ind_set will remove all matches of the sequence. If you only want to remove the first match, it's even simpler. You just need the first element of bad_ind (no need to flatten the list):
bad_ind_set = set(bad_ind[0])
Update: Here is a way to find and remove the first matching sub-sequence using a short circuiting for loop. This will be faster because it will break out once the first match is found.
start_ind = None
for i in range(len(A)):
if A[i:i+len(B)] == B:
start_ind = i
break
C = [x for i, x in enumerate(A)
if start_ind is None or not(start_ind <= i < (start_ind + len(B)))]
print(C)
#[1, 2, 5, 6, 7, 7, 7]

I considered this question was like one substring search, so KMP, BM etc sub-string search algorithm could be applied at here. Even you'd like support multiple patterns, there are some multiple pattern algorithms like Aho-Corasick, Wu-Manber etc.
Below is KMP algorithm implemented by Python which is from GitHub Gist.
PS: the author is not me. I just want to share my idea.
class KMP:
def partial(self, pattern):
""" Calculate partial match table: String -> [Int]"""
ret = [0]
for i in range(1, len(pattern)):
j = ret[i - 1]
while j > 0 and pattern[j] != pattern[i]:
j = ret[j - 1]
ret.append(j + 1 if pattern[j] == pattern[i] else j)
return ret
def search(self, T, P):
"""
KMP search main algorithm: String -> String -> [Int]
Return all the matching position of pattern string P in S
"""
partial, ret, j = self.partial(P), [], 0
for i in range(len(T)):
while j > 0 and T[i] != P[j]:
j = partial[j - 1]
if T[i] == P[j]: j += 1
if j == len(P):
ret.append(i - (j - 1))
j = 0
return ret
Then use it to calcuate out the matched position, finally remove the match:
A = [1, 2, 3, 4, 5, 6, 7, 7, 7, 3, 4]
B = [3, 4]
result = KMP().search(A, B)
print(result)
#assuming at least one match is found
print(A[:result[0]:] + A[result[0]+len(B):])
Output:
[2, 9]
[1, 2, 5, 6, 7, 7, 7, 3, 4]
[Finished in 0.201s]
PS: You can try other algorithms also. And #Pault 's answers is good enough unless you care about the performance a lot.

Here is another approach:
# Returns that starting and ending point (index) of the sublist, if it exists, otherwise 'None'.
def findSublist(subList, inList):
subListLength = len(subList)
for i in range(len(inList)-subListLength):
if subList == inList[i:i+subListLength]:
return (i, i+subListLength)
return None
# Removes the sublist, if it exists and returns a new list, otherwise returns the old list.
def removeSublistFromList(subList, inList):
indices = findSublist(subList, inList)
if not indices is None:
return inList[0:indices[0]] + inList[indices[1]:]
else:
return inList
A = [1, 2, 3, 4, 5, 6, 7, 7, 7]
s1 = [3,4]
B = removeSublistFromList(s1, A)
print(B)
s2 = [4,3]
C = removeSublistFromList(s2, A)
print(C)

Related

Checking sum of items in a list if equals target value

I am trying to make a program that checks whether which items are equal to a target value in a list and then output their indexes.
E.g.
li = [2, 5, 7, 9, 3]
target = 16
output: [2, 3]
li = [2, 5, 7, 9, 3]
target = 7
output: [0, 1]
Another way, assuming you can sort the list is the following
original_l = [1,2,6,4,9,3]
my_l = [ [index, item] for item,index in zip(original_l, range(0,len(original_l)))]
my_l_sort = sorted(my_l, key=lambda x: x[1])
start_i = 0
end_i = len(my_l_sort)-1
result = []
target = 7
while start_i < end_i:
if my_l_sort[start_i][1] + my_l_sort[end_i][1] == target:
result.append([my_l_sort[start_i][0], my_l_sort[end_i][0]])
break
elif my_l_sort[start_i][1] + my_l_sort[end_i][1] < target:
start_i+=1
else:
end_i-=1
if len(result) != 0:
print(f"Match for indices {result[0]}")
else:
print("No match")
The indices 0 and 1 of result[0] are respectively the 2 positions, given as a 2 element string, in original_l that holds the values that summed give the target.
is this a homework?
Anyways, here is the answer you are looking for
def check_sum(l, target):
for i in range(len(l)):
sum_temp = 0
for j in range(i, len(l)):
if sum_temp == target:
return [i, j-1]
else:
sum_temp += l[j]
return None
print(check_sum([2, 5, 7, 9, 3], 16))
"""
check_sum([2, 5, 7, 9, 3], 16)
>>> [2, 3]
check_sum([2, 5, 7, 9, 3], 7)
>>> [0, 1]
check_sum([2, 5, 7, 9, 3], 99)
>>> None
"""
The code is self-explanatory and does not require extra commenting. It simply iterates over the list of integers you have as an input and tries to find a sequence of values that add up to your target.
If you dont worry about stack explosion, for smaller input.
We divide solutions containing an index and not containing index and merge all those solution. It returns indices of all possible solutions.
It is O(2^n) solutions. Similar ones
def solve(residual_sum, original_list, present_index):
'''Returns list of list of indices where sum gives residual_sum'''
if present_index == len(original_list)-1:
# If at end of list
if residual_sum == original_list[-1]:
# if residual sum if equal to present element
# then this index is part of solution
return [[present_index]]
if residual_sum == 0:
# 0 sum, empty solution
return [[]]
# Reaching here would mean list at caller side can not
# lead to desired sum, so there is no solution possible
return []
all_sols = []
# Get all solutions which contain i
# since i is part of solution,
# so we only need to find for residual_sum-original_list[present_index]
solutions_with_i = solve(residual_sum-original_list[present_index], original_list, present_index+1)
if solutions_with_i:
# Add solutions containing i
all_sols.extend([[present_index] + x for x in solutions_with_i])
# solution dont contain i, so use same residual sum
solutions_without_i = solve(residual_sum, original_list, present_index+1)
if solutions_without_i:
all_sols.extend(solutions_without_i)
return all_sols
print(solve(16, [2, 5, 7, 9, 3], 0))
Indices
[[0, 1, 3], [2, 3]]

Find smallest repeated piece of a list

I've got some list with integers like:
l1 = [8,9,8,9,8,9,8],
l2 = [3,4,2,4,3]
My purpose to slice it into the smallest repeated piece. So:
output_l1 = [8,9]
output_l2 = [3,4,2,4]
Biggest problem that the sequences not fully finished every time. So not
'abcabcabc'
just
'abcabcab'.
def shortest_repeating_sequence(inp):
for i in range(1, len(inp)):
if all(inp[j] == inp[j % i] for j in range(i, len(inp))):
return inp[:i]
# inp doesn't have a repeating pattern if we got this far
return inp[:]
This code is O(n^2). The worst case is one element repeated a lot of times followed by something that breaks the pattern at the end, for example [1, 1, 1, 1, 1, 1, 1, 1, 1, 8].
You start with 1, and then iterate over the entire list checking that each inp[i] is equal to inp[i % 1]. Any number % 1 is equal to 0, so you're checking if each item in the input is equal to the first item in the input. If all items are equal to the first element then the repeating pattern is a list with just the first element so we return inp[:1].
If at some point you hit an element that isn't equal to the first element (all() stops as soon as it finds a False), you try with 2. So now you're checking if each element at an even index is equal to the first element (4 % 2 is 0) and if every odd index is equal to the second item (5 % 2 is 1). If you get all the way through this, the pattern is the first two elements so return inp[:2], otherwise try again with 3 and so on.
You could do range(1, len(inp)+1) and then the for loop will handle the case where inp doesn't contain a repeating pattern, but then you have to needlessly iterate over the entire inp at the end. And you'd still have to have to have return [] at the end to handle inp being the empty list.
I return a copy of the list (inp[:]) instead of the list to have consistent behavior. If I returned the original list with return inp and someone called that function on a list that didn't have a repeating pattern (ie their repeating pattern is the original list) and then did something with the repeating pattern, it would modify their original list as well.
shortest_repeating_sequence([4, 2, 7, 4, 6]) # no pattern
[4, 2, 7, 4, 6]
shortest_repeating_sequence([2, 3, 1, 2, 3]) # pattern doesn't repeat fully
[2, 3, 1]
shortest_repeating_sequence([2, 3, 1, 2]) # pattern doesn't repeat fully
[2, 3, 1]
shortest_repeating_sequence([8, 9, 8, 9, 8, 9, 8])
[8, 9]
shortest_repeating_sequence([1, 1, 1, 1, 1])
[1]
shortest_repeating_sequence([])
[]
The following code is a rework of your solution that addresses some issues:
Your solution as posted doesn't handle your own 'abcabcab' example.
Your solution keeps processing even after it's found a valid result, and then filters through both the valid and non-valid results. Instead, once a valid result is found, we process and return it. Additional valid results, and non-valid results, are simply ignored.
#Boris' issue regarding returning the input if there is no repeating pattern.
CODE
def repeated_piece(target):
target = list(target)
length = len(target)
for final in range(1, length):
result = []
while len(result) < length:
for i in target[:final]:
result.append(i)
if result[:length] == target:
return result[:final]
return target
l1 = [8, 9, 8, 9, 8, 9, 8]
l2 = [3, 4, 2, 4, 3]
l3 = 'abcabcab'
l4 = [1, 2, 3]
print(*repeated_piece(l1), sep='')
print(*repeated_piece(l2), sep='')
print(*repeated_piece(l3), sep='')
print(*repeated_piece(l4), sep='')
OUTPUT
% python3 test.py
89
3424
abc
123
%
You can still use:
print(''.join(map(str, repeated_piece(l1))))
if you're uncomfortable with the simpler Python 3 idiom:
print(*repeated_piece(l1), sep='')
SOLUTION
target = [8,9,8,9,8,9,8]
length = len(target)
result = []
results = [] * length
for j in range(1, length):
result = []
while len(result) < length:
for i in target[:j]:
result.append(i)
results.append(result)
final = []
for i in range(0, len(results)):
if results[i][:length] == target:
final.append(1)
else:
final.append(0)
if 1 in final:
solution = results[final.index(1)][:final.index(1)+1]
else:
solution = target
int(''.join(map(str, solution)))
'result: [8, 9]'.
Simple Solution:
def get_unique_items_list(some_list):
new_list = []
for i in range(len(some_list)):
if not some_list[i] in new_list:
new_list.append(some_list[i])
return new_list
l1 = [8,9,8,9,8,9,8]
l2 = [3,4,2,4,3]
print(get_unique_items_list(l1))
print(get_unique_items_list(l2))
#### Output ####
# [8, 9]
# [3, 4, 2]

Loop from a specific point in a list of lists Python

I would like to append to a new list all elements of an existing list of lists after a specific point
m = [[1,2,3],[4,5,10],[6,2,1]]
specific point = m[0][2]
newlist = [3,4,5,10,6,2,1]
You can directly slice off the remainder of the first target list and then add on all subsequent elements, eg:
m = [[1,2,3],[4,5,10],[6,2,1]]
y, x = 0, 2
new_list = m[y][x:] + [v for el in m[y+1:] for v in el]
# [3, 4, 5, 10, 6, 2, 1]
Here's a couple of functional approaches for efficiently iterating over your data.
If sublists are evenly sized, and you know the index from where to begin extracting elements, use chain + islice:
from itertools import chain, islice
n = 3 # Sublist size.
i,j = 0,2
newlist = list(islice(chain.from_iterable(m), i*n + j, None))
If you don't know the size of your sublists in advance, you can use next to discard the first portion of your data.
V = chain.from_iterable(m)
next(v for v in V if v == m[i][j])
newlist = list(V)
newlist.insert(m[i][j], 0)
This assumes there is no identical value earlier in the sequence.
You can put a conditional in your iteration and only add based on that condition. Once you hit that specific index, make your condition true. Something like this:
m = [[1,2,3],[4,5,10],[6,2,1]]
specific_point = (0,2)
newlist = [3,4,5,10,6,2,1]
output = []
for i in range(len(m)):
for j in range(len(m[i])):
if (i,j) < specific_point:
continue
output.append(m[i][j])
output:
[3, 4, 5, 10, 6, 2, 1]
why not flatten the initial list and go from there
flat_list = [item for sublist in m for item in sublist]
would return [1,2,3,4,5,10,6,2,1] so now you're really on flat_list[2:]
Most of the answers only work for this specific shape of nested list, but it's also possible to create a solution that works with any shape of nested list.
def flatten_from(sequence, path=[]):
start = path.pop(0) if path else 0
for item in sequence[start:]:
if isinstance(item, (list, tuple)):
yield from flatten_from(item, path)
else:
yield item
With the example from the question
>>> list(flatten_from([[1, 2, 3], [4, 5, 10], [6, 2, 1]], [0, 2]))
[3, 4, 5, 10, 6, 2, 1]
It also works with any shape and level of nesting of the input data
m = [[1], [[2], [3, 4, 5, 6, 7]], 8, [9, [10, 11]]]
flatten_from(m, [])) # 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
flatten_from(m, [2]) # 8, 9, 10, 11
flatten_from(m, [1, 1, 3]) # 6, 7, 8, 9, 10, 11
This is a bit of a bastard algorithm, though. On one hand, it uses nice functional programming concepts: recursion and yield.
On the other hand it relies on the side effect of mutating the path argument with list.pop, so it's not a pure function.
Below solution will work for your case where your array is restricted to list of list and the size of 'sublist' is consistent throughout i.e "3" in your case
m = [[1,2,3],[4,5,10],[6,2,1]] #input 2D array
a, b = 0, 2 #user input --> specific point a and b
flat_list_m = [item for firstlist in m for item in firstlist] #flat the 2D list
print (flat_list_m[len(m[0])*a+b:]) #print from specific position a and b, considering your sublist length is consistent throughout.
I hope this helps! :)

Checking if you can divide list into 2 with exact sum of numbers that are in them

My problem is that I have a list, for example
l =[1, 2, 3, 4, 5, 15]
and I would like to divide it in two lists, list1 that would have a single element of the actual list which should be the sum of all other numbers in the list, and list2 containing rest. So the output for this would be ([1, 2, 3, 4, 5], [15]) if its possible if not, return False.
This is one way, though not necessarily optimal. It uses the, in my opinion underused, for...else... construct.
I've also reversed the range iterator. This is more efficient in the case you provided.
l = [1, 2, 3, 4, 5, 15]
def splitter(l):
for i in reversed(range(len(l))):
if sum(l[:i]) == sum(l[i:]):
return [l[:i], l[i:]]
else:
return False
splitter(l) # [[1, 2, 3, 4, 5], [15]]
Should it be possible for the positions of the values to change in the list? If not you can try an iteration such as:
l = [1, 2, 3, 4, 5, 15]
dividable = "False"
x = 0
while dividable == "False":
l1 = l[0:x]
l2 = l[x:len(l)]
if sum(l1) == sum(l2):
dividable = "True"
elif x == len(l):
#not possible
break
else:
x += 1
This answer should help in all cases.
No imports required and no sorting required for the data.
def split_list(l):
dividable=False
index=0
for i in range(len(l)):
if l[i]==sum(l)-l[i]:
dividable=True
index=i
break
if dividable:
l1=l[index]
l.remove(l[index])
return (l1,l)
else:
return False
Might not be the optimised way, but a better and clear way to understand for beginners.
split_list([1,2,3,4,5,15])
[15],[1,2,3,4,5]
Hope this helps. Thanks
what about this?
l =[1, 2, 3, 4, 5, 15]
l=sorted(l)
track=[]
for i in l:
track.append(i)
if sum(track) in l and len(track)==len(l[1:]):
print(track,[sum(track)])
output:
[1, 2, 3, 4, 5], [15]
You need to do a couple of steps:
1) Sort the list from small to large. (Into a new list if you don't want to alter the original)
2) Sum every other element of the list and see if it's equal.
3) If false return false
4) if true:
Store the last (biggest) value in a variable and delete this from the duplicate of the original list.
Make a second list with only that last value in it.
Create another new list and add the altered duplicate list and the list made of the biggest element.
Return the last created list.
Then you're done
Brute force:
import itertools
x = [1, 2, 3, 4, 5, 15]
for size in range(1,len(x)):
for sublist in itertools.combinations(x, size):
comp = x[:]
for n in sublist:
comp.remove(n)
if sum(comp) == sum(sublist):
print(comp, sublist)
[1, 2, 3, 4, 5] (15,)
[15] (1, 2, 3, 4, 5)
This approach can handle duplicated numbers.
Using numpy:
def split(l):
c = np.cumsum(l)
idx = np.flatnonzero(np.equal(l, c[-1] / 2.0))
return (l[:idx[0]], l[idx[0]:]) if idx.size > 0 else False
Alternatively, if using Python > 3.2:
import itertools
def split(l):
c = list(itertools.accumulate(l))
h = c[-1] / 2.0
if h in c:
i = l.index(h)
return l[:i], l[i:]
return False
Finally, if you want to use "pure" Python (no imports):
def split(l):
c = [sum(l[:k]) for k in range(1, len(l) + 1)]
h = c[-1] / 2.0
if h in c:
i = l.index(h)
return l[:i], l[i:]
return False

python - Comparing two lists to see if one occurs in another consecutively

I've been trying to make a function that can take two lists of any size (say, list A and list B) and sees if list B occurs in list A, but consecutively and in the same order. If the above is true, it returns True, else it'll return False
e.g.
A:[9,0,**1,2,3,4,5,6,**7,8] and B:[1,2,3,4,5,6] is successful
A:[1,2,0,3,4,0,5,6,0] and B:[1,2,3,4,5,6] is unsuccessful.
A:[1,2,3,4,5,6] and B [6,5,3,2,1,4] fails because despite having the same
numbers, they aren't in the same order
I've tried doing this using nested loops so far and am a bit confused as to where to go
Just try this:
L1 = [9,0,1,2,3,4,5,6,7,8]
L2 = [1,2,3,4,5,6]
c = 0
w = 0
for a in range(len(L2)):
for b in range(w+1, len(L1)):
if L2[a] == L1[b]:
c = c+1
w = b
break
else:
c = 0
if c == len(L2):
print('yes')
break
Here you check if the element of l2 is in l1 and if so breaks the first loops remember where you left and of the next element of l2 is the same as the next element of l1 and so on.
And the last part is to check if this happened as much times as the length of l2. if so then you know that the statement is correct!
if your arrays are not huge and if you can find a way to map each element in your array to a string you can use:
list1 = [9,0,1,2,3,4,5,6,7,8]
list2 = [1,2,3,4,5,6]
if ''.join(str(e) for e in list2) in ''.join(str(e) for e in list1):
print 'true'
it just make two string from the lists and than use 'in' to find any accorence
Use any function
any(A[i:i+len(B)] == B for i in range(len(A) - len(B) + 1))
demo
i converted the entire list into a string and then found a substring of that string
the list when converted to a string it becomes
str(a)='[9,0,1,2,3,4,5,6,7,8]'
which when when we strip the string becomes
str(a).strip('[]')='9,0,1,2,3,4,5,6,7,8'
Now the problem just converted to
checking if there is a substring in the the string
so we can us the in operator to check the substring
The solution
a=[9,0,1,2,3,4,5,6,7,8]
b=[1,2,3,4,5,6]
print(str(b).strip('[]') in str(a).strip(']['))
testcase1
testcase2
Try this:
L1 = [9,2,1,2,0,4,5,6,7,8]
L2 = [1,2,3,4,5,6]
def sameorder(L1,L2):
for i in range(len(L1)-len(L2)+1):
if L1[i:len(L2)+i]==L2:
return True
return False
You can create sublists of a that can be analyzed:
def is_consecutive(a, b):
return any(all(c == d for c, d in zip(b, i)) for i in [a[e:e+len(b)] for e in range(len(a)-len(b))])
cases = [[[9, 0, 1, 2, 3, 4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6]], [[1, 2, 0, 3, 4, 0, 5, 6, 0], [1, 2, 3, 4, 5, 6]], [[1, 2, 3, 4, 5, 6], [6, 5, 3, 2, 1, 4]]]
final_cases = {"case_{}".format(i):is_consecutive(*a) for i, a in enumerate(cases, start=1)}
Output:
{'case_3': False, 'case_2': False, 'case_1': True}

Categories