Partial Digest Problem is one of the algorithms for getting the places of cut in DNA. Given all the possible lengths of cut, for example [2,2,3,3,4,5,6,7,8,10] I have to figure out a way to find the actual places of cut. In this example total length of the DNA is 10, and the places of actual cut are [0,3,6,8,10].
From the algorithm above, I'm trying to build the actual code in python, and with hand I'm not sure what I've done wrong.
The desired output for this code is
[0,3,6,8,10]
where I'm only getting
"None"
Can anyone please tell me what part in my code is wrong?
# function to remove multiple elements given as list
def delete(elements,A):
for el in elements:
A.remove(el)
return A
# y is given as integer, X as list
def delta(y,X):
n = len(X)
for i in range(n):
X[i] -= y
X[i] = abs(X[i])
return sorted(X)
# If former contains latter, return true. Else, return false
def contains(small, big):
for i in range(len(big)-len(small)+1):
for j in range(len(small)):
if big[i+j] != small[j]:
break
else:
return True
return False
def partialDigest(L):
global width
width = (max(L))
delete([width], L) # Needs to be in list to feed to 'delete' function
X = [0, width]
X = place(L,X)
return X
def place(L,X):
if len(L) == 0: # Baseline condition
return X
y = max(L)
if contains(delta(y,X),L): # If former is the subset of L
delete(delta(y,X), L) # Remove lengths from L
X += list(y) # assert that this y is one of the fixed points, X
X = sorted(X) # To maintain order
print(X)
place(L,X) # Recursive call of the function to redo the upper part
# If none of the if statements match the condition, continue
X.remove(y) # If the code reaches down here, it means the assumption that
# y is one of the points is wrong. Thus undo
L += delta(y,X) # undo L
L = sorted(L) # To maintain order
# Do the same thing except this time it's (width-y)
elif contains(delta(width-y,X),L):
delete(delta(width-y,X), L)
X += list(width - y)
X = sorted(X)
place(L,X)
X.remove(width-y)
L += delta(y,X)
L = sorted(L)
L = [2,2,3,3,4,5,6,7,8,10]
X = partialDigest(L)
print(X)
Related
Suppose we have an array: x = [10,0,30,40]. I would like to extract the first non zero element and store it in a different variable, say y. In this example, y = 10. We can also have many zeros, x = [0,0,30,40], which should give y = 30 as the extracted value.
I tried a Python snippet like this:
i = 0
while x[i] != 0:
y = arr[i]
if x[i] == 0:
break
This only works if the array is [10,0,30,40]. It does not work if I have 0,20,30,40. The loop would stop before that. What is an efficient way to implement this? I try not to use any special Numpy functions, just generic common loops because I might need to port it to other languages.
You can do this
x = [10,0,30,40]
for var in x:
if var != 0:
y = var
break
You could use list comprehension to get all the non-zero values, then provided there are some non-zero values extract the first one.
x = [10,0,30,40]
lst = [v for v in x if v != 0]
if lst:
y = lst[0]
print(y)
The problem with your code is that you don't increment i so it's stuck on the first element. So what you could do to keep the code portable is:
while x[i] != 0:
y = x[i]
if x[i] == 0:
break
i+=1
This code is still not clean, because if there is no 0 in your array then you will get an IndexError as soon as you reach the end.
I'm kind of new to this but i think this should work:
x = [0, 12, 24, 32, 0, 11]
y = []
for num in x:
if num != 0:
y.append(num)
break
print(y)
You can use the next function:
x = [10,0,30,40]
next(n for n in x if n) # 10
x = [0,0,30,40]
next(n for n in x if n) # 30
if you need to support the absence of zero in the list, you can use the second parameter of the next() function:
x = [0,0,0,0]
next((n for n in x if n),0) # 0
I want to test if a list contains consecutive integers and no repetition of numbers.
For example, if I have
l = [1, 3, 5, 2, 4, 6]
It should return True.
How should I check if the list contains up to n consecutive numbers without modifying the original list?
I thought about copying the list and removing each number that appears in the original list and if the list is empty then it will return True.
Is there a better way to do this?
For the whole list, it should just be as simple as
sorted(l) == list(range(min(l), max(l)+1))
This preserves the original list, but making a copy (and then sorting) may be expensive if your list is particularly long.
Note that in Python 2 you could simply use the below because range returned a list object. In 3.x and higher the function has been changed to return a range object, so an explicit conversion to list is needed before comparing to sorted(l)
sorted(l) == range(min(l), max(l)+1))
To check if n entries are consecutive and non-repeating, it gets a little more complicated:
def check(n, l):
subs = [l[i:i+n] for i in range(len(l)) if len(l[i:i+n]) == n]
return any([(sorted(sub) in range(min(l), max(l)+1)) for sub in subs])
The first code removes duplicates but keeps order:
from itertools import groupby, count
l = [1,2,4,5,2,1,5,6,5,3,5,5]
def remove_duplicates(values):
output = []
seen = set()
for value in values:
if value not in seen:
output.append(value)
seen.add(value)
return output
l = remove_duplicates(l) # output = [1, 2, 4, 5, 6, 3]
The next set is to identify which ones are in order, taken from here:
def as_range(iterable):
l = list(iterable)
if len(l) > 1:
return '{0}-{1}'.format(l[0], l[-1])
else:
return '{0}'.format(l[0])
l = ','.join(as_range(g) for _, g in groupby(l, key=lambda n, c=count(): n-next(c)))
l outputs as: 1-2,4-6,3
You can customize the functions depending on your output.
We can use known mathematics formula for checking consecutiveness,
Assuming min number always start from 1
sum of consecutive n numbers 1...n = n * (n+1) /2
def check_is_consecutive(l):
maximum = max(l)
if sum(l) == maximum * (maximum+1) /2 :
return True
return False
Once you verify that the list has no duplicates, just compute the sum of the integers between min(l) and max(l):
def check(l):
total = 0
minimum = float('+inf')
maximum = float('-inf')
seen = set()
for n in l:
if n in seen:
return False
seen.add(n)
if n < minimum:
minimum = n
if n > maximum:
maximum = n
total += n
if 2 * total != maximum * (maximum + 1) - minimum * (minimum - 1):
return False
return True
import numpy as np
import pandas as pd
(sum(np.diff(sorted(l)) == 1) >= n) & (all(pd.Series(l).value_counts() == 1))
We test both conditions, first by finding the iterative difference of the sorted list np.diff(sorted(l)) we can test if there are n consecutive integers. Lastly, we test if the value_counts() are all 1, indicating no repeats.
I split your query into two parts part A "list contains up to n consecutive numbers" this is the first line if len(l) != len(set(l)):
And part b, splits the list into possible shorter lists and checks if they are consecutive.
def example (l, n):
if len(l) != len(set(l)): # part a
return False
for i in range(0, len(l)-n+1): # part b
if l[i:i+3] == sorted(l[i:i+3]):
return True
return False
l = [1, 3, 5, 2, 4, 6]
print example(l, 3)
def solution(A):
counter = [0]*len(A)
limit = len(A)
for element in A:
if not 1 <= element <= limit:
return False
else:
if counter[element-1] != 0:
return False
else:
counter[element-1] = 1
return True
The input to this function is your list.This function returns False if the numbers are repeated.
The below code works even if the list does not start with 1.
def check_is_consecutive(l):
"""
sorts the list and
checks if the elements in the list are consecutive
This function does not handle any exceptions.
returns true if the list contains consecutive numbers, else False
"""
l = list(filter(None,l))
l = sorted(l)
if len(l) > 1:
maximum = l[-1]
minimum = l[0] - 1
if minimum == 0:
if sum(l) == (maximum * (maximum+1) /2):
return True
else:
return False
else:
if sum(l) == (maximum * (maximum+1) /2) - (minimum * (minimum+1) /2) :
return True
else:
return False
else:
return True
1.
l.sort()
2.
for i in range(0,len(l)-1)))
print(all((l[i+1]-l[i]==1)
list must be sorted!
lst = [9,10,11,12,13,14,15,16]
final = True if len( [ True for x in lst[:-1] for y in lst[1:] if x + 1 == y ] ) == len(lst[1:]) else False
i don't know how efficient this is but it should do the trick.
With sorting
In Python 3, I use this simple solution:
def check(lst):
lst = sorted(lst)
if lst:
return lst == list(range(lst[0], lst[-1] + 1))
else:
return True
Note that, after sorting the list, its minimum and maximum come for free as the first (lst[0]) and the last (lst[-1]) elements.
I'm returning True in case the argument is empty, but this decision is arbitrary. Choose whatever fits best your use case.
In this solution, we first sort the argument and then compare it with another list that we know that is consecutive and has no repetitions.
Without sorting
In one of the answers, the OP commented asking if it would be possible to do the same without sorting the list. This is interesting, and this is my solution:
def check(lst):
if lst:
r = range(min(lst), max(lst) + 1) # *r* is our reference
return (
len(lst) == len(r)
and all(map(lst.__contains__, r))
# alternative: all(x in lst for x in r)
# test if every element of the reference *r* is in *lst*
)
else:
return True
In this solution, we build a reference range r that is a consecutive (and thus non-repeating) sequence of ints. With this, our test is simple: first we check that lst has the correct number of elements (not more, which would indicate repetitions, nor less, which indicates gaps) by comparing it with the reference. Then we check that every element in our reference is also in lst (this is what all(map(lst.__contains__, r)) is doing: it iterates over r and tests if all of its elements are in lts).
l = [1, 3, 5, 2, 4, 6]
from itertools import chain
def check_if_consecutive_and_no_duplicates(my_list=None):
return all(
list(
chain.from_iterable(
[
[a + 1 in sorted(my_list) for a in sorted(my_list)[:-1]],
[sorted(my_list)[-2] + 1 in my_list],
[len(my_list) == len(set(my_list))],
]
)
)
)
Add 1 to any number in the list except for the last number(6) and check if the result is in the list. For the last number (6) which is the greatest one, pick the number before it(5) and add 1 and check if the result(6) is in the list.
Here is a really short easy solution without having to use any imports:
range = range(10)
L = [1,3,5,2,4,6]
L = sorted(L, key = lambda L:L)
range[(L[0]):(len(L)+L[0])] == L
>>True
This works for numerical lists of any length and detects duplicates.
Basically, you are creating a range your list could potentially be in, editing that range to match your list's criteria (length, starting value) and making a snapshot comparison. I came up with this for a card game I am coding where I need to detect straights/runs in a hand and it seems to work pretty well.
I could really use some help with understanding where I am going wrong with 2d list comprehension. I have spent hours and the finer points of why its not working out continues to elude me.
The following code is a very basic Lights out game that takes an input
runGenerations2d([0,1,1,0],[1,0,1,0],[1,0,1,0])
Sets up a game board N x N
with a click it needs to change the value of the clicked box.
I believe the problem is
setNewElement
is taking x,y data and the rest of my functions haven't a clue what to do with the values passed on
import time # provides time.sleep(0.5)
from csplot import choice
from random import * # provides choice( [0,1] ), etc.
import sys # larger recursive stack
sys.setrecursionlimit(100000) # 100,000 deep
def runGenerations2d(L , x = 0,y=0):
show(L)
print( L ) # display the list, L
time.sleep(.1) # pause a bit
newL = evolve2d( L ) # evolve L into newL
print(newL)
if min(L) == 1:
#I like read outs to be explained so I added an extra print command.
if x<=1: # Takes into account the possibility of a 1 click completition.
print ('BaseCase Reached!... it took %i click to complete' % (x))
print (x)
done()#removes the need to input done() into the shell
else:
print ('BaseCase Reached!... it took %i clicks to complete' % (x))
print (x)
done()#removes the need to input done() into the shell
return
x = x+1 # add 1 to x before every recusion
runGenerations2d( newL , x,y ) # recurse
def evolve2d( L ):
N = len(L) # N now holds the size of the list L
x,y = sqinput2() # Get 2D mouse input from the user
print(x,y) #confirm the location clicked
return [ setNewElement2d( L, i,x,y ) for i in range(N) ]
def setNewElement2d( L, i, x=0,y=0 ):
if i == (x,y): # if it's the user's chosen column,
if L[i]==1: # if the cell is already one
return L[i]-1 # make it 0
else: # else the cell must be 0
return L[i]+1 # so make it 1
The error after a click
[None, None, None, None]
[None, None, None, None]
The data does not seem 2d.
Try using sqinput instead.
setNewElement2d returns a single number but the calling code is expecting two numbers.
This line
return [ setNewElement2d( L, i,x,y ) for i in range(N) ]
Is setting i to 0, then 1, then 2, ... then N-1. These are single numbers.
You then compare the single numbers to two numbers on this line:
if i == (x,y):
You seem to be assuming i is an x,y pair but it isn't.
Here's how to create every x-y pair for a 3x3 grid:
# Makes (0,0),(0,1)...(2,2)
[(x,y) for x in range(3) for y in range(3)]
I think this code is closer to what you want, still might need changing:
def evolve2d( L ):
N = len(L)
x,y = sqinput2()
print(x,y)
return [setNewElement2d(L, xx, yy, x, y) for xx in range(N) for yy in range(N)]
def setNewElement2d( L, xx, yy, x=0,y=0 ):
if (xx,yy) == (x,y): # if it's the user's chosen row and column
# If it's already 1 return 0 else return 1
return 0 if L[xx][yy]==1 else 1
After analyzing the fastest subset sum algorithm which runs in 2^(n/2) time, I noticed a slight optimization that can be done. I'm not sure if it really counts as an optimization and if it does, I'm wondering if it can be improved by recursion.
Basically from the original algorithm: http://en.wikipedia.org/wiki/Subset_sum_problem (see part with title Exponential time algorithm)
it takes the list and splits it into two
then it generates the sorted power sets of both in 2^(n/2) time
then it does a linear search in both lists to see if 1 value in both lists sum to x using a clever trick
In my version with the optimization
it takes the list and removes the last element last
then it splits the list in two
then it generates the sorted power sets of both in 2^((n-1)/2) time
then it does a linear search in both lists to see if 1 value in both lists sum to x or x-last (at same time with same running time) using a clever trick
If it finds either, then I will know it worked. I tried using python time functions to test with lists of size 22, and my version is coming like twice as fast apparently.
After running the below code, it shows
0.050999879837 <- the original algorithm
0.0250000953674 <- my algorithm
My logic for the recursion part is, well if it works for a size n list in 2^((n-1)/1) time, can we not repeat this again and again?
Does any of this make sense, or am I totally wrong?
Thanks
I created this python code:
from math import log, ceil, floor
import helper # my own code
from random import randint, uniform
import time
# gets a list of unique random floats
# s = how many random numbers
# l = smallest float can be
# h = biggest float can be
def getRandomList(s, l, h):
lst = []
while len(lst) != s:
r = uniform(l,h)
if not r in lst:
lst.append(r)
return lst
# This just generates the two powerset sorted lists that the 2^(n/2) algorithm makes.
# This is just a lazy way of doing it, this running time is way worse, but since
# this can be done in 2^(n/2) time, I just pretend its that running time lol
def getSortedPowerSets(lst):
n = len(lst)
l1 = lst[:n/2]
l2 = lst[n/2:]
xs = range(2**(n/2))
ys1 = helper.getNums(l1, xs)
ys2 = helper.getNums(l2, xs)
return ys1, ys2
# this just checks using the regular 2^(n/2) algorithm to see if two values
# sum to the specified value
def checkListRegular(lst, x):
lst1, lst2 = getSortedPowerSets(lst)
left = 0
right = len(lst2)-1
while left < len(lst1) and right >= 0:
sum = lst1[left] + lst2[right]
if sum < x:
left += 1
elif sum > x:
right -= 1
else:
return True
return False
# this is my improved version of the above version
def checkListSmaller(lst, x):
last = lst.pop()
x1, x2 = x, x - last
return checkhelper(lst, x1, x2)
# this is the same as the function 'checkListRegular', but it checks 2 values
# at the same time
def checkhelper(lst, x1, x2):
lst1, lst2 = getSortedPowerSets(lst)
left = [0,0]
right = [len(lst2)-1, len(lst2)-1]
while 1:
check = 0
if left[0] < len(lst1) and right[0] >= 0:
check += 1
sum = lst1[left[0]] + lst2[right[0]]
if sum < x1:
left[0] += 1
elif sum > x1:
right[0] -= 1
else:
return True
if left[1] < len(lst1) and right[1] >= 0:
check += 1
sum = lst1[left[1]] + lst2[right[1]]
if sum < x2:
left[1] += 1
elif sum > x2:
right[1] -= 1
else:
return True
if check == 0:
return False
n = 22
lst = getRandomList(n, 1, 3000)
startTime = time.time()
print checkListRegular(lst, -50) # -50 so it does worst case scenario
startTime2 = time.time()
print checkListSmaller(lst, -50) # -50 so it does worst case scenario
startTime3 = time.time()
print (startTime2 - startTime)
print (startTime3 - startTime2)
This is the helper library which I just use to generate the powerset list.
def dec_to_bin(x):
return int(bin(x)[2:])
def getNums(lst, xs):
sums = []
n = len(lst)
for i in xs:
bin = str(dec_to_bin(i))
bin = (n-len(bin))*"0" + bin
chosen_items = getList(bin, lst)
sums.append(sum(chosen_items))
sums.sort()
return sums
def getList(binary, lst):
s = []
for i in range(len(binary)):
if binary[i]=="1":
s.append(float(lst[i]))
return s
then it generates the sorted power sets of both in 2^((n-1)/2) time
OK, since now the list has one less lement. However, this is not a big deal its just a constant time improvement of 2^(1/2)...
then it does a linear search in both lists to see if 1 value in both lists sum to x or x-last (at same time with same running time) using a clever trick
... and this improvement will go away because now you do twice as many operations to check for both x and x-last sums instead of only for x
can we not repeat this again and again?
No you can't, for the same reason why you couldn't split the original algorithm again and again. The trick only works for once because once you start looking for values in more than two lists you can't use the sorting trick anymore.
I'm trying to answer this question on a practice test:
Write a function, def eliminate(x, y), that copies all the elements of the list x except the largest value into the list y.
The best thing I could come up with is:
def eliminate(x, y):
print(x)
y = x
big = max(y)
y.remove(big)
print(y)
def main():
x = [1, 3, 5, 6, 7, 9]
y = [0]
eliminate(x, y)
main()
I don't think that'll cut it if a question like that comes up on my final, and I'm pretty sure I shouldn't be writing a main function with it, just the eliminate one. So how would I answer this? (keep in mind this is an introductory course, I shouldn't be using more advanced coding)
I'd probably do this:
def eliminate(x, y):
largest = max(x)
y[:] = [elem for elem in x if elem != largest]
This fills y with all the elements in x except whichever is largest. For example:
>>> x = [1,2,3]
>>> y = []
>>> eliminate(x, y)
>>> y
[1, 2]
>>> x = [7,10,10,3,4]
>>> eliminate(x, y)
>>> y
[7, 3, 4]
This assumes that by "copies" the question is asking for the contents of y to be replaced. If the non-maximal elements of x are to be appended to y, you could use y.extend instead.
Note that your version doesn't handle the case where there are multiple elements with the maximum value (e.g. [1,2,2]) -- .remove() only removes one of the arguments, not all of them.
In order to find the largest number in a list you will need to iterate over that list and keep trace of the largest element along the way. There are several ways to achieve this.
So this code answers the question:
y.extend([n for n in x if n != max(x)])
but i'm worried it might not solve your problem, which is learning how and why this works. Here is that code expanded into a very straight forward way that just uses for loops and if statments.
def trasfer_all_but_largest(transfer_from_list, transfer_to_list):
current_index = 0
index_of_current_largest_element = 0
largest_element_so_far = None
for element in transfer_from_list:
if current_index == 0:
largest_element_so_far = element
else:
if element > largest_element_so_far:
largest_element_so_far = element
index_of_current_largest_element = current_index
current_index = current_index + 1
index_of_largest_element = index_of_current_largest_element
current_index = 0 # reset our index counter
for element in transfer_from_list:
if current_index == index_of_largest_element:
continue # continue means keep going through the list
else:
transfer_to_list = transfer_to_list + [element]
current_index = current_index + 1
return transfer_to_list
list_with_large_number = [1, 2, 100000]
list_were_transfering_to = [40, 50]
answer_list = trasfer_all_but_largest(list_with_large_number, list_were_transfering_to)
print(answer_list)