(Binary) Summing the elements of a list

(Binary) Summing the elements of a list - python

I need to sum the elements of a list, containing all zeros or ones, so that the result is 1 if there is a 1 in the list, but 0 otherwise.
def binary_search(l, low=0,high=-1):
if not l: return -1
if(high == -1): high = len(l)-1
if low == high:
if l[low] == 1: return low
else: return -1
mid = (low + high)//2
upper = [l[mid:high]]
lower = [l[0:mid-1]]
u = sum(int(x) for x in upper)
lo = sum(int(x) for x in lower)
if u == 1: return binary_search(upper, mid, high)
elif lo == 1: return binary_search(lower, low, mid-1)
return -1
l = [0 for x in range(255)]
l[123] = 1
binary_search(l)
The code I'm using to test
u = sum(int(x) for x in upper)
works fine in the interpreter, but gives me the error
TypeError: int() argument must be a string or a number, not 'list'
I've just started to use python, and can't figure out what's going wrong (the version I've written in c++ doesn't work either).
Does anyone have any pointers?
Also, how would I do the sum so that it is a binary xor, not simply decimal addition?

You don't actually want a sum; you want to know whether upper or lower contains a 1 value. Just take advantage of Python's basic container-type syntax:
if 1 in upper:
# etc
if 1 in lower:
# etc
The reason you're getting the error, by the way, is because you're wrapping upper and lower with an extra nested list when you're trying to split l (rename this variable, by the way!!). You just want to split it like this:
upper = the_list[mid:high]
lower = the_list[:mid-1]
Finally, it's worth noting that your logic is pretty weird. This is not a binary search in the classic sense of the term. It looks like you're implementing "find the index of the first occurrence of 1 in this list". Even ignoring the fact that there's a built-in function to do this already, you would be much better served by just iterating through the whole list until you find a 1. Right now, you've got O(nlogn) time complexity (plus a bunch of extra one-off loops), which is pretty silly considering the output can be replicated in O(n) time by:
def first_one(the_list):
for i in range(len(the_list)):
if the_list[i] == 1:
return i
return -1
Or of course even more simply by using the built-in function index:
def first_one(the_list):
try:
return the_list.index(1)
except ValueError:
return -1

I need to sum the elements of a list, containing all zeros or ones, so that the result is 1 if there is a 1 in the list, but 0 otherwise.
What's wrong with
int(1 in l)

I need to sum the elements of a list, containing all zeros or ones, so that the result is 1 if there is a 1 in the list, but 0 otherwise.
No need to sum the whole list; you can stop at the first 1. Simply use any(). It will return True if there is at least one truthy value in the container and False otherwise, and it short-circuits (i.e. if a truthy value is found early in the list, it doesn't scan the rest). Conveniently, 1 is truthy and 0 is not.
True and False work as 1 and 0 in an arithmetic context (Booleans are a subclass of integers), but if you want specifically 1 and 0, just wrap any() in int().

Stop making nested lists.
upper = l[mid:high]
lower = l[0:mid-1]

Related

Find The Parity Outlier using dictionary {Python}

during the Kata on Codewars called 'Find The Parity Outlier' I faced a problem, and have been trying to solve it using dictionary. I pass almost all tests except 4.
Instruction for the Kata is:
You are given an array (which will have a length of at least 3, but could be very large) containing integers. The array is either entirely comprised of odd integers or entirely comprised of even integers except for a single integer N. Write a method that takes the array as an argument and returns this "outlier" N.
The function is:
def find_outlier(integers):
d = dict()
count = 0
count1 = 0
for i in range(len(integers)):
if integers[i] % 2 != 0 :
d['odd'] = integers[i]
else:
d['even'] = integers[i]
for j in range(len(integers)):
if integers[j] % 2 == 0:
count += 1
else:
count1 += 1
if count > count1:
return d['odd']
return d['even']
Test Results:
2 should equal 1
36 should equal 17
36 should equal -123456789
0 should equal 1
So the question is? Why is it so? Can you help me to sort the problem out? Thanks a lot!

I'm not sure what exactly you're referring to with that list of test results. In general though, your method with the dictionary seems like it might be overcomplicating things a bit as well. You shouldn't need to use a dict, and you shouldn't need two for loops either. Here's an alternative solution to this problem using only list comprehension.
def find_outlier(arr):
# input should be an array-like of integers (and length >= 3) with either only one odd element OR only one even element
odd_mask = [n%2 != 0 for n in arr] # boolean array with True in the location(s) where the elements are odd
even_mask = [n%2 == 0 for n in arr] # boolean array with True in the location(s) where the elements are even
N_odd = sum(odd_mask) # number of odd elements in the input
N_even = sum(even_mask) # number of even elements in the input
if N_even == 1: # if even is the 'outlier'...
return arr[even_mask.index(True)] # return the element of the input array at the index we determined we had an even
elif N_odd == 1: # if odd is the 'outlier'...
return arr[odd_mask.index(True)] # return the element of the input array at the index we determined we had an odd
else: # something has gone wrong or the input did not adhere to the standards set by the problem
return None
And even this is technically not as efficient as it could be. Let me know if you try this and whether it solves whatever issue you were experiencing with expected results.

In your code the final part should not be in the else block, nor even in the for loop:
if count > count1:
return d['odd']
return d['even']
Like this is may give a wrong result. For instance, if the first number in the input is odd, and is the only odd one, then this code will return d['even'] which is obviously wrong.
Place these lines after the loop (with correct indentation) and it should work.
However, this problem can be solved without dictionary or extra lists. Have a go at it.
def find_outlier(integers):
parity = integers[-2] % 2
if integers[-1] % 2 != parity:
if integers[0] % 2 != parity:
return integers[-2]
else:
return integers[-1]
for i in integers:
if i % 2 != parity:
return i

Given 2 strings, return number of positions where the two strings contain the same length 2 substring

here is my code:
def string_match(a, b):
count = 0
if len(a) < 2 or len(b) < 2:
return 0
for i in range(len(a)):
if a[i:i+2] == b[i:i+2]:
count = count + 1
return count
And here are the results:
Correct me if I am wrong but, I see that it didn't work probably because the two string lengths are the same. If I were to change the for loop statement to:
for i in range(len(a)-1):
then it would work for all cases provided. But can someone explain to me why adding the -1 makes it work? Perhaps I'm comprehending how the for loop works in this case. And can someone tell me a more optimal way to write this because this is probably really bad code. Thank you!

But can someone explain to me why adding the -1 makes it work?
Observe:
test = 'food'
i = len(test) - 1
test[i:i+2] # produces 'd'
Using len(a) as your bound means that len(a) - 1 will be used as an i value, and therefore a slice is taken at the end of a that would extend past the end. In Python, such slices succeed, but produce fewer characters.

String slicing can return strings that are shorter than requested. In your first failing example that checks "abc" against "abc", in the third iteration of the for loop, both a[i:i+2] and b[i:i+2] are equal to "c", and therefore count is incremented.
Using range(len(a)-1) ensures that your loop stops before it gets to a slice that would be just one letter long.

Since the strings may be of different lengths, you want to iterate only up to the end of the shortest one. In addition, you're accessing i+2, so you only want i to iterate up to the index before the last item (otherwise you might get a false positive at the end of the string by going off the end and getting a single-character string).
def string_match(a: str, b: str) -> int:
return len([
a[i:i+2]
for i in range(min(len(a), len(b)) - 1)
if a[i:i+2] == b[i:i+2]
])
(You could also do this counting with a sum, but this makes it easy to get the actual matches as well!)

You can use this :
def string_match(a, b):
if len(a) < 2 or len(b) < 0:
return 0
subs = [a[i:i+2] for i in range(len(a)-1)]
occurence = list(map(lambda x: x in b, subs))
return occurence.count(True)

How does Python handle multiple conditions in a list comprehension?

I was trying to create a list comprehension from a function that I had and I came across an unexpected behavior. Just for a better understanding, my function gets an integer and checks which of its digits divides the integer exactly:
# Full function
divs = list()
for i in str(number):
digit = int(i)
if digit > 0 and number % digit == 0:
divs.append(digit)
return len(divs)
# List comprehension
return len([x for x in str(number) if x > 0 and number % int(x) == 0])
The problem is that, if I give a 1012 as an input, the full function returns 3, which is the expected result. The list comprehension returns a ZeroDivisionError: integer division or modulo by zero instead. I understand that it is because of this condition:
if x > 0 and number % int(x) == 0
In the full function, the multiple condition is handled from the left to the right, so it is fine. In the list comprehension, I do not really know, but I was guessing that it was not handled in the same way.
Until I tried with a simpler function:
# Full function
positives = list()
for i in numbers:
if i > 0 and 20 % i ==0:
positives.append(i)
return positives
# List comprehension
return [i for i in numbers if i > 0 and 20 % i == 0]
Both of them worked. So I am thinking that maybe it has something to do with the number % int(x)? This is just curiosity on how this really works? Any ideas?

The list comprehension is different, because you compare x > 0 without converting x to int. In Py2, mismatched types will compare in an arbitrary and stupid but consistent way, which in this case sees all strs (the type of x) as greater than all int (the type of 0) meaning that the x > 0 test is always True and the second test always executes (see Footnote below for details of this nonsense). Change the list comprehension to:
[x for x in str(number) if int(x) > 0 and number % int(x) == 0]
and it will work.
Note that you could simplify a bit further (and limit redundant work and memory consumption) by importing a Py3 version of map at the top of your code (from future_builtins import map), and using a generator expression with sum, instead of a list comprehension with len:
return sum(1 for i in map(int, str(number)) if i > 0 and number % i == 0)
That only calls int once per digit, and constructs no intermediate list.
Footnote: 0 is a numeric type, and all numeric types are "smaller" than everything except None, so a str is always greater than 0. In non-numeric cases, it would be comparing the string type names, so dict < frozenset < list < set < str < tuple, except oops, frozenset and set compare "naturally" to each other, so you can have non-transitive relationships; frozenset() < [] is true, [] < set() is true, but frozenset() < set() is false, because the type specific comparator gets invoked in the final version. Like I said, arbitrary and confusing; it was removed from Python 3 for a reason.

You should say int(x) > 0 in the list comprehension

dificulty solving a code in O(logn)

I wrote a function that gets as an input a list of unique ints in order,(from small to big). Im supposed to find in the list an index that matches the value in the index. for example if L[2]==2 the output is true.
so after i did that in complexity O(logn) i now want to find how many indexes behave like that in the given list with the same complexity O(logn).
im uploading my first code that does the first part and the second code which i need help with:
def steady_state(L):
lower= 0
upper= len(L) -1
while lower<=upper:
middle_i= (upper+ lower)//2
if L[middle_i]== middle_i:
return middle_i
elif L[middle_i]>middle_i:
upper= middle_i-1
else:
lower= middle_i +1
return None
def cnt_steady_states(L):
lower= 0
upper= len(L) -1
a=b=steady_state(L)
if steady_state(L)== None:
return 0
else:
cnt=1
while True:
if L[upper] == upper and a<=upper:
cnt+= upper-a
upper= a
if L[lower]== lower and b>=lower:
cnt+= b- lower
lower = b

It's not possible with the restrictions you've given yet. The best complexity you can theoretically achieve is O(n).
O() assumes the worst case (just a definition, you could drop that part). And in the worst case you will always have to look at each item in order to check it for being equal to its index.
The case changes if you have more restrictions (e. g. the numbers are all ints and none may appear more than once, i. e. no two consecutive numbers are equal). Maybe this is the case?
EDIT:
After hearing that in fact my assumed restrictions apply (i. e. only once-appearing ints) I now propose this approach: You can safely assume that you can have only exactly one continuous range where all your matching entries are located. I. e. you only need to find a lower bound and upper bound. The wanted result will then be the size of that range.
Each bound can safely be found using a binary search, of which each has O(log n).
def binsearch(field, lower=True, a=0, b=None):
if b is None:
b = len(field)
while a + 1 < b:
c = (a + b) / 2
if lower:
if field[c] < c:
a = c
else:
b = c
else: # search for upper bound
if field[c] > c:
b = c
else:
a = c
return b if lower else a
def indexMatchCount(field):
upper = binsearch(field, lower=False)
lower = binsearch(field, b=upper+1)
return upper - lower + 1
This I used for testing:
field = list({ random.randint(-10, 30) for i in range(30) })
field.sort()
upper = binsearch(field, lower=False)
lower = binsearch(field, b=upper+1)
for i, f in enumerate(field):
print lower <= i <= upper, i == f, i, f

Assuming negative integers are OK:
I think the key is that if you get a value less than your index, you know all indices to the left also do not match their value (since the integers are strictly increasing). Also, once you get an index whose value is greater than the index, everything to the right is incorrect (same reason). You can then do a divide and conquer algorithm like you did in the first case. Something along the lines of:
check middle index:
if equal:
count = count + 1
check both halves, minus this index
elif value > index:
check left side (lower to index)
elif index > value:
check right side (index to upper)
In the worst case (every index matches the value), we still have to check every index.
If the integers are non-negative, then you know even more. You now also know that if an index matches the value, all indices to the left must also match the value (why?). Thus, you get:
check middle index:
if equal:
count = count + indices to the left (index-lower)
check the right side (index to upper)
elif value > index:
check left side (lower to index)
elif index > value:
##Can't happen in this case
Now our worst case is significantly improved. Instead of finding an index that matches and not gaining any new information from it, we gain a ton of information when we find one that matches, and now know half of the indices match.

If "all of the numbers are ints and they appear only once", then you can simply do a binary search for the first pair of numbers where L[i]==i && L[i+1]!=i+1.
To allow negative ints, check if L[0]<0, and if so, search between 1..N for:
i>0 && L[i]==i && L[i-1]!=i-1. Then perform the previous search between i and N.

Median code explanation

My professor wrote this median function and I don't understand it very well. Can someone please explain the part about i = len(list)/2 and median = avg() and the else statement?
def avg_list(numbers):
sum = 0
for num in numbers:
sum += num
avg = float(sum)/len(numbers)
print avg
def median(list):
list.sort()
if len(list)%2 == 0:
#have to take avg of middle two
i = len(list)/2
median = avg()
else:
#find the middle (remembering that lists start at 0)
i = len(list)/2
median = list
return median
To add from an example I saw, for even list length:
def median(s):
i = len(s)
if not i%2:
return (s[(i/2)-1]+s[i/2])/2.0
return s[i/2]
This works very well but I don't understand the last return s[i/2]?
For odd list length:
x = [1,2,5,2,3,763,234,23,1,234,21,3,2134,23,54]
median = sorted(x)[len(x)/2]
Since x has a list length of odd, wouldn't the [len(x)/2] be a floating number index? I'm not getting this all the way? Any explanation better than mine is much appreciated.

Why this is is very wrong, line by line:
def median(list): # 1
list.sort() # 2
if len(list)%2 == 0:
#have to take avg of middle two
i = len(list)/2 # 3
median = avg() # 4
else:
#find the middle (remembering that lists start at 0)
i = len(list)/2 # 5
median = list # 6
return median
#1: It's a bad idea to give your variables the same name as data types, namely list.
#2: list.sort() will modify the list that is being passed. One would expect a getter like median() not to do that.
#4 It calls a function avg() with no arguments, which is completely meaningless, even if such a function was defined.
#3 and #5 are calculated the same way regardless of the if branch taken. Regardless, i is never used.
#6 It sets median to the original list, which makes zero sense.
Here's how I would rewrite this (while maintaining clarity):
def median(alist):
srtd = sorted(alist) # returns a sorted copy
mid = len(alist)/2 # remember that integer division truncates
if len(alist) % 2 == 0: # take the avg of middle two
return (srtd[mid-1] + srtd[mid]) / 2.0
else:
return srtd[mid]
Also, the avg_list() function (which is not used nor could be used in median()) could be rewritten as:
def avg_list(numbers):
return float(sum(numbers))/len(numbers)
sum() is a function that returns the sum of all elements in an iterable.

We're missing some code here, but we can puzzle it out.
The comments here are instructive. When we check:
if len(list)%2 == 0:
Then we're checking to see if the list is of even length. If a list has an even number of members, then there is no true "middle" element, and so:
#have to take avg of middle two
i = len(list)/2
median = avg()
We assume that the avg() function is going to return the average of the two middle elements. Since you didn't include a definition of an avg function, it's possible that this is really supposed to be an avg_list function taking the middle two elements of the list.
Now, if the list is of odd length, there is a middle element, and so:
else:
#find the middle (remembering that lists start at 0)
i = len(list)/2
median = list
Now this looks kinda wrong to me too, but my guess is that the intention is that this should read:
median = list[i]
That would be us returning the middle element of the list. Since the list has been sorted, that middle element is the true median of the list.
Hope this helps!

I'm sure it's trying to say, "If the list is of odd size, just take the central element; otherwise take the mean of the central two elements" - but I can't see that that's what the code is actually doing at all.
In particular:
It's calling an avg() function (not avg_list, note) but without any arguments
It's ignoring the value of i after computing it in the same way in both branches
Are you sure that's the complete code which is meant to work?

You can also decide to always return the average of the middle sub-array of the ordered list:
For instance return the average of [4,5] out of [1,2,3,4,5,6,7,8], and that of [5] out of [1,2,3,4,5,6,7,8,9].
A python implementation would be:
def median(a):
ordered = sorted(a)
length = len(a)
return float((ordered[length/2] + ordered[-(length+1)/2]))/2

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

(Binary) Summing the elements of a list - python

I need to sum the elements of a list, containing all zeros or ones, so that the result is 1 if there is a 1 in the list, but 0 otherwise. What's wrong with int(1 in l)

Stop making nested lists. upper = l[mid:high] lower = l[0:mid-1]

Related

Find The Parity Outlier using dictionary {Python}

Given 2 strings, return number of positions where the two strings contain the same length 2 substring

How does Python handle multiple conditions in a list comprehension?

dificulty solving a code in O(logn)

Median code explanation

Categories

Resources