Find The Parity Outlier, CodeWars question on python - python

I am trying to solve this problem but it doesn't give me the correct answer. Here is the problem:
You are given an array (which will have a length of at least 3, but could be very large) containing integers. The array is either entirely comprised of odd integers or entirely comprised of even integers except for a single integer N. Write a method that takes the array as an argument and returns this "outlier" N.
Here is my code:
a = [2, 4, 6, 8, 10, 3]
b = [2, 4, 0, 100, 4, 11, 2602, 36]
c = [160, 3, 1719, 19, 11, 13, -21]
def find_outlier(list_integers):
for num in list_integers:
if num % 2 != 0:
odd = num
elif num % 2 == 0:
even = num
for num in list_integers:
if len([odd]) < len([even]):
return odd
else:
return even
print(find_outlier(a))
print(find_outlier(b))
print(find_outlier(c))
It spits out 10, 36, 160 and obviously only the last one is correct. Can anyone help me out with it?
Thanks!

You could analyze the first three and find the outlier if it is there.
If it is there, you are done. If not, you know the expected parity and can test each subsequent element accordingly.
Creating lists for odd/even numbers, while in principle leading to a result, is unnecessarily memory inefficient.
In code this could look something like:
def find_outlier(seq):
par0 = seq[0] % 2
par1 = seq[1] % 2
if par0 != par1:
return seq[1] if seq[2] % 2 == par0 else seq[0]
# the checks on the first 2 elements are redundant, but avoids copying
# for x in seq[2:]: would do less iteration but will copy the input
for x in seq:
if x % 2 != par0:
return x
a = [2, 4, 6, 8, 10, 3]
b = [2, 4, 0, 100, 4, 11, 2602, 36]
c = [160, 3, 1719, 19, 11, 13, -21]
print(find_outlier(a))
# 3
print(find_outlier(b))
# 11
print(find_outlier(c))
# 160
Your code could not work in its current form:
this block:
for num in list_integers:
if num % 2 != 0:
odd = num
elif num % 2 == 0:
even = num
will just have the last seen odd in odd and the last seen even in even, without any info on how many were seen. You would need to count how many even/odd numbers are there and eventually need to store the first value encountered for each parity.
this second block
for num in list_integers:
if len([odd]) < len([even]):
return odd
else:
return even
is always checking the length of the length-1 lists, and will always return even.
I see no simple way of fixing your code to make it with comparable efficiency as the above solution. But you could adapt your code to make it reasonably efficient (O(n) in time -- but without short-circuiting, O(1) in memory):
def find_outlier_2(seq):
odd = None
even = None
n_odd = n_even = 0
for x in seq:
if x % 2 == 0:
if even is None: # save first occurrence
even = x
n_even += 1
else: # no need to compute the modulus again
if odd is None: # save first occurrente
odd = x
n_odd += 1
if n_even > 1:
return odd
else:
return even
The above is significantly more efficient than some of the other answers in that it does not create unnecessary lists.
For example, these solutions are unnecessarily more memory consuming (being O(n) in time and O(n) in memory):
def find_outlier_3(list_integers):
odd = []
even = []
for num in list_integers:
if num % 2 != 0:
odd.append(num)
elif num % 2 == 0:
even.append(num)
if len(odd) < len(even):
return odd[0]
else:
return even[0]
def find_outlier_4(lst):
odds = [el % 2 for el in lst]
if odds.count(0) == 1:
return lst[odds.index(0)]
else:
return lst[odds.index(1)]
Simple benchmarks show that these solutions are also slower:
%timeit [find_outlier(x) for x in (a, b, c) * 100]
# 10000 loops, best of 3: 128 µs per loop
%timeit [find_outlier_2(x) for x in (a, b, c) * 100]
# 1000 loops, best of 3: 229 µs per loop
%timeit [find_outlier_3(x) for x in (a, b, c) * 100]
# 1000 loops, best of 3: 341 µs per loop
%timeit [find_outlier_4(x) for x in (a, b, c) * 100]
# 1000 loops, best of 3: 248 µs per loop

You can nicely use list comprehensions for this:
a = [2, 4, 6, 8, 10, 3]
b = [2, 4, 0, 100, 4, 11, 2602, 36]
c = [160, 3, 1719, 19, 11, 13, -21]
def outlier(lst):
odds = [ el % 2 for el in lst ] # list with 1's when odd, 0's when even
print(odds) # just to show what odds contains
if odds.count(0) == 1: # if the amount of zeros (even numbers) = 1 in this list
print(lst[odds.index(0)]) # find the index of the 'zero' and use it to read the value from the input lst
else:
print(lst[odds.index(1)]) # find the index of the 'one' and use it to read the value from the input lst
outlier(a)
outlier(b)
outlier(c)
Output
[0, 0, 0, 0, 0, 1] # only 1 'one' so use the position of that 'one'
3
[0, 0, 0, 0, 0, 1, 0, 0] # only 1 'one' so use the position of that 'one'
11
[0, 1, 1, 1, 1, 1, 1] # only 1 'zero' so use the position of that 'zero'
160

Count the number of odd values in the first three items in the list. This can be done using a sum(). It the sum > 1, the list has mostly odd numbers, so find the even outlier. Otherwise find the odd outlier.
def find_outlier(sequence):
if sum(x & 1 for x in numbers[:3]) > 1:
# find the even outlier
for n in sequence:
if not n & 1:
return n
else:
# find the odd outlier
for n in sequence:
if n & 1:
return n

I imagine it would be a bit more efficient to first determine if the outlier is odd or even by looking at a small sample, then return just the outlier using list comprehension. This way, if the list is massive, you won't have timeout issues.
Here's what I would do:
def findoutlier(yourlist):
if (yourlist[0] % 2 == 0 and yourlist[1] % 2 == 0) or (yourlist[0] % 2 == 0 and yourlist[2] % 2 == 0) or (yourlist[1] % 2 == 0 and yourlist[2] % 2 == 0):
oddeven = "even"
else:
oddeven = "odd"
if oddeven == "even":
return [i for i in yourlist if i % 2 != 0][0]
else:
return [i for i in yourlist if i % 2 == 0][0]
a = [2, 4, 6, 8, 10, 3]
b = [2, 4, 0, 100, 4, 11, 2602, 36]
c = [160, 3, 1719, 19, 11, 13, -21]
print(findoutlier(a))
print(findoutlier(b))
print(findoutlier(c))
This will return 3, 11, and 160 as expected.

You want to use a list to store your odd/even numbers. Right now you're storing them as int and they're getting replaced on your loop's next iteration.
def find_outlier(list_integers):
odd = []
even = []
for num in list_integers:
if num % 2 != 0:
odd.append(num)
elif num % 2 == 0:
even.append(num)
if len(odd) < len(even):
return odd[0]
else:
return even[0]

Related

Find large number in a list, where all previous numbers are also in the list

I am trying to implement a Yellowstone Integer calculation which suggests that "Every number appears exactly once: this is a permutation of the positive numbers". The formula I have implemented to derive the values is as follows:
import math
yellowstone_list = []
item_list = []
i = 0
while i <= 1000:
if i <= 3:
yellowstone_list.append(i)
else:
j = 1
inList = 1
while inList == 1:
minus_1 = math.gcd(j, yellowstone_list[i-1])
minus_2 = math.gcd(j, yellowstone_list[i-2])
if minus_1 == 1 and minus_2 > 1:
if j in yellowstone_list:
inList = 1
else:
inList = 0
j += 1
yellowstone_list.append(j - 1)
item_list.append(i)
i += 1
The issue becomes that as i increases, the time taken for the formula to determine the value of j also increases (naturally as i is increasingly further away from the start point of j).
What I would like to do is determine the largest value of j in the yellowstone_list, where all the values of 1 to j are already in the list.
As an example, in the below list, j would be 9, as all the values 0 - 9 are in the list:
yellowstone_list = [0, 1, 2, 3, 4, 9, 8, 15, 14, 5, 6, 25, 12, 35, 16, 7]
Any suggestions on how to implement this in an efficient manner?
For the "standalone" problem as stated the algorithm would be:
Sort the list.
Run a counter from 0 while in parallel traversing the list. Once the counter value is unequal to the list element, then you have found one-past the wanted element.
Something like the following:
x=[0, 1, 2, 3, 4, 9, 8, 15, 14, 5, 6, 25, 12, 35, 16, 7]
y=sorted(x)
for i in range(1, len(y)):
if y[i]!=i:
print(i-1)
break
But in your case it appears that the initial list is being built gradually. So each time a number is added to the list, it can be inserted in a sorted manner and can be checked against the previous element and the traversal can start from there for more efficient process.
This is how I would do it:
lst.sort()
for c, i in enumerate(lst):
if c + 1 < len(lst) and lst[c + 1] != i + 1:
j = i
break
else:
j = i
Basically, the list is sorted, and then, it loops through each value, checking if the next value is only 1 greater than the current.
After some time to sit down and think about it, and using the suggestions to sort the list, I came up with two solutions:
Sorting
I implemented #eugebe Sh.'s solution within the while i < 1000 loop as follows:
while i <= 1000:
m = sorted(yellowstone_list)
for n in range(1, len(m)):
if m[n]!=n:
break
if i == 0:
....
In List
I ran an increment to check if the value was in the list using the "in" function, also within the while i < 1000 loop, as follows:
while i <= 1000:
while k in yellowstone_list:
k += 1
if i == 0:
....
Running both codes 100 times, I got the following:
Sorting: Total: 1:56.403527 seconds, Average: 1.164035 seconds.
In List: Total: 1:14.225230 seconds, Average: 0.742252 seconds.

Automatically generate list from math function?

My idea is to run the 3n + 1 process (Collatz conjecture) on numbers ending in 1, 3, 7, and 9, within any arbitrary range, and to tell the code to send the lengths of each action to a list, so I can run functions on that list separately.
What I have so far is to specify unit digits 1,3,7 and 9 as: if n % 10 == 1; if n % 10 == 3 ...etc, and I think my plan needs some form of nested for loops; where I'm at with list appending is to have temp = [] and leng = [] and find a way for the code to automatically temp.clear() before each input to leng. I'm assuming there's different ways to do this, and I'm open to any ideas.
leng = []
temp = []
def col(n):
while n != 1:
print(n)
temp.append(n)
if n % 2 == 0:
n = n // 2
else:
n = n * 3 + 1
temp.append(n)
print(n)
It's unclear what specifically you're asking about and want to know, so this is only a guess. Since you only want to know the lengths of the sequences, there's no need to actually save the numbers in each one—which means there's only one list created.
def collatz(n):
""" Return length of Collatz sequence beginning with positive integer "n".
"""
count = 0
while n != 1:
n = n // 2 if n % 2 == 0 else n*3 + 1
count += 1
return count
def process_range(start, stop):
""" Return list of results of calling the collatz function to the all the
numbers in the closed interval [start...stop] that end with a digit
in the set {1, 3, 7, or 9}.
"""
return [collatz(n) for n in range(start, stop+1) if n % 10 in {1, 3, 7, 9}]
print(process_range(1, 42))
Output:
[0, 7, 16, 19, 14, 9, 12, 20, 7, 15, 111, 18, 106, 26, 21, 34, 109]

Python: parsing a string of concatenated ascending integers

The objective is to parse the output of an ill-behaving program which concatenates a list of numbers, e.g., 3, 4, 5, into a string "345", without any non-number separating the numbers. I also know that the list is sorted in ascending order.
I came up with the following solution which reconstructs the list from a string:
a = '3456781015203040'
numlist = []
numlist.append(int(a[0]))
i = 1
while True:
j = 1
while True:
if int(a[i:i+j]) <= numlist[-1]:
j = j + 1
else:
numlist.append(int(a[i:i+j]))
i = i + j
break
if i >= len(a):
break
This works, but I have a feeling that the solution reflects too much the fact that I have been trained in Pascal, decades ago. Is there a better or more pythonic way to do it?
I am aware that the problem is ill-posed, i.e., I could start with '34' as the initial element and get a different solution (or possibly end up with remaining trailing numeral characters which don't form the next element of the list).
This finds solutions for all possible initial number lengths:
a = '3456781015203040'
def numbers(a,n):
current_num, i = 0, 0
while True:
while i+n <= len(a) and int(a[i:i+n]) <= current_num:
n += 1
if i+n <= len(a):
current_num = int(a[i:i+n])
yield current_num
i += n
else:
return
for n in range(1,len(a)):
l = list(numbers(a,n))
# print only solutions that use up all digits of a
if ''.join(map(str,l)) == a:
print(l)
[3, 4, 5, 6, 7, 8, 10, 15, 20, 30, 40]
[34, 56, 78, 101, 520, 3040]
[34567, 81015, 203040]
little modification which allows to parse "7000000000001" data and give the best output (max list size)
a = 30000001
def numbers(a,n):
current_num, i = 0, 0
while True:
while i+n <= len(a) and int(a[i:i+n]) <= current_num:n += 1
if i+2*n>len(a):current_num = int(a[i:]);yield current_num; return
elif i+n <= len(a):current_num = int(a[i:i+n]);yield current_num;i += n
else: return
print(current_num)
for n in range(1,len(a)):
l = list(numbers(a,n))
if "".join(map(str,l)) == a:print (l)

if a number is divisible by all the entries of a list then

This came up while attempting Problem 5 of Project Euler, I'm sorry if this is vague or obvious I am new to programming.
Suppose I have a list of integers
v = range(1,n) = [1, ..., n]
What I want to do is this:
if m is divisible by all the entries of v then I want to set
m/v[i] for i starting at 2 and iterating up
then I want to keep repeating this process until I eventually get something which is not divisible by all the entries of v.
Here is a specific example:
let v=[1,2,3,4] and m = 24
m is divisible by 1, 2, 3, and 4, so we divide m by 2 giving us
m=12 which is divisible by 1, 2, 3, and 4 , so we divide by 3
giving us m=4 which is not divisible by 1, 2, 3, and 4. So we stop here.
Is there a way to do this in python using a combination of loops?
I think this code will solve your problem:
i=1
while(True):
w=[x for x in v if (m%x)==0]
if(w==v):
m/=v[i]
i+=1
continue
elif(m!=v):
break
Try this out of size, have a feeling this is what you were asking for:
v = [1,2,3,4]
m = 24
cont = True
c = 1
d = m
while cont:
d = d/c
for i in v:
if d % i != 0:
cont = False
result = d
break
c+=1
print (d)
Got an output of 4.
I think this piece of code should do what you're asking for:
v = [1,2,3,4]
m = 24
index = 1
done = False
while not done:
if all([m % x == 0 for x in v]):
m = m // v[index]
if index + 1 == len(v):
print('Exhausted v')
done = True
else:
index += 1
else:
done = True
print('Not all elements in v evenly divide m')
That said, this is not the best way to go about solving Project Euler problem 5. A more straightforward and faster approach would be:
solved = False
num = 2520
while not solved:
num += 2520
if all([num % x == 0 for x in [11, 13, 14, 16, 17, 18, 19, 20]]):
solved = True
print(num)
In this approach, we known that the answer will be a multiple of 2520, so we increment the value we're checking by that amount. We also know that the only values that need to be checked are in [11, 13, 14, 16, 17, 18, 19, 20], because the number in the range [1,20] that aren't in that list are factors of at least one of the numbers in the list.

Python - Memoization and Collatz Sequence

When I was struggling to do Problem 14 in Project Euler, I discovered that I could use a thing called memoization to speed up my process (I let it run for a good 15 minutes, and it still hadn't returned an answer). The thing is, how do I implement it? I've tried to, but I get a keyerror(the value being returned is invalid). This bugs me because I am positive I can apply memoization to this and get this faster.
lookup = {}
def countTerms(n):
arg = n
count = 1
while n is not 1:
count += 1
if not n%2:
n /= 2
else:
n = (n*3 + 1)
if n not in lookup:
lookup[n] = count
return lookup[n], arg
print max(countTerms(i) for i in range(500001, 1000000, 2))
Thanks.
There is also a nice recursive way to do this, which probably will be slower than poorsod's solution, but it is more similar to your initial code, so it may be easier for you to understand.
lookup = {}
def countTerms(n):
if n not in lookup:
if n == 1:
lookup[n] = 1
elif not n % 2:
lookup[n] = countTerms(n / 2)[0] + 1
else:
lookup[n] = countTerms(n*3 + 1)[0] + 1
return lookup[n], n
print max(countTerms(i) for i in range(500001, 1000000, 2))
The point of memoising, for the Collatz sequence, is to avoid calculating parts of the list that you've already done. The remainder of a sequence is fully determined by the current value. So we want to check the table as often as possible, and bail out of the rest of the calculation as soon as we can.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
l += table[n]
break
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({n: l[i:] for i, n in enumerate(l) if n not in table})
return l
Is it working? Let's spy on it to make sure the memoised elements are being used:
class NoisyDict(dict):
def __getitem__(self, item):
print("getting", item)
return dict.__getitem__(self, item)
def collatz_sequence(start, table=NoisyDict()):
# etc
In [26]: collatz_sequence(5)
Out[26]: [5, 16, 8, 4, 2, 1]
In [27]: collatz_sequence(5)
getting 5
Out[27]: [5, 16, 8, 4, 2, 1]
In [28]: collatz_sequence(32)
getting 16
Out[28]: [32, 16, 8, 4, 2, 1]
In [29]: collatz_sequence.__defaults__[0]
Out[29]:
{1: [1],
2: [2, 1],
4: [4, 2, 1],
5: [5, 16, 8, 4, 2, 1],
8: [8, 4, 2, 1],
16: [16, 8, 4, 2, 1],
32: [32, 16, 8, 4, 2, 1]}
Edit: I knew it could be optimised! The secret is that there are two places in the function (the two return points) that we know l and table share no elements. While previously I avoided calling table.update with elements already in table by testing them, this version of the function instead exploits our knowledge of the control flow, saving lots of time.
[collatz_sequence(x) for x in range(500001, 1000000)] now times around 2 seconds on my computer, while a similar expression with #welter's version clocks in 400ms. I think this is because the functions don't actually compute the same thing - my version generates the whole sequence, while #welter's just finds its length. So I don't think I can get my implementation down to the same speed.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
table.update({x: l[i:] for i, x in enumerate(l)})
return l + table[n]
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({x: l[i:] for i, x in enumerate(l)})
return l
PS - spot the bug!
This is my solution to PE14:
memo = {1:1}
def get_collatz(n):
if n in memo : return memo[n]
if n % 2 == 0:
terms = get_collatz(n/2) + 1
else:
terms = get_collatz(3*n + 1) + 1
memo[n] = terms
return terms
compare = 0
for x in xrange(1, 999999):
if x not in memo:
ctz = get_collatz(x)
if ctz > compare:
compare = ctz
culprit = x
print culprit

Categories