Fibonacci sequence using list comprehension - python

I want to write a list comprehension that will give out Fibonacci number until the number 4 millions. I want to add that to list comprehension and sum evenly spaced terms.
from math import sqrt
Phi = (1 + sqrt(5)) / 2
phi = (1 - sqrt(5)) / 2
series = [int((Phi**n - phi**n) / sqrt(5)) for n in range(1, 10)]
print(series)
[1, 1, 2, 3, 5, 8, 13, 21, 34]
This is a sample code that works and I want to write similar code using list comprehension. Please do help.
a, b = 1, 1
total = 0
while a <= 4000000:
if a % 2 == 0:
total += a
a, b = b, a+b
print(total)

Since there's no actual list required for what you need to do, it's a bit wasteful having a list comprehension. Far better would be to just provide a function to do all the heavy lifting for you, something like:
def sumEvenFibsBelowOrEqualTo(n):
a, b = 1, 1
total = 0
while a <= n:
if a % 2 == 0:
total += a
a, b = b, a + b
return total
Then just call it with print(sumEvenFibsBelowOrEqualTo(4000000)).
If you really do want a list of Fibonacci numbers (perhaps you want to run different comprehensions on it), you can make a small modification to do this - this returns a list rather than the sum of the even values:
def listOfFibsBelowOrEqualTo(n):
a, b = 1, 1
mylist = []
while a <= n:
mylist.append(a)
a, b = b, a + b
return mylist
You can then use the following list comprehension to sum the even ones:
print(sum([x for x in listOfFibsBelowOrEqualTo(4000000) if x % 2 == 0]))
This is probably not too bad given that the Fibonacci numbers get very big very fast (so the list won't be that big) but, for other sequences that don't do that (or for much larger limits), constructing a list may use up large chunks of memory unnecessarily.
A better method may be to use a generator which, if you want a list, you can always construct one from it. But, if you don't need a list, you can still use it in list comprehensions:
def fibGen(limit):
a, b = 1, 1
while a <= limit:
yield a
a, b = b, a + b
mylist = list(fibGen(4000000)) # a list
print(sum([x for x in fibGen(4000000) if x % 2 == 0])) # sum evens, no list

A list comprehension is by its nature a parallel process; it's a process in which an input iterable is fed in, some function is applied to each element, and an output list is created. When this function is applied, it is applied to each element independently of other elements. Thus, list comprehensions are not suitable to iterative algorithms such as the one you present. It could be used in your closed-form formula:
sum([int((Phi**n - phi**n) / sqrt(5)) for n in range(1, 10) if int((Phi**n - phi**n) / sqrt(5))%2 == 0])
If you want to use an iterative algorithm, a generator is more suitable.

Related

Check elements sum in two different list [duplicate]

I've read that one of the key beliefs of Python is that flat > nested. However, if I have several variables counting up, what is the alternative to multiple for loops?
My code is for counting grid sums and goes as follows:
def horizontal():
for x in range(20):
for y in range(17):
temp = grid[x][y: y + 4]
sum = 0
for n in temp:
sum += int(n)
print sum # EDIT: the return instead of print was a mistype
This seems to me like it is too heavily nested. Firstly, what is considered to many nested loops in Python ( I have certainly seen 2 nested loops before). Secondly, if this is too heavily nested, what is an alternative way to write this code?
from itertools import product
def horizontal():
for x, y in product(range(20), range(17)):
print 1 + sum(int(n) for n in grid[x][y: y + 4])
You should be using the sum function. Of course you can't if you shadow it with a variable, so I changed it to my_sum
grid = [range(20) for i in range(20)]
sum(sum( 1 + sum(grid[x][y: y + 4]) for y in range(17)) for x in range(20))
The above outputs 13260, for the particular grid created in the first line of code. It uses sum() three times. The innermost sum adds up the numbers in grid[x][y: y + 4], plus the slightly strange initial value sum = 1 shown in the code in the question. The middle sum adds up those values for the 17 possible y values. The outer sum adds up the middle values over possible x values.
If elements of grid are strings instead of numbers, replace
sum(grid[x][y: y + 4])
with
sum(int(n) for n in grid[x][y: y + 4]
You can use a dictionary to optimize performance significantly
This is another example:
locations = {}
for i in range(len(airports)):
locations[airports["abb"][i][1:-1]] = (airports["height"][i], airports["width"][i])
for i in range(len(uniqueData)):
h, w = locations[uniqueData["dept_apt"][i]]
uniqueData["dept_apt_height"][i] = h
uniqueData["dept_apt_width"][i] = w

Sum of two squares in Python

I have written a code based on the two pointer algorithm to find the sum of two squares. My problem is that I run into a memory error when running this code for an input n=55555**2 + 66666**2. I am wondering how to correct this memory error.
def sum_of_two_squares(n):
look=tuple(range(n))
i=0
j = len(look)-1
while i < j:
x = (look[i])**2 + (look[j])**2
if x == n:
return (j,i)
elif x < n:
i += 1
else:
j -= 1
return None
n=55555**2 + 66666**2
print(sum_of_two_squares(n))
The problem Im trying to solve using two pointer algorithm is:
return a tuple of two positive integers whose squares add up to n, or return None if the integer n cannot be so expressed as a sum of two squares. The returned tuple must present the larger of its two numbers first. Furthermore, if some integer can be expressed as a sum of two squares in several ways, return the breakdown that maximizes the larger number. For example, the integer 85 allows two such representations 7*7 + 6*6 and 9*9 + 2*2, of which this function must therefore return (9, 2).
You're creating a tuple of size 55555^2 + 66666^2 = 7530713581
So if each element of the tuple takes one byte, the tuple will take up 7.01 GiB.
You'll need to either reduce the size of the tuple, or possibly make each element take up less space by specifying the type of each element: I would suggest looking into Numpy for the latter.
Specifically for this problem:
Why use a tuple at all?
You create the variable look which is just a list of integers:
look=tuple(range(n)) # = (0, 1, 2, ..., n-1)
Then you reference it, but never modify it. So: look[i] == i and look[j] == j.
So you're looking up numbers in a list of numbers. Why look them up? Why not just use i in place of look[i] and remove look altogether?
As others have pointed out, there's no need to use tuples at all.
One reasonably efficient way of solving this problem is to generate a series of integer square values (0, 1, 4, 9, etc...) and test whether or not subtracting these values from n leaves you with a value that is a perfect square.
You can generate a series of perfect squares efficiently by adding successive odd numbers together: 0 (+1) → 1 (+3) → 4 (+5) → 9 (etc.)
There are also various tricks you can use to test whether or not a number is a perfect square (for example, see the answers to this question), but — in Python, at least — it seems that simply testing the value of int(n**0.5) is faster than iterative methods such as a binary search.
def integer_sqrt(n):
# If n is a perfect square, return its (integer) square
# root. Otherwise return -1
r = int(n**0.5)
if r * r == n:
return r
return -1
def sum_of_two_squares(n):
# If n can be expressed as the sum of two squared integers,
# return these integers as a tuple. Otherwise return <None>
# i: iterator variable
# x: value of i**2
# y: value we need to add to x to obtain (i+1)**2
i, x, y = 0, 0, 1
# If i**2 > n / 2, then we can stop searching
max_x = n >> 1
while x <= max_x:
r = integer_sqrt(n-x)
if r >= 0:
return (i, r)
i, x, y = i+1, x+y, y+2
return None
This returns a solution to sum_of_two_squares(55555**2 + 66666**2) in a fraction of a second.
You do not need the ranges at all, and certainly do not need to convert them into tuples. They take a ridiculous amount of space, but you only need their current elements, numbers i and j. Also, as the friendly commenter suggested, you can start with sqrt(n) to improve the performance further.
def sum_of_two_squares(n):
i = 1
j = int(n ** (1/2))
while i < j:
x = i * i + j * j
if x == n:
return j, i
if x < n:
i += 1
else:
j -= 1
Bear in mind that the problem takes a very long time to be solved. Be patient. And no, NumPy won't help. There is nothing here to vectorize.

You have a list of integers, and for each index you want to find the product of every integer except the integer at that index. (Can't use division)

I am currently working on practice interview questions. Attached is a screenshot of the question I'm working on below
I tried with the brute force approach of using nested loops with the intent of refactoring out the nested loop, but it still failed the tests in the brute force approach.
Here is the code that I tried:
def get_products_of_all_ints_except_at_index(int_list):
# Make a list with the products
products = []
for i in int_list:
for j in int_list:
if(i != j):
k = int_list[i] * int_list[j]
products.append(k)
return products
I am curious to both the brute force solution, and the more efficient solution without using nested loops.
Linear algorithm using cumulative products from the right side and from the left side
def productexcept(l):
n = len(l)
right = [1]*n
for i in reversed(range(n-1)):
right[i] = right[i + 1] * l[i+1]
#print(right)
prod = 1
for i in range(n):
t = l[i]
l[i] = prod * right[i]
prod *= t
return l
print(productexcept([2,3,7,5]))
>> [105, 70, 30, 42]
If you are allowed to use imports and really ugly list comprehensions you could try this:
from functools import reduce
l = [1,7,3,4]
[reduce(lambda x,y: x*y, [l[i] for i in range(len(l)) if i != k],1) for k,el in enumerate(l)]
If you are not allowed to use functools you can write your own function:
def prod(x):
prod = 1
for i in x:
prod = prod * i
return prod
l = [1,7,3,4]
[prod([l[i] for i in range(len(l)) if i != k]) for k,el in enumerate(l)]
I leave it as an exercise to the reader to put the two solutions in a function.
If you want to get the index of each elements in a list, you should try for i in range(len(int_list)). for i in int_list is in fact returning the values in the list but not the index.
So the brute force solution should be:
def get_products_of_all_ints_except_at_index(int_list):
# Make a list with the products
products = []
for i in range(len(int_list)):
k = 1
for j in range(len(int_list)):
if(i != j):
k *= int_list[j]
products.append(k)
return products
I've come up with this:
def product(larg):
result = 1
for n in larg:
result *= n
return result
List = [1, 7, 3, 4]
N = len(List)
BaseList = [List[0:i] + List[i+1:N] for i in range(N)]
Products = [product(a) for a in BaseList]
print(Products)
From the input list List, you create a list of lists where the proper integer is removed in each one. Then you just build a new list with the products of those sublists.
Here's a solution using a recursive function instead of a loop:
from functools import reduce
def get_products_of_all_ints_except_at_index(int_list, n=0, results=[]):
new_list = int_list.copy()
if n == len(int_list):
return results
new_list.pop(n)
results.append(reduce((lambda x, y: x * y), new_list))
return get_products_of_all_ints_except_at_index(int_list, n+1, results)
int_list = [1, 7, 3, 4]
print(get_products_of_all_ints_except_at_index(int_list))
# expected results [84, 12, 28, 21]
Output:
[84, 12, 28, 21]
Assume N to be a power of 2.
Brute force takes N(N-2) products. You can improve on that by precomputing the N/2 products of pairs of elements, then N/4 pairs of pairs, then pairs of pairs of pairs, until you have a single pair left. This takes N-2 products in total.
Next, you form all the requested products by multiplying the required partial products, in a dichotomic way. This takes Lg(N)-1 multiplies per product, hence a total of N(Lg(N)-1) multiplies.
This is an O(N Log N) solution.
Illustration for N=8:
Using 6 multiplies,
a b c d e f g h
ab cd ef gh
abcd efgh
Then with 16 multiplies,
b.cd.efgh
a.cd.efgh
ab.d.efgh
ab.c.efgh
abcd.f.gh
abcd.e.gh
abcd.ef.h
abcd.ef.g
The desired expressions can be obtained from the binary structure of the numbers from 0 to N-1.
Brute force
Your brute force approach does not work for multiple reasons:
(I'm assuming at least fixed indentation)
List index out of range
for i in int_list
This already gives you i which is an element of the list, not an index. When i is 7,
int_list[i]
is not possible any more.
The loops should be for ... in range(len(int_list)).
With that fixed, the result contains too many elements. There are 12 elements in the result, but only 4 are expected. That's because of another indentation issue at products.append(...). It needs to be outdented 2 steps.
With that fixed, most k are overwritten by a new value each time i*j is calculated. To fix that, start k at the identity element for multiplication, which is 1, and then multiply int_list[j] onto it.
The full code is now
def get_products_of_all_ints_except_at_index(int_list):
products = []
for i in range(len(int_list)):
k = 1
for j in range(len(int_list)):
if i != j:
k *= int_list[j]
products.append(k)
return products
Optimization
I would propose the "brute force" solution as an answer first. Don't worry about performance as long as there are no performance requirements. That would be premature optimization.
The solution with division would be:
def get_products_of_all_ints_except_at_index(int_list):
products = []
product = 1
for i in int_list:
product *= i
for i in int_list:
products.append(product//i)
return products
and thus does not need a nested loop and has linear Big-O time complexity.
A little exponent trick may save you the division: simply multiply by the inverse:
for i in int_list:
products.append(int(product * i ** -1))

Finding maximum sum of occurrences of one element in two attempts from a list

Best explained by example. If a python list is -
[[0,1,2,0,4],
[0,1,2,0,2],
[1,0,0,0,1],
[1,0,0,1,0]]
I want to select two sub-lists which will yield the max sum of occurrences of zeros present - where sum is to be calculated as below
SUM = No. of zeros present in the first selected sub-list + No. of zeros present in the second selected sub-list which were not present in the first selected sub-list.
In this case, answer is 5. (First or second sub-list and the last sub-list). (Note that the third sub-list is not to be selected because it has zero present in 3rd index which is same as in first/second sub-list we have to select and it will amount to sum as 4 which will not be maximum if we consider the last sub-list)
What kind of algorithm is best suited if we were to apply it on a big input? Is there a better way to do this in better than in N2 time?
Binary operations are fairly useful for this task:
Convert each sublist to a binary number, where a 0 is turned into a 1 bit, and other numbers are turned into a 0 bit.
For example, [0,1,2,0,4] would be turned into 10010, which is 18.
Eliminate duplicate numbers.
Combine the remaining numbers pairwise and combine them with a binary OR.
Find the number with the most 1 bits.
The code:
lists = [[0,1,2,0,4],
[0,1,2,0,2],
[1,0,0,0,1],
[1,0,0,1,0]]
import itertools
def to_binary(lst):
num = ''.join('1' if n == 0 else '0' for n in lst)
return int(num, 2)
def count_ones(num):
return bin(num).count('1')
# Step 1 & 2: Convert to binary and remove duplicates
binary_numbers = {to_binary(lst) for lst in lists}
# Step 3: Create pairs
combinations = itertools.combinations(binary_numbers, 2)
# Step 4 & 5: Compute binary OR and count 1 digits
zeros = (count_ones(a | b) for a, b in combinations)
print(max(zeros)) # output: 5
The efficiency of the naive algorithm is O(n(n-1)*m) ~ O(n2m) where n is the number of lists and m is the length of each list. When n and m are comparable in magnitude, this equates to O(n3).
It might be helpful to observe that naive matrix multiplication is also O(n3). This might lead us to the following algorithm:
Write each list with only 1's and 0's, where a 1 indicates a non-zero entry.
Arrange these lists in a matrix A.
Compute the product M=AAT.
Find the minimum element in M; the row and column correspond to the lists which produce the maximize number of non-overlapping zeros.
Here, (3) is the limiting step of the algorithm. Asymptotically, depending on your matrix multiplication algorithm, you can achieve a complexity down to roughly O(n2.4).
An example Python implementation would look like:
import numpy as np
lists = [[0,1,2,0,4],
[0,1,2,0,2],
[1,0,0,0,1],
[1,0,0,1,0]]
filtered = list(set(tuple(1 if e else 0 for e in sub) for sub in lists))
A = np.mat(filtered)
D = np.einsum('ik,jk->ij', A, A)
indices= np.unravel_index(np.argmin(D), D.shape)
print(f'{indices}: {len(lists[0]) - D[indices]}') # (0, 3): 0
Note that this algorithm on it's own has the fundamental inefficiency that it is calculating both the lower-triangular and upper-triangular halves of dot product matrix. However, the numpy speed-up will probably offset this from the combinations approach. See the timing results below:
def numpy_approach(lists):
filtered = list(set(tuple(1 if e else 0 for e in sub) for sub in lists))
A = np.mat(filtered, dtype=bool).astype(int)
D = np.einsum('ik,jk->ij', A, A)
return len(lists[0]) - D.min()
def itertools_approach(lists):
binary_numbers = {int(''.join('1' if n == 0 else '0' for n in lst), 2)
for lst in lists}
combinations = itertools.combinations(binary_numbers, 2)
zeros = (bin(a | b).count('1') for a, b in combinations)
return max(zeros)
from time import time
N = 1000
lists = [[random.randint(0, 5) for _ in range(10)] for _ in range(100)]
for name, function in {
'numpy approach': numpy_approach,
'itertools approach': itertools_approach
}.items():
start = time()
for _ in range(N):
function(lists)
print(f'{name}: {time() - start}')
# numpy approach: 0.2698099613189697
# itertools approach: 0.9693171977996826
The algorithm should look something like (with Haskell code as example, so as not to make the process trivial for you in Python:
turn each sublist into "Is zero" or "Isn't zero"
map (map (\x -> if x==0 then 1 else 0)) bigList
Enumerate the list so you can keep indices
enumList = zip [0..] bigList
Compare each sublist with its successive sublists
myCompare = concat . go
where
go [] = []
go ((ix, xs):xss) = [((ix, iy), zipWith (.|.) xs ys) | (iy, ys) <- xss] : go xss
Calculate your maxes
best = maximumBy (compare `on` (sum . snd)) $ myCompare enumList
Pull out the indices
result = fst best

Two value in two lists, simplify the code

I have two lists, and I want to compare the value in each list to see if the difference is in a certain range, and return the number of same value in each list. Here is my code 1st version:
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
k=0
for i in xrange(0, len(m)):
for j in xrange(0, len(n)):
if abs(m[i] - n[j]) <=0.5:
k+=1
else:
continue
the output is 3. I also tried 2nd version:
for i, j in zip(m,n):
if abs(i - j) <=0.5:
t+=1
else:
continue
the output is 1, the answer is wrong. So I am wondering if there is simpler and more efficient code for the 1st version, I have a big mount of data to deal with. Thank you!
The first thing you could do is remove the else: continue, since that doesn't add anything. Also, you can directly use for a in m to avoid iterating over a range and indexing.
If you wanted to write it more succiently, you could use itertools.
import itertools
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
k = sum(abs(a - b) <= 0.5 for a, b in itertools.product(m, n))
The runtime of this (and your solution) is O(m * n), where m and n are the lengths of the lists.
If you need a more efficient algorithm, you can use a sorted data structure like a binary tree or a sorted list to achieve better lookup.
import bisect
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
n.sort()
k = 0
for a in m:
i = bisect.bisect_left(n, a - 0.5)
j = bisect.bisect_right(n, a + 0.5)
k += j - i
The runtime is O((m + n) * log n). That's n * log n for sorting and m * log n for lookups. So you'd want to make n the shorter list.
More pythonic version of your first version:
ms = [1, 3, 5, 7]
ns = [1, 4, 7, 9, 5, 6, 34, 52]
k = 0
for m in ms:
for n in ns:
if abs(m - n) <= 0.5:
k += 1
I don't think it will run faster, but it's simpler (more readable).
It's simpler, and probably slightly faster, to simply iterate over the lists directly rather than to iterate over range objects to get index values. You already do this in your second version, but you're not constructing all possible pairs with that zip() call. Here's a modification of your first version:
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
k=0
for x in m:
for y in n:
if abs(x - y) <=0.5:
k+=1
You don't need the else: continue part, which does nothing at the end of a loop, so I left it out.
If you want to explore generator expressions to do this, you can use:
k = sum(sum( abs(x-y) <= 0.5 for y in n) for x in m)
That should run reasonably fast using just the core language and no imports.
Your two code snippets are doing two different things. The first one is comparing each element of n with each element of m, but the second one is only doing a pairwise comparison of corresponding elements of m and n, stopping when the shorter list runs out of elements. We can see exactly which elements are being compared in the second case by printing the zip:
>>> m = [1,3,5,7]
>>> n = [1,4,7,9,5,6,34,52]
>>> zip(m,n)
[(1, 1), (3, 4), (5, 7), (7, 9)]
pawelswiecki has posted a more Pythonic version of your first snippet. Generally, it's better to directly iterate over containers rather than using an indexed loop unless you actually need the index. And even then, it's more Pythonic to use enumerate() to generate the index than to use xrange(len(m)). Eg
>>> for i, v in enumerate(m):
... print i, v
...
0 1
1 3
2 5
3 7
A rule of thumb is that if you find yourself writing for i in xrange(len(m)), there's probably a better way to do it. :)
William Gaul has made a good suggestion: if your lists are sorted you can break out of the inner loop once the absolute difference gets bigger than your threshold of 0.5. However, Paul Draper's answer using bisect is my favourite. :)

Categories