I have written a code based on the two pointer algorithm to find the sum of two squares. My problem is that I run into a memory error when running this code for an input n=55555**2 + 66666**2. I am wondering how to correct this memory error.
def sum_of_two_squares(n):
look=tuple(range(n))
i=0
j = len(look)-1
while i < j:
x = (look[i])**2 + (look[j])**2
if x == n:
return (j,i)
elif x < n:
i += 1
else:
j -= 1
return None
n=55555**2 + 66666**2
print(sum_of_two_squares(n))
The problem Im trying to solve using two pointer algorithm is:
return a tuple of two positive integers whose squares add up to n, or return None if the integer n cannot be so expressed as a sum of two squares. The returned tuple must present the larger of its two numbers first. Furthermore, if some integer can be expressed as a sum of two squares in several ways, return the breakdown that maximizes the larger number. For example, the integer 85 allows two such representations 7*7 + 6*6 and 9*9 + 2*2, of which this function must therefore return (9, 2).
You're creating a tuple of size 55555^2 + 66666^2 = 7530713581
So if each element of the tuple takes one byte, the tuple will take up 7.01 GiB.
You'll need to either reduce the size of the tuple, or possibly make each element take up less space by specifying the type of each element: I would suggest looking into Numpy for the latter.
Specifically for this problem:
Why use a tuple at all?
You create the variable look which is just a list of integers:
look=tuple(range(n)) # = (0, 1, 2, ..., n-1)
Then you reference it, but never modify it. So: look[i] == i and look[j] == j.
So you're looking up numbers in a list of numbers. Why look them up? Why not just use i in place of look[i] and remove look altogether?
As others have pointed out, there's no need to use tuples at all.
One reasonably efficient way of solving this problem is to generate a series of integer square values (0, 1, 4, 9, etc...) and test whether or not subtracting these values from n leaves you with a value that is a perfect square.
You can generate a series of perfect squares efficiently by adding successive odd numbers together: 0 (+1) → 1 (+3) → 4 (+5) → 9 (etc.)
There are also various tricks you can use to test whether or not a number is a perfect square (for example, see the answers to this question), but — in Python, at least — it seems that simply testing the value of int(n**0.5) is faster than iterative methods such as a binary search.
def integer_sqrt(n):
# If n is a perfect square, return its (integer) square
# root. Otherwise return -1
r = int(n**0.5)
if r * r == n:
return r
return -1
def sum_of_two_squares(n):
# If n can be expressed as the sum of two squared integers,
# return these integers as a tuple. Otherwise return <None>
# i: iterator variable
# x: value of i**2
# y: value we need to add to x to obtain (i+1)**2
i, x, y = 0, 0, 1
# If i**2 > n / 2, then we can stop searching
max_x = n >> 1
while x <= max_x:
r = integer_sqrt(n-x)
if r >= 0:
return (i, r)
i, x, y = i+1, x+y, y+2
return None
This returns a solution to sum_of_two_squares(55555**2 + 66666**2) in a fraction of a second.
You do not need the ranges at all, and certainly do not need to convert them into tuples. They take a ridiculous amount of space, but you only need their current elements, numbers i and j. Also, as the friendly commenter suggested, you can start with sqrt(n) to improve the performance further.
def sum_of_two_squares(n):
i = 1
j = int(n ** (1/2))
while i < j:
x = i * i + j * j
if x == n:
return j, i
if x < n:
i += 1
else:
j -= 1
Bear in mind that the problem takes a very long time to be solved. Be patient. And no, NumPy won't help. There is nothing here to vectorize.
Useful Links: Project Euler #8, I don't understand where I'm going wrongProject Euler 8
My approach to the problem involves:
1) Starting from the first digit, slicing the integer into required slice lengths (13 here)
2) Creating a list of individual elements in a particular slice.
3) Evaluating the product of the digits in the list using numpy.
4) Appending the results of multiplication in a separate list.
5) Printing the maximum valued product from the list
Here is the attempt:
import numpy
import math
i = 7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
i = str(i)
multiple_list = []
for j in range(len(i)-14):
p = i[j:j+13]
l = list(p)
l = [int(x) for x in l]
y = numpy.prod(l)
multiple_list.append(y)
print(max(multiple_list))
The output of the above block of code is : 2091059712
Which of course is the wrong answer! Please help me in figuring out the reason for this discrepancy.
Why complicate your program? You can make it simpler -
s = "7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450"
largest_product = 0
for i in range(0, len(s) - 13):
product = 1
for j in range(i, i + 13):
product *= int(s[j: j + 1])
if product > largest_product:
largest_product = product
print(largest_product)
The reason your code doesn't work is because numpy uses 32 bits integers, like C. You are experiencing the same problem of the question you linked. You can find a complete explanation in this answer:
https://stackoverflow.com/a/39089671
To solve your code you can use a for loop to multiply the 13 digits instead of using numpy.prod()
You could also do it with a list comprehension:
from numpy import prod
n = "7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450"
p = max(prod([int(d) for d in a]) for a in zip(*[n[i:] for i in range(13)]))
print(p)
Pardon me for a basic question(I am new to Theano)!
I want to get the difference of 2 matrices for only those positions that satisfy a condition. So, suppose we have 2 matrices A and B, this(python equivalent code) is what I want to calculate:
sum = 0
n,m = A.shape
for i in xrange(n):
for j in xrange(m):
if(A[i][j] != 3.5): #some random condition!
sum += A[i][j] - B[i][j]
I want a Theano equivalent code to calculate the sum.
I know there is Theano.scan that can be used to scan an ndarray, but I could not get any example that has an if condition.
Thank you in advance :)
I have found lots of help on creating arrays with specific number values but I cannot seem to find anything to help me set up the array in the first or second problems.
I am not asking for the answers to this assignment, this is just my first Python assignment so I am a beginner and cannot figure out how to set up the arrays I need as I am not given numbers.
So far, I have found this to create an empty array:
import itertools
import numpy as np
my_array = np.empty([n, n])
And then set the value at coordinate i, j to to f(i, j).
for i, j in itertools.product(range(n), range(n)):
my_array[i, j] = f(i, j)
I just cannot seem to figure out how to actually apply this code to my question. Would sin(z) be my f(i, j)?
Yes, sin(zi,j) would be your f(i, j). It's probably more efficient to do without the loop, though:
np.sin((2 * np.pi) * (1 - np.random.random_sample((n, n))))
I have two lists, and I want to compare the value in each list to see if the difference is in a certain range, and return the number of same value in each list. Here is my code 1st version:
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
k=0
for i in xrange(0, len(m)):
for j in xrange(0, len(n)):
if abs(m[i] - n[j]) <=0.5:
k+=1
else:
continue
the output is 3. I also tried 2nd version:
for i, j in zip(m,n):
if abs(i - j) <=0.5:
t+=1
else:
continue
the output is 1, the answer is wrong. So I am wondering if there is simpler and more efficient code for the 1st version, I have a big mount of data to deal with. Thank you!
The first thing you could do is remove the else: continue, since that doesn't add anything. Also, you can directly use for a in m to avoid iterating over a range and indexing.
If you wanted to write it more succiently, you could use itertools.
import itertools
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
k = sum(abs(a - b) <= 0.5 for a, b in itertools.product(m, n))
The runtime of this (and your solution) is O(m * n), where m and n are the lengths of the lists.
If you need a more efficient algorithm, you can use a sorted data structure like a binary tree or a sorted list to achieve better lookup.
import bisect
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
n.sort()
k = 0
for a in m:
i = bisect.bisect_left(n, a - 0.5)
j = bisect.bisect_right(n, a + 0.5)
k += j - i
The runtime is O((m + n) * log n). That's n * log n for sorting and m * log n for lookups. So you'd want to make n the shorter list.
More pythonic version of your first version:
ms = [1, 3, 5, 7]
ns = [1, 4, 7, 9, 5, 6, 34, 52]
k = 0
for m in ms:
for n in ns:
if abs(m - n) <= 0.5:
k += 1
I don't think it will run faster, but it's simpler (more readable).
It's simpler, and probably slightly faster, to simply iterate over the lists directly rather than to iterate over range objects to get index values. You already do this in your second version, but you're not constructing all possible pairs with that zip() call. Here's a modification of your first version:
m = [1,3,5,7]
n = [1,4,7,9,5,6,34,52]
k=0
for x in m:
for y in n:
if abs(x - y) <=0.5:
k+=1
You don't need the else: continue part, which does nothing at the end of a loop, so I left it out.
If you want to explore generator expressions to do this, you can use:
k = sum(sum( abs(x-y) <= 0.5 for y in n) for x in m)
That should run reasonably fast using just the core language and no imports.
Your two code snippets are doing two different things. The first one is comparing each element of n with each element of m, but the second one is only doing a pairwise comparison of corresponding elements of m and n, stopping when the shorter list runs out of elements. We can see exactly which elements are being compared in the second case by printing the zip:
>>> m = [1,3,5,7]
>>> n = [1,4,7,9,5,6,34,52]
>>> zip(m,n)
[(1, 1), (3, 4), (5, 7), (7, 9)]
pawelswiecki has posted a more Pythonic version of your first snippet. Generally, it's better to directly iterate over containers rather than using an indexed loop unless you actually need the index. And even then, it's more Pythonic to use enumerate() to generate the index than to use xrange(len(m)). Eg
>>> for i, v in enumerate(m):
... print i, v
...
0 1
1 3
2 5
3 7
A rule of thumb is that if you find yourself writing for i in xrange(len(m)), there's probably a better way to do it. :)
William Gaul has made a good suggestion: if your lists are sorted you can break out of the inner loop once the absolute difference gets bigger than your threshold of 0.5. However, Paul Draper's answer using bisect is my favourite. :)