I am trying to solve the arithmetic progression problem from USACO. Here is the problem statement.
An arithmetic progression is a sequence of the form a, a+b, a+2b, ..., a+nb where n=0, 1, 2, 3, ... . For this problem, a is a non-negative integer and b is a positive integer.
Write a program that finds all arithmetic progressions of length n in the set S of bisquares. The set of bisquares is defined as the set of all integers of the form p2 + q2 (where p and q are non-negative integers).
The two lines of input are n and m, which are the length of each sequence, and the upper bound to limit the search of the bi squares respectively.
I have implemented an algorithm which correctly solves the problem, yet it takes too long. With the max constraints of n = 25 and m = 250, my program does not solve the problem in the 5 second time limit.
Here is the code:
n = 25
m = 250
bisq = set()
for i in range(m+1):
for j in range(i,m+1):
bisq.add(i**2+j**2)
seq = []
for b in range(1, max(bisq)):
for a in bisq:
x = a
for i in range(n):
if x not in bisq:
break
x += b
else:
seq.append((a,b))
The program outputs the correct answer, but it takes too long. I tried running the program with the max n/m values, and after 30 seconds, it was still going.
Disclaimer: this is not a full answer. This is more of a general direction where to look for.
For each member of a sequence, you're looking for four parameters: two numbers to be squared and summed (q_i and p_i), and two differences to be used in the next step (x and y) such that
q_i**2 + p_i**2 + b = (q_i + x)**2 + (p_i + y)**2
Subject to:
0 <= q_i <= m
0 <= p_i <= m
0 <= q_i + x <= m
0 <= p_i + y <= m
There are too many unknowns so we can't get a closed form solution.
let's fix b: (still too many unknowns)
let's fix q_i, and also state that this is the first member of the sequence. I.e., let's start searching from q_1 = 0, extend as much as possible and then extract all sequences of length n. Still, there are too many unknowns.
let's fix x: we only have p_i and y to solve for. At this point, note that the range of possible values to satisfy the equation is much smaller than full range of 0..m. After some calculus, b = x*(2*q_i + x) + y*(2*p_i + y), and there are really not many values to check.
This last step prune is what distinguishes it from the full search. If you write down this condition explicitly, you can get the range of possible p_i values and from that find the length of possible sequence with step b as a function of q_i and x. Rejecting sequences smaller than n should further prune the search.
This should get you from O(m**4) complexity to ~O(m**2). It should be enough to get into the time limit.
A couple more things that might help prune the search space:
b <= 2*m*m//n
a <= 2*m*m - b*n
An answer on math.stackexchange says that for a number x to be a bisquare, any prime factor of x of the form 3 + 4k (e.g., 3, 7, 11, 19, ...) must have an even power. I think this means that for any n > 3, b has to be even. The first item in the sequence a is a bisquare, so it has an even number of factors of 3. If b is odd, then one of a+1b or a+2b will have an odd number of factors of 3 and therefore isn't a bisquare.
Related
n = 5
cube = n**3
def get_sum(n):
a1 = n * (n - 1) + 1
for i in range(a1, cube, 2):
print(i, end='+')
print(f'{get_sum(n)}')
print(cube)
I have output:
21+23+25+27+29+31+33+35+37+39+41+43+45+47+49+51+53+55+57+59+61+63+65+67+69+71+73+75+77+79+81+83+85+87+89+91+93+95+97+99+101+103+105+107+109+111+113+115+117+119+121+123+None
125
How can I get a range till 29 so the sum of these numbers will be equal to cube in Python?
For example, 21+23+25+27+29 = 5^3
first, no need to write print(f'{get_sum(n)}') since your function doesn't return anything except None which you can see in your output, get_sum(n) is enough.
since you are always looping n times, you can simplify your condition, in my solution I used a while loop with a sum variable to keep tabs with the current sum of numbers.
you can apply the same logic with a for loop of course, this is just my implementation.
def get_sum(n):
a1 = n * (n - 1) + 1
sum = a1
while sum < cube:
print(a1, end='+')
a1+=2
sum+=a1
print(a1, end='=')
n = 5
cube = n**3
get_sum(n)
print(cube)
output:
21+23+25+27+29=125
Inefficient approach:
Keep a variable that tracks the current sum to check if we need to break the loop or not (as mentioned in the other answers).
Efficient Approach:
n^3 can be expressed as a sum of n odd integers, which are symmetric about n^2. Examples:
3^3 = 7+9+11 (symmetric about 9)
4^3 = 13+15+17+19 (symmetric about 16)
5^3 = 21+23+25+27+29 (symmetric about 25)
Use this approach to get a simpler algorithm
Suppose Alice & Bob have to write from page 1 to page 200. According to this simple division, Alice will write 1, 2, 3... until 100. Bob will write 101, 102, 103... to 200. Bob will have to write a lot more digits than Alice! Let's say Alice & Bob are counters or markers for numbering, so how we can fairly split up this numbering task?
Considering two integers, start & end, for the starting and ending page numbers (inclusive) defining the range of pages that needs handwritten numbering.
A page number has to be written by either Alice or Bob. They cannot jointly write one number.
Page numbers are all decimal integers count from 1. The missing number of pages can start from page 1 or anywhere in the middle of the notes.
Input: There are multiple tests in each test case.
Line 1: Integer N, the number of tests to follow.
Following N lines: Each line has two integers, st ed, for the starting and ending page numbers (inclusive) defining the range of pages that needs handwritten numbering.
#Input examples
4 #N=4 it means 4 following lines
1 200
8 10
9 10
8 11
import sys
import math
n = int(input()) #1 ≤ N ≤ 200
for i in range(n): #1 ≤ start < end ≤ 10,000,000
start, end = [int(j) for j in input().split()]
Output:
N Lines: For the test in the input, It should be written the corresponding last page number that should be responsible by Alice to write the needed page number on.
#Output examples
118
9
9
9
I was trying to get inspired by this post on fair casting dice unsuccessfully. I also was wondering the solution is far from Checking number of elements for Counter
.
First thing to note is this cannot be done, consider the sequence 99, 100. You cannot split this up fairly. In saying that you can get pretty close +- 1 digit, this assumes you always start counting from 1.
start = 1
end = 200
bobs_numbers = []
alices_numbers = []
count = 0
for i in range(end, start - 1, -1):
if count > 0:
bobs_numbers.append(i)
count -= len(str(i))
else:
alices_numbers.append(i)
count += len(str(i))
print(bobs_numbers, alices_numbers, count)
This is an answer to the initial question. Since the question has been changed, I posted another answer for the new question.
The initial question was: Partition the set [1, 200] into two subsets such that the total number of digits in one subset is as close to possible to the total number of digits in the other subset.
Since user Mitchel Paulin already gave a straightforward iterative solution, let me give an arithmetic solution.
Counting the number of digits
First, let's count the total number of digits we want to split between Alice and Bob:
there are 9 numbers with 1 digit;
there are 90 numbers with 2 digits;
there are 101 numbers with 3 digits.
Total: 492 digits.
We want to give 246 digits to Alice and 246 digits to Bob.
How to most simply get 246 digit by summing up numbers with 1, 2 and 3 digits?
246 = 3 * 82.
Let's give 82 numbers with 3 digits to Bob, and all the other numbers to Alice.
Finally Bob can handle numbers [119, 200] and Alice can handle numbers [1, 118].
Generalizing to any range [1, n]
Counting the numbers of numbers with each possible number of digits should be O(log n).
Dividing by 2 to get the number of digits for Bob is O(1).
Decomposing this number using dynamic programming is linear in the maximum number of digits, i.e., O(log n) space and time (this is exactly the coin change problem).
Transforming this decomposition into a union of ranges is straightforward, and linear in the maximum number of digits, so again O(log n). Deducing the ranges for Alice by "subtracting" Bob's ranges from [1, n] is also straightforward.
Conclusion: the algorithm is O(log n) space and time, as opposed to Mitchel Paulin's O(n) algorithm. The output is also logarithmic instead of linear, since it can be written as a union of ranges, instead of a long list.
This algorithm is a bit more complex to write, but the output being in the form of ranges mean that Alice and Bob won't bother each other too much by writing adjacent pages, which they would do a lot with the simpler algorithm (which mostly alternates between giving a number to Bob and giving a number to Alice).
Since the question has changed, this is an answer the new question.
The new question is: Given a range [a, b], find number m such that the total number of digits in range [a, m] is as close as possible to the number of digits in range [m+1, b].
Algorithm explanation
The algorithm is simple: Start with m = (a + b) / 2, count the digits, then move m to the right or to the left to adjust.
To count the total number of digits in a range [1, n], we first count the number of unit digits (which is n); then add the number of tens digits (which is n - 9; then add the number of hundreds digits (which is n - 99); etc.
To count the total number of digits in a range [a, b], we take the difference between the total number of digits in ranges [1, b] and [1, a-1].
Note that the number of digits of a given number n > 1 is given by any of the two expressions math.ceil(math.log10(n)) and len(str(n)). I used the former in the code below. If you have a phobia of logarithms, you can replace it with the latter; in which case import math is no longer needed.
Code in python
import math
def count_digits_from_1(n):
power_of_ten = math.ceil(math.log10(n))
total_digits = 0
for i in range(1, power_of_ten+1):
digits_at_pos_i = n - (10**(i-1) - 1)
total_digits += digits_at_pos_i
return total_digits
def count_digits(a, b):
if a > 2:
return count_digits_from_1(b) - count_digits_from_1(a-1)
else:
return count_digits_from_1(b) - (a - 1) # assumes a >= 1
def share_digits(a, b):
total_digits = count_digits(a, b)
m = (a + b) // 2
alices_digits = count_digits(a, m)
bobs_digits = total_digits - alices_digits
direction = 1 if alices_digits < bobs_digits else -1
could_be_more_fair = True
while (could_be_more_fair):
new_m = m + direction
diff = math.ceil(math.log10(new_m))
new_alices_digits = alices_digits + direction * diff
new_bobs_digits = bobs_digits - direction * diff
if abs(alices_digits - bobs_digits) > abs(new_alices_digits - new_bobs_digits):
alices_digits = new_alices_digits
bobs_digits = new_bobs_digits
m = new_m
else:
could_be_more_fair = False
return ((a, m), (m+1, b))
if __name__=='__main__':
for (a, b) in [(1, 200), (8, 10), (9, 10), (8, 11)]:
print('{},{} ---> '.format(a,b), end='')
print(share_digits(a, b))
Output:
1,200 ---> ((1, 118), (119, 200))
8,10 ---> ((8, 9), (10, 10))
9,10 ---> ((9, 9), (10, 10))
8,11 ---> ((8, 10), (11, 11))
Remark: This code uses the assumption 1 <= a <= b.
Performance analysis
Function count_digits_from1 executes in O(log n); its for loop iterates over the position of the digits to count the number of unit digits, then the number of tens digits, then the number of hundreds digits, etc. There are log10(n) positions.
The question is: how many iterations will the while loop in share_digits have?
If we're lucky, the final value of m will be very close to the initial value (a+b)//2, so the number of iterations of this loop might be O(1). This remains to be proven.
If the number of iterations of this loop is too high, the algorithm could be improved by getting rid of this loop entirely, and calculating the final value of m directly. Indeed, replacing m with m+1 or m-1 changes the difference abs(alices_digits - bobs_digits) by exactly two times the number of digits of m+1 (or m-1). Therefore, the final value of m should be given approximately by:
new_m = m + direction * abs(alices_digits - bobs_digits) / (2 * math.ceil(math.log10(m)))
I'm trying to maximize the Euler Totient function on Python given it can use large arbitrary numbers. The problem is that the program gets killed after some time so it doesn't reach the desired ratio. I have thought of increasing the starting number into a larger number, but I don't think it's prudent to do so. I'm trying to get a number when divided by the totient gets higher than 10. Essentially I'm trying to find a sparsely totient number that fits this criteria.
Here's my phi function:
def phi(n):
amount = 0
for k in range(1, n + 1):
if fractions.gcd(n, k) == 1:
amount += 1
return amount
The most likely candidates for high ratios of N/phi(N) are products of prime numbers. If you're just looking for one number with a ratio > 10, then you can generate primes and only check the product of primes up to the point where you get the desired ratio
def totientRatio(maxN,ratio=10):
primes = []
primeProd = 1
isPrime = [1]*(maxN+1)
p = 2
while p*p<=maxN:
if isPrime[p]:
isPrime[p*p::p] = [0]*len(range(p*p,maxN+1,p))
primes.append(p)
primeProd *= p
tot = primeProd
for f in primes:
tot -= tot//f
if primeProd/tot >= ratio:
return primeProd,primeProd/tot,len(primes)
p += 1 + (p&1)
output:
totientRatio(10**6)
16516447045902521732188973253623425320896207954043566485360902980990824644545340710198976591011245999110,
10.00371973209101,
55
This gives you the smallest number with that ratio. Multiples of that number will have the same ratio.
n = 16516447045902521732188973253623425320896207954043566485360902980990824644545340710198976591011245999110
n*2/totient(n*2) = 10.00371973209101
n*11*13/totient(n*11*13) = 10.00371973209101
No number will have a higher ratio until you reach the next product of primes (i.e. that number multiplied by the next prime).
n*263/totient(n*263) = 10.041901868473037
Removing a prime from the product affects the ratio by a proportion of (1-1/P).
For example if m = n/109, then m/phi(m) = n/phi(n) * (1-1/109)
(n//109) / totient(n//109) = 9.91194248684247
10.00371973209101 * (1-1/109) = 9.91194248684247
This should allow you to navigate the ratios efficiently and find the numbers that meed your need.
For example, to get a number with a ratio that is >= 10 but closer to 10, you can go to the next prime product(s) and remove one or more of the smaller primes to reduce the ratio. This can be done using combinations (from itertools) and will allow you to find very specific ratios:
m = n*263/241
m/totient(m) = 10.000234225865265
m = n*(263...839) / (7 * 61 * 109 * 137) # 839 is 146th prime
m/totient(m) = 10.000000079805726
I have a partial solution for you, but the results don't look good.. (this solution may not give you an answer with modern computer hardware (amount of ram is limiting currently)) I took an answer from this pcg challenge and modified it to spit out ratios of n/phi(n) up to a particular n
import numba as nb
import numpy as np
import time
n = int(2**31)
#nb.njit("i4[:](i4[:])", locals=dict(
n=nb.int32, i=nb.int32, j=nb.int32, q=nb.int32, f=nb.int32))
def summarum(phi):
#calculate phi(i) for i: 1 - n
#taken from <a>https://codegolf.stackexchange.com/a/26753/42652</a>
phi[1] = 1
i = 2
while i < n:
if phi[i] == 0:
phi[i] = i - 1
j = 2
while j * i < n:
if phi[j] != 0:
q = j
f = i - 1
while q % i == 0:
f *= i
q //= i
phi[i * j] = f * phi[q]
j += 1
i += 1
#divide each by n to get ratio n/phi(n)
i = 1
while i < n: #jit compiled while loop is faster than: for i in range(): blah blah blah
phi[i] = i//phi[i]
i += 1
return phi
if __name__ == "__main__":
s1 = time.time()
a = summarum(np.zeros(n, np.int32))
locations = np.where(a >= 10)
print(len(locations))
I only have enough ram on my work comp. to test about 0 < n < 10^8 and the largest ratio was about 6. You may or may not have any luck going up to larger n, although 10^8 already took several seconds (not sure what the overhead was... spyder's been acting strange lately)
p55# is a sparsely totient number satisfying the desired condition.
Furthermore, all subsequent primorial numbers are as well, because pn# / phi(pn#) is a strictly increasing sequence:
p1# / phi(p1#) is 2, which is positive. For n > 1, pn# / phi(pn#) is equal to pn-1#pn / phi(pn-1#pn), which, since pn and pn-1# are coprime, is equal to (pn-1# / phi(pn-1#)) * (pn/phi(pn)). We know pn > phi(pn) > 0 for all n, so pn/phi(pn) > 1. So we have that the sequence pn# / phi(pn#) is strictly increasing.
I do not believe these to be the only sparsely totient numbers satisfying your request, but I don't have an efficient way of generating the others coming to mind. Generating primorials, by comparison, amounts to generating the first n primes and multiplying the list together (whether by using functools.reduce(), math.prod() in 3.8+, or ye old for loop).
As for the general question of writing a phi(n) function, I would probably first find the prime factors of n, then use Euler's product formula for phi(n). As an aside, make sure to NOT use floating-point division. Even finding the prime factors of n by trial division should outperform computing gcd n times, but when working with large n, replacing this with an efficient prime factorization algorithm will pay dividends. Unless you want a good cross to die on, don't write your own. There's one in sympy that I'm aware of, and given the ubiquity of the problem, probably plenty of others around. Time as needed.
Speaking of timing, if this is still relevant enough to you (or a future reader) to want to time... definitely throw the previous answer in the mix as well.
I have a simple math algo. All it does is it takes an input and finds i,j such that i^2 + j^2 = input with the restriction that j >= i (so that it doesn't print it's counterpart e.g., 2^2 + 3^2 == 3^2 + 2^2 but I only need the latter as j >= i)
For my code, I did the following: I have 2 for loops, first loop for i and second loop for j. Takes both i and j values and test if i^2 + j^2 == input and if j >= i. if yes, print it and update count.
The problem is, with large sums of values, it takes a very long time as it loops twice from 1 to 2000 and then 1 to 2000 again.
def some_mathfn(n):
count = 0
for i in range(1,n+1):
for j in range(1,n+1):
if(i**2 + j**2 == n and j >= i):
g = print(i, '^2 + ', j,'^2')
count += 1
return count
some_mathfn(2001)
You've got an O(n2) algorithm for no obvious reason. It's easy to make this O(n1/2)...
Loop from 1 to the square root of n/2 (for variable i) - because when i is greater than sqrt(n/2) then i*i + j*j will be greater than n for any j greater than i.
(Only to the square root of n, because
Subtract the square of i
Take the square root of the result, and find the nearest integer - call that j
Check whether the condition you're interested in holds
The last two steps are effectively just checking that the square root of n - i*i is actually an integer, but in some cases (for very large values of n) finding the nearest integer and then checking the condition could be a more reliable approach, in order to avoid floating point limitations causing issues, where the nearest-representable double to the theoretical result could be an integer, despite that actual result not being an integer. This would only happen for really large values of n, but...
I want to generate the digits of the square root of two to 3 million digits.
I am aware of Newton-Raphson but I don't have much clue how to implement it in C or C++ due to lack of biginteger support. Can somebody point me in the right direction?
Also, if anybody knows how to do it in python (I'm a beginner), I would also appreciate it.
You could try using the mapping:
a/b -> (a+2b)/(a+b) starting with a= 1, b= 1. This converges to sqrt(2) (in fact gives the continued fraction representations of it).
Now the key point: This can be represented as a matrix multiplication (similar to fibonacci)
If a_n and b_n are the nth numbers in the steps then
[1 2] [a_n b_n]T = [a_(n+1) b_(n+1)]T
[1 1]
which now gives us
[1 2]n [a_1 b_1]T = [a_(n+1) b_(n+1)]T
[1 1]
Thus if the 2x2 matrix is A, we need to compute An which can be done by repeated squaring and only uses integer arithmetic (so you don't have to worry about precision issues).
Also note that the a/b you get will always be in reduced form (as gcd(a,b) = gcd(a+2b, a+b)), so if you are thinking of using a fraction class to represent the intermediate results, don't!
Since the nth denominators is like (1+sqrt(2))^n, to get 3 million digits you would likely need to compute till the 3671656th term.
Note, even though you are looking for the ~3.6 millionth term, repeated squaring will allow you to compute the nth term in O(Log n) multiplications and additions.
Also, this can easily be made parallel, unlike the iterative ones like Newton-Raphson etc.
EDIT: I like this version better than the previous. It's a general solution that accepts both integers and decimal fractions; with n = 2 and precision = 100000, it takes about two minutes. Thanks to Paul McGuire for his suggestions & other suggestions welcome!
def sqrt_list(n, precision):
ndigits = [] # break n into list of digits
n_int = int(n)
n_fraction = n - n_int
while n_int: # generate list of digits of integral part
ndigits.append(n_int % 10)
n_int /= 10
if len(ndigits) % 2: ndigits.append(0) # ndigits will be processed in groups of 2
decimal_point_index = len(ndigits) / 2 # remember decimal point position
while n_fraction: # insert digits from fractional part
n_fraction *= 10
ndigits.insert(0, int(n_fraction))
n_fraction -= int(n_fraction)
if len(ndigits) % 2: ndigits.insert(0, 0) # ndigits will be processed in groups of 2
rootlist = []
root = carry = 0 # the algorithm
while root == 0 or (len(rootlist) < precision and (ndigits or carry != 0)):
carry = carry * 100
if ndigits: carry += ndigits.pop() * 10 + ndigits.pop()
x = 9
while (20 * root + x) * x > carry:
x -= 1
carry -= (20 * root + x) * x
root = root * 10 + x
rootlist.append(x)
return rootlist, decimal_point_index
As for arbitrary big numbers you could have a look at The GNU Multiple Precision Arithmetic Library (for C/C++).
For work? Use a library!
For fun? Good for you :)
Write a program to imitate what you would do with pencil and paper. Start with 1 digit, then 2 digits, then 3, ..., ...
Don't worry about Newton or anybody else. Just do it your way.
Here is a short version for calculating the square root of an integer a to digits of precision. It works by finding the integer square root of a after multiplying by 10 raised to the 2 x digits.
def sqroot(a, digits):
a = a * (10**(2*digits))
x_prev = 0
x_next = 1 * (10**digits)
while x_prev != x_next:
x_prev = x_next
x_next = (x_prev + (a // x_prev)) >> 1
return x_next
Just a few caveats.
You'll need to convert the result to a string and add the decimal point at the correct location (if you want the decimal point printed).
Converting a very large integer to a string isn't very fast.
Dividing very large integers isn't very fast (in Python) either.
Depending on the performance of your system, it may take an hour or longer to calculate the square root of 2 to 3 million decimal places.
I haven't proven the loop will always terminate. It may oscillate between two values differing in the last digit. Or it may not.
The nicest way is probably using the continued fraction expansion [1; 2, 2, ...] the square root of two.
def root_two_cf_expansion():
yield 1
while True:
yield 2
def z(a,b,c,d, contfrac):
for x in contfrac:
while a > 0 and b > 0 and c > 0 and d > 0:
t = a // c
t2 = b // d
if not t == t2:
break
yield t
a = (10 * (a - c*t))
b = (10 * (b - d*t))
# continue with same fraction, don't pull new x
a, b = x*a+b, a
c, d = x*c+d, c
for digit in rdigits(a, c):
yield digit
def rdigits(p, q):
while p > 0:
if p > q:
d = p // q
p = p - q * d
else:
d = (10 * p) // q
p = 10 * p - q * d
yield d
def decimal(contfrac):
return z(1,0,0,1,contfrac)
decimal((root_two_cf_expansion()) returns an iterator of all the decimal digits. t1 and t2 in the algorithm are minimum and maximum values of the next digit. When they are equal, we output that digit.
Note that this does not handle certain exceptional cases such as negative numbers in the continued fraction.
(This code is an adaptation of Haskell code for handling continued fractions that has been floating around.)
Well, the following is the code that I wrote. It generated a million digits after the decimal for the square root of 2 in about 60800 seconds for me, but my laptop was sleeping when it was running the program, it should be faster that. You can try to generate 3 million digits, but it might take a couple days to get it.
def sqrt(number,digits_after_decimal=20):
import time
start=time.time()
original_number=number
number=str(number)
list=[]
for a in range(len(number)):
if number[a]=='.':
decimal_point_locaiton=a
break
if a==len(number)-1:
number+='.'
decimal_point_locaiton=a+1
if decimal_point_locaiton/2!=round(decimal_point_locaiton/2):
number='0'+number
decimal_point_locaiton+=1
if len(number)/2!=round(len(number)/2):
number+='0'
number=number[:decimal_point_locaiton]+number[decimal_point_locaiton+1:]
decimal_point_ans=int((decimal_point_locaiton-2)/2)+1
for a in range(0,len(number),2):
if number[a]!='0':
list.append(eval(number[a:a+2]))
else:
try:
list.append(eval(number[a+1]))
except IndexError:
pass
p=0
c=list[0]
x=0
ans=''
for a in range(len(list)):
while c>=(20*p+x)*(x):
x+=1
y=(20*p+x-1)*(x-1)
p=p*10+x-1
ans+=str(x-1)
c-=y
try:
c=c*100+list[a+1]
except IndexError:
c=c*100
while c!=0:
x=0
while c>=(20*p+x)*(x):
x+=1
y=(20*p+x-1)*(x-1)
p=p*10+x-1
ans+=str(x-1)
c-=y
c=c*100
if len(ans)-decimal_point_ans>=digits_after_decimal:
break
ans=ans[:decimal_point_ans]+'.'+ans[decimal_point_ans:]
total=time.time()-start
return ans,total
Python already supports big integers out of the box, and if that's the only thing holding you back in C/C++ you can always write a quick container class yourself.
The only problem you've mentioned is a lack of big integers. If you don't want to use a library for that, then are you looking for help writing such a class?
Here's a more efficient integer square root function (in Python 3.x) that should terminate in all cases. It starts with a number much closer to the square root, so it takes fewer steps. Note that int.bit_length requires Python 3.1+. Error checking left out for brevity.
def isqrt(n):
x = (n >> n.bit_length() // 2) + 1
result = (x + n // x) // 2
while abs(result - x) > 1:
x = result
result = (x + n // x) // 2
while result * result > n:
result -= 1
return result