Related
Hopefully a simple one, I have a number, say 1234567.890 this number could be anything but will be this length.
How do I truncate the first 3 numbers so it turns into 4567.890?
This could be any number so subtracting 123000 will not work.
I'm working with map data in UTM coordinates (but that should not matter)
Example
x = 580992.528
y = 4275267.719
For x, I want 992.528
For y, I want 267.719
Note: y has an extra digit to the left of the decimal so 4 need removing
You can use slices for this:
x = 1234567.890
# This is still a float
x = float(str(x)[3:])
print(x)
Outputs:
4567.89
As [3:] gets the starts the index at 3 and continues to the end of the string
Update after your edit
The simplest way is to use Decimal:
from decimal import Decimal
def fmod(v, m=1000, /):
return float(Decimal(str(v)) % m)
print(fmod(x))
print(fmod(y))
Output
992.528
267.719
If you don't use string, you will have some problems with floating point in Python.
Demo:
n = 1234567.890
i = 0
while True:
m = int(n // 10**i)
if m < 1000:
break
i += 1
r = n % 10**i
Output:
>>> r
4567.889999999898
>>> round(r, 3)
4567.89
Same with Decimal from decimal module:
from decimal import Decimal
n = 1234567.890
n = Decimal(str(n))
i = 0
while True:
m = int(n // 10**i)
if m < 1000:
break
i += 1
r = n % 10**i
Output:
>>> r
Decimal('4567.89')
>>> float(r)
4567.89
This approach simply implements your idea.
int_len is the length of the integer part that we keep
sub is the rounded value that we will subtract the original float by
Code
Here is the code that implements your idea.
import math
def trim(n, digits):
int_len = len(str(int(n))) - digits # length of 4567
sub = math.floor(n / 10 **int_len) * 10**int_len
print(n - sub)
But as Kelly Bundy has pointed out, you can use modulo operation to avoid the complicated process of finding the subtrahend.
def trim(n, digits):
int_len = len(str(int(n))) - digits # length of 4567
print(n % 10**int_len)
Output
The floating point thing is a bit cursed and you may want to take Corralien's answer as an alternative.
>>> n = 1234567.890
>>> trim(n, 3)
4567.889999999898
def get_slice(number, split_n):
return number - (number // 10**split_n) * 10**split_n
To explain this, this is basically a way to shrink floating point vector data into 8-bit or 16-bit signed or unsigned integers with a single common unsigned exponent (the most common of which being bs16 for precision with a common exponent of 11).
I'm not sure what this pseudo-float method is called; all I know is to get the resulting float, you need to do this:
float_result = int_value / ( 2.0 ** exponent )
What I'd like to do is match this data by basically guessing the exponent by attempting to re-calculate it from the given floats.
(if done properly, it should be able to be re-calculated in other formats as well)
So if all I'm given is a large group of 1140 floats to work with, how can I find the common exponent and convert these floats into this shrunken bu8, bs8, bu16, or bs16 (specified) format?
EDIT: samples
>>> for value in array('h','\x28\xC0\x04\xC0\xF5\x00\x31\x60\x0D\xA0\xEB\x80'):
print( value / ( 2. ** 11 ) )
-7.98046875
-7.998046875
0.11962890625
12.0239257812
-11.9936523438
-15.8852539062
EDIT2:
I wouldn't exactly call this "compression", as all it really is, is an extracted mantissa to be re-computed via the shared exponent.
Maybe something like this:
def validExponent(x,e,a,b):
"""checks if x*2.0**e is an integer in range [a,b]"""
y = x*2.0**e
return a <= y <= b and y == int(y)
def allValid(xs,e,a,b):
return all(validExponent(x,e,a,b) for x in xs)
def firstValid(xs,a,b,maxE = 100):
for e in xrange(1+maxE):
if allValid(xs,e,a,b):
return e
return "None found"
#test:
xs = [x / ( 2. ** 11 ) for x in [-12,14,-5,16,28]]
print xs
print firstValid(xs,-2**15,2**15-1)
Output:
[-0.005859375, 0.0068359375, -0.00244140625, 0.0078125, 0.013671875]
11
You could of course write a wrapper function which will take a string argument such as 'bs16' and automatically compute the bounds a,b
On Edit:
1) If you have the exact values of the floats the above should work. It anything has introduced any round-off error you might want to replace y == int(y) by abs(y-round(y)) < 0.00001 (or something similar).
2) The first valid exponent will be the exponent you want unless all of the integers in the original integer list are even. If you have 1140 values and they are in some sense random, the chance of this happening is vanishingly small.
On Further Edit: If the floats in question are not generated by this process but you want to find an optimal exponent which allows for (lossy) compression to ints of a given size you can do something like this (not thoroughly tested):
import math
def maxExp(x,a,b):
"""returns largest nonnegative integer exponent e with
a <= x*2**e <= b, where a, b are integers with a <= 0 and b > 0
Throws an error if no such e exists"""
if x == 0.0:
e = -1
elif x < 0.0:
e = -1 if a == 0 else math.floor(math.log(a/float(x),2))
else:
e = math.floor(math.log(b/float(x),2))
if e >= 0:
return int(e)
else:
raise ValueError()
def bestExponent(floats,a,b):
m = min(floats)
M = max(floats)
e1 = maxExp(m,a,b)
e2 = maxExp(M,a,b)
MSE = []
for e in range(1+min(e1,e2)):
MSE.append(sum((x - round(x*2.0**e)/2.0**e)**2 for x in floats)/float(len(floats)))
minMSE = min(MSE)
for e,error in enumerate(MSE):
if error == minMSE:
return e
To test it:
>>> import random
>>> xs = [random.uniform(-10,10) for i in xrange(1000)]
>>> bestExponent(xs,-2**15,2**15-1)
11
It seems like the common exponent 11 is chosen for a reason.
If you've got the original values, and the corresponding result, you can use log to find the exponent. Math has a log function you can use. You'd have to log Int_value/float_result to the base 2.
EG:
import Math
x = (int_value/float_result)
math.log(x,2)
I was curious if any of you could come up with a more streamline version of code to calculate Brown numbers. as of the moment, this code can do ~650! before it moves to a crawl. Brown Numbers are calculated thought the equation n! + 1 = m**(2) Where M is an integer
brownNum = 8
import math
def squareNum(n):
x = n // 2
seen = set([x])
while x * x != n:
x = (x + (n // x)) // 2
if x in seen: return False
seen.add(x)
return True
while True:
for i in range(math.factorial(brownNum)+1,math.factorial(brownNum)+2):
if squareNum(i) is True:
print("pass")
print(brownNum)
print(math.factorial(brownNum)+1)
break
else:
print(brownNum)
print(math.factorial(brownNum)+1)
brownNum = brownNum + 1
continue
break
print(input(" "))
Sorry, I don't understand the logic behind your code.
I don't understand why you calculate math.factorial(brownNum) 4 times with the same value of brownNum each time through the while True loop. And in the for loop:
for i in range(math.factorial(brownNum)+1,math.factorial(brownNum)+2):
i will only take on the value of math.factorial(brownNum)+1
Anyway, here's my Python 3 code for a brute force search of Brown numbers. It quickly finds the only 3 known pairs, and then proceeds to test all the other numbers under 1000 in around 1.8 seconds on this 2GHz 32 bit machine. After that point you can see it slowing down (it hits 2000 around the 20 second mark) but it will chug along happily until the factorials get too large for your machine to hold.
I print progress information to stderr so that it can be separated from the Brown_number pair output. Also, stderr doesn't require flushing when you don't print a newline, unlike stdout (at least, it doesn't on Linux).
import sys
# Calculate the integer square root of `m` using Newton's method.
# Returns r: r**2 <= m < (r+1)**2
def int_sqrt(m):
if m <= 0:
return 0
n = m << 2
r = n >> (n.bit_length() // 2)
while True:
d = (n // r - r) >> 1
r += d
if -1 <= d <= 1:
break
return r >> 1
# Search for Browns numbers
fac = i = 1
while True:
if i % 100 == 0:
print('\r', i, file=sys.stderr, end='')
fac *= i
n = fac + 1
r = int_sqrt(n)
if r*r == n:
print('\nFound', i, r)
i += 1
You might want to:
pre calculate your square numbers, instead of testing for them on the fly
pre calculate your factorial for each loop iteration num_fac = math.factorial(brownNum) instead of multiple calls
implement your own, memoized, factorial
that should let you run to the hard limits of your machine
one optimization i would make would be to implement a 'wrapper' function around math.factorial that caches previous values of factorial so that as your brownNum increases, factorial doesn't have as much work to do. this is known as 'memoization' in computer science.
edit: found another SO answer with similar intention: Python: Is math.factorial memoized?
You should also initialize the square root more closely to the root.
e = int(math.log(n,4))
x = n//2**e
Because of 4**e <= n <= 4**(e+1) the square root will be between x/2 and x which should yield quadratic convergence of the Heron formula from the first iteration on.
for given x < 10^15, quickly and accurately determine the maximum integer p such that 2^p <= x
Here are some things I've tried:
First I tried this but it's not accurate for large numbers:
>>> from math import log
>>> x = 2**3
>>> x
8
>>> p = int(log(x, 2))
>>> 2**p == x
True
>>> x = 2**50
>>> p = int(log(x, 2))
>>> 2**p == x #not accurate for large numbers?
False
I could try something like:
p = 1
i = 1
while True:
if i * 2 > n:
break
i *= 2
p += 1
not_p = n - p
Which would take up to 50 operations if p was 50
I could pre-compute all the powers of 2 up until 2^50, and use binary search to find p. This would take around log(50) operations but seems a bit excessive and ugly?
I found this thread for C based solutions: Compute fast log base 2 ceiling
However It seems a bit ugly and I wasn't exactly sure how to convert it to python.
In Python >= 2.7, you can use the .bit_length() method of integers:
def brute(x):
# determine max p such that 2^p <= x
p = 0
while 2**p <= x:
p += 1
return p-1
def easy(x):
return x.bit_length() - 1
which gives
>>> brute(0), brute(2**3-1), brute(2**3)
(-1, 2, 3)
>>> easy(0), easy(2**3-1), easy(2**3)
(-1, 2, 3)
>>> brute(2**50-1), brute(2**50), brute(2**50+1)
(49, 50, 50)
>>> easy(2**50-1), easy(2**50), easy(2**50+1)
(49, 50, 50)
>>>
>>> all(brute(n) == easy(n) for n in range(10**6))
True
>>> nums = (max(2**x+d, 0) for x in range(200) for d in range(-50, 50))
>>> all(brute(n) == easy(n) for n in nums)
True
You specify in comments your x is an integer, but for anyone coming here where their x is already a float, then math.frexp() would be pretty fast at extracting log base 2:
log2_slow = int(floor(log(x, 2)))
log2_fast = frexp(x)[1]-1
The C function that frexp() calls just grabs and tweaks the exponent. Some more 'splainin:
The subscript[1] is because frexp() returns a tuple (significand, exponent).
The subtract-1 accounts for the significand being in the range [0.5,1.0). For example 250 is stored as 0.5x251.
The floor() is because you specified 2^p <= x, so p == floor(log(x,2)).
(Derived from another answer.)
Be careful! The accepted answer returns floor(log(n, 2)), NOT ceil(log(n, 2)) like the title of the question implies!
If you came here for a clog2 implementation, do this:
def clog2(x):
"""Ceiling of log2"""
if x <= 0:
raise ValueError("domain error")
return (x-1).bit_length()
And for completeness:
def flog2(x):
"""Floor of log2"""
if x <= 0:
raise ValueError("domain error")
return x.bit_length() - 1
You could try the log2 function from numpy, which appears to work for powers up to 2^62:
>>> 2**np.log2(2**50) == 2**50
True
>>> 2**np.log2(2**62) == 2**62
True
Above that (at least for me) it fails due to the limtiations of numpy's internal number types, but that will handle data in the range you say you're dealing with.
Works for me, Python 2.6.5 (CPython) on OSX 10.7:
>>> x = 2**50
>>> x
1125899906842624L
>>> p = int(log(x,2))
>>> p
50
>>> 2**p == x
True
It continues to work at least for exponents up to 1e9, by which time it starts to take quite a while to do the math. What are you actually getting for x and p in your test? What version of Python, on what OS, are you running?
With respect to "not accurate for large numbers" your challenge here is that the floating point representation is indeed not as precise as you need it to be (49.999999999993 != 50.0). A great reference is "What Every Computer Scientist Should Know About Floating-Point Arithmetic."
The good news is that the transformation of the C routine is very straightforward:
def getpos(value):
if (value == 0):
return -1
pos = 0
if (value & (value - 1)):
pos = 1
if (value & 0xFFFFFFFF00000000):
pos += 32
value = value >> 32
if (value & 0x00000000FFFF0000):
pos += 16
value = value >> 16
if (value & 0x000000000000FF00):
pos += 8
value = value >> 8
if (value & 0x00000000000000F0):
pos += 4
value = value >> 4
if (value & 0x000000000000000C):
pos += 2
value = value >> 2
if (value & 0x0000000000000002):
pos += 1
value = value >> 1
return pos
Another alternative is that you could round to the nearest integer, instead of truncating:
log(x,2)
=> 49.999999999999993
round(log(x,2),1)
=> 50.0
I needed to calculate the upper bound power of two (to figure out how many bytes of entropy was needed to generate a random number in a given range using the modulus operator).
From a rough experiment I think the calculation below gives the minimum integer p such that val < 2^p
It's probably about as fast as you can get, and uses exclusively bitwise integer arithmetic.
def log2_approx(val):
from math import floor
val = floor(val)
approx = 0
while val != 0:
val &= ~ (1<<approx)
approx += 1
return approx
Your slightly different value would be calculated for a given n by
log2_approx(n) - 1
...maybe. But in any case, the bitwise arithmetic could give you a clue how to do this fast.
I want to generate the digits of the square root of two to 3 million digits.
I am aware of Newton-Raphson but I don't have much clue how to implement it in C or C++ due to lack of biginteger support. Can somebody point me in the right direction?
Also, if anybody knows how to do it in python (I'm a beginner), I would also appreciate it.
You could try using the mapping:
a/b -> (a+2b)/(a+b) starting with a= 1, b= 1. This converges to sqrt(2) (in fact gives the continued fraction representations of it).
Now the key point: This can be represented as a matrix multiplication (similar to fibonacci)
If a_n and b_n are the nth numbers in the steps then
[1 2] [a_n b_n]T = [a_(n+1) b_(n+1)]T
[1 1]
which now gives us
[1 2]n [a_1 b_1]T = [a_(n+1) b_(n+1)]T
[1 1]
Thus if the 2x2 matrix is A, we need to compute An which can be done by repeated squaring and only uses integer arithmetic (so you don't have to worry about precision issues).
Also note that the a/b you get will always be in reduced form (as gcd(a,b) = gcd(a+2b, a+b)), so if you are thinking of using a fraction class to represent the intermediate results, don't!
Since the nth denominators is like (1+sqrt(2))^n, to get 3 million digits you would likely need to compute till the 3671656th term.
Note, even though you are looking for the ~3.6 millionth term, repeated squaring will allow you to compute the nth term in O(Log n) multiplications and additions.
Also, this can easily be made parallel, unlike the iterative ones like Newton-Raphson etc.
EDIT: I like this version better than the previous. It's a general solution that accepts both integers and decimal fractions; with n = 2 and precision = 100000, it takes about two minutes. Thanks to Paul McGuire for his suggestions & other suggestions welcome!
def sqrt_list(n, precision):
ndigits = [] # break n into list of digits
n_int = int(n)
n_fraction = n - n_int
while n_int: # generate list of digits of integral part
ndigits.append(n_int % 10)
n_int /= 10
if len(ndigits) % 2: ndigits.append(0) # ndigits will be processed in groups of 2
decimal_point_index = len(ndigits) / 2 # remember decimal point position
while n_fraction: # insert digits from fractional part
n_fraction *= 10
ndigits.insert(0, int(n_fraction))
n_fraction -= int(n_fraction)
if len(ndigits) % 2: ndigits.insert(0, 0) # ndigits will be processed in groups of 2
rootlist = []
root = carry = 0 # the algorithm
while root == 0 or (len(rootlist) < precision and (ndigits or carry != 0)):
carry = carry * 100
if ndigits: carry += ndigits.pop() * 10 + ndigits.pop()
x = 9
while (20 * root + x) * x > carry:
x -= 1
carry -= (20 * root + x) * x
root = root * 10 + x
rootlist.append(x)
return rootlist, decimal_point_index
As for arbitrary big numbers you could have a look at The GNU Multiple Precision Arithmetic Library (for C/C++).
For work? Use a library!
For fun? Good for you :)
Write a program to imitate what you would do with pencil and paper. Start with 1 digit, then 2 digits, then 3, ..., ...
Don't worry about Newton or anybody else. Just do it your way.
Here is a short version for calculating the square root of an integer a to digits of precision. It works by finding the integer square root of a after multiplying by 10 raised to the 2 x digits.
def sqroot(a, digits):
a = a * (10**(2*digits))
x_prev = 0
x_next = 1 * (10**digits)
while x_prev != x_next:
x_prev = x_next
x_next = (x_prev + (a // x_prev)) >> 1
return x_next
Just a few caveats.
You'll need to convert the result to a string and add the decimal point at the correct location (if you want the decimal point printed).
Converting a very large integer to a string isn't very fast.
Dividing very large integers isn't very fast (in Python) either.
Depending on the performance of your system, it may take an hour or longer to calculate the square root of 2 to 3 million decimal places.
I haven't proven the loop will always terminate. It may oscillate between two values differing in the last digit. Or it may not.
The nicest way is probably using the continued fraction expansion [1; 2, 2, ...] the square root of two.
def root_two_cf_expansion():
yield 1
while True:
yield 2
def z(a,b,c,d, contfrac):
for x in contfrac:
while a > 0 and b > 0 and c > 0 and d > 0:
t = a // c
t2 = b // d
if not t == t2:
break
yield t
a = (10 * (a - c*t))
b = (10 * (b - d*t))
# continue with same fraction, don't pull new x
a, b = x*a+b, a
c, d = x*c+d, c
for digit in rdigits(a, c):
yield digit
def rdigits(p, q):
while p > 0:
if p > q:
d = p // q
p = p - q * d
else:
d = (10 * p) // q
p = 10 * p - q * d
yield d
def decimal(contfrac):
return z(1,0,0,1,contfrac)
decimal((root_two_cf_expansion()) returns an iterator of all the decimal digits. t1 and t2 in the algorithm are minimum and maximum values of the next digit. When they are equal, we output that digit.
Note that this does not handle certain exceptional cases such as negative numbers in the continued fraction.
(This code is an adaptation of Haskell code for handling continued fractions that has been floating around.)
Well, the following is the code that I wrote. It generated a million digits after the decimal for the square root of 2 in about 60800 seconds for me, but my laptop was sleeping when it was running the program, it should be faster that. You can try to generate 3 million digits, but it might take a couple days to get it.
def sqrt(number,digits_after_decimal=20):
import time
start=time.time()
original_number=number
number=str(number)
list=[]
for a in range(len(number)):
if number[a]=='.':
decimal_point_locaiton=a
break
if a==len(number)-1:
number+='.'
decimal_point_locaiton=a+1
if decimal_point_locaiton/2!=round(decimal_point_locaiton/2):
number='0'+number
decimal_point_locaiton+=1
if len(number)/2!=round(len(number)/2):
number+='0'
number=number[:decimal_point_locaiton]+number[decimal_point_locaiton+1:]
decimal_point_ans=int((decimal_point_locaiton-2)/2)+1
for a in range(0,len(number),2):
if number[a]!='0':
list.append(eval(number[a:a+2]))
else:
try:
list.append(eval(number[a+1]))
except IndexError:
pass
p=0
c=list[0]
x=0
ans=''
for a in range(len(list)):
while c>=(20*p+x)*(x):
x+=1
y=(20*p+x-1)*(x-1)
p=p*10+x-1
ans+=str(x-1)
c-=y
try:
c=c*100+list[a+1]
except IndexError:
c=c*100
while c!=0:
x=0
while c>=(20*p+x)*(x):
x+=1
y=(20*p+x-1)*(x-1)
p=p*10+x-1
ans+=str(x-1)
c-=y
c=c*100
if len(ans)-decimal_point_ans>=digits_after_decimal:
break
ans=ans[:decimal_point_ans]+'.'+ans[decimal_point_ans:]
total=time.time()-start
return ans,total
Python already supports big integers out of the box, and if that's the only thing holding you back in C/C++ you can always write a quick container class yourself.
The only problem you've mentioned is a lack of big integers. If you don't want to use a library for that, then are you looking for help writing such a class?
Here's a more efficient integer square root function (in Python 3.x) that should terminate in all cases. It starts with a number much closer to the square root, so it takes fewer steps. Note that int.bit_length requires Python 3.1+. Error checking left out for brevity.
def isqrt(n):
x = (n >> n.bit_length() // 2) + 1
result = (x + n // x) // 2
while abs(result - x) > 1:
x = result
result = (x + n // x) // 2
while result * result > n:
result -= 1
return result