Bulk updating a slice of a Python list - python

I have written a simple implementation of the Sieve of Eratosthenes, and I would like to know if there is a more efficient way to perform one of the steps.
def eratosthenes(n):
primes = [2]
is_prime = [False] + ((n - 1)/2)*[True]
for i in xrange(len(is_prime)):
if is_prime[i]:
p = 2*i + 1
primes.append(p)
is_prime[i*p + i::p] = [False]*len(is_prime[i*p + i::p])
return primes
I am using Python's list slicing to update my list of booleans is_prime. Each element is_prime[i] corresponds to an odd number 2*i + 1.
is_prime[i*p + i::p] = [False]*len(is_prime[i*p + i::p])
When I find a prime p, I can mark all elements corresponding to multiples of that prime False, and since all multiples smaller than p**2 are also multiples of smaller primes, I can skip marking those. The index of p**2 is i*p + i.
I'm worried about the cost of computing [False]*len(is_prime[i*p + 1::p]) and I have tried to compare it to two other strategies that I couldn't get to work.
For some reason, the formula (len(is_prime) - (i*p + i))/p (if positive) is not always equal to len(is_prime[i*p + i::p]). Is it because I've calculated the length of the slice wrong, or is there something subtle about slicing that I haven't caught?
When I use the following lines in my function:
print len(is_prime[i*p + i::p]), ((len(is_prime) - (i*p + i))/p)
is_prime[i*p + i::p] = [False]*((len(is_prime) - (i*p + i))/p)
I get the following output (case n = 50):
>>> eratosthenes2(50)
7 7
3 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in eratosthenes2
ValueError: attempt to assign sequence of size 2 to extended slice of size 3
I also tried replacing the bulk updating line with the following:
for j in xrange(i*p + i, len(is_prime), p):
is_prime[j] = False
But this fails for large values of n because xrange doesn't take anything bigger than a long. I gave up on trying to wrestle itertools.count into what I needed.
Are there faster and more elegant ways to bulk-update the list slice? Is there anything I can do to fix the other strategies that I tried, so that I can compare them to the working one? Thanks!

Use itertools.repeat():
is_prime[i*p + 1::p] = itertools.repeat(False, len(is_prime[i*p + 1::p]))
The slicing syntax will iterate over whatever you put on the right-hand side; it doesn't need to be a full-blown sequence.
So let's fix that formula. I'll just borrow the Python 3 formula since we know that works:
1 + (hi - 1 - lo) / step
Since step > 0, hi = stop and lo = start, so we have:
1 + (len(is_prime) - 1 - (i*p + 1))//p
(// is integer division; this future-proofs our code for Python 3, but requires 2.7 to run).
Now, put it all together:
slice_len = 1 + (len(is_prime) - 1 - (i*p + 1))//p
is_prime[i*p + 1::p] = itertools.repeat(False, slice_len)
Python 3 users: Please do not use this formula directly. Instead, just write len(range(start, stop, step)). That gives the same result with similar performance (i.e. it's O(1)) and is much easier to read.

Related

Why is the value you get from heapq.heappop(H) different from the value you get from H[0]?

I am trying to solve LeetCode problem 295. Find Median from Data Stream:
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value and the median is the mean of the two middle values.
For example, for arr = [2,3,4], the median is 3.
For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5.
Implement the MedianFinder class:
MedianFinder() initializes the MedianFinder object.
void addNum(int num) adds the integer num from the data stream to the data structure.
double findMedian() returns the median of all elements so far.
Answers within 10-5 of the actual answer will be accepted.
Example 1:
[...]
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3); // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
For this question, I am adding numbers to a heap and then later use the smallest one to do some operations.
When I tried to do the operation, I found out that my heap returns a different value when doing heapq.heappop(self.small) than when doing self.small[0].
Could you please explain this to me? Any hint is much appreciated.
(Every number in self.small is added using heapq.heappush)
Here is my code when it works:
class MedianFinder:
def __init__(self):
self.small, self.large = [], []
def addNum(self, num):
heapq.heappush(self.small, -1 * num)
if (self.small and self.large) and -1 * self.small[0] > self.large[0]:
val = -1 * heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.small) > len(self.large) + 1:
val = -1 * heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = -1 * heapq.heappop(self.large)
heapq.heappush(self.small, val)
def findMedian(self):
if len(self.small) > len(self.large):
return -1 * self.small[0]
elif len(self.small) < len(self.large):
return self.large[0]
else:
return (-1 * self.small[0] + self.large[0]) / 2
For the last line, if I change:
-1 * self.small[0] + self.large[0]
into:
-1 * heapq.heappop(self.small) + heapq.heappop(self.large)
then the tests fail.
Why would that be any different?
When you change -1 * self.small[0] + self.large[0] into -1 * heapq.heappop(self.small) + heapq.heappop(self.large) then it will still work the first time findMedian is called, but when it is called again, it will return (in general) a different result. The reason is that with heappop, you remove a value from the heap. This should not happen, as this changes the data that a next call of findMedian will have to deal with. findMedian is supposed to leave the data structure unchanged.
Note how the challenge says this:
double findMedian() returns the median of all elements so far.
I highlight the last two words. These indicate that findMedian is not (only) called when the whole stream of data has been processed, but will be called several times during the processing of the data stream. That makes it crucial that findMedian does not modify the data structure, and so heappop should not be used.

Python OverflowError: int too large to convert to float

I have a number and I want to find the sum of all of its possible substrings. Since the sum may be very large, I am taking modulo 1e9+7. Here is the code I wrote for it:
n = input()
total = 0
for i in range(len(n)):
for j in range(i, len(n)):
total = (total + int(n[i:j+1]))%(1e9+7)
print(int(total))
But this gives me Overflow error:
OverflowError: int too large to convert to float
Taking modulo inside also doesn't help:
total = (total%(1e9+7) + int(n[i:j+1])%(1e9+7))%(1e9+7)
Neither does converting total to int every step:
total = int((total%(1e9+7) + int(n[i:j+1])%(1e9+7))%(1e9+7))
I searched online, and many people were using decimal, so I tried that too:
import decimal
n = input()
total = 0
for i in range(len(n)):
for j in range(i, len(n)):
total = decimal.Decimal((int(total)%(1e9+7) + int(n[i:j+1])%(1e9+7))%(1e9+7))
print(int(total))
This also gave me the same error. So how can I fix it?
EDIT:
This is the input value causing the error:

Convert it into Decimal
from decimal import Decimal
Decimal
The best solution to this question runs in linear time. Here's the idea:
let's look at a shorter example: 1234
consider the total sum of the substrings which end on the ith digit (0..3)
substringsum[0]: 1 (we have only a single substring)
substringsum[1]: 2 + 12 (two substrings)
substringsum[2]: 3+23+123 (three substrings)
substringsum[3]: 4 + 34+234+1234
see a pattern?
let's look at substringsum[2]:
3 + 23 + 123 = 3 + 20+3 + 120+3 = 3*3 + 20+120 = 3*3 + 10*(2+12) = 3*3 +10*substringsum[1]
in general:
substringsum[k] = (k+1)*digit[k] + 10 * substringsum[k-1]
This you can compute in linear time.
This type of ideas is called "Dynamic Programming"
The overflow error suggest that there's an undesired conversion to float, which I assume you did not intend. The conversion happens because the type of 1e9 is float. To fix this, use % int(1e9+7) instead of % 1e9+7

Binary mask with shift operation without cycle

We have some large binary number N (large means millions of digits). We also have binary mask M where 1 means that we must remove digit in this position in number N and move all higher bits one position right.
Example:
N = 100011101110
M = 000010001000
Res 1000110110
Is it possible to solve this problem without cycle with some set of logical or arithmetical operations? We can assume that we have access to bignum arithmetic in Python.
Feels like it should be something like this:
Res = N - (N xor M)
But it doesn't work
UPD: My current solution with cycle is following:
def prepare_reduced_arrays(dict_of_N, mask):
'''
mask: string '0000011000'
each element of dict_of_N - big python integer
'''
capacity = len(mask)
answer = dict()
for el in dict_of_N:
answer[el] = 0
new_capacity = 0
for i in range(capacity - 1, -1, -1):
if mask[i] == '1':
continue
cap2 = (1 << new_capacity)
pos = (capacity - i - 1)
for el in dict_of_N:
current_bit = (dict_of_N[el] >> pos) & 1
if current_bit:
answer[el] |= cap2
new_capacity += 1
return answer, new_capacity
While this may not be possible without a loop in python, it can be made extremely fast with numba and just in time compilation. I went on the assumption that your inputs could be easily represented as boolean arrays, which would be very simple to construct from a binary file using struct. The method I have implemented involves iterating a few different objects, however these iterations were chosen carefully to make sure they were compiler optimized, and never doing the same work twice. The first iteration is using np.where to locate the indices of all the bits to delete. This specific function (among many others) is optimized by the numba compiler. I then use this list of bit indices to build the slice indices for slices of bits to keep. The final loop copies these slices to an empty output array.
import numpy as np
from numba import jit
from time import time
def binary_mask(num, mask):
num_nbits = num.shape[0] #how many bits are in our big num
mask_bits = np.where(mask)[0] #which bits are we deleting
mask_n_bits = mask_bits.shape[0] #how many bits are we deleting
start = np.empty(mask_n_bits + 1, dtype=int) #preallocate array for slice start indexes
start[0] = 0 #first slice starts at 0
start[1:] = mask_bits + 1 #subsequent slices start 1 after each True bit in mask
end = np.empty(mask_n_bits + 1, dtype=int) #preallocate array for slice end indexes
end[:mask_n_bits] = mask_bits #each slice ends on (but does not include) True bits in the mask
end[mask_n_bits] = num_nbits + 1 #last slice goes all the way to the end
out = np.empty(num_nbits - mask_n_bits, dtype=np.uint8) #preallocate return array
for i in range(mask_n_bits + 1): #for each slice
a = start[i] #use local variables to reduce number of lookups
b = end[i]
c = a - i
d = b - i
out[c:d] = num[a:b] #copy slices
return out
jit_binary_mask = jit("b1[:](b1[:], b1[:])")(binary_mask) #decorator without syntax sugar
###################### Benchmark ########################
bignum = np.random.randint(0,2,1000000, dtype=bool) # 1 million random bits
bigmask = np.random.randint(0,10,1000000, dtype=np.uint8)==9 #delete about 1 in 10 bits
t = time()
for _ in range(10): #10 cycles of just numpy implementation
out = binary_mask(bignum, bigmask)
print(f"non-jit: {time()-t} seconds")
t = time()
out = jit_binary_mask(bignum, bigmask) #once ahead of time to compile
compile_and_run = time() - t
t = time()
for _ in range(10): #10 cycles of compiled numpy implementation
out = jit_binary_mask(bignum, bigmask)
jit_runtime = time()-t
print(f"jit: {jit_runtime} seconds")
print(f"estimated compile_time: {compile_and_run - jit_runtime/10}")
In this example, I execute the benchmark on a boolean array of length 1,000,000 a total of 10 times for both the compiled and un-compiled version. On my laptop, the output is:
non-jit: 1.865583896636963 seconds
jit: 0.06370806694030762 seconds
estimated compile_time: 0.1652850866317749
As you can see with a simple algorithm like this, very significant performance gains can be seen from compilation. (in my case about 20-30x speedup)
As far as I know, this can be done without the use of loops if and only if M is a power of 2.
Let's take your example, and modify M so that it is a power of 2:
N = 0b100011101110 = 2286
M = 0b000000001000 = 8
Removing the fourth lowest bit from N and shifting the higher bits to the right would result in:
N = 0b10001110110 = 1142
We achieved this using the following algorithm:
Begin with N = 0b100011101110 = 2286
Iterate from the most-significant bit to the least-significant bit in M.
If the current bit in M is set to 1, then store the lower bits in some variable, x:
x = 0b1101110
Then, subtract every bit up to and including the current bit in M from N, so that we end up with the following:
N - (0b10000000 + x) = N - (0b10000000 + 0b1101110) = 0b100011101110 - 0b11101110 = 0b100000000000
This step can also be achieved by and-ing the bits with 0, which may be more efficient.
Next, we shift the result once to the right:
0b100000000000 >> 1 = 0b10000000000
Finally, we add back x to the shifted result:
0b10000000000 + x = 0b10000000000 + 0b1101110 = 0b10001101110 = 1142
There may be a possibility that this can somehow be done without loops, but it would actually be efficient if you were to simply iterate over M (from the most-significant bit to the least-significant bit) and performed this process on every set bit, as the time complexity would be O(M.bit_length()).
I wrote up the code for this algorithm as well, and I believe it's relatively efficient, but I don't have any big binary numbers to test it with:
def remove_bits(N, M):
bit = 2 ** (M.bit_length() - 1)
while bit != 0:
if M & bit:
ones = bit - 1
# Store lower `bit` bits.
temp = N & ones
# Clear lower `bit` bits.
N &= ~ones
# Shift once to the right.
N >>= 1
# Set stored lower `bit` bits.
N |= temp
bit >>= 1
return N
if __name__ == '__main__':
N = 0b100011101110
M = 0b000010001000
print(bin(remove_bits(N, M)))
Using your example, this returns your result: 0b1000110110
I don't think there's any way to do this in a constant number of calls to the built-in bitwise operators. Python would have to provide something like PEXT for that to be possible.
For literally millions of digits, you may actually get best performance by working in terms of sequences of bits, sacrificing the space advantages of Python ints and the time advantages of bitwise operations in favor of more flexibility in the operations you can perform. I don't know where the break-even point would be:
import itertools
bits = bin(N)[2:]
maskbits = bin(M)[2:].zfill(len(bits))
bits = bits.zfill(len(maskbits))
chosenbits = itertools.compress(bits, map('0'.__eq__, maskbits))
result = int(''.join(chosenbits), 2)

Find length of a string that includes its own length?

I want to get the length of a string including a part of the string that represents its own length without padding or using structs or anything like that that forces fixed lengths.
So for example I want to be able to take this string as input:
"A string|"
And return this:
"A string|11"
On the basis of the OP tolerating such an approach (and to provide an implementation technique for the eventual python answer), here's a solution in Java.
final String s = "A String|";
int n = s.length(); // `length()` returns the length of the string.
String t; // the result
do {
t = s + n; // append the stringified n to the original string
if (n == t.length()){
return t; // string length no longer changing; we're good.
}
n = t.length(); // n must hold the total length
} while (true); // round again
The problem of, course, is that in appending n, the string length changes. But luckily, the length only ever increases or stays the same. So it will converge very quickly: due to the logarithmic nature of the length of n. In this particular case, the attempted values of n are 9, 10, and 11. And that's a pernicious case.
A simple solution is :
def addlength(string):
n1=len(string)
n2=len(str(n1))+n1
n2 += len(str(n2))-len(str(n1)) # a carry can arise
return string+str(n2)
Since a possible carry will increase the length by at most one unit.
Examples :
In [2]: addlength('a'*8)
Out[2]: 'aaaaaaaa9'
In [3]: addlength('a'*9)
Out[3]: 'aaaaaaaaa11'
In [4]: addlength('a'*99)
Out[4]: 'aaaaa...aaa102'
In [5]: addlength('a'*999)
Out[5]: 'aaaa...aaa1003'
Here is a simple python port of Bathsheba's answer :
def str_len(s):
n = len(s)
t = ''
while True:
t = s + str(n)
if n == len(t):
return t
n = len(t)
This is a much more clever and simple way than anything I was thinking of trying!
Suppose you had s = 'abcdefgh|, On the first pass through, t = 'abcdefgh|9
Since n != len(t) ( which is now 10 ) it goes through again : t = 'abcdefgh|' + str(n) and str(n)='10' so you have abcdefgh|10 which is still not quite right! Now n=len(t) which is finally n=11 you get it right then. Pretty clever solution!
It is a tricky one, but I think I've figured it out.
Done in a hurry in Python 2.7, please fully test - this should handle strings up to 998 characters:
import sys
orig = sys.argv[1]
origLen = len(orig)
if (origLen >= 98):
extra = str(origLen + 3)
elif (origLen >= 8):
extra = str(origLen + 2)
else:
extra = str(origLen + 1)
final = orig + extra
print final
Results of very brief testing
C:\Users\PH\Desktop>python test.py "tiny|"
tiny|6
C:\Users\PH\Desktop>python test.py "myString|"
myString|11
C:\Users\PH\Desktop>python test.py "myStringWith98Characters.........................................................................|"
myStringWith98Characters.........................................................................|101
Just find the length of the string. Then iterate through each value of the number of digits the length of the resulting string can possibly have. While iterating, check if the sum of the number of digits to be appended and the initial string length is equal to the length of the resulting string.
def get_length(s):
s = s + "|"
result = ""
len_s = len(s)
i = 1
while True:
candidate = len_s + i
if len(str(candidate)) == i:
result = s + str(len_s + i)
break
i += 1
This code gives the result.
I used a few var, but at the end it shows the output you want:
def len_s(s):
s = s + '|'
b = len(s)
z = s + str(b)
length = len(z)
new_s = s + str(length)
new_len = len(new_s)
return s + str(new_len)
s = "A string"
print len_s(s)
Here's a direct equation for this (so it's not necessary to construct the string). If s is the string, then the length of the string including the length of the appended length will be:
L1 = len(s) + 1 + int(log10(len(s) + 1 + int(log10(len(s)))))
The idea here is that a direct calculation is only problematic when the appended length will push the length past a power of ten; that is, at 9, 98, 99, 997, 998, 999, 9996, etc. To work this through, 1 + int(log10(len(s))) is the number of digits in the length of s. If we add that to len(s), then 9->10, 98->100, 99->101, etc, but still 8->9, 97->99, etc, so we can push past the power of ten exactly as needed. That is, adding this produces a number with the correct number of digits after the addition. Then do the log again to find the length of that number and that's the answer.
To test this:
from math import log10
def find_length(s):
L1 = len(s) + 1 + int(log10(len(s) + 1 + int(log10(len(s)))))
return L1
# test, just looking at lengths around 10**n
for i in range(9):
for j in range(30):
L = abs(10**i - j + 10) + 1
s = "a"*L
x0 = find_length(s)
new0 = s+`x0`
if len(new0)!=x0:
print "error", len(s), x0, log10(len(s)), log10(x0)

Python: "long int too large to convert to float" when calculating pi

I get this error when using a python script that calculates pi using the Gauss-Legendre algorithm. You can only use up to 1024 iterations before getting this:
C:\Users\myUsernameHere>python Desktop/piWriter.py
End iteration: 1025
Traceback (most recent call last):
File "Desktop/piWriter.py", line 15, in <module>
vars()['t' + str(sub)] = vars()['t' + str(i)] - vars()['p' + str(i)] * math.
pow((vars()['a' + str(i)] - vars()['a' + str(sub)]), 2)
OverflowError: long int too large to convert to float
Here is my code:
import math
a0 = 1
b0 = 1/math.sqrt(2)
t0 = .25
p0 = 1
finalIter = input('End iteration: ')
finalIter = int(finalIter)
for i in range(0, finalIter):
sub = i + 1
vars()['a' + str(sub)] = (vars()['a' + str(i)] + vars()['b' + str(i)])/ 2
vars()['b' + str(sub)] = math.sqrt((vars()['a' + str(i)] * vars()['b' + str(i)]))
vars()['t' + str(sub)] = vars()['t' + str(i)] - vars()['p' + str(i)] * math.pow((vars()['a' + str(i)] - vars()['a' + str(sub)]), 2)
vars()['p' + str(sub)] = 2 * vars()['p' + str(i)]
n = i
pi = math.pow((vars()['a' + str(n)] + vars()['b' + str(n)]), 2) / (4 * vars()['t' + str(n)])
print(pi)
Ideally, I want to be able to plug in a very large number as the iteration value and come back a while later to see the result.
Any help appreciated!
Thanks!
Floats can only represent numbers up to sys.float_info.max, or 1.7976931348623157e+308. Once you have an int with more than 308 digits (or so), you are stuck. Your iteration fails when p1024 has 309 digits:
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216L
You'll have to find a different algorithm for pi, one that doesn't require such large values.
Actually, you'll have to be careful with floats all around, since they are only approximations. If you modify your program to print the successive approximations of pi, it looks like this:
2.914213562373094923430016933707520365715026855468750000000000
3.140579250522168575088244324433617293834686279296875000000000
3.141592646213542838751209274050779640674591064453125000000000
3.141592653589794004176383168669417500495910644531250000000000
3.141592653589794004176383168669417500495910644531250000000000
3.141592653589794004176383168669417500495910644531250000000000
3.141592653589794004176383168669417500495910644531250000000000
In other words, after only 4 iterations, your approximation has stopped getting better. This is due to inaccuracies in the floats you are using, perhaps starting with 1/math.sqrt(2). Computing many digits of pi requires a very careful understanding of the numeric representation.
As noted in previous answer, the float type has an upper bound on number size. In typical implementations, sys.float_info.max is 1.7976931348623157e+308, which reflects the use of 10 bits plus sign for the exponent field in a 64-bit floating point number. (Note that 1024*math.log(2)/math.log(10) is about 308.2547155599.)
You can add another half dozen decades to the exponent size by using the Decimal number type. Here is an example (snipped from an ipython interpreter session):
In [48]: import decimal, math
In [49]: g=decimal.Decimal('1e12345')
In [50]: g.sqrt()
Out[50]: Decimal('3.162277660168379331998893544E+6172')
In [51]: math.sqrt(g)
Out[51]: inf
This illustrates that decimal's sqrt() function performs correctly with larger numbers than does math.sqrt().
As noted above, getting lots of digits is going to be tricky, but looking at all those vars hurts my eyes. So here's a version of your code after (1) replacing your use of vars with dictionaries, and (2) using ** instead of the math functions:
a, b, t, p = {}, {}, {}, {}
a[0] = 1
b[0] = 2**-0.5
t[0] = 0.25
p[0] = 1
finalIter = 4
for i in range(finalIter):
sub = i + 1
a[sub] = (a[i] + b[i]) / 2
b[sub] = (a[i] * b[i])**0.5
t[sub] = t[i] - p[i] * (a[i] - a[sub])**2
p[sub] = 2 * p[i]
n = i
pi_approx = (a[n] + b[n])**2 / (4 * t[n])
Instead of playing games with vars, I've used dictionaries to store the values (the link there is to the official Python tutorial) which makes your code much more readable. You can probably even see an optimization or two now.
As noted in the comments, you really don't need to store all the values, only the last, but I think it's more important that you see how to do things without dynamically creating variables. Instead of a dict, you could also have simply appended the values to a list, but lists are always zero-indexed and you can't easily "skip ahead" and set values at arbitrary indices. That can occasionally be confusing when working with algorithms, so let's start simple.
Anyway, the above gives me
>>> print(pi_approx)
3.141592653589794
>>> print(pi_approx-math.pi)
8.881784197001252e-16
A simple solution is to install and use the arbitrary-precisionmpmath module which now supports Python 3. However, since I completely agree with DSM that your use ofvars()to create variables on the fly is an undesirable way to implement the algorithm, I've based my answer on his rewrite of your code and [trivially] modified it to make use ofmpmath to do the calculations.
If you insist on usingvars(), you could probably do something similar -- although I suspect it might be more difficult and the result would definitely harder to read, understand, and modify.
from mpmath import mpf # arbitrary-precision float type
a, b, t, p = {}, {}, {}, {}
a[0] = mpf(1)
b[0] = mpf(2**-0.5)
t[0] = mpf(0.25)
p[0] = mpf(1)
finalIter = 10000
for i in range(finalIter):
sub = i + 1
a[sub] = (a[i] + b[i]) / 2
b[sub] = (a[i] * b[i])**0.5
t[sub] = t[i] - p[i] * (a[i] - a[sub])**2
p[sub] = 2 * p[i]
n = i
pi_approx = (a[n] + b[n])**2 / (4 * t[n])
print(pi_approx) # 3.14159265358979

Categories