This question is a parallel to python - How do I decompose a number into powers of 2?. Indeed, it is the same question, but rather than using Python (or Javascript, or C++, as these also seem to exist), I'm wondering how it can be done using Lua. I have a very basic understanding of Python, so I took the code first listed in the site above and attempted to translate it to Lua, with no success. Here's the original, and following, my translation:
Python
def myfunc(x):
powers = []
i = 1
while i <= x:
if i & x:
powers.append(i)
i <<= 1
return powers
Lua
function powerfind(n)
local powers = {}
i = 1
while i <= n do
if bit.band(i, n) then -- bitwise and check
table.insert(powers, i)
end
i = bit.shl(i, 1) -- bitwise shift to the left
end
return powers
end
Unfortunately, my version locks and "runs out of memory". This was after using the number 12 as a test. It's more than likely that my primitive knowledge of Python is failing me, and I'm not able to translate the code from Python to Lua correctly, so hopefully someone can offer a fresh set of eyes and help me fix it.
Thanks to the comments from user2357112, I've got it fixed, so I'm posting the answer in case anyone else comes across this issue:
function powerfind(n)
local powers = {}
i = 1
while i <= n do
if bit.band(i, n) ~= 0 then -- bitwise and check
table.insert(powers, i)
end
i = bit.shl(i, 1) -- bitwise shift to the left
end
return powers
end
I saw that in the other one, it became a sort of speed contest. This one should also be easy to understand.
i is the current power. It isn't used for calculations.
n is the current place in the array.
r is the remainder after a division of x by two.
If the remainder is 1 then you know that i is a power of two which is used in the binary representation of x.
local function powerfind(x)
local powers={
nil,nil,nil,nil,
nil,nil,nil,nil,
nil,nil,nil,nil,
nil,nil,nil,nil,
}
local i,n=1,0
while x~=0 do
local r=x%2
if r==1 then
x,n=x-1,n+1
powers[n]=i
end
x,i=x/2,2*i
end
end
Running a million iterations, x from 1 to 1000000, takes me 0.29 seconds. I initialize the size of the powers table to 16.
Related
I want to be able to access the sign bit of a number in python. I can do something like n >> 31 in C since int is represented as 32 bits.
I can't make use of the conditional operator and > <.
in python 3 integers don't have a fixed size, and aren't represented using the internal CPU representation (which allows to handle very large numbers without trouble).
So the best way is
signbit = 1 if n < 0 else 0
or
signbit = int(n < 0)
EDIT: if you cannot use < or > (which is ludicrious but so be it) you could use the fact that a-b will be positive if a is greater than b, so you could do
abs(a-b) == a-b
that doesn't use < or > (at least in the text, because abs uses it you can trust me)
I would argue that in Python there is not really a concept of a sign bit. As far as the programmer is concerned, an int is just a type with certain behavior. You don't get access to the low-level representation. Even bin(-3) returns a "negative" binary representation: '-0b11'
However, there are ways to get the sign or the bigger if two integers without comparisons. The following approach abuses floating point math to avoid comparisons.
def sign(a):
try:
return (1 - int(a / (a**2)**0.5)) // 2
except ZeroDivisionError:
return 0
def return_bigger(a, b):
s = sign(b - a)
return a * s + b * (1 - s)
assert sign(-33) == 1
assert sign(33) == 0
assert return_bigger(10, 15) == 15
assert return_bigger(25, 3) == 25
assert return_bigger(42, 42) == 42
(a**2)**0.5 could be replaced with abs but I bet internally this is implemented with a comparison.
The try/except is not needed if you don't care about 0 or equal integers (or there may be another horrible math workaround).
Finally, I'd like to point out that I have absolutely no idea why on earth anybody would want to implement something like that, except for the hell of it.
Conceptually, the bit representation of a negative integer is padded with an infinite number of 1 bits to the left (just like a non-negative number is regarded as padded with an infinite number of 0 bits). The operation n >> 31 does work (given that n is in the range of signed 32-bit numbers) in the sense that it places the sign bit (or if you prefer, one of the left-padding bits) in the lowest bit position. You just need to get rid of the rest of the left-padding bits, which you can do with a bitwise and operation like this:
n >> 31 & 1
Or you can make use of the fact that all one bits is how −1 is represented, and simply negate the result:
-(n >> 31)
Alternatively, you can cut off all but the lowest 32 1 bits before you do the shift, like this:
(n & 0xffffffff) >> 31
This is all under the assumption that you are working with numbers that fit into a signed 32-bit int. If n might need a 64 bit representation, shift by 63 places instead of 31, and if it's just 16 bits numbers, shifting by 15 places is enoough. (If you use the (n & 0xffffffff) >> 31 variant, adjust the number of fs accordingly).
On machine code level, and-ing/negating and shifting is potentially much more efficient than using comparison. The former is just a couple of machine instructions, while the latter would usually boil down to a branch. Branching doesn't just take more machine instructions, it also has a bad influence on the pipelining and out-of-order execution of modern CPUs. But Python execution takes place in a higher-level layer than machine code execution, and therefore it's difficult to say anything about the performance impact in Python: it may depend on the context – as it would in machine code – and may therefore also be difficult to test generally. (Caveat: I don't know much about how low-level execution happens in CPython, or in Python in general. For someone who does, this might not be so difficult to say.)
If you don't know how big n is (in Python, an integer is not required to fit into any specific number of bits), you can use bit_length() to find out. This will work for integers of any size:
-(n >> n.bit_length())
The bit_length() operation might boil down to a single machine instruction, or it might actually need a loop to find the result, depending on the implementation and the underlying machine architecture. In the latter case, this should be noticeably more costly than using a constant
Final remark: in C, n >> 31 is actually not guaranteed to work like you assume, because the C language leaves it undefined whether >> does logical right shift (like you assume) or arithmetic shift right (like Python does). In some languages, these are different operators, which makes it clear what you get. In Java for instance, logical shift right is >>>, and arithmetic shift right is >>.
How about this?
def is_negative(num, places):
return not long(num * 10**places) & 0xFFFFFFFF == long(num * 10**places)
Not efficient, but not using < or > definitely restricts you to weirdness. Note that zero will evaluate as positive.
I was wondering whether Gaussian elimination in modulo 2 (or even generally in modulo k for that purpose) has ever been implemented somewhere, so that I do not have to reinvent the wheel and just use the available resources?
The pseudo code of the algorithm you are looking for exists and is:
// A is n by m binary matrix
i := 1 // row and column index
for i := 1 to m do // for every column
// find non-zero element in column i, starting in row i:
maxi := i
for k := i to n do
if A[k,i] = 1 then maxi := k
end for
if A[maxi,i] = 1 then
swap rows i and maxi in A and b, but do not change the value of i
Now A[i,i] will contain the old value of A[maxi,i], that is 1
for u := i+1 to m do
Add A[u,i] * row i to row u, do this for BOTH, matrix A and RHS vector b
Now A[u,i] will be 0
end for
else
declare error – more than one solution exist
end if
end for
if n>m and if you can find zero row in A with nonzero RHS element, then
declare error – no solution.
end if
// now, matrix A is in upper triangular form and solution can be found
use back substitution to find vector x
Taken from this pdf
Binary arithmetic means arithmetic in modulo 2, which is what you are looking for in your question, if I am not wrong.
Unfortunately I don't code in Python, but if you are familiar with Python, you can simply translate the pseudo code above to Python line by line in your own way for your convenience, and this task should be neither difficult nor long.
I googled "gaussian elimination modulo 2 python", but didn't find the python code you are looking for, but I think that this is for good, because during the translation you may understand the algorithm and the method better.
EDIT 1: If you are also familiar with C# and it's not effort for you to translate C# to Python, then Michael Anderson's answer to this question may also help you.
EDIT 2: After posting the answer, I continued the search and found this
"over any field" implies "over modulo 2" and even "over modulo k" for any k≥2.
It contains Source Code for Java Version and Python version too.
According to the last link I gave you for the Python version fieldmath.py includes the class BinaryField which suppose to be the modulo 2 as you wish.
Enjoy!
I just hope that Gauss-Jordan elimination and Gaussian Elimination are not two different things.
EDIT 3: If you are also familiar with VC++ and translating VC++ to Python isn't effort for you then you can also try this.
I hope this well answers your question.
I am attempting problem 10 of Project Euler, which is the summation of all primes below 2,000,000. I have tried implementing the Sieve of Erasthotenes using Python, and the code I wrote works perfectly for numbers below 10,000.
However, when I attempt to find the summation of primes for bigger numbers, the code takes too long to run (finding the sum of primes up to 100,000 took 315 seconds). The algorithm clearly needs optimization.
Yes, I have looked at other posts on this website, like Fastest way to list all primes below N, but the solutions there had very little explanation as to how the code worked (I am still a beginner programmer) so I was not able to actually learn from them.
Can someone please help me optimize my code, and clearly explain how it works along the way?
Here is my code:
primes_below_number = 2000000 # number to find summation of all primes below number
numbers = (range(1, primes_below_number + 1, 2)) # creates a list excluding even numbers
pos = 0 # index position
sum_of_primes = 0 # total sum
number = numbers[pos]
while number < primes_below_number and pos < len(numbers) - 1:
pos += 1
number = numbers[pos] # moves to next prime in list numbers
sum_of_primes += number # adds prime to total sum
num = number
while num < primes_below_number:
num += number
if num in numbers[:]:
numbers.remove(num) # removes multiples of prime found
print sum_of_primes + 2
As I said before, I am new to programming, therefore a thorough explanation of any complicated concepts would be deeply appreciated. Thank you.
As you've seen, there are various ways to implement the Sieve of Erasthotenes in Python that are more efficient than your code. I don't want to confuse you with fancy code, but I can show how to speed up your code a fair bit.
Firstly, searching a list isn't fast, and removing elements from a list is even slower. However, Python provides a set type which is quite efficient at performing both of those operations (although it does chew up a bit more RAM than a simple list). Happily, it's easy to modify your code to use a set instead of a list.
Another optimization is that we don't have to check for prime factors all the way up to primes_below_number, which I've renamed to hi in the code below. It's sufficient to just go to the square root of hi, since if a number is composite it must have a factor less than or equal to its square root.
We don't need to keep a running total of the sum of the primes. It's better to do that at the end using Python's built-in sum() function, which operates at C speed, so it's much faster than doing the additions one by one at Python speed.
# number to find summation of all primes below number
hi = 2000000
# create a set excluding even numbers
numbers = set(xrange(3, hi + 1, 2))
for number in xrange(3, int(hi ** 0.5) + 1):
if number not in numbers:
#number must have been removed because it has a prime factor
continue
num = number
while num < hi:
num += number
if num in numbers:
# Remove multiples of prime found
numbers.remove(num)
print 2 + sum(numbers)
You should find that this code runs in a a few seconds; it takes around 5 seconds on my 2GHz single-core machine.
You'll notice that I've moved the comments so that they're above the line they're commenting on. That's the preferred style in Python since we prefer short lines, and also inline comments tend to make the code look cluttered.
There's another small optimization that can be made to the inner while loop, but I let you figure that out for yourself. :)
First, removing numbers from the list will be very slow. Instead of this, make a list
primes = primes_below_number * True
primes[0] = False
primes[1] = False
Now in your loop, when you find a prime p, change primes[k*p] to False for all suitable k. (You wouldn't actually do multiply, you'd continually add p, of course.)
At the end,
primes = [n for n i range(primes_below_number) if primes[n]]
This should be a great deal faster.
Second, you can stop looking once your find a prime greater than the square root of primes_below_number, since a composite number must have a prime factor that doesn't exceed its square root.
Try using numpy, should make it faster. Replace range by xrange, it may help you.
Here's an optimization for your code:
import itertools
primes_below_number = 2000000
numbers = list(range(3, primes_below_number, 2))
pos = 0
while pos < len(numbers) - 1:
number = numbers[pos]
numbers = list(
itertools.chain(
itertools.islice(numbers, 0, pos + 1),
itertools.ifilter(
lambda n: n % number != 0,
itertools.islice(numbers, pos + 1, len(numbers))
)
)
)
pos += 1
sum_of_primes = sum(numbers) + 2
print sum_of_primes
The optimization here is because:
Removed the sum to outside the loop.
Instead of removing elements from a list we can just create another one, memory is not an issue here (I hope).
When creating the new list we create it by chaining two parts, the first part is everything before the current number (we already checked those), and the second part is everything after the current number but only if they are not divisible by the current number.
Using itertools can make things faster since we'd be using iterators instead of looping through the whole list more than once.
Another solution would be to not remove parts of the list but disable them like #saulspatz said.
And here's the fastest way I was able to find: http://www.wolframalpha.com/input/?i=sum+of+all+primes+below+2+million 😁
Update
Here is the boolean method:
import itertools
primes_below_number = 2000000
numbers = [v % 2 != 0 for v in xrange(primes_below_number)]
numbers[0] = False
numbers[1] = False
numbers[2] = True
number = 3
while number < primes_below_number:
n = number * 3 # We already excluded even numbers
while n < primes_below_number:
numbers[n] = False
n += number
number += 1
while number < primes_below_number and not numbers[number]:
number += 1
sum_of_numbers = sum(itertools.imap(lambda index_n: index_n[1] and index_n[0] or 0, enumerate(numbers)))
print(sum_of_numbers)
This executes in seconds (took 3 seconds on my 2.4GHz machine).
Instead of storing a list of numbers, you can instead store an array of boolean values. This use of a bitmap can be thought of as a way to implement a set, which works well for dense sets (there aren't big gaps between the values of members).
An answer on a recent python sieve question uses this implementation python-style. It turns out a lot of people have implemented a sieve, or something they thought was a sieve, and then come on SO to ask why it was slow. :P Look at the related-questions sidebar from some of them if you want more reading material.
Finding the element that holds the boolean that says whether a number is in the set or not is easy and extremely fast. array[i] is a boolean value that's true if i is in the set, false if not. The memory address can be computed directly from i with a single addition.
(I'm glossing over the fact that an array of boolean might be stored with a whole byte for each element, rather than the more efficient implementation of using every single bit for a different element. Any decent sieve will use a bitmap.)
Removing a number from the set is as simple as setting array[i] = false, regardless of the previous value. No searching, not comparison, no tracking of what happened, just one memory operation. (Well, two for a bitmap: load the old byte, clear the correct bit, store it. Memory is byte-addressable, but not bit-addressable.)
An easy optimization of the bitmap-based sieve is to not even store the even-numbered bytes, because there is only one even prime, and we can special-case it to double our memory density. Then the membership-status of i is held in array[i/2]. (Dividing by powers of two is easy for computers. Other values are much slower.)
An SO question:
Why is Sieve of Eratosthenes more efficient than the simple "dumb" algorithm? has many links to good stuff about the sieve. This one in particular has some good discussion about it, in words rather than just code. (Nevermind the fact that it's talking about a common Haskell implementation that looks like a sieve, but actually isn't. They call this the "unfaithful" sieve in their graphs, and so on.)
discussion on that question brought up the point that trial division may be fast than big sieves, for some uses, because clearing the bits for all multiples of every prime touches a lot of memory in a cache-unfriendly pattern. CPUs are much faster than memory these days.
I am trying to get an accepted answer for this question:http://www.spoj.com/problems/PRIME1/
It's nothing new, just wanting prime numbers to be generated between two given numbers. Eventually, I have coded the following. But spoj is giving me runtime-error(nzec), and I have no idea how it should be dealt with. I hope you can help me with it. Thanks in advance.
def is_prime(m,n):
myList= []
mySieve= [True] * (n+1)
for i in range(2,n+1):
if mySieve[i]:
myList.append(i)
for x in range(i*i,n+1,i):
mySieve[x]= False
for a in [y for y in myList if y>=m]:
print(a)
t= input()
count = 0
while count <int(t):
m, n = input().split()
count +=1
is_prime(int(m),int(n))
if count == int(t):
break
print("\n")
Looking at the problem definition:
In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.
Looking at your code:
mySieve= [True] * (n+1)
So, if n is 1000000000, you're going to try to create a list of 1000000001 boolean values. That means you're asking Python to allocate storage for a billion pointers. On a 64-bit platform, that's 8GB—which is fine as far as Python's concerned, but might well throw your system into swap hell or get it killed by a limit or watchdog. On a 32-bit platform, that's 4GB—which will guarantee you a MemoryError.
The problem also explicitly has this warning:
Warning: large Input/Output data, be careful with certain languages
So, if you want to implement it this way, you're going to have to come up with a more compact storage. For example, array.array('B', [True]) * (n+1) will only take 1GB instead of 4 or 8. And you can make it even smaller (128MB) if you store it in bits instead of bytes, but that's not quite as trivial a change to code.
Calculating prime numbers between two numbers is meaningless. You can only calculate prime numbers until a given number by using other primes you found before, then show only range you wanted.
Here is a python code and some calculated primes you can continue by using them:
bzr branch http://bzr.ceremcem.net/calc-primes
This code is somewhat trivial but is working correctly and tested well.
I am new to StackOverflow, and I am extremely new to Python.
My problem is this... I am needing to write a double-sum, as follows:
The motivation is that this is the angular correction to the gravitational potential used for the geoid.
I am having difficulty writing the sums. And please, before you say "Go to such-and-such a resource," or get impatient with me, this is the first time I have ever done coding/programming/whatever this is.
Is this a good place to use a "for" loop?
I have data for the two indices (n,m) and for the coefficients c_{nm} and s_{nm} in a .txt file. Each of those items is a column. When I say usecols, do I number them 0 through 3, or 1 through 4?
(the equation above)
\begin{equation}
V(r, \phi, \lambda) = \sum_{n=2}^{360}\left(\frac{a}{r}\right)^{n}\sum_{m=0}^{n}\left[c_{nm}*\cos{(m\lambda)} + s_{nm}*\sin{(m\lambda)}\right]*\sqrt{\frac{(n-m)!}{(n+m)!}(2n + 1)(2 - \delta_{m0})}P_{nm}(\sin{\lambda})
\end{equation}
(2) Yes, a "for" loop is fine. As #jpmc26 notes, a generator expression is a good alternative to a "for" loop. IMO, you'll want to use numpy if efficiency is important to you.
(3) As #askewchan notes, "usecols" refers to an argument of genfromtxt; as specified in that documentation, column indexes start at 0, so you'll want to use 0 to 3.
A naive implementation might be okay since the larger factorial is the denominator, but I wouldn't be surprised if you run into numerical issues. Here's something to get you started. Note that you'll need to define P() and a. I don't understand how "0 through 3" relates to c and s since their indexes range much further. I'm going to assume that each (and delta) has its own file of values.
import math
import numpy
c = numpy.getfromtxt("the_c_file.txt")
s = numpy.getfromtxt("the_s_file.txt")
delta = numpy.getfromtxt("the_delta_file.txt")
def V(r, phi, lam):
ret = 0
for n in xrange(2, 361):
for m in xrange(0, n + 1):
inner = c[n,m]*math.cos(m*lam) + s[n,m]*math.sin(m*lam)
inner *= math.sqrt(math.factorial(n-m)/math.factorial(n+m)*(2*n+1)*(2-delta[m,0]))
inner *= P(n, m, math.sin(lam))
ret += math.pow(a/r, n) * inner
return ret
Make sure to write unittests to check the math. Note that "lambda" is a reserved word.