Binary negation in python - python

I can't seem to find logical negation of integers as an operator anywhere in Python.
Currently I'm using this:
def not_(x):
assert x in (0, 1)
return abs(1-x)
But I feel a little stupid. Isn't there a built-in operator for this? The logical negation (not) returns a Boolean -- that's not really what I want. Is there a different operator, or a way to make not return an integer, or am I stuck with this dodgy workaround?

You can use:
int(not x)
to convert the boolean to 0 or 1.

Did you mean:
int(not(x))
? Assuming that any non-zero integer value is true and 0 is false you'll always get integer 0 or 1 as a result.

If what you expect is to get 1 when input is 0, and 0 and when input is 1, then XOR is your friend. You need to XOR your value with 1:
negate = lambda x: x ^ True
negate(0)
Out: 1
negate(1)
Out: 0
negate(False)
Out: True
negate(True)
Out: False

If you are looking for Bitwise Not, then ~ is what you are looking for. However, it works in the two's complement form.

This will raise a KeyError if x is not in (0,1)
def not_(x):
return {1:0,0:1}[x]
The tuple version would also accept -1 if you don't add a check for it, but is probably faster
def not_(x):
return (1,0)[x]
$ python -m timeit "(1,0)[0]"
10000000 loops, best of 3: 0.0629 usec per loop
$ python -m timeit "(1,0)[1]"
10000000 loops, best of 3: 0.0646 usec per loop
$ python -m timeit "1^1"
10000000 loops, best of 3: 0.063 usec per loop
$ python -m timeit "1^0"
10000000 loops, best of 3: 0.0638 usec per loop
$ python -m timeit "int(not(0))"
1000000 loops, best of 3: 0.354 usec per loop
$ python -m timeit "int(not(1))"
1000000 loops, best of 3: 0.354 usec per loop
$ python -m timeit "{1:0,0:1}[0]"
1000000 loops, best of 3: 0.446 usec per loop
$ python -m timeit "{1:0,0:1}[1]"
1000000 loops, best of 3: 0.443 usec per loop

You can use not but then convert result to integer.
int(False)
0
int(True)
1

I think Your approach is very good for two reasons:
It is fast, clear and understandable
It does error-checking
I assume that there cannot be such operator defined on the integers, because of the following problem: what to return if given value is not 0 or 1? Throw exception? Assume positive integers to mean 1? But negative integers?
Your approach defines concrete behaviour - accept only 0 or 1.

This can be easily done using some basic binary and string manipulation features in python
if x be an integer for which we want a bitwise negation, which is
called x_bar(learned in digital class :))
>>> x_bar = x^int('1'*len(bin(x).split('b')[1]),2)
>>> bin(x_bar) #returns the binary string representation of integer 'x'
bin(int_value) function returns the binary string representation of any integer eg: '0b11011011'
xor operation is done with '1's'

Related

Which is the efficient way to convert a float into an int in python?

I've been using n = int(n) to convert a float into an int.
Recently, I came across another way to do the same thing :
n = n // 1
Which is the most efficient way, and why?
Test it with timeit:
$ bin/python -mtimeit -n10000000 -s 'n = 1.345' 'int(n)'
10000000 loops, best of 3: 0.234 usec per loop
$ bin/python -mtimeit -n10000000 -s 'n = 1.345' 'n // 1'
10000000 loops, best of 3: 0.218 usec per loop
So floor division is only a faster by a small margin. Note that these values are very close, and I had to crank up the loop repeat count to iron out random influences on my machine. Even with such a high count, you need to repeat the experiments a few times to see how much the numbers still vary and what comes out faster most of the time.
This is logical, as int() requires a global lookup and a function call (so state is pushed and popped):
>>> import dis
>>> def use_int(n):
... return int(n)
...
>>> def use_floordiv(n):
... return n // 1
...
>>> dis.dis(use_int)
2 0 LOAD_GLOBAL 0 (int)
3 LOAD_FAST 0 (n)
6 CALL_FUNCTION 1
9 RETURN_VALUE
>>> dis.dis(use_floordiv)
2 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (1)
6 BINARY_FLOOR_DIVIDE
7 RETURN_VALUE
It is the LOAD_GLOBAL and CALL_FUNCTION opcodes that are slower than the LOAD_CONST and BINARY_FLOOR_DIVIDE opcodes; LOAD_CONST is a simple array lookup, LOAD_GLOBAL needs to do a dictionary lookup instead.
Binding int() to a local name can make a small difference, giving it the edge again (as it has to do less work than // 1 floor division):
$ bin/python -mtimeit -n10000000 -s 'n = 1.345' 'int(n)'
10000000 loops, best of 3: 0.233 usec per loop
$ bin/python -mtimeit -n10000000 -s 'n = 1.345; int_=int' 'int_(n)'
10000000 loops, best of 3: 0.195 usec per loop
$ bin/python -mtimeit -n10000000 -s 'n = 1.345' 'n // 1'
10000000 loops, best of 3: 0.225 usec per loop
Again, you need to run this with 10 million loops to see the differences consistently.
That said, int(n) is a lot more explicit and unless you are doing this in a time-critical loop, int(n) wins it in readability over n // 1. The timing differences are too small to make the cognitive cost of having to work out what // 1 does here worthwhile.
Although Martijn Pieters answered your question of what is faster and how to test it I feel like speed isn't that important for such a small operation. I would use int() for readability as Inbar Rose said. Typically when dealing with something this small readability is far more important; although, a common equation can be an exception to this.
Actually, int seems to be faster than the division. The slow part is looking the function up in the global scope.
Here are my numbers if we avoid it:
$ python -mtimeit -s 'i=int; a=123.456' 'i(a)'
10000000 loops, best of 3: 0.122 usec per loop
$ python -mtimeit -s 'i=int; a=123.456' 'a//1'
10000000 loops, best of 3: 0.145 usec per loop
Notice that you are not converting from float to int using the floor division operator. The result of this operation is still a float. In Python 2.7.5 (CPython), n=n//1 is exactly the same thing of:
n.__floordiv__(1)
that is basically the same thing of:
n.__divmod__(1)[0]
both functions return a float instead of an int. Inside the CPython __divmod__ function, the denominator and numerator must be converted from PyObject to double. So, in this case, it's faster to use the floor function instead of the // operator, because only one conversion is needed.
from cmath import floor
n=floor(n)
In the case you really want to convert a float to integer, I don't think there is a way to beat up the int(n) performance.
Too long; didn't read:
Using float.__trunc__() is 30% faster than builtins.int()
I like long explanations:
#MartijnPieters trick to bind builtins.int is interesting indeed and it reminds me to An Optimization Anecdote. However, calling builtins.int is not the most efficient.
Let's take a look at this:
python -m timeit -n10000000 -s "n = 1.345" "int(n)"
10000000 loops, best of 5: 48.5 nsec per loop
python -m timeit -n10000000 -s "n = 1.345" "n.__trunc__()"
10000000 loops, best of 5: 33.1 nsec per loop
That's a 30% gain! What's happening here?
It turns out all builtints.int does is invoke the following method-chains:
If 1.345.__int__ is defined return 1.345.__int__() else:
If 1.345.__index__ is defined return 1.345.__index__() else:
If 1.345.__trunc__ is defined return 1.345.__trunc__()
1.345.__int__ is not defined 1 - and neither is 1.345.__index__. Therefore, directly calling 1.345.__trunc__() allow us to skip all the unnecessary method calls - which is relatively expensive.
What about the binding trick? Well float.__trunc__ is essentially just an instance method and we can pass 1.345 as the self argument.
python -m timeit -n10000000 -s "n = 1.345; f=int" "f(n)"
10000000 loops, best of 5: 43 nsec per loop
python -m timeit -n10000000 -s "n = 1.345; f=float.__trunc__" "f(n)"
10000000 loops, best of 5: 27.4 nsec per loop
Both methods improved as expected 2 and they maintain roughly the same ratio!
1 I'm not entirely certain about this - correct me if somebody knows otherwise.
2 This surprised me because I was under the impression that float.__trunc__ is binded to 1.345 during instance creation. It'd be great if anyone'd be kind enough to explain this to me.
There is also this method builtins.float.__floor__ that is not mentioned in the documentation - and is faster than builtins.int but slower than buitlins.float.__trunc__.
python -m timeit -n10000000 -s "n = 1.345; f=float.__floor__" "f(n)"
10000000 loops, best of 5: 32.4 nsec per loop
It seems to produce the same results on both negative and positive floats. Would be awesome if someone could explain how this fits among the other methods.
Just a statistical test to have a little fun - change the timeit tests to whatever you prefer:
import timeit
from scipy import mean, std, stats, sqrt
# Parameters:
reps = 100000
dups = 50
signif = 0.01
timeit_setup1 = 'i=int; a=123.456'
timeit_test1 = 'i(a)'
timeit_setup2 = 'i=int; a=123.456'
timeit_test2 = 'a//1'
#Some vars
t1_data = []
t2_data = []
frmt = '{:.3f}'
testformat = '{:<'+ str(max([len(timeit_test1), len(timeit_test2)]))+ '}'
def reportdata(mylist):
string = 'mean = ' + frmt.format(mean(mylist)) + ' seconds, st.dev. = ' + \
frmt.format(std(mylist))
return string
for i in range(dups):
t1_data.append(timeit.timeit(timeit_test1, setup = timeit_setup1,
number = reps))
t2_data.append(timeit.timeit(timeit_test2, setup = timeit_setup2,
number = reps))
print testformat.format(timeit_test1) + ':', reportdata(t1_data)
print testformat.format(timeit_test2) + ':', reportdata(t2_data)
ttest = stats.ttest_ind(t1_data, t2_data)
print 't-test: the t value is ' + frmt.format(float(ttest[0])) + \
' and the p-value is ' + frmt.format(float(ttest[1]))
isit = ''
if float(ttest[1]) > signif:
isit = "not "
print 'The difference of ' + \
'{:.2%}'.format(abs((mean(t1_data)-mean(t2_data))/mean(t1_data))) + \
' +/- ' + \
'{:.2%}'.format(3*sqrt((std(t1_data)**2 + std(t2_data)**2)/dups)) + \
' is ' + isit + 'significative.'

String multiplication versus for loop

I was solving a Python question on CodingBat.com. I wrote following code for a simple problem of printing a string n times-
def string_times(str, n):
return n * str
Official result is -
def string_times(str, n):
result = ""
for i in range(n):
result = result + str
return result
print string_times('hello',3)
The output is same for both the functions. I am curious how string multiplication (first function) perform against for loop (second function) on performance basis. I mean which one is faster and mostly used?
Also please suggest me a way to get the answer to this question myself (using time.clock() or something like that)
We can use the timeit module to test this:
python -m timeit "100*'string'"
1000000 loops, best of 3: 0.222 usec per loop
python -m timeit "''.join(['string' for _ in range(100)])"
100000 loops, best of 3: 6.9 usec per loop
python -m timeit "result = ''" "for i in range(100):" " result = result + 'string'"
100000 loops, best of 3: 13.1 usec per loop
You can see that multiplying is the far faster option. You can take note that while the string concatenation version isn't that bad in CPython, that may not be true in other versions of Python. You should always opt for string multiplication or str.join() for this reason - not only but speed, but for readability and conciseness.
I've timed the following three functions:
def string_times_1(s, n):
return s * n
def string_times_2(s, n):
result = ""
for i in range(n):
result = result + s
return result
def string_times_3(s, n):
"".join(s for _ in range(n))
The results are as follows:
In [4]: %timeit string_times_1('hello', 10)
1000000 loops, best of 3: 262 ns per loop
In [5]: %timeit string_times_2('hello', 10)
1000000 loops, best of 3: 1.63 us per loop
In [6]: %timeit string_times_3('hello', 10)
100000 loops, best of 3: 3.87 us per loop
As you can see, s * n is not only the clearest and the most concise, it is also the fastest.
You can use the timeit stuff from either the command line or in code to see how fast some bit of python code is:
$ python -m timeit "\"something\" * 100"
1000000 loops, best of 3: 0.608 usec per loop
Do something similar for your other function and compare.

Converting a list of integers into a single value

If I had list of integers say,
x = [1,2,3,4,5]
Is there an in-built function that can convert this into a single number like 12345? If not, what's the easiest way?
>>> listvar = [1,2,3,4,5]
>>> reduce(lambda x,y:x*10+y, listvar, 0)
12345
If they're digits like this,
sum(digit * 10 ** place for place, digit in enumerate(reversed(x)))
int("".join(str(X) for X in x))
You have not told us what the result for x = [1, 23, 4] should be by the way...
My answer gives 1234, others give 334
Just for fun :)
int(str(x)[1:-1].replace(', ', ''))
Surprisingly, this is even faster for large list:
$ python -m timeit -s "x=[1,2,3,4,5,6,7,8,9,0]*100" "int(str(x)[1:-1].replace(', ', ''))"
10000 loops, best of 3: 128 usec per loop
$ python -m timeit -s "x=[1,2,3,4,5,6,7,8,9,0]*100" "int(''.join(map(str, x)))"
10000 loops, best of 3: 183 usec per loop
$ python -m timeit -s "x=[1,2,3,4,5,6,7,8,9,0]*100" "reduce(lambda x,y:x*10+y, x, 0)"
1000 loops, best of 3: 649 usec per loop
$ python -m timeit -s "x=[1,2,3,4,5,6,7,8,9,0]*100" "sum(digit * 10 ** place for place, digit in enumerate(reversed(x)))"
100 loops, best of 3: 7.19 msec per loop
But for very small list (maybe more common?) , this one is slowest.

Python comparing strings to their equivalent integers effeciently

What's the most efficient way to compare two python values both of which are probably strings, but might be integers. So far I'm using str(x)==str(y) but that feels inefficient and (more importantly) ugly:
>>> a = 1.0
>>> b = 1
>>> c = '1'
>>> a == b
True
>>> b == c
False # here I wanted this to be true
>>> str(b)==str(c)
True # true, as desired
My actual objects are dictionary values retrieved with get(), and most of them are strings.
Test it out. I like using %timeit in ipython:
In [1]: %timeit str("1") == str(1)
1000000 loops, best of 3: 702 ns per loop
In [2]: %timeit "1" == str(1)
1000000 loops, best of 3: 412 ns per loop
In [3]: %timeit int("1") == 1
1000000 loops, best of 3: 906 ns per loop
Apart from that, though, if you truly don't know what the input type is, there isn't much you can do about it, unless you want to make assumptions about the input data. For example, if you assume that most of the inputs are equal (same type, same value), you could do something like:
if a == b or str(a) == str(b):
... they are equal ...
Which would be faster if they are normally the same type and normally equal... But it will be slower if they aren't normally the same type, or aren't normally equal.
However, are you sure you can't cast everything to a str/int when they enter your code?
wim#wim-acer:~/sandpit$ python -mtimeit "str('69') == str(69)"
1000000 loops, best of 3: 0.28 usec per loop
wim#wim-acer:~/sandpit$ python -mtimeit "int('69') == int(69)"
1000000 loops, best of 3: 0.5 usec per loop
wim#wim-acer:~/sandpit$ python -mtimeit "str('32767') == str(32767)"
1000000 loops, best of 3: 0.317 usec per loop
wim#wim-acer:~/sandpit$ python -mtimeit "int('32767') == int(32767)"
1000000 loops, best of 3: 0.492 usec per loop
Conclusion: Probably how you're already doing it is plenty fast enough. Optimise the slowest parts of your program, after everything is working.

python simple iteration

i would like to ask what is the best way to make simple iteration. suppose i want to repeat certain task 1000 times, which one of the following is the best? or is there a better way?
for i in range(1000):
do something with no reference to i
i = 0
while i < 1000:
do something with no reference to i
i += 1
thanks very much
The first is considered idiomatic. In Python 2.x, use xrange instead of range.
The for loop is more concise and more readable. while loops are rarely used in Python (with the exception of while True).
A bit of idiomatic Python: if you're trying to do something a set number of times with a range (with no need to use the counter), it's good practice to name the counter _. Example:
for _ in range(1000):
# do something 1000 times
In Python 2, use
for i in xrange(1000):
pass
In Python 3, use
for i in range(1000):
pass
Performance figures for Python 2.6:
$ python -s -m timeit '' 'i = 0
> while i < 1000:
> i += 1'
10000 loops, best of 3: 71.1 usec per loop
$ python -s -m timeit '' 'for i in range(1000): pass'
10000 loops, best of 3: 28.8 usec per loop
$ python -s -m timeit '' 'for i in xrange(1000): pass'
10000 loops, best of 3: 21.9 usec per loop
xrange is preferable to range in this case because it produces a generator rather than the whole list [0, 1, 2, ..., 998, 999]. It'll use less memory, too. If you needed the actual list to work with all at once, that's when you use range. Normally you want xrange: that's why in Python 3, xrange(...) becomes range(...) and range(...) becomes list(range(...)).
first. because the integer is done in the internal layer rather than interpretor. Also one less global variable.

Categories