Couldn't find much on this. Trying to compare 2 values, but they can't be equal. In my case, they can be (and often are) either greater than or less than.
Should I use:
if a <> b:
dostuff
or
if a != b:
dostuff
This page says they're similar, which implies there's at least something different about them.
Quoting from Python language reference,
The comparison operators <> and != are alternate spellings of the same operator. != is the preferred spelling; <> is obsolescent.
So, they both are one and the same, but != is preferred over <>.
I tried disassembling the code in Python 2.7.8
from dis import dis
form_1 = compile("'Python' <> 'Python'", "string", 'exec')
form_2 = compile("'Python' != 'Python'", "string", 'exec')
dis(form_1)
dis(form_2)
And got the following
1 0 LOAD_CONST 0 ('Python')
3 LOAD_CONST 0 ('Python')
6 COMPARE_OP 3 (!=)
9 POP_TOP
10 LOAD_CONST 1 (None)
13 RETURN_VALUE
1 0 LOAD_CONST 0 ('Python')
3 LOAD_CONST 0 ('Python')
6 COMPARE_OP 3 (!=)
9 POP_TOP
10 LOAD_CONST 1 (None)
13 RETURN_VALUE
Both <> and != are generating the same byte code
6 COMPARE_OP 3 (!=)
So they both are one and the same.
Note:
<> is removed in Python 3.x, as per the Python 3 Language Reference.
Quoting official documentation,
!= can also be written <>, but this is an obsolete usage kept for backwards compatibility only. New code should always use !=.
Conclusion
Since <> is removed in 3.x, and as per the documentation, != is the preferred way, better don't use <> at all.
Just stick to !=.
<> is outdated! Please check recent python reference manual.
Related
I understand that, in CPython 2.x and 3.x, some integers are singletons:
>>> a = 256; a is 256 # or any integer from -5 to 256
True
>>> a = 257; a is 257 # or any other integer outside the magic range
False
Accordingly, if I run sys.getrefcount on an integer in the range -5 to 256, I find that a lot of my imported packages have referenced that integer:
>>> sys.getrefcount(1)
1470
I also understand that sys.getrefcount returns 1 more than you might expect, because of its own reference to the argument:
>>> a = 257; sys.getrefcount(a)
2
What I don't get is this:
>>> sys.getrefcount(257)
3
Why 3, not 2? I could understand that I might have created a temporary variable in my own scope (count 1), and clearly sys.getrefcount would add another reference of its own to that (count 2) but where does the third one come from, and why didn't happen in the previous example? And more importantly: are there other contexts in which this might occur, leading to possible misinterpretation of sys.getrefcount outputs?
All of the above are replicable for me on 64-bit Python 2.7.12 and 3.5.1 by Anaconda, running on OSX, and also on a 32-bit Python 2.7.5 distribution running on Windows. However, on an older Python version (32-bit Python 2.5.4 on Windows), sys.getrefcount(257) returns 2 which is (to me) more expected.
You are running into an implementation detail here. The compiler can often cache immutable literal values:
>>> import dis
>>> compile("sys.getrefcount(257)", '', 'single').co_consts
(257, None)
>>> dis.dis(compile("sys.getrefcount(257)", '', 'single'))
1 0 LOAD_NAME 0 (sys)
2 LOAD_ATTR 1 (getrefcount)
4 LOAD_CONST 0 (257)
6 CALL_FUNCTION 1
8 PRINT_EXPR
10 LOAD_CONST 1 (None)
12 RETURN_VALUE
('single' is the mode used by the interactive interpreter).
We see 3 references here; one from the co_consts tuple on the code object, one on the stack (from the LOAD_CONST instruction), and one for the sys.getrefcount() method itself.
In python for comparisons like this, does python create a temporary object for the string constant "help" and then continue with the equality comparison ? The object would be GCed after some point.
s1 = "nohelp"
if s1 == "help":
# Blah Blah
String literals, like all Python constants, are created during compile time, when the source code is translated to byte code. And because all Python strings are immutable the interpreter can re-use the same string object if it encounters the same string literal in multiple places. It can even do that if the literal string is created via concatenation of literals, but not if the string is built by concatenating a string literal to an existing string object.
Here's a short demo that creates a few identical strings inside and outside of functions. It also dumps the disassembled byte code of one of the functions.
from __future__ import print_function
from dis import dis
def f1(s):
a = "help"
print('f1', id(s), id(a))
return s > a
def f2(s):
a = "help"
print('f2', id(s), id(a))
return s > a
a = "help"
print(id(a))
print(f1("he" + "lp"))
b = "h"
print(f2(b + "elp"))
print("\nf1")
dis(f1)
typical output on a 32 bit machine running Python 2.6.6
3073880672
f1 3073880672 3073880672
False
f2 3073636576 3073880672
False
f1
26 0 LOAD_CONST 1 ('help')
3 STORE_FAST 1 (a)
27 6 LOAD_GLOBAL 0 (print)
9 LOAD_CONST 2 ('f1')
12 LOAD_GLOBAL 1 (id)
15 LOAD_FAST 0 (s)
18 CALL_FUNCTION 1
21 LOAD_GLOBAL 1 (id)
24 LOAD_FAST 1 (a)
27 CALL_FUNCTION 1
30 CALL_FUNCTION 3
33 POP_TOP
28 34 LOAD_FAST 0 (s)
37 LOAD_FAST 1 (a)
40 COMPARE_OP 4 (>)
43 RETURN_VALUE
Note that the ids of all the "help" strings are identical, apart from the one constructed with b + "elp".
(BTW, Python will concatenate adjacent string literals, so instead of writing "he" + "lp" I could've written "he" "lp", or even "he""lp").
The string literals themselves are not freed until the process is cleaning itself up at termination, however a string like b would be GC'ed if it went out of scope.
Note that in CPython (standard Python) when objects are GC'ed their memory is returned to Python's allocation system for recycling, not to the OS. Python does return unneeded memory to the OS, but only in special circumstances. See Releasing memory in Python and Why doesn't memory get released to system after large queries (or series of queries) in django?
Another question that discusses this topic: Why strings object are cached in python
I do this:
>>> dis.dis(lambda: 1 + 1)
0 LOAD_CONST 2 (2)
3 RETURN_VALUE
I was expecting a BINARY_ADD opcode to perform the addition. How was the sum computed?
This is the work of Python's peephole optimizer. It evaluates simple operations with only constants during the compile time itself and stores the result as a constant in the generated bytecode.
Quoting from the Python 2.7.9 Source code,
/* Fold binary ops on constants.
LOAD_CONST c1 LOAD_CONST c2 BINOP --> LOAD_CONST binop(c1,c2) */
case BINARY_POWER:
case BINARY_MULTIPLY:
case BINARY_TRUE_DIVIDE:
case BINARY_FLOOR_DIVIDE:
case BINARY_MODULO:
case BINARY_ADD:
case BINARY_SUBTRACT:
case BINARY_SUBSCR:
case BINARY_LSHIFT:
case BINARY_RSHIFT:
case BINARY_AND:
case BINARY_XOR:
case BINARY_OR:
if (lastlc >= 2 &&
ISBASICBLOCK(blocks, i-6, 7) &&
fold_binops_on_constants(&codestr[i-6], consts)) {
i -= 2;
assert(codestr[i] == LOAD_CONST);
cumlc = 1;
}
break;
Basically, it looks for instructions like this
LOAD_CONST c1
LOAD_CONST c2
BINARY_OPERATION
and evaluates that and replaces those instructions with the result and a LOAD_CONST instruction. Quoting the comment in the fold_binops_on_constants function,
/* Replace LOAD_CONST c1. LOAD_CONST c2 BINOP
with LOAD_CONST binop(c1,c2)
The consts table must still be in list form so that the
new constant can be appended.
Called with codestr pointing to the first LOAD_CONST.
Abandons the transformation if the folding fails (i.e. 1+'a').
If the new constant is a sequence, only folds when the size
is below a threshold value. That keeps pyc files from
becoming large in the presence of code like: (None,)*1000.
*/
The actual evaluation of this particular code happens in this block,
case BINARY_ADD:
newconst = PyNumber_Add(v, w);
break;
The Python interpreter interprets from the inside out, that is, it reads the 1 + 1 evaluates it to 2, then creates a function object that returns the constant 2 (notice the order here!). Finally, the dis function evaluates. the newly created lambda function object, which simply returns a 2.
Thus, the 1+1 has already been computed when the lambda function object is created, and the dis.dis() function knows nothing about the addition that took place when the interpreter read 1+1 and evaluated it to 2.
If you do something like:
>>> dis.dis(lambda: x + 1)
1 0 LOAD_GLOBAL 0 (x)
3 LOAD_CONST 1 (1)
6 BINARY_ADD
7 RETURN_VALUE
You'll notice that a BINARY_ADD instruction is used, since x + 1 can't be further simplified by itself.
I'd like to develop a small debugging tool for python programs.In Dynamic Slicing How can I find the variables that are accessed in a statement? And find the type of access (read or write) for those variables (in Python).### Write: A statement can change the program state Read : A statement can read the program state .**For example in these 4 lines we have: (1) x = a+b => write{x} & reads{a,b} (2)y=6 => write{y}&reads{} (3) while(n>1) => write{} &reads{n} (4) n=n-1 write{n} & reads{n}
Not sure what your goal is. Perhaps dis is what you're looking for?
>>> import dis
>>> dis.dis("x=a+b")
1 0 LOAD_NAME 0 (a)
3 LOAD_NAME 1 (b)
6 BINARY_ADD
7 STORE_NAME 2 (x)
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
The dis module can be effectively used to disassemble Python methods, functions and classes into low-level interpreter instructions.
I know that dis information can be used for:
1. Find race condition in programs that use threads
2. Find possible optimizations
From your experience, do you know any other scenarios where Disassembly Python feature could be useful?
dis is useful, for example, when you have different code doing the same thing and you wonder where the performance difference lies in.
Example: list += [item] vs list.append(item)
def f(x): return 2*x
def f1(func, nums):
result = []
for item in nums:
result += [fun(item)]
return result
def f2(func, nums):
result = []
for item in nums:
result.append(fun(item))
return result
timeit.timeit says that f2(f, range(100)) is approximately twice as fast than f1(f, range(100). Why?
(Interestingly f2 is roughly as fast as map(f, range(100)) is.)
f1
You can see the whole output of dis by calling dis.dis(f1), here is line 4.
4 19 LOAD_FAST 2 (result)
22 LOAD_FAST 1 (fun)
25 LOAD_FAST 3 (item)
28 CALL_FUNCTION 1
31 BUILD_LIST 1
34 INPLACE_ADD
35 STORE_FAST 2 (result)
38 JUMP_ABSOLUTE 13
>> 41 POP_BLOCK
f2
Again, here is only line 4:
4 19 LOAD_FAST 2 (result)
22 LOAD_ATTR 0 (append)
25 LOAD_FAST 1 (fun)
28 LOAD_FAST 3 (item)
31 CALL_FUNCTION 1
34 CALL_FUNCTION 1
37 POP_TOP
38 JUMP_ABSOLUTE 13
>> 41 POP_BLOCK
Spot the difference
In f1 we need to:
Call fun on item (opcode 28)
Make a list out of it (opcode 31, expensive!)
Add it to result (opcode 34)
Store the returned value in result (opcode 35)
In f2, instead, we just:
Call fun on item (opcode 31)
Call append on result (opcode 34; C code: fast!)
This explains why the (imho) more expressive list += [value] is much slower than the list.append() method.
Other than that, dis.dis is mainly useful for curiosity and for trying to reconstruct code out of .pyc files you don't have the source of without spending a fortune :)
I see the dis module as being, essentially, a learning tool. Understanding what opcodes a certain snippet of Python code generates is a start to getting more "depth" to your grasp of Python -- rooting the "abstract" understanding of its semantics into a sample of (a bit more) concrete implementation. Sometimes the exact reason a certain Python snippet behaves the way it does may be hard to grasp "top-down" with pure reasoning from the "rules" of Python semantics: in such cases, reinforcing the study with some "bottom-up" verification (based on a possible implementation, of course -- other implementations would also be possible;-) can really help the study's effectiveness.
For day-to-day Python programming, not much. However, it is useful if you want to find out why doing something one way is faster than another way. I've also sometimes used it to figure out exactly how the interpreter handles some obscure bits of code. But really, I come up with a practical use-case for it very infrequently.
On the other hand, if your goal is to understand python rather than just being able to program in it, then it is an invaluable tool. For instance, ever wonder how function definition works? Here you go:
>>> def f():
... def foo(x=[1, 2, 3]):
... y = [4,]
... return x + y
...
>>> dis(f)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 LOAD_CONST 4 (<code object foo at 0xb7690770, file "<stdin>", line 2>)
15 MAKE_FUNCTION 1
18 STORE_FAST 0 (foo)
21 LOAD_CONST 0 (None)
24 RETURN_VALUE
You can see that this happens by pushing the constants 1, 2, and 3 onto the stack, putting what's in the stack into a list, loading that into a code object, making the code function into an object, and storing it into a variable foo.