I am confused about the leading 0 in 0b10101000:
It does not seem to be a sign symbol.
In [1]: bin(168)
Out[1]: '0b10101000'
In [2]: int(bin(168), 2)
Out[2]: 168
I assume it should be sufficient, and it would certainly be more succinct, to say b10101000.
Why is the leading 0 needed?
It's to not confuse binary literals with variables.
You can express numbers as literals in whatever base (0b -> binary, 0x -> hexadecimal for instance):
0b100
>>>4
0x100
>>>256
The problem arises when there isn't a leading 0. Python's naming convention for variables is that it must start with an alphabetical character. With the leading 0 the interpreter can tell if it's a literal or a variable.
It would be more succinct, but Python would interpret b10101000 as a variable name if you used it in code whereas it would interpret 0b10101000 as a binary number.
It would be confusing (to you, the programmer) if Python presented the value to you differently from the way it would expect you to present the value to it in code that you write.
Why do you have to use def instead of df or class instead of cls? That is what the language grammar dictates. Its enforced by language design.
Related
I always understood that if something can be converted to integer (ie; something is string representation of numeric), isdigit() return True. This is not the case with the new feature. Here is the sample below:
Code Sample
But why?
To answer your question, looking at the python 3.6 documentation for the isdigit method.
Return true if all characters in the string are digits and there is at least one character, false otherwise.
Since an underscore isn't a digit, the new format will not work well with the current implementation of isdigit. As I commented before, the immediate work around would be: str.replace("_", "").isdigit() where str is string containing the newly formatted number, while avoiding a try-except block with int.
You also need to take out the negative sign for negative integers. This way negative integers will work as well. str.replace("_", "").lstrip("-").isdigit().
I just need to know how to put in a numerical value such as 1.5x10^15 into json. I assumed the same syntax as python would work but json doesn't like the *s it seems.
1.5x10^15 isn't a "numerical value," it's an expression. You could put that numerical value in JSON ({"value":1500000000000000}, or {"value":1.5e15} also works), but JSON has no syntax for expressions.
You can use the exponential notation in JSON. RFC 7159 -- 6. Numbers, says:
A number is represented in base 10 using decimal digits. It
contains an integer component that may be prefixed with an optional
minus sign, which may be followed by a fraction part and/or an
exponent part.
So you could use something as 1E400 in theory, although keep in mind that different implementation will have different limits.
When I type int("1.7") Python returns error (specifically, ValueError). I know that I can convert it to integer by int(float("1.7")). I would like to know why the first method returns error.
From the documentation:
If x is not a number or if base is given, then x must be a string or Unicode object representing an integer literal in radix base ...
Obviously, "1.7" does not represent an integer literal in radix base.
If you want to know why the python dev's decided to limit themselves to integer literals in radix base, there are a possible infinite number of reasons and you'd have to ask Guido et. al to know for sure. One guess would be ease of implementation + efficiency. You might think it would be easily for them to implement it as:
Interpret number as a float
truncate to an integer
Unfortunately, that doesn't work in python as integers can have arbitrary precision and floats cannot. Special casing big numbers could lead to inefficiency for the common case1.
Additionally, forcing you do to int(float(...)) has the additional benefit in clarity -- It makes it more obvious what the input string probably looks like which can help in debugging elsewhere. In fact, I might argue that even if int would accept strings like "1.7", it'd be better to write int(float("1.7")) anyway for the increased code clarity.
1Assuming some validation. Other languages skip this -- e.g. ruby will evaluate '1e6'.to_i and give you 1 since it stops parsing at the first non-integral character. Seems like that could lead to fun bugs to track down ...
We have a good, obvious idea of what "make an int out of this float" means because we think of a float as two parts and we can throw one of them away.
It's not so obvious when we have a string. Make this string into a float implies all kinds of subtle things about the contents of the string, and that is not the kind of thing a sane person wants to see in code where the value is not obvious.
So the short answer is: Python likes obvious things and discourages magic.
Here is a good description of why you cannot do this found in the python documentation.
https://docs.python.org/2/library/functions.html#int
If x is not a number or if base is given, then x must be a string or Unicode object representing an integer literal in radix base. Optionally, the literal can be preceded by + or - (with no space in between) and surrounded by whitespace. A base-n literal consists of the digits 0 to n-1, with a to z (or A to Z) having values 10 to 35. The default base is 10. The allowed values are 0 and 2-36. Base-2, -8, and -16 literals can be optionally prefixed with 0b/0B, 0o/0O/0, or 0x/0X, as with integer literals in code. Base 0 means to interpret the string exactly as an integer literal, so that the actual base is 2, 8, 10, or 16.
Basically to typecast to an integer from a string, the string must not contain a "."
Breaks backwards-compatibility. It is certainly possible, however this would be a terrible idea since it would break backwards-compatibility with the very old and well-established Python idiom of relying on a try...except ladder ("Easier to ask forgiveness than permission") to determine the type of the string's contents. This idiom has been around and used since at least Python 1.5, AFAIK; here are two citations: [1] [2]
s = "foo12.7"
#s = "-12.7"
#s = -12
try:
n = int(s) # or else throw an exception if non-integer...
print "Do integer stuff with", n
except ValueError:
try:
f = float(s) # or else throw an exception if non-float...
print "Do float stuff with", f
except ValueError:
print "Handle case for when s is neither float nor integer"
raise # if you want to reraise the exception
And another minor thing: it's not just about whether the number contains '.' Scientific notation, or arbitrary letters, could also break the int-ness of the string.
Examples: int("6e7") is not an integer (base-10). However int("6e7",16) =
1767 is an integer in base-16 (or any base>=15). But int("6e-7") is never an int.
(And if you expand the base to base-36, any legal alphanumeric string (or Unicode) can be interpreted as representing an integer, but doing that by default would generally be a terrible behavior, since "dog" or "cat" are unlikely to be references to integers).
I've got string like x='0x08h, 0x0ah' in Python, wanting to convert it to [8,10] (like unsigned ints). I could split and index it like [int(a[-3:-1],16) for a in x.split(', ')] but is there a better way to convert it to a list of ints?
Would it matter if I had y='080a'?
edit (for plus points:).) what (sane) string-based hexadecimal notations have python support, and which not?
You really have to know what the pattern you're trying to parse is, before you write a parser.
But it looks like your pattern is: optional 0x, then hex digits, then optional h. At least that's the most reasonable thing I can come up with that handles both '0x08h' and '080a'. So:
def parse_hex(s):
return int(s.lstrip('0x').rstrip('h'), 16)
Then:
numbers = [parse_hex(s) for s in x.split(', ')]
Of course you don't actually need to remove the 0x prefix, because Python accepts that as part of a hex string, so you could write it as:
def parse_hex(s):
return int(s.rstrip('h'), 16)
However, I think the intention is clearer if you're more explicit.
From your edit:
edit what (sane) string-based hexadecimal notations have python support, and which not?
See the documentation for int:
Base-2, -8, and -16 literals can be optionally prefixed with 0b/0B, 0o/0O, or 0x/0X, as with integer literals in code.
That's it. (If you read the rest of the paragraph, if you're guaranteed to have 0x/0X, you don't have to explicitly use base=16. But that doesn't help you here, so that one sentence is really all you need.) The docs on Numeric Types and Numeric literals detail exactly what "as with integer literals in code"; the only thing surprising there is that negative numbers aren't literals, complex numbers aren't literals (but pure imaginary numbers are), and non-ASCII digits can be used but the documentation doesn't explain how.
You can also use map: map(lambda s:int(s.lower().replace('0x','').replace('h',''), 16),x.split(', '))
Just starting out with Python, so this is probably my mistake, but...
I'm trying out Python. I like to use it as a calculator, and I'm slowly working through some tutorials.
I ran into something weird today. I wanted to find out 2013*2013, but I wrote the wrong thing and wrote 2013*013, and got this:
>>> 2013*013
22143
I checked with my calculator, and 22143 is the wrong answer! 2013 * 13 is supposed to be 26169.
Why is Python giving me a wrong answer? My old Casio calculator doesn't do this...
Because of octal arithmetic, 013 is actually the integer 11.
>>> 013
11
With a leading zero, 013 is interpreted as a base-8 number and 1*81 + 3*80 = 11.
Note: this behaviour was changed in python 3. Here is a particularly appropriate quote from PEP 3127
The default octal representation of integers is silently confusing to
people unfamiliar with C-like languages. It is extremely easy to
inadvertently create an integer object with the wrong value, because
'013' means 'decimal 11', not 'decimal 13', to the Python language
itself, which is not the meaning that most humans would assign to this
literal.
013 is an octal integer literal (equivalent to the decimal integer literal 11), due to the leading 0.
>>> 2013*013
22143
>>> 2013*11
22143
>>> 2013*13
26169
It is very common (certainly in most of the languages I'm familiar with) to have octal integer literals start with 0 and hexadecimal integer literals start with 0x. Due to the exact confusion you experienced, Python 3 raises a SyntaxError:
>>> 2013*013
File "<stdin>", line 1
2013*013
^
SyntaxError: invalid token
and requires either 0o or 0O instead:
>>> 2013*0o13
22143
>>> 2013*0O13
22143
Python's 'leading zero' syntax for octal literals is a common gotcha:
Python 2.7.3
>>> 010
8
The syntax was changed in Python 3.x http://docs.python.org/3.0/whatsnew/3.0.html#integers
This is mostly just expanding on #Wim's answer a bit, but Python indicates the base of integer literals using certain prefixes. Without a prefix, integers are interpreted as being in base-10. With an "0x", the integer will be interpreted as a hexadecimal int. The full grammar specification is here, though it's a bit tricky to understand if you're not familiar with formal grammars: http://docs.python.org/2/reference/lexical_analysis.html#integers
The table essentially says that if you want a long value (i.e. one that exceeds the capacity of a normal int), write the number followed by the letter "L" or "l"; if you want your number to be interpreted in decimal, write the number normally (with no leading 0); if you want it interpreted in octal, prefix it with "0", "0o", or "0O"; if you want it in hex, prefix it with "0x"; and if you want it in binary, prefix it with "0b" or "0B".