python string formatting {:d} vs %d on floating point number

python string formatting {:d} vs %d on floating point number - python

I realise that this question could be construed as similar to others, so before I start, here is a list of some possible "duplicates" before everyone starts pointing them out. None of these seem to really answer my question properly.
Python string formatting: % vs. .format
"%s" % format vs "{0}".format() vs "?" format
My question specifically pertains to the use of the string.format() method for displaying integer numbers.
Running the following code using % string formatting in the interpreter running python 2.7
>>> print "%d" %(1.2345)
1
Whereas using the string.format() method results in the following
>>> print "{:d}".format(1.2345)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Unknown format code 'd' for object type 'float'
I was expecting the same behavior in both; for the interpreter to actually convert my floating point number to an integer prior to displaying. I realise that I could just use the int function to convert the floating point number to integer format, but I was looking for the same functionality you get with the %d formatting method. Is there any string.format() method that would do this for me?

The two implementations are quite separate, and some warts in the % implementation were ironed out. Using %d for floats may mask problems in your code, where you thought you had integers but got floating point values instead. Imagine a value of 1.999999 and only seeing 1 instead of 2 as %d truncates the value.
As such, the float.__format__() hook method called by str.format() to do the actual conversion work does not support the d format and throws an exception instead.
You can use the {:.0f} format to explicitly display (rounded) floating point values with no decimal numbers:
>>> '{:.0f}'.format(1.234)
'1'
>>> '{:.0f}'.format(1.534)
'2'
or use int() before formatting to explicitly truncate your floating point number.
As a side note, if all you are doing is formatting a number as a string (and not interpolating into a larger string), use the format() function:
>>> format(1.234, '.0f')
'1'
This communicates your intent better and is a little faster to boot.

There is an important change between 2.7 and 3.0 regarding "automatic type conversion" (coercion). While 2.7 was somehow relatively "relax" regarding this, 3.0 forces you to be more disciplined.
Automatic conversion may be dangerous, as it may silently truncate/reduce some data ! Besides, this behavior is inconsistent and you never know what to expect; until you're faced with he problem. Python 3.0 requires that you specify what you want to, precisely, do !
However, the new string.format() adds some very powerful and useful formatting techniques. It's even very clear with the "free" format '{}'. Like this :
'{}'.format(234)
'{:10}.format(234)
'{:<10}'.format(234)
See ? I didn't need to specify 'integer', 'float' or anything else. This will work for any type of values.
for v in (234, 1.234, 'toto'):
for fmt in ('[{}]', '[{:10}]', '[{:<10d}]', '[{:>10d}]'):
print(fmt.format(v))
Besides, the % value is obsolete and should not be used any more. The new string.format() is easier to use and has more features than the old formatting techniques. Which, IMHO, renders the old technique less attractive.

Related

Getting error in python: Value Error: invalid literal for int() with base 10: '470.21'

i want adding and subtracting this type of data: $12,587.30.which returns answer in same format.how can do this ?
Here is my code example:
print(int(col_ammount2.lstrip('$'))-int(col_ammount.lstrip('$')))
I removed $ sign and convert it to int but it gives me base 10 error.

You mentioned you want to do arithmetic operations to the numbers (addition/subtraction) so you probably want them in float instead. The difference between an integer (int) and float is that integers do not carry decimal points.
Additionally, as #officialaimm mentioned you need to remove the commas too, for example
float('$3,333.33'.replace('$', '').replace(',', ''))
will give you
3333.33
So putting it into your code
print(float(col_ammount2.lstrip('$').replace(',', ''))
- float(col_ammount.lstrip('$').replace(',', '')))
An additional note for when you parse a floating point number (same applies to integers too), you may want to watch out for empty values, i.e.
float('')
is bad. One of the things u can do in case col_amount and col_amount2 may be empty at some point is default them to 0 if that happens
float(col_amount.lstrip(...).replace(...) or 0)
You also want to read this to know about workaround to problems you may face with floating point arithmetic https://docs.python.org/3/tutorial/floatingpoint.html

There are two things you are missing here. Firstly python int(...) cannot parse numbers with commas so you will need to remove commas as well by using .replace(',',''). Secondly int() cannot parse floating point values you will have to use float(...) first and after that maybe typecast it to int using int or math.ceil, math.floor appropriately as per your choice and needs.
Maybe something like this will solve your problem:
col_ammount2='$1,587.30'
col_ammount = '$2,567.67'
print(int(float(col_ammount2.lstrip('$').replace(',','')))-int(float(col_ammount.lstrip('$').replace(',',''))))
If you are doing these sorts of things quite often in your code, making a function as such might be handy:
integerify_currency = lambda x:int(float(x.lstrip('$').replace(',','')))

Python : Do not include "L" at the end of the outcome for : randint(100000000000000000000, 999999999999999999999)

so far this is what i found:
from random import randint
randint(100000000000000000000, 999999999999999999999)
the output is:
922106555361958347898L
but i do not want that L there..
i can only use this as an int if there is no "L" there at the end of it.
UPDATE
would it be a better idea to generate two small numbers and then combine them if the goal is to simply have a random number that is 30 digits long ?

The reason there's an L there is because this is too large to fit into an int,* so it's a long. See Numeric Types — int, long, float, complex in the docs for more details.
So, why do you get that L, and how do you get rid of it?
Python has two different ways to turn a value into a string representation:
repr either returns the canonical source-code representation, or something like <__main__.Eggs at 0x10fbd8908>).
str returns a human-friendly representation.
For example, for strings, str('abc') is abc, while repr('abc') is 'abc'.
And for longs, str(1L) is 1, while repr(1L) is 1L.
When you just type an expression at the interactive prompt, it uses repr. But when you use the print command, it uses str. So, if you want to see the value without the L, just print it:
print randint(100000000000000000000, 999999999999999999999)
If you want to, e.g., save the string in a variable or write it to a file, you have to call str explicitly.
But if you just want to use it as a number, you don't have to worry about this at all; it's a number, and int and long values can be intermixed freely (as of Python 2.3 or so).
And if you're trying to store it in a MySQL database, whichever MySQL interface you use won't care whether you're giving it int values or long, as long as they fit into the column type.**
Or you could upgrade to Python 3.x, where there is no separate long type anymore (all integers are int, no matter how big) and no L suffix.
* The exact cutoff isn't documented anywhere, but at least for CPython, it's whatever fits into a C long on your platform. So, on most 64-bit platforms, the max value is (1<<63)-1; on the other 64-bit platforms, and all 32-bit platforms, it's (1<<31)-1. You can see for yourself on your platform by printing sys.maxint. At any rate, your number takes 70 bits, so unless someone ports Python 2.x to a platform with 128-bit C longs, it won't fit.
** Note that your values are too big to fit into even a MySQL BIGINT, so you're going to be using either DECIMAL or NUMERIC. Depending on which interface you're using, and how you've set things up, you may have to convert to and from strings manually. But you can do that with the str and int functions, without worrying about which values fit into the int type and which don't.)

If you're on the interactive prompt, explicitly print the value. The repr of the value has an L, but the str of the value doesn't.
>>> 922106555361958347898
922106555361958347898L
>>> print 922106555361958347898
922106555361958347898

The output in the REPL has an L suffixed; if you print the value, it is not displayed.
>>> from random import randint
>>> print randint(100000000000000000000, 999999999999999999999)
106315199286113607384
>>>

Precise definition of float string formatting?

Is the following behavior defined in Python's documentation (Python 2.7)?
>>> '{:20}'.format(1e10)
' 10000000000.0'
>>> '{:20g}'.format(1e10)
' 1e+10'
In fact, the first result surprises me: the documentation indicates that not indicating the format type ('f', 'e', etc.) for floats is equivalent to using the general format 'g'. This example shows that this does not seem to be the case, so I'm confused.
Maybe this is related to the fact that "A general convention is that an empty format string ("") produces the same result as if you had called str() on the value."? In fact:
>>> str(1e10)
'10000000000.0'
However, in the case of the {:20} format, the format string is not empty (it is 20), so I'm confused.
So, is this behavior of {:20} defined precisely in the documentation? Is the precise behavior of str() on floats precisely defined (str(1e11) has an exponent, but not str(1e10)…)?
PS: My goal is to format numbers with an uncertainty so that the output is very close to what floats would give (presence or not of an exponent, etc.). However, I'm having a hard time finding the exact formatting rules.
PPS: '{:20}'.format(1e10) gives a result that differs from the string formatting '{!s:20}'.format(1e10), where the string is flushed to the left (as usual for string) instead of to the right.

As #blckknght explains in comments, '{:20}' specifies a string width of 20; to specify float precision you need a decimal point before it: {:.20} or {:.20g}.
As to why the number is formatted as it is, OP said it: "A general convention is that an empty format string ("") produces the same result as if you had called str() on the value." That's what you're getting, space-padded as per the format string (it's empty as to the number format, and the format can accommodate the full str representation).

Integer literal is an object in Python? [duplicate]

This question already exists:
Closed 10 years ago.
Possible Duplicate:
accessing a python int literals methods
Everything in Python is an object. Even a number is an object:
>>> a=1
>>> type(a)
<class 'int'>
>>>a.real
1
I tried the following, because we should be able to access class members of an object:
>>> type(1)
<class 'int'>
>>> 1.real
File "<stdin>", line 1
1.real
^
SyntaxError: invalid syntax
Why does this not work?

Yes, an integer literal is an object in Python. To summarize, the parser needs to be able to understand it is dealing with an object of type integer, while the statement 1.real confuses the parser into thinking it has a float 1. followed by the word real, and therefore raises a syntax error.
To test this you can also try
>> (1).real
1
as well as,
>> 1.0.real
1.0
so in the case of 1.real python is interpreting the . as a decimal point.
Edit
BasicWolf puts it nicely too - 1. is being interpreted as the floating point representation of 1, so 1.real is equivalent to writing (1.)real - so with no attribute access operator i.e. period /full stop. Hence the syntax error.
Further edit
As mgilson alludes to in his/her comment: the parser can handle access to int's attributes and methods, but only as long the statement makes it clear that it is being given an int and not a float.

a language is usually built in three layers.
when you provide a program to a language it first has to "read" the program. then it builds what it has read into something it can work with. and finally it runs that thing as "a program" and (hopefully) prints a result.
the problem here is that the first part of python - the part that reads programs - is confused. it's confused because it's not clever enough to know the difference between
1.234
and
1.letters
what seems to be happening is that it thinks you were trying to type a number like 1.234 but made a mistake and typed letters instead(!).
so this has nothing to do with what 1 "really is" and whether or not is it an object. all that kind of logic happens in the second and third stages i described earlier, when python tries to build and then run the program.
what you've uncovered is just a strange (but interesting!) wrinkle in how python reads programs.
[i'd call it a bug, but it's probably like this for a reason. it turns out that some things are hard for computers to read. python is probably designed so that it's easy (fast) for the computer to read programs. fixing this "bug" would probably make the part of python that reads programs slower or more complicated. so it's probably a trade-off.]

Although the behaviour with 1.real seems unlogical, it is expected due to the language specification: Python interprets 1. as a float (see floating point literals). But as #mutzmatron pointed out (1).real works because the expression in brackets is a valid Python object.
Update: Note the following pits:
1 + 2j.real
>>> 1.0 # due to the fact that 2j.real == 0
# but
1 + 2j.imag
>>> 3.0 # due to the fact that 2j.imag == 2

You can still access 1.real:
>>> hasattr(1, 'real')
True
>>> getattr(1, 'real')
1

floats inside tuples changing values when accessed

So I have a list of tuples of two floats each. Each tuple represents a range. I am going through another list of floats which represent values to be fit into the ranges. All of these floats are < 1 but positive, so precision matter. One of my tests to determine if a value fits into a range is failing when it should pass. If I print the value and the range that is causing problems I can tell this much:
curValue = 0.00145000000671
range = (0.0014500000067055225, 0.0020968749796738849)
The conditional that is failing is:
if curValue > range[0] and ... blah :
# do some stuff
From the values given by curValue and range, the test should clearly pass (don't worry about what is in the conditional). Now, if I print explicitly what the value of range[0] is I get:
range[0] = 0.00145000000671
Which would explain why the test is failing. So my question then, is why is the float changing when it is accessed. It has decimal values available up to a certain precision when part of a tuple, and a different precision when accessed. Why would this be? What can I do to ensure my data maintains a consistent amount of precision across my calculations?

The float doesn't change. The built-in numberic types are all immutable. The cause for what you're observing is that:
print range[0] uses str on the float, which (up until very recent versions of Python) printed less digits of a float.
Printing a tuple (be it with repr or str) uses repr on the individual items, which gives a much more accurate representation (again, this isn't true anymore in recent releases which use a better algorithm for both).
As for why the condition doesn't work out the way you expect, it's propably the usual culprit, the limited precision of floats. Try print repr(curVal), repr(range[0]) to see if what Python decided was the closest representation of your float literal possible.

In modern day PC's floats aren't that precise. So even if you enter pi as a constant to 100 decimals, it's only getting a few of them accurate. The same is happening to you. This is because in 32-bit floats you only get 24 bits of mantissa, which limits your precision (and in unexpected ways because it's in base2).
Please note, 0.00145000000671 isn't the exact value as stored by Python. Python only diplays a few decimals of the complete stored float if you use print. If you want to see exactly how python stores the float use repr.
If you want better precision use the decimal module.

It isn't changing per se. Python is doing its best to store the data as a float, but that number is too precise for float, so Python modifies it before it is even accessed (in the very process of storing it). Funny how something so small is such a big pain.
You need to use a arbitrary fixed point module like Simple Python Fixed Point or the decimal module.

Not sure it would work in this case, because I don't know if Python's limiting in the output or in the storage itself, but you could try doing:
if curValue - range[0] > 0 and...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.