Add 'decimal-mark' thousands separators to a number - python

How do I format 1000000 to 1.000.000 in Python? where the '.' is the decimal-mark thousands separator.

If you want to add a thousands separator, you can write:
>>> '{0:,}'.format(1000000)
'1,000,000'
But it only works in Python 2.7 and above.
See format string syntax.
In older versions, you can use locale.format():
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'en_AU.utf8'
>>> locale.format('%d', 1000000, 1)
'1,000,000'
the added benefit of using locale.format() is that it will use your locale's thousands separator, e.g.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'de_DE.utf-8')
'de_DE.utf-8'
>>> locale.format('%d', 1000000, 1)
'1.000.000'

I didn't really understand it; but here is what I understand:
You want to convert 1123000 to 1,123,000. You can do that by using format:
http://docs.python.org/release/3.1.3/whatsnew/3.1.html#pep-378-format-specifier-for-thousands-separator
Example:
>>> format(1123000,',d')
'1,123,000'

Just extending the answer a bit here :)
I needed to both have a thousandth separator and limit the precision of a floating point number.
This can be achieved by using the following format string:
> my_float = 123456789.123456789
> "{:0,.2f}".format(my_float)
'123,456,789.12'
This describes the format()-specifier's mini-language:
[[fill]align][sign][#][0][width][,][.precision][type]
Source: https://www.python.org/dev/peps/pep-0378/#current-version-of-the-mini-language

An idea
def itanum(x):
return format(x,',d').replace(",",".")
>>> itanum(1000)
'1.000'

Strange that nobody mentioned a straightforward solution with regex:
import re
print(re.sub(r'(?<!^)(?=(\d{3})+$)', r'.', "12345673456456456"))
Gives the following output:
12.345.673.456.456.456
It also works if you want to separate the digits only before comma:
re.sub(r'(?<!^)(?=(\d{3})+,)', r'.', "123456734,56456456")
gives:
123.456.734,56456456
the regex uses lookahead to check that the number of digits after a given position is divisible by 3.
Update 2021: Please use this for scripting only (i.e. only in situation where you can destroy the code after using it). When used in an application, this approach would constitute a ReDoS.

Using itertools can give you some more flexibility:
>>> from itertools import zip_longest
>>> num = "1000000"
>>> sep = "."
>>> places = 3
>>> args = [iter(num[::-1])] * places
>>> sep.join("".join(x) for x in zip_longest(*args, fillvalue=""))[::-1]
'1.000.000'

Drawing on the answer by Mikel, I implemented his solution like this in my matplotlib plot. I figured some might find it helpful:
ax=plt.gca()
ax.get_xaxis().set_major_formatter(matplotlib.ticker.FuncFormatter(lambda x, loc: locale.format('%d', x, 1)))

DIY solution
def format_number(n):
result = ""
for i, digit in enumerate(reversed(str(n))):
if i != 0 and (i % 3) == 0:
result += ","
result += digit
return result[::-1]
built-in solution
def format_number(n):
return "{:,}".format(n)

Here's only a alternative answer.
You can use split operator in python and through some weird logic
Here's the code
i=1234567890
s=str(i)
str1=""
s1=[elm for elm in s]
if len(s1)%3==0:
for i in range(0,len(s1)-3,3):
str1+=s1[i]+s1[i+1]+s1[i+2]+"."
str1+=s1[i]+s1[i+1]+s1[i+2]
else:
rem=len(s1)%3
for i in range(rem):
str1+=s1[i]
for i in range(rem,len(s1)-1,3):
str1+="."+s1[i]+s1[i+1]+s1[i+2]
print str1
Output
1.234.567.890

Related

How to format a number with comma every four digits in Python?

I have a number 12345 and I want the result '1,2345'. I tried the following code, but failed:
>>> n = 12345
>>> f"{n:,}"
'12,345'
Regex will work for you:
import re
def format_number(n):
return re.sub(r"(\d)(?=(\d{4})+(?!\d))", r"\1,", str(n))
>>> format_number(123)
'123'
>>> format_number(12345)
'1,2345'
>>> format_number(12345678)
'1234,5678'
>>> format_number(123456789)
'1,2345,6789'
Explanation:
Match:
(\d) Match a digit...
(?=(\d{4})+(?!\d)) ...that is followed by one or more groups of exactly 4 digits.
Replace:
\1, Replace the matched digit with itself and a ,
Sounds like a locale thing(*). This prints 12,3456,7890 (Try it online!):
import locale
n = 1234567890
locale._override_localeconv["thousands_sep"] = ","
locale._override_localeconv["grouping"] = [4, 0]
print(locale.format_string('%d', n, grouping=True))
That's an I guess hackish way based on this answer. The other answer there talks about using babel, maybe that's a clean way to achieve it.
(*) Quick googling found this talking about Chinese grouping four digits, and OP's name seems somewhat Chinese, so...
Using babel:
>>> from babel.numbers import format_decimal
>>> format_decimal(1234, format="#,####", locale="en")
'1234'
>>> format_decimal(12345, format="#,####", locale="en")
'1,2345'
>>> format_decimal(1234567890, format="#,####", locale="en")
'12,3456,7890'
This format syntax is specified in UNICODE LOCALE DATA MARKUP LANGUAGE (LDML). Some light bedtime reading there.
Using stdlib only (hackish):
>>> from textwrap import wrap
>>> n = 12345
>>> ",".join(wrap(str(n)[::-1], width=4))[::-1]
'1,2345'
You can break your number into chunks of 10000's using modulus and integer division, then str.join using ',' delimiters
def commas(n):
s = []
while n > 0:
n, chunk = divmod(s, n)
s.append(str(chunk))
return ','.join(reversed(s))
>>> commas(123456789)
'1,2345,6789'
>>> commas(123)
'123'

How to extract a part of a string

I have this string:
-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)
but actually I have a lot of string like this:
a*p**(-1.0) + b*p**(c)
where a,b and c are double. And I would like to extract a,b and c of this string. How can I do this using Python?
import re
s = '-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)'
pattern = r'-?\d+\.\d*'
a,_,b,c = re.findall(pattern,s)
print(a, b, c)
Output
('-1007.88670550662', '67293.8347365694', '-0.416543501823503')
s is your test strings and what not, pattern is the regex pattern, we are looking for floats, and once we find them using findall() we assign them back to a,b,c
Note this method works only if your string is in format of what you've given. else you can play with the pattern to match what you want.
Edit like most people stated in the comments if you need to include a + in front of your positive numbers you can use this pattern r'[-+]?\d+\.\d*'
Using the reqular expression
(-?\d+\.?\d*)\*p\*\*\(-1\.0\)\s*\+\s*(-?\d+\.?\d*)\*p\*\*\((-?\d+\.?\d*)\)
We can do
import re
pat = r'(-?\d+\.?\d*)\*p\*\*\(-1\.0\)\s*\+\s*(-?\d+\.?\d*)\*p\*\*\((-?\d+\.?\d*)\)'
regex = re.compile(pat)
print(regex.findall('-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)'))
will print [('-1007.88670550662', '67293.8347365694', '-0.416543501823503')]
If your formats are consistent, and you don't want to deep dive into regex (check out regex101 for this, btw) you could just split your way through it.
Here's a start:
>>> s= "-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)"
>>> a, buf, c = s.split("*p**")
>>> b = buf.split()[-1]
>>> a,b,c
('-1007.88670550662', '67293.8347365694', '(-0.416543501823503)')
>>> [float(x.strip("()")) for x in (a,b,c)]
[-1007.88670550662, 67293.8347365694, -0.416543501823503]
The re module can certainly be made to work for this, although as some of the comments on the other answers have pointed out, the corner cases can be interesting -- decimal points, plus and minus signs, etc. It could be even more interesting; e.g. can one of your numbers be imaginary?
Anyway, if your string is always a valid Python expression, you can use Python's built-in tools to process it. Here is a good generic explanation about the ast module's NodeVisitor class. To use it for your example is quite simple:
import ast
x = "-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)"
def getnums(s):
result = []
class GetNums(ast.NodeVisitor):
def visit_Num(self, node):
result.append(node.n)
def visit_UnaryOp(self, node):
if (isinstance(node.op, ast.USub) and
isinstance(node.operand, ast.Num)):
result.append(-node.operand.n)
else:
ast.NodeVisitor.generic_visit(self, node)
GetNums().visit(ast.parse(s))
return result
print(getnums(x))
This will return a list with all the numbers in your expression:
[-1007.88670550662, -1.0, 67293.8347365694, -0.416543501823503]
The visit_UnaryOp method is only required for Python 3.x.
You can use something like:
import re
a,_,b,c = re.findall(r"[\d\-.]+", subject)
print(a,b,c)
Demo
While I prefer MooingRawr's answer as it is simple, I would extend it a bit to cover more situations.
A floating point number can be converted to string with surprising variety of formats:
Exponential format (eg. 2.0e+07)
Without leading digit (eg. .5, which is equal to 0.5)
Without trailing digit (eg. 5., which is equal to 5)
Positive numbers with plus sign (eg. +5, which is equal to 5)
Numbers without decimal part (integers) (eg. 0 or 5)
Script
import re
test_values = [
'-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)',
'-2.000e+07*p**(-1.0) + 1.23e+07*p**(-5e+07)',
'+2.*p**(-1.0) + -1.*p**(5)',
'0*p**(-1.0) + .123*p**(7.89)'
]
pattern = r'([-+]?\.?\d+\.?\d*(?:[eE][-+]?\d+)?)'
for value in test_values:
print("Test with '%s':" % value)
matches = re.findall(pattern, value)
del matches[1]
print(matches, end='\n\n')
Output:
Test with '-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)':
['-1007.88670550662', '67293.8347365694', '-0.416543501823503']
Test with '-2.000e+07*p**(-1.0) + 1.23e+07*p**(-5e+07)':
['-2.000e+07', '1.23e+07', '-5e+07']
Test with '+2.*p**(-1.0) + -1.*p**(5)':
['+2.', '-1.', '5']
Test with '0*p**(-1.0) + .123*p**(7.89)':
['0', '.123', '7.89']

How to remove spaces between letters and print them on one line?

I am stuck guys...
I have a for loop that works perfectly, but I don't know how to remove spaces. I tried using the sep="" in the print function, but that didn't work out. I get this error:
"sytnax error while detecting tuple"
What I want to achieve is this:
abcd.... (so glued together, on one
line).
I've placed them on one line, like this:
for letter in range(96,126):
a_y = chr(letter)
print(a_y),
Hence the , which I use to print them all on one line. My question: is this approach correct?
And the other one: how on earth can I glue the outputs together? I tried using append and sep="", but both just don't work. Am I doing something wrong?
You can use a comprehension list:
>>> ''.join([chr(n) for n in range(96, 126)])
'`abcdefghijklmnopqrstuvwxyz{|}'
And the reverse:
>>> ''.join([chr(n) for n in reversed(range(96, 126))])
'}|{zyxwvutsrqponmlkjihgfedcba`'
Or, if you really want to use the print function:
from __future__ import print_function
for letter in range(96, 126):
print(chr(letter), end='')
# Reverse
for letter in reversed(range(96, 126)):
print(chr(letter), end='')
In addition to Tiger-222's answer, you can combine with map:
print ''.join(map(chr, xrange(96, 126)))
Result
`abcdefghijklmnopqrstuvwxyz{|}
Try any one of the following whichever suits your need.
>>> import string
>>> string.ascii_lowercase
>>> 'abcdefghijklmnopqrstuvwxyz'
>>> string.ascii_uppercase
>>> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.letters
>>> 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
Try,
import string
print string.letters[26:]
or
print ''.join([chr(n) for n in range(97, 123)])

Python - Most elegant way to extract a substring, being given left and right borders [duplicate]

This question already has answers here:
How to extract the substring between two markers?
(22 answers)
Closed 4 years ago.
I have a string - Python :
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
Expected output is :
"Atlantis-GPS-coordinates"
I know that the expected output is ALWAYS surrounded by "/bar/" on the left and "/" on the right :
"/bar/Atlantis-GPS-coordinates/"
Proposed solution would look like :
a = string.find("/bar/")
b = string.find("/",a+5)
output=string[a+5,b]
This works, but I don't like it.
Does someone know a beautiful function or tip ?
You can use split:
>>> string.split("/bar/")[1].split("/")[0]
'Atlantis-GPS-coordinates'
Some efficiency from adding a max split of 1 I suppose:
>>> string.split("/bar/", 1)[1].split("/", 1)[0]
'Atlantis-GPS-coordinates'
Or use partition:
>>> string.partition("/bar/")[2].partition("/")[0]
'Atlantis-GPS-coordinates'
Or a regex:
>>> re.search(r'/bar/([^/]+)', string).group(1)
'Atlantis-GPS-coordinates'
Depends on what speaks to you and your data.
What you haven't isn't all that bad. I'd write it as:
start = string.find('/bar/') + 5
end = string.find('/', start)
output = string[start:end]
as long as you know that /bar/WHAT-YOU-WANT/ is always going to be present. Otherwise, I would reach for the regular expression knife:
>>> import re
>>> PATTERN = re.compile('^.*/bar/([^/]*)/.*$')
>>> s = '/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/'
>>> match = PATTERN.match(s)
>>> match.group(1)
'Atlantis-GPS-coordinates'
import re
pattern = '(?<=/bar/).+?/'
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
result = re.search(pattern, string)
print string[result.start():result.end() - 1]
# "Atlantis-GPS-coordinates"
That is a Python 2.x example. What it does first is:
1. (?<=/bar/) means only process the following regex if this precedes it (so that /bar/ must be before it)
2. '.+?/' means any amount of characters up until the next '/' char
Hope that helps some.
If you need to do this kind of search a bunch it is better to 'compile' this search for performance, but if you only need to do it once don't bother.
Using re (slower than other solutions):
>>> import re
>>> string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
>>> re.search(r'(?<=/bar/)[^/]+(?=/)', string).group()
'Atlantis-GPS-coordinates'

Python: Dividing a string into substrings

I have a bunch of mathematical expressions stored as strings. Here's a short one:
stringy = "((2+2)-(3+5)-6)"
I want to break this string up into a list that contains ONLY the information in each "sub-parenthetical phrase" (I'm sure there's a better way to phrase that.) So my yield would be:
['2+2','3+5']
I have a couple of ideas about how to do this, but I keep running into a "okay, now what" issue.
For example:
for x in stringy:
substring = stringy[stringy.find('('+1 : stringy.find(')')+1]
stringlist.append(substring)
Works just peachy to return 2+2, but that's about as far as it goes, and I am completely blanking on how to move through the remainder...
One way using regex:
import re
stringy = "((2+2)-(3+5)-6)"
for exp in re.findall("\(([\s\d+*/-]+)\)", stringy):
print exp
Output
2+2
3+5
You could use regular expressions like the following:
import re
x = "((2+2)-(3+5)-6)"
re.findall(r"(?<=\()[0-9+/*-]+(?=\))", x)
Result:
['2+2', '3+5']

Categories