Read numbers from string into float

Read numbers from string into float - python

I need to convert some strings to float. Most of them are only numbers but some of them have letters too. The regular float() function throws an error.
a='56.78'
b='56.78 ab'
float(a) >> 56.78
float(b) >> ValueError: invalid literal for float()
One solution is to check for the presence of other characters than numbers, but I was wondering if there is some built-in or other short function which gives:
magicfloat(a) >> 56.78
magicfloat(b) >> 56.78

You can try stripping letters from your input:
from string import ascii_lowercase
b='56.78 ab'
float(b.strip(ascii_lowercase))

use a regex
import re
def magicfloat(input):
numbers = re.findall(r"[-+]?[0-9]*\.?[0-9]+", input)
# TODO: Decide what to do if you got more then one number in your string
if numbers:
return float(numbers[0])
return None
a=magicfloat('56.78')
b=magicfloat('56.78 ab')
print a
print b
output:
56.78
56.78

Short answer: No.
There is no built-in function that can accomplish this.
Longish answer: Yes:
One thing you can do is go through each character in the string to check if it is a digit or a period and work with it from there:
def magicfloat(var):
temp = list(var)
temp = [char for char in temp if char.isdigit() or char == '.']
var = "".join(temp)
return var
As such:
>>> magicfloat('56.78 ab')
'56.78'
>>> magicfloat('56.78')
'56.78'
>>> magicfloat('56.78ashdusaid')
'56.78'
>>>

Related

How to get string and int as input in one line in Python3?

I have seen the below link for my answer but still didn't get an expected answer.
If I provide Name and Number in one line, Python should take the first value as a string and second as an integer.

a, b = [int(x) if x.isnumeric() else x for x in input().split()]
It will convert to int any part of input if possible.

s, num = input("Give str and int :\n").split()
num = int(num)
print(s, type(s))
print(num, type(num))
Output :
Give str and int :
hello 23
hello <class 'str'>
23 <class 'int'>

If you're certain to have a string and a number you can use list comprehension to get the values of both.
x = "Hello 345"
str_value = "".join([s for s in x if s.isalpha()]) # list of alpha characters and join into one string
num_value = int("".join([n for n in x if n.isdigit()])) # list of numbers converted into an int
print(str_value)
>> Hello
print(num_value)
>> 345

You can get the string and int by using regular expressions as follows:
import re
input_text = "string489234"
re_match = re.match("([^0-9]*) ?([0-9]*)", input_text)
if re_match:
input_str = re_match.group(1)
input_int = re_match.group(2)
see: https://docs.python.org/3/library/re.html

Removing non numeric characters from a string

I have been given the task to remove all non numeric characters including spaces from a either text file or string and then print the new result next to the old characters for example:
Before:
sd67637 8
After:
676378
As i am a beginner i do not know where to start with this task. Please Help

The easiest way is with a regexp
import re
a = 'lkdfhisoe78347834 (())&/&745 '
result = re.sub('[^0-9]','', a)
print result
>>> '78347834745'

Loop over your string, char by char and only include digits:
new_string = ''.join(ch for ch in your_string if ch.isdigit())
Or use a regex on your string (if at some point you wanted to treat non-contiguous groups separately)...
import re
s = 'sd67637 8'
new_string = ''.join(re.findall(r'\d+', s))
# 676378
Then just print them out:
print(old_string, '=', new_string)

There is a builtin for this.
string.translate(s, table[, deletechars])
Delete all characters from s
that are in deletechars (if present), and then translate the
characters using table, which must be a 256-character string giving
the translation for each character value, indexed by its ordinal. If
table is None, then only the character deletion step is performed.
>>> import string
>>> non_numeric_chars = ''.join(set(string.printable) - set(string.digits))
>>> non_numeric_chars = string.printable[10:] # more effective method. (choose one)
'sd67637 8'.translate(None, non_numeric_chars)
'676378'
Or you could do it with no imports (but there is no reason for this):
>>> chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~ \t\n\r\x0b\x0c'
>>> 'sd67637 8'.translate(None, chars)
'676378'

I would not use RegEx for this. It is a lot slower!
Instead let's just use a simple for loop.
TLDR;
This function will get the job done fast...
def filter_non_digits(string: str) -> str:
result = ''
for char in string:
if char in '1234567890':
result += char
return result
The Explanation
Let's create a very basic benchmark to test a few different methods that have been proposed. I will test three methods...
For loop method (my idea).
List Comprehension method from Jon Clements' answer.
RegEx method from Moradnejad's answer.
# filters.py
import re
# For loop method
def filter_non_digits_for(string: str) -> str:
result = ''
for char in string:
if char in '1234567890':
result += char
return result
# Comprehension method
def filter_non_digits_comp(s: str) -> str:
return ''.join(ch for ch in s if ch.isdigit())
# RegEx method
def filter_non_digits_re(string: str) -> str:
return re.sub('[^\d]','', string)
Now that we have an implementation of each way of removing digits, let's benchmark each one.
Here is some very basic and rudimentary benchmark code. However, it will do the trick and give us a good comparison of how each method performs.
# tests.py
import time, platform
from filters import filter_non_digits_re,
filter_non_digits_comp,
filter_non_digits_for
def benchmark_func(func):
start = time.time()
# the "_" in the number just makes it more readable
for i in range(100_000):
func('afes098u98sfe')
end = time.time()
return (end-start)/100_000
def bench_all():
print(f'# System ({platform.system()} {platform.machine()})')
print(f'# Python {platform.python_version()}\n')
tests = [
filter_non_digits_re,
filter_non_digits_comp,
filter_non_digits_for,
]
for t in tests:
duration = benchmark_func(t)
ns = round(duration * 1_000_000_000)
print(f'{t.__name__.ljust(30)} {str(ns).rjust(6)} ns/op')
if __name__ == "__main__":
bench_all()
Here is the output from the benchmark code.
# System (Windows AMD64)
# Python 3.9.8
filter_non_digits_re 2920 ns/op
filter_non_digits_comp 1280 ns/op
filter_non_digits_for 660 ns/op
As you can see the filter_non_digits_for() funciton is more than four times faster than using RegEx, and about twice as fast as the comprehension method. Sometimes simple is best.

You can use string.ascii_letters to identify your non-digits:
from string import *
a = 'sd67637 8'
a = a.replace(' ', '')
for i in ascii_letters:
a = a.replace(i, '')
In case you want to replace a colon, use quotes " instead of colons '.

To extract Integers
Example: sd67637 8 ==> 676378
import re
def extract_int(x):
return re.sub('[^\d]','', x)
To extract a single float/int number (possible decimal separator)
Example: sd7512.sd23 ==> 7512.23
import re
def extract_single_float(x):
return re.sub('[^\d|\.]','', x)
To extract multiple float/float numbers
Example: 123.2 xs12.28 4 ==> [123.2, 12.28, 4]
import re
def extract_floats(x):
return re.findall("\d+\.\d+", x)

Adding into #MoradneJad . You can use the following code to extract integer values, floats and even signed values.
a = re.findall(r"[-+]?\d*\.\d+|\d+", "Over th44e same pe14.1riod of time, p-0.8rices also rose by 82.8p")
And then you can convert the list items to numeric data type effectively using map.
print(list(map(float, a)))
[44.0, 14.1, -0.8, 82.8]

import re
result = re.sub('\D','','sd67637 8')
result >>> '676378'

Convert all numeric strings with or without unit abbreviations. You must indicate that the source string is a decimal comma notation by parameter dec=',' Converting to floats as well as integer is possible. Default conversion is float, but set the parameter toInt=True and the result is an integer. Automatic recognition of unit abbreviations that can be edited in the md dictionary. The key is the unit abbreviation and the value is the multiplier. In this way, the applications of this function are endless. The result is always a number you can calculate with. This all in one function is not the fastest method, but you don't have to worry anymore and it always returns a reliable result.
import re
'''
units: gr=grams, K=thousands, M=millions, B=billions, ms=mili-seconds, mt= metric-tonnes
'''
md = {'gr': 0.001, '%': 0.01, 'K': 1000, 'M': 1000000, 'B': 1000000000, 'ms': 0.001, 'mt': 1000}
seps = {'.': True, ',': False}
kl = list(md.keys())
def to_Float_or_Int(strVal, toInt=None, dec=None):
toInt = False if toInt is None else toInt
dec = '.' if dec is None else dec
def chck_char_in_string(strVal):
rs = None
for el in kl:
if el in strVal:
rs = el
break
return rs
if dec in seps.keys():
dcp = seps[dec]
strVal = strVal.strip()
mpk = chck_char_in_string(strVal)
mp = 1 if mpk is None else md[mpk]
strVal = re.sub(r'[^\de.,-]+', '', strVal)
if dcp:
strVal = strVal.replace(',', '')
else:
strVal = strVal.replace('.', '')
strVal = strVal.replace(',', '.')
dcnm = float(strVal)
dcnm = dcnm * mp
dcnm = int(round(dcnm)) if toInt else dcnm
else:
print('wrong decimal separator')
dcnm = None
return dcnm
Call the function as follows:
pvals = ['-123,456', '-45,145.01 K', '753,159.456', '1,000,000', '985 ms' , '888 745.23', '1.753 e-04']
cvals = ['-123,456', '1,354852M', '+10.000,12 gr', '-87,24%', '10,2K', '985 ms', '(mt) 0,475', ' ,159']
print('decimal point strings')
for val in pvals:
result = to_Float_or_Int(val)
print(result)
print()
print('decimal comma strings')
for val in cvals:
result = to_Float_or_Int(val, dec=',')
print(result)
exit()
The output results:
decimal point strings
-123456.0
-45145010.0
753159.456
1000000.0
0.985
888745.23
0.0001753
decimal comma strings
-123.456
1354852.0
10.00012
-0.8724
10200.0
0.985
475.0
0.159

How do you write a function in Python that takes a string and returns a new string that is the original string with all of the characters repeated?

I am trying to write a function that returns a string that is the inputted string with double characters. For example, if the input was 'hello', then the function should return 'hheelllloo'. I have been trying but I can't seem to find a way to write the function. Any help would be greatly appreciated--Thanks.

With a simple generator:
>>> s = 'hello'
>>> ''.join(c * 2 for c in s)
'hheelllloo'

def repeatChars(text, numOfRepeat):
ans = ''
for c in text:
ans += c * numOfRepeat
return ans
To use:
repeatChars('hello', 2)
output: 'hheelllloo'
Since strings are immutable, it's not a good idea to concatenate them together as seen in the repeatChars method. It's okay if the text you're manipulating has short length like 'hello' but if you're passing 'superfragilisticexpialidocious' (or longer strings)... You get the point. So as an alternative, I've merged my previous code with #Roman Bodnarchuk's code.
Alternate method:
def repeatChars(text, numOfRepeat):
return ''.join([c * numOfRepeat for c in text])
Why? Read this: Efficient String Concatenation in Python

s = 'hello'
''.join(c+c for c in s)
# returns 'hheelllloo'

>>> s = "hello"
>>> "".join(map(str.__add__, s, s))
'hheelllloo'

def doublechar(s):
if s:
return s[0] + s[0] + doublechar(s[1:])
else:
return ""

In Python, how do I check if a string has alphabets or numbers?

If the string has an alphabet or a number, return true. Otherwise, return false.
I have to do this, right?
return re.match('[A-Z0-9]',thestring)

Use thestring.isalnum() method.
>>> '123abc'.isalnum()
True
>>> '123'.isalnum()
True
>>> 'abc'.isalnum()
True
>>> '123#$%abc'.isalnum()
>>> a = '123abc'
>>> (a.isalnum()) and (not a.isalpha()) and (not a.isnumeric())
True
>>>

If you want to check if ALL characters are alphanumeric:
string.isalnum() (as #DrTyrsa pointed out), or
bool(re.match('[a-z0-9]+$', thestring, re.IGNORECASE))
If you want to check if at least one alphanumeric character is present:
import string
alnum = set(string.letters + string.digits)
len(set(thestring) & alnum) > 0
or
bool(re.search('[a-z0-9]', thestring, re.IGNORECASE))

It might be a while, but if you want to figure out if the string at at least 1 alphabet or numeral, we could use
re.match('.*[a-zA-Z0-9].*', yourstring)

What about
stringsample.isalpha()
method?

String reversal in Python

I have taken an integer input and tried to reverse it in Python but in vain! I changed it into a string but still I am not able to. Is there any way to reverse it ? Is there any built-in function?
I am not able to convert the integer into a list so not able to apply the reverse function.

You can use the slicing operator to reverse a string:
s = "hello, world"
s = s[::-1]
print s # prints "dlrow ,olleh"
To convert an integer to a string, reverse it, and convert it back to an integer, you can do:
x = 314159
x = int(str(x)[::-1])
print x # prints 951413

Code:
>>> n = 1234
>>> print str(n)[::-1]
4321

>>> int(''.join(reversed(str(12345))))
54321

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Read numbers from string into float - python

You can try stripping letters from your input: from string import ascii_lowercase b='56.78 ab' float(b.strip(ascii_lowercase))

Related

How to get string and int as input in one line in Python3?

Removing non numeric characters from a string

How do you write a function in Python that takes a string and returns a new string that is the original string with all of the characters repeated?

In Python, how do I check if a string has alphabets or numbers?

String reversal in Python

Categories

Resources