How to extract a number from an alphanumeric string in Python?

How to extract a number from an alphanumeric string in Python? - python

I want to extract numbers contained in an alphanumeric strings. Please help me on this.
Example:
line = ["frame_117", "frame_11","frame_1207"]
Result:
[117, 11, 1207]

You can split with special character '_' like this:
numbers = []
line = ["frame_117", "frame_11","frame_1207"]
for item in line:
number = int(item.split("_",1)[1])
numbers.append(number)
print(numbers)

import re
temp = []
lines = ["frame_117", "frame_11","frame_1207"]
for line in lines:
num = re.search(r'-?\d+', line)
temp.append(int(num.group(0)))
print(temp) # [117, 11, 1207]

Rationale
The first thing I see is that the names inside the list have a pattern. The string frame, an underscore _ and the string number: "frame_number".
Step-By-Step
With that in mind, you can:
Loop through the list. We'll use a list comprehension.
Get each item from the list (the names="frame_number" )
Split them according to a separator (getting a sublist with ["frame", "number"])
And then create a new list with the last items of each sublist
numbers = [x.split("_")[-1] for x in line]
['117', '11', '1207']
Solution
But you need numbers and here you have a list of strings. We make one extra step and use int().
numbers = [int(x.split("_")[-1]) for x in line]
[117, 11, 1207]
This works only because we detected a pattern in the target name.
But what happens if you need to find all numbers in a random string? Or floats? Or Complex numbers? Or negative numbers?
That's a little bit more complex and out of scope of this answer.
See How to extract numbers from a string in Python?

Related

How to use sep command with sorted

I want to separate sorted numbers with "<", but can't do it.
Here is the code:
numbers = [3, 7, 5]
print(sorted(numbers), sep="<")

The * operator as mentioned by #MisterMiyagi, can be used to unpack the list variables and use the sep.
Code:
print(*sorted(numbers), sep="<")

I dont know if this is the answer you want, but I have made a python code to seperate the sorted numbers with "<" with join after I convert the numbers to strings.
As the items in the iterable must be string types, I first use a list comprehension to create a list containing each interger as a string, and pass this as input to str.join()
# Initial String
test_str = [5,1,2,3,4]
# Sorting number
sortedNum = sorted(test_str)
# Changing numbers into string
string_ints = [str(int) for int in sortedNum]
# Joining the sorted string with "<"
output = '<'.join(string_ints)
print(output)

How to split up certain characters but not others?

I want to take an input of a string of elements and make one list with the atoms and the amount of that atom.
["H3", "He4"]
That sections works, however I also need to make a list of only the elements. It would look something like
["H", "He"]
However when I try and split it into individual atoms it comes out like.
["H", "H", "He"]
Here is my current code for the function:
def molar_mass():
nums = "0123456789"
print("Please use the format H3 He4")
elements = input("Please leaves spaces between elements and their multipliers: ")
element_list = elements.split()
print(element_list)
elements_only_list = []
for element_pair in element_list:
for char in element_pair:
if char not in nums:
elements_only_list.append(char)
test = element_pair.split()
print(test)
print(elements_only_list)
I'm aware that there is a library for something similar, however I don't wish to use it.

Your problem here is that you are appending each non-numeric character to elements_only_list, as a new element of that list. You want instead to get the portion of element_pair that contains non-numeric characters, and append that string to the list. A simple way to do this is to use the rstrip method to remove the numeric characters from the end of the string.
for element_pair in element_list:
element_only = element_pair.rstrip(nums)
elements_only_list.append(element_only)
It could also be done using regular expressions, but that's more complicated than you need right now.
FYI, you don't really need your nums variable. The string module contains constants for various standard groups of characters. In this case you could import string.digits.

To my understanding, you will have user input such as H3 He4 and expect the output to be ['H','He'], accordingly i modified your function:
def molar_mass():
print("Please use the format H3 He4")
elements = input("Please leaves spaces between elements and their multipliers: ")
element_list = elements.split() # splits text to a list
print(element_list)
results = []
for elem in element_list: # loops over elements list
#seperate digits from characters in a list and replace digits with ''
el1 = list(map(lambda x: x if not x.isdigit() else '' , elem))
el2 = ''.join(el1)
results.append(el2)
return results
molar_mass()
using this function, with an input as below:
H3 He4
output will be:
['H','He']

How can I generate random numbers based on a pattern from a given list of numbers?

I'm trying to generate x random numbers based on lists I will provide (containing the same amount of numbers I want generated).
So Far I have this code:
import random
list = []
while len(list) < 10:
x = random.randint(1, 100)
if x not in list:
list.append(x)
list.sort()
print (list)
The question is, how do I input the lists I have so Python can read some pattern (in lack of a better word) and generate numbers?
Tried Google it, found nothing so far.
Thanks.

With python a file can be read and split on whitespace into a list using str.split() with no argument like this:
lines = []
for line in open('filename'):
line = line.strip().split() # splits on whitespace
for token in line:
lines.append(token)
If the file has a different separator such as a colon it can be split like above if the separator is a character or fixed sequence of characters using split('char') as in split(':') or split('charseq') as in split('==='), or it can be split on a regular expression using re.split('some_regex','text2split'). Additionally, it could be useful to verify the format of numeric data to ensure invalid data does not cause an error or other undesirable behavior in subsequent processing.
Below is a complete example for extracting comma-separated numbers from a file and appending them into a list and where the numbers are filtered to match at least one of three forms defined by regular expressons: (1) '\d+' (more than one decimal digit); (2) '\d+.\d*' (more than one decimal digit followed by a period followed by zero or more decimal digits; or (3) '\d*.\d+' (zero or more decimal digits followed by a period followed by one or more decimal digits). In this example a regex for matching numbers in these forms is compiled to improve performance.
import re
numList = []
regex = re.compile('^(\d+)|(\d+\.\d*)|(\d*\.\d+)$')
for data in open('filename'):
tmpList = re.split(',',data.strip()) # could use data.strip().split(',')
for element in tmpList:
if regex.match(element):
numList.append(element)
After running this the numbers in numList can be iterated like this:
for item in numList:
print(item)
# do other things such as calculations with item

Split identifier string python

DB00002
DB00914
DB00222
DB01056
I have a list of database ID and want to trim it down to contain number only e.g. (2,914,222,1056) How can I do this in python? Many thanks!

Just exclude the first two characters and convert the rest to int.
data = ["DB00002", "DB00914", "DB00222", "DB01056"]
print [int(item[2:]) for item in data ]
# [2, 914, 222, 1056]
If you are not sure about the number of characters which are not numbers, you can skip them using generator expression, like this
[int("".join(char for char in item if char.isdigit())) for item in data]

Assuming, you want to remove the frist two chars and convert the rest into integer:
text = "DB00914"
num = int(text[2:])

How to extract certain letters from a string using Python

I have a string 'A1T1730'
From this I need to extract the second letter and the last four letters. For example, from 'A1T1730' I need to extract '1' and '1730'. I'm not sure how to do this in Python.
I have the following right now which extracts every character from the string separately so can someone please help me update it as per the above need.
list = ['A1T1730']
for letter in list[0]:
print letter
Which gives me the result of A, 1, T, 1, 7, 3, 0

my_string = "A1T1730"
my_string = my_string[1] + my_string[-4:]
print my_string
Output
11730
If you want to extract them to different variables, you can just do
first, last = my_string[1], my_string[-4:]
print first, last
Output
1 1730

Using filter with str.isdigit (as unbound method form):
>>> filter(str.isdigit, 'A1T1730')
'11730'
>>> ''.join(filter(str.isdigit, 'A1T1730')) # In Python 3.x
'11730'
If you want to get numbers separated, use regular expression (See re.findall):
>>> import re
>>> re.findall(r'\d+', 'A1T1730')
['1', '1730']
Use thefourtheye's solution if the positions of digits are fixed.
BTW, don't use list as a variable name. It shadows builtin list function.

Well you could do like this
_2nd = lsit[0][1]
# last 4 characters
numbers = list[0][-4:]

You can use the function isdigit(). If that character is a digit it returns true and otherwise returns false:
list = ['A1T1730']
for letter in list[0]:
if letter.isdigit() == True:
print letter, #The coma is used for print in the same line
I hope this useful.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract a number from an alphanumeric string in Python? - python

I want to extract numbers contained in an alphanumeric strings. Please help me on this. Example: line = ["frame_117", "frame_11","frame_1207"] Result: [117, 11, 1207]

You can split with special character '_' like this: numbers = [] line = ["frame_117", "frame_11","frame_1207"] for item in line: number = int(item.split("_",1)[1]) numbers.append(number) print(numbers)

import re temp = [] lines = ["frame_117", "frame_11","frame_1207"] for line in lines: num = re.search(r'-?\d+', line) temp.append(int(num.group(0))) print(temp) # [117, 11, 1207]

Related

How to use sep command with sorted

How to split up certain characters but not others?

How can I generate random numbers based on a pattern from a given list of numbers?

Split identifier string python

How to extract certain letters from a string using Python

Categories

Resources