How do I add a dot into a Python list?
For example
groups = [0.122, 0.1212, 0.2112]
If I want to output this data, how would I make it so it is like
122, 1212, 2112
I tried write(groups...[0]) and further research but didn't get far. Thanks.
Thankyou
[str(g).split(".")[1] for g in groups]
results in
['122', '1212', '2112']
Edit:
Use it like this:
groups = [0.122, 0.1212, 0.2112]
decimals = [str(g).split(".")[1] for g in groups]
You could use a list comprehension and return a list of strings
groups = [0.122, 0.1212, 0.2112]
[str(x).split(".")[1] for x in groups]
Result
['122', '1212', '2112']
The list comprehension is doing the following:
Turn each list element into a string
Split the string about the "." character
Return the substring to the right of the split
Return a list based on the above logic
This should do it:
groups = [0.122, 0.1212, 0.2112]
import re
groups_str = ", ".join([str(x) for x in groups])
re.sub('[0-9]*[.]', "", groups_str)
[str(x) for x in groups] will make strings of the items.
", ".join will connect the items, as a string.
import re allows you to replace regular expressions:
using re.sub, the regular expression is used by replacing any numbers followed by a dot by nothing.
EDIT (no extra modules):
Working with Lutz' answer, this will also work in the case there is an integer (no dot):
decimals = [str(g).split("0.") for g in groups]
decimals = decimals = [i for x in decimals for i in x if i != '']
It won't work though when you have numbers like 11.11, where there is a part you don't want to ignore in front of the dot.
Related
I have a list of strings. Each string has the form of data0*(\d*) if we use a regular expression form.
The following is an example the strings:
data000000, data000003, data0172, data2312, data008212312
I would like to take only the meaningful number portion. All numbers are integers. For example, in the above case, I would like to get another list containing:
0, 3, 172, 2312, 8212312
What would be the best way in the above case?
The following is the solution that I thought:
import re
string_list = ["data0000172", ..... ]
number_list = []
for string in string_list:
match = re.search("data0*(\d+)", string)
if match:
number_list.append(match.group(1))
else:
raise Exception("Wrong format.")
However, the above might be inefficient. Could you suggest a better way for doing this?
If you are sure that the strings start with "data", you can just slice the string and convert to integer. Leading zeroes aren't an issue there. Building an integer from a zero-padded digit strings works.
lst = ["data000000", "data000003", "data0172", "data2312", "data008212312"]
result = [int(x[4:]) for x in lst]
result:
[0, 3, 172, 2312, 8212312]
or good old replace just in case the prefix can be omitted (but it will be slightly slower):
result = [int(x.replace("data","")) for x in lst]
import re
st = 'data0000172'
a = float(re.search('data(\d+)',st).group(1))
print(a)
Output:
172.0
This extract the numbers i.e useful part.Apply this to your list.
In the case where the strings are might not be of the form data<num> and you want the solution to still be valid or if some of the entries are broken for some reason, you can do the following:
import re
ll = ['data000000', 'data000003', 'data0172', 'data2312', 'data008212312']
ss = ''.join(ll)
res = [int(s) for s in re.findall(r'\d+', ss)]
print(res)
The re.findall is applied to the entire list of strings but due to the fact it returns a list of tuples you will get the desired result.
Output:
[0, 3, 172, 2312, 8212312]
Note: applying the re.findall to the list without the join will raise an error.
I have a list of strings
['time_10', 'time_23', 'time_345', 'date_10', 'date_23', 'date_345']
I want to use regular expression to get strings that end with a specific number.
As I understand, first I have to combine all strings from the list into large string, then use form some kind of a pattern to use it for regular expression
I would be grateful if you could provide
regex(some_pattern, some_string)
that would return
['time_10', 'date_10']
or just
'time_10, date_10'
str.endswith is enough.
l = ['time_10', 'time_23', 'time_345', 'date_10', 'date_23', 'date_345']
result = [s for s in l if s.endswith('10')]
print(result)
['time_10', 'date_10']
If you insist on using regex,
import re
result = [s for s in l if re.search('10$', s)]
Assume we have a string a.
A part of a looks like ac5:9qr$28c#.
This pattern (value1:value2$value3#) repeats.
Now, my question is: How do I look for these values and extract them?
Note: These string parts aren't necessarily special characters.
re.findall works well for this problem.
Try this:
import re
data = 'abc:def$ghi#ac5:9qr$28c#1234:4567$89#'
result = re.findall(r'(.*?):(.*?)\$(.*?)#', data)
print result
Result:
[('abc', 'def', 'ghi'), ('ac5', '9qr', '28c'), ('1234', '4567', '89')]
Something like this should do:
a = "ac5:9qr$28c#"
values = []
delimiters = [':','$','#']
while len(a) > 0:
for delimiter in delimiters:
delimiterIndex = a.index(delimiter )
newValue = a[0:delimiterIndex]
values.append(newValue)
a = a[delimiterIndex+1:]
print values
Output is:
['ac5', '9qr', '28c']
Of course you could implement something similar to retain the original 'a' string.
I have a string.
s = '1989, 1990'
I want to convert that to list using python & i want output as,
s = ['1989', '1990']
Is there any fastest one liner way for the same?
Use list comprehensions:
s = '1989, 1990'
[x.strip() for x in s.split(',')]
Short and easy.
Additionally, this has been asked many times!
Use the split method:
>>> '1989, 1990'.split(', ')
['1989', '1990']
But you might want to:
remove spaces using replace
split by ','
As such:
>>> '1989, 1990,1991'.replace(' ', '').split(',')
['1989', '1990', '1991']
This will work better if your string comes from user input, as the user may forget to hit space after a comma.
Call the split function:
myList = s.split(', ')
print s.replace(' ','').split(',')
First removes spaces, then splits by comma.
Or you can use regular expressions:
>>> import re
>>> re.split(r"\s*,\s*", "1999,2000, 1999 ,1998 , 2001")
['1999', '2000', '1999', '1998', '2001']
The expression \s*,\s* matches zero or more whitespace characters, a comma and zero or more whitespace characters again.
i created generic method for this :
def convertToList(v):
'''
#return: input is converted to a list if needed
'''
if type(v) is list:
return v
elif v == None:
return []
else:
return [v]
Maybe it is useful for your project.
converToList(s)
I have a string like this:
"a word {{bla|123|456}} another {{bli|789|123}} some more text {{blu|789}} and more".
I would like to get this as an output:
(("bla", 123, 456), ("bli", 789, 123), ("blu", 789))
I haven't been able to find the proper python regex to achieve that.
>>> re.findall(' {{(\w+)\|(\w+)(?:\|(\w+))?}} ', s)
[('bla', '123', '456'), ('bli', '789', '123'), ('blu', '789', '')]
if you still want number there you'd need to iterate over the output and convert it to the integer with int.
You need a lot of escapes in your regular expression since {, } and | are special characters in them. A first step to extract the relevant parts of the string would be this:
regex = re.compile(r'\{\{(.*?)\|(.*?)(?:\|(.*?))?\}\}')
regex.findall(line)
For the example this gives:
[('bla', '123', '456'), ('bli', '789', '123'), ('blu', '789', '')]
Then you can continue with converting strings with digits into integers and removing empty strings like for the last match.
[re.split('\|', i) for i in re.findall("{{(.*?)}}", str)]
Returns:
[['bla', '123', '456'], ['bli', '789', '123'], ['blu', '789']]
This method works regardless of the number of elements in the {{ }} blocks.
To get the exact output you wrote, you need a regex and a split:
import re
map(lambda s: s.split("|"), re.findall(r"\{\{([^}]*)\}\}", s))
To get it with the numbers converted, do this:
toint = lambda x: int(x) if x.isdigit() else x
[map(toint, p.split("|")) for p in re.findall(r"\{\{([^}]*)\}\}", s)]
Assuming your actual format is {{[a-z]+|[0-9]+|[0-9]+}}, here's a complete program with conversion to ints.
import re
s = "a word {{bla|123|456}} another {{bli|789|123}} some more text {{blu|789}} and more"
result = []
for match in re.finditer('{{.*?}}', s):
# Split on pipe (|) and filter out non-alphanumerics
parts = [filter(str.isalnum, part) for part in match.group().split('|')]
# Convert to int when possible
for index, part in enumerate(parts):
try:
parts[index] = int(part)
except ValueError:
pass
result.append(tuple(parts))
We might be able to get fancy and do everything in a single complicated regular expression, but that way lies madness. Let's do one regexp that grabs the groups, and then split the groups up. We could use a regexp to split the groups, but we can just use str.split(), so let's do that.
import re
pat_group = re.compile("{{([^}]*)}}")
def mixed_tuple(iterable):
lst = []
for x in iterable:
try:
lst.append(int(x))
except ValueError:
lst.append(x)
return tuple(lst)
s = "a word {{bla|123|456}} another {{bli|789|123}} some more text {{blu|789}} and more"
lst_groups = re.findall(pat_group, s)
lst = [mixed_tuple(x.split("|")) for x in lst_groups]
In pat_group, "{{" just matches literal "{{". "(" starts a group. "[^}]" is a character class that matches any character except for "}", and '*' allows it to match zero or more such characters. ")" closes out the group and "}}" matches literal characters. Thus, we match the "{{...}}" patterns, and can extract everything between the curly braces as a group.
re.findall() returns a list of groups matched from the pattern.
Finally, a list comprehension splits each string and returns the result as a tuple.
Is pyparsing overkill for this? Maybe, but without too much suffering, it does deliver the desired output, without a thicket of backslashes to escape the '{', '|', or '}' characters. Plus, there's no need for post-parse conversions of integers and whatnot - the parse actions take care of this kind of stuff at parse time.
from pyparsing import Word, Suppress, alphas, alphanums, nums, delimitedList
LBRACE,RBRACE,VERT = map(Suppress,"{}|")
word = Word(alphas,alphanums)
integer = Word(nums)
integer.setParseAction(lambda t: int(t[0]))
patt = (LBRACE*2 + delimitedList(word|integer, VERT) + RBRACE*2)
patt.setParseAction(lambda toks:tuple(toks.asList()))
s = "a word {{bla|123|456}} another {{bli|789|123}} some more text {{blu|789}} and more"
print tuple(p[0] for p in patt.searchString(s))
Prints:
(('bla', 123, 456), ('bli', 789, 123), ('blu', 789))