How to find the digit(post number) after # and before ;; since there are other digits in the string. Finally, it produced [507, 19, 1].
Example:
post507 = "#507::empty in Q1/Q2::can list be empty::yes"
post19 = "#19::Allowable functions::empty?, first, rest::"
post1 = "#1::CS116 W2015::Welcome to first post::Thanks!"
cs116 = [post507, post1, post19]
print (search_piazza(cs116, "l")) =>[507,1,19]
(?<=#)\d+
use a lookbehind.See demo.
https://regex101.com/r/eS7gD7/33#python
import re
p = re.compile(r'(?<=#)\d+', re.IGNORECASE | re.MULTILINE)
test_str = "\"#507empty in Q1/Q2can list be emptyyes\"\n \"#19Allowable functionsempty?, first, rest\"\n \"#1CS116 W2015Welcome to first postThanks!"
re.findall(p, test_str)
iF input is in list
use
x=["#507;;empty in Q1/Q2;;can list be empty;;yes",
"#19;;Allowable functions;;empty?, first, rest;;",
"#1;;CS116 W2015;;Welcome to first post;;Thanks!"]
print [re.findall(r"(?<=#)\d+",k) for k in x]
How to find the digit(post number) in string after # and before ;;
Use re.findall along with the list_comprehension.
>>> l = ["#507;;empty in Q1/Q2;;can list be empty;;yes",
"#19;;Allowable functions;;empty?, first, rest;;",
"#1;;CS116 W2015;;Welcome to first post;;Thanks!"]
>>> [j for i in l for j in re.findall(r'#(\d+);;', i)]
['507', '19', '1']
Finally convert the resultant number to integer.
>>> [int(j) for i in l for j in re.findall(r'#(\d+);;', i)]
[507, 19, 1]
Without regex.
>>> l = ["#507;;empty in Q1/Q2;;can list be empty;;yes",
"#19;;Allowable functions;;empty?, first, rest;;",
"#1;;CS116 W2015;;Welcome to first post;;Thanks!"]
>>> for i in l:
for j in i.split(';;'):
for k in j.split('#'):
if k.isdigit():
print(k)
507
19
1
List_comprehension:
>>> [int(k) for i in l for j in i.split(';;') for k in j.split('#') if k.isdigit()]
[507, 19, 1]
Related
l = ['-t2=Idle -D2=/sv/socket0/local-core-01/local-cpu-00 -T2=0 \\\n',
'-t3=Idle -D3=/sv/socket0/local-core-02/local-cpu-01 -T3=0 \\\n',]
I want to add the core number into a variable preferably.
Maybe use a regex in a list comprehension:
l = ['-t2=Idle -D2=/sv/socket0/local-core-01/local-cpu-00 -T2=0 \\n',
'-t3=Idle -D3=/sv/socket0/local-core-02/local-cpu-01 -T3=0 \\n',]
import re
out = [m.group(1) if (m:=re.search(r'core-(\d+)', s)) else None for s in l]
Output:
['01', '02']
For integers:
out = [int(m.group(1)) if (m:=re.search(r'core-(\d+)', s)) else None for s in l]
Output:
[1, 2]
I am trying to convert this string '4-6,10-12,16' into a list that looks like this [4,"-",6,10,"-",12,16]. There would be a combination of integers and the special character "-" in the list.
I was trying to use a regex code in python but I could only do it to extract the numbers, however, I need the dashes as well in the list. How can I include dashes with numbers in the list?
Here is my code:
interval='4-6,10-12,16'
import re
l=[int(s) for s in re.findall(r'\b\d+\b', interval)]
Try this:
interval='4-6,10-12,16'
import re
l=[int(s) if s.isnumeric() else s for s in re.findall(r'\d+|-', interval)]
l
Output:
[4, '-', 6, 10, '-', 12, 16]
You can use
import re
interval='4-6,10-12,16'
l=[int(s) if all(c.isdigit() for c in s) else '-' for s in re.findall(r'\d+|-', interval)]
print(l) # => [4, '-', 6, 10, '-', 12, 16]
See the Python demo.
Details:
re.findall(r'\d+|-', interval) extracts digit sequences or - chars
int(s) if all(c.isdigit() for c in s) else '-' either casts a digit sequence to an int if the whole match consists of digits, or just returns - as a string.
Useful functions:
str.isdigit (or str.isnumeric or str.isdecimal);
itertools.groupby to group adjacent characters that share a characteristic.
from itertools import groupby
def tokenize_digits_and_dashes(s):
for k, g in groupby(s, key=lambda c: (c.isdigit(), c == '-')):
if k == (True, False):
yield int(''.join(g))
elif k == (False, True):
yield '-'
print(list(tokenize_digits_and_dashes('4-6,10-12,16')))
# [4, '-', 6, 10, '-', 12, 16]
Alternative approach
Your string already contains separators in the form of commas ,. These are useful! Don't ignore them. You can split the list on the separators using str.split.
def tokenize_intervals(s):
for interval in s.split(','):
i = interval.split('-')
if len(i) == 2:
yield tuple(int(''.join(w)) for w in i)
elif len(i) == 1:
x = int(''.join(i[0]))
yield (x, x)
print(list(tokenize_intervals('4-6,10-12,16')))
# [(4, 6), (10, 12), (16, 16)]
# By Using Regex #
# -------------- #
import re
interval = '4-6,10-12,16'
s_list = re.findall(r'[\d+]+|-', interval)
x = [int(_) if _.isnumeric() else _ for _ in s_list]
print(x)
# By Using the split method #
# ------------------------- #
final_list = []
for _ in interval.split(','):
sub_list = _.split('-')
for i in sub_list:
if i.isnumeric():
final_list.append(int(i))
if sub_list[-1] != I:
final_list.append('-')
print(final_list)
# By Checking Character By Character #
# ---------------------------------- #
z = ""
s = []
count = 0
for _ in interval:
count += 1
if _.isnumeric():
z += _
if count == len(interval):
s.append(int(z))
elif _ == '-':
s.append(int(z))
z = ""
s.append('-')
else:
s.append(int(z))
z = ""
print(s)
I have a string bar:
bar = 'S17H10E7S5E3H2S105H90E15'
I take this string and form groups that start with the letter S:
groups = ['S' + elem for elem in bar.split('S') if elem != '']
groups
['S17H10E7', 'S5H3E2', 'S105H90E15']
Without using the mini-language RegEx, I'd like to be able to get the integer values that follow the different letters S, H, and E in these groups. To do so, I'm using:
code = 'S'
temp_num = []
for elem in groups:
start = elem.find(code)
for char in elem[start + 1: ]:
if not char.isdigit():
break
else:
temp_num.append(char)
num_tests = ','.join(temp_num)
This gives me:
print(groups)
['S17H10E7', 'S5H3E2', 'S105H90E15']
print(temp_num)
['1', '7', '5', '1', '0', '5']
print(num_tests)
1,7,5,1,0,5
How would I take these individual integers 1, 7, 5, 1, 0, and 5 and put them back together to form a list of the digits following the code S? For example:
[17, 5, 105]
UPDATE:
In addition to the accepted answer, here is another solution:
def count_numbers_after_code(string_to_read, code):
index_values = [i for i, char in enumerate(string_to_read) if char == code]
temp_1 = []
temp_2 = []
for idx in index_values:
temp_number = []
for character in string_to_read[idx + 1: ]:
if not character.isdigit():
break
else:
temp_number.append(character)
temp_1 = ''.join(temp_number)
temp_2.append(int(temp_1))
return sum(temp_2)
Would something like this work?
def get_numbers_after_letter(letter, bar):
current = True
out = []
for x in bar:
if x==letter:
out.append('')
current = True
elif x.isnumeric() and current:
out[-1] += x
elif x.isalpha() and x!=letter:
current = False
return list(map(int, out))
Output:
>>> get_numbers_after_letter('S', bar)
[17, 5, 105]
>>> get_numbers_after_letter('H', bar)
[10, 3, 90]
>>> get_numbers_after_letter('E', bar)
[7, 2, 15]
I think it's better to get all the numbers after every letter, since we're making a pass over the string anyway but if you don't want to do that, I guess this could work.
The question states that you would favour a solution without using regex ("unless absolutely necessary" from the comments)
It is not necessary of course, but as an alternative for future readers you can match S and capture 1 or more digits using (\d+) in a group that will be returned by re.findall.
import re
bar = 'S17H10E7S5E3H2S105H90E15'
print(re.findall(r"S(\d+)", bar))
Output
['17', '5', '105']
An example of the list would be this:
Name
KOI-234
KOI-123
KOI-3004
KOI-21
KOI-4325
and I simply want to make all these numbers to have at least 4 characters, so it would look like this:
Name
KOI-0234
KOI-0123
KOI-3004
KOI-0021
KOI-4325
I've already tried this code, but I guess it reads the 'KOI' part as not number and doesn't add the zeros.
first_list = db['Name']
second_list = []
for pl in first_list:
second_list.append(pl.zfill(4))
So, how can I achieve that?
You can use format specifications:
lst = ['KOI-234', 'KOI-123', 'KOI-3004', 'KOI-21', 'KOI-4325']
['{}-{:0>4}'.format(*i.split('-')) for i in lst]
# ['KOI-0234', 'KOI-0123', 'KOI-3004', 'KOI-0021', 'KOI-4325']
If you want to remove leading zeros:
[f'{i}-{int(j)}' for i, j in map(lambda x: x.split('-'), lst)]
It does not add zeroes because every element/name already has more than 4 symbols.
You can try using regular expressions:
import re
my_list = ['KOI-123', 'KOI-3004', 'KOI-21']
pattern = r'(?<=-)\w+' # regex to capture the part of the string after the hyphen
for pl in my_list:
match_after_dash = re.search(pattern, pl) # find the matching object after the hyphen
pl = 'KOI-' + match_after_dash.group(0).zfill(4) # concatenate the first (fixed?) part of string with the numbers part
print(pl) # print out the resulting value of a list element
You can use str.split:
n, *d = ['Name', 'KOI-234', 'KOI-123', 'KOI-3004', 'KOI-21', 'KOI-4325']
result = [n, *[f'{a}-{b.zfill(4)}' for a, b in map(lambda x:x.split('-'), d)]]
Output:
['Name', 'KOI-0234', 'KOI-0123', 'KOI-3004', 'KOI-0021', 'KOI-4325']
And if you want to compute the offset value generically:
n, *d = ['Name', 'KOI-234', 'KOI-123', 'KOI-3004', 'KOI-21', 'KOI-4325']
_d = [i.split('-') for i in d]
offset = max(map(len, [b for _, b in _d]))
result = [n, *[f'{a}-{b.zfill(offset)}' for a, b in _d]]
Output:
['Name', 'KOI-0234', 'KOI-0123', 'KOI-3004', 'KOI-0021', 'KOI-4325']
I would like to convert this list:
a = [['0001', '0101'], ['1100', '0011']]
to:
a' = [['1110', '1010'],['0011','1100']]
In the second example, every character is changed to its opposite (i.e. '1' is changed to '0' and '0' is changed to '1').
The code I have tried is:
for i in a:
for j in i:
s=list(j)
for k in s:
position = s.index(k)
if k=='0':
s[position] = '1'
elif k=='1':
s[position] = '0'
''.join(s)
But it doen't work properly. What can I do?
Thanks
You can use a function that flips the bits like this:
from string import maketrans
flip_table = maketrans('01', '10')
def flip(s):
return s.translate(flip_table)
Then just call it on each item in the list like this:
>>> flip('1100')
'0011'
[["".join([str(int(not int(t))) for t in x]) for x in d] for d in a]
Example:
>>> a = [['0001', '0101'], ['1100', '0011']]
>>> a_ = [["".join([str(int(not int(t))) for t in x]) for x in d] for d in a]
>>> a_
[['1110', '1010'], ['0011', '1100']]
Using a simple list comprehension:
[[k.translate({48:'1', 49:'0'}) for k in i] for i in a]
48 is the code for "0", and 49 is the code for "1".
Demo:
>>> a = [['0001', '0101'], ['1100', '0011']]
>>> [[k.translate({48:'1', 49:'0'}) for k in i] for i in a]
[['1110', '1010'], ['0011', '1100']]
For Python 2.x:
from string import translate, maketrans
[[translate(k, maketrans('10', '01')) for k in i] for i in a]
from ast import literal_eval
import re
a = [['0001', '0101'], ['1100', '0011']]
print literal_eval(re.sub('[01]',lambda m: '0' if m.group()=='1' else '1',str(a)))
literal_eval() is said to be safer than eval()