Remove different substrings from list of strings

Remove different substrings from list of strings - python

I have a list of strings with two different prefixes that I would like to remove.
example_list=[
'/test1/test2/test3/ABCD_1',
'/test1/test2/test3/ABCD_2',
'/test1/test2/test3/ABCD_3',
'/test1/test4/test5/test6/ABCD_4',
'/test1/test4/test5/test6/ABCD_5',
'/test1/test4/test5/test6/ABCD_6',
'/test1/test4/test5/test6/ABCD_7']
I would like the new list to look like:
example_list=[
'ABCD_1',
'ABCD_2',
'ABCD_3',
'ABCD_4',
'ABCD_5',
'ABCD_6',
'ABCD_7']
I was trying something like this, but keep running into errors.
for i in example_list:
if i.startswith('/test1/test2/test3/'):
i=i[19:]
else:
i=i[25:]

example_list = [path.split('/')[-1] for path in example_list]
Output:
['ABCD_1', 'ABCD_2', 'ABCD_3', 'ABCD_4', 'ABCD_5', 'ABCD_6', 'ABCD_7']

given that these are all filesystem paths i suggest you use pathlib:
from pathlib import Path
example_list = [
'/test1/test2/test3/ABCD_1',
'/test1/test2/test3/ABCD_2',
'/test1/test2/test3/ABCD_3',
'/test1/test4/test5/test6/ABCD_4',
'/test1/test4/test5/test6/ABCD_5',
'/test1/test4/test5/test6/ABCD_6',
'/test1/test4/test5/test6/ABCD_7']
res = [Path(item).name for item in example_list]
print(res) # ['ABCD_1', 'ABCD_2', 'ABCD_3', 'ABCD_4', 'ABCD_5', 'ABCD_6', 'ABCD_7']

Just use reverse indexing:
new_list=[]
for i in example_list:
j=i[-6:]
new_list.append(j)
print(new_list)
Output will be
['ABCD_1', 'ABCD_2', 'ABCD_3', 'ABCD_4', 'ABCD_5', 'ABCD_6', 'ABCD_7']

Related

Remove Prefixes From a String

What's a cute way to do this in python?
Say we have a list of strings:
clean_be
clean_be_al
clean_fish_po
clean_po
and we want the output to be:
be
be_al
fish_po
po

Another approach which will work for all scenarios:
import re
data = ['clean_be',
'clean_be_al',
'clean_fish_po',
'clean_po', 'clean_a', 'clean_clean', 'clean_clean_1']
for item in data:
item = re.sub('^clean_', '', item)
print (item)
Output:
be
be_al
fish_po
po
a
clean
clean_1

Here is a possible solution that works with any prefix:
prefix = 'clean_'
result = [s[len(prefix):] if s.startswith(prefix) else s for s in lst]

You've merely provided minimal information on what you're trying to achieve, but the desired output for the 4 given inputs can be created via the following function:
def func(string):
return "_".join(string.split("_")[1:])

you can do this:
strlist = ['clean_be','clean_be_al','clean_fish_po','clean_po']
def func(myList:list, start:str):
ret = []
for element in myList:
ret.append(element.lstrip(start))
return ret
print(func(strlist, 'clean_'))
I hope, it was useful, Nohab

There are many ways to do based on what you have provided.
Apart from the above answers, you can do in this way too:
string = 'clean_be_al'
string = string.replace('clean_','',1)
This would remove the first occurrence of clean_ in the string.
Also if the first word is guaranteed to be 'clean', then you can try in this way too:
string = 'clean_be_al'
print(string[6:])

You can use lstrip to remove a prefix and rstrip to remove a suffix
line = "clean_be"
print(line.lstrip("clean_"))
Drawback:
lstrip([chars])
The [chars] argument is not a prefix; rather, all combinations of its values are stripped.

Converting A List With Python

I have a large list of names which is in this format
list1 = ["apple", "orange", "banana", "pine-apple"]
And I want it in this format
list1 = ["'apple'", "'orange'", "'banana'", "'pine-apple'"]
Basically, I want to add punctuation marks to every single word in the list
but since the list is too large, I can't do it manually.
So is there any python function or way to do this task. Thank You.

The names in python are already strings enclosed in the quotes like you have shown here. I am supposing you want to wrap the string with specific quote to look this '"apple"' or "'apple'". To do so, you should use the following snippet
q = "'" # this will be wrapped around the string
list1 = ['apple','orange','banana','pine-apple']
list1 = [q+x+q for x in list1]
For reference, the syntax I have used in last line is known as list comprehension
According to latest comment posted by #xdhmoore
If you are using vim/nano (linux/macos) or notepad(windows), then i would rather suggest you to use IDLE python (shipped with python setup)

Str function is the built in function to convert a value into string.
You can run this code;
For i in range(len(list1)):
new = str(list1[i])
list1.remove(list[i])
list1.append(new)

Using for loop to process each line, two ways to go
text = "list1 = [apple,orange,banana,pine-apple]"
start = text.find('[')+1
stop = text.find(']')
lst = text[start:stop].split(',') # ['apple', 'orange', 'banana', 'pine-apple']
new_lst = [f'"{item}"' for item in lst] # ['"apple"', '"orange"', '"banana"', '"pine-apple"']
new_text1 = text[:start]+','.join(new_lst)+text[stop:] # 'list1 = ["apple","orange","banana","pine-apple"]'
text = "list1 = [apple,orange,banana,pine-apple]"
new_text2 = text.replace('[', '["').replace(']', '"]').replace(',', '","')

Extract int between two different strings in python

I have a list files of strings of the following format:
files = ['/misc/lmbraid17/bensch/u-net-3d/2dcellnet/2dcellnet_v6w4l1/2dcellnet_v6w4l1_snapshot_iter_418000.caffemodel.h5',
'/misc/lmbraid17/bensch/u-net-3d/2dcellnet/2dcellnet_v6w4l1/2dcellnet_v6w4l1_snapshot_iter_502000.caffemodel.h5', ...]
I want to extract the int between iter_ and .caffemodel and return a list of those ints.
After some research I came up with this solution that does the trick, but I was wondering if there is a more elegant/pythonic way to do it, possibly using a list comprehension?
li = []
for f in files:
tmp = re.search('iter_[\d]+.caffemodel', f).group()
li.append(int(re.search(r'\d+', tmp).group()))

Just to add another possible solution: join the file names together into one big string (looks like the all end with h5, so there is no danger of creating unwanted matches) and use re.findall on that:
import re
li = [int(d) for d in re.findall(r'iter_(\d+)\.caffemodel', ''.join(files))]

Use just:
li = []
for f in files:
tmp = int(re.search('iter_(\d+)\.caffemodel', f).group(1))
li.append(tmp)
If you put an expression into parenthesis it creates another group of matched expressions.

You can also use a lookbehind assertion:
regex = re.compile("(?<=iter_)\d+")
for f in files:
number = regex.search(f).group(0)

Solution with list comprehension, as you wished:
import re
re_model_id = re.compile(r'iter_(?P<model_id>\d+).caffemodel')
li = [int(re_model_id.search(f).group('model_id')) for f in files]

Without a regex:
files = [
'/misc/lmbraid17/bensch/u-net-3d/2dcellnet/2dcellnet_v6w4l1/2dcellnet_v6w4l1_snapshot_iter_418000.caffemodel.h5',
'/misc/lmbraid17/bensch/u-net-3d/2dcellnet/2dcellnet_v6w4l1/2dcellnet_v6w4l1_snapshot_iter_502000.caffemodel.h5']
print([f.rsplit("_", 1)[1].split(".", 1)[0] for f in files])
['418000', '502000']
Or if you want to be more specific:
print([f.rsplit("iter_", 1)[1].split(".caffemodel", 1)[0] for f in files])
But your pattern seems to repeat so the first solution is probably sufficient.
You can also slice using find and rfind:
print( [f[f.find("iter_")+5: f.rfind("caffe")-1] for f in files])
['418000', '502000']

Get all characters after a certain character?

Let's say I have a list of strings like this:
list1 = [
"filename1.txt",
"file2.py",
"fileexample.tiff"
]
How would I be able to grab all characters after the '.', if it's not too much to ask, by using "for i in" and have them come back in a list, like this: ['.txt','.py','.tiff']

If you are dealing with filepaths, then you should use the os.path module
import os.path
list1 = ["filename1.txt","file2.py","fileexample.tiff"]
print [os.path.splitext(f)[1] for f in list1]
prints
['.txt', '.py', '.tiff']

import os
for i in list1:
fileName, fileExtension = os.path.splitext(i)
print fileExtension
second one :
[i.split('.')[1] for i in list1]

map(lambda s:s.rsplit(".",1)[-1],my_list)
is probably how I would do it
which just splits from the right side exactly once on a period ... and gets whatever is on the right hand side for each item in the list

concatenate all items from two lists in Python

I want to produce a list of possible websites from two lists:
strings = ["string1", "string2", "string3"]
tlds = ["com', "net", "org"]
to produce the following output:
string1.com
string1.net
string1.org
string2.com
string2.net
string2.org
I've got to this:
for i in strings:
print i + tlds[0:]
But I can't concatenate str and list objects. How can I join these?

itertools.product is designed for this purpose.
url_tuples = itertools.product(strings, tlds)
urls = ['.'.join(url_tuple) for url_tuple in url_tuples]
print(urls)

A (nested) list comprehension would be another alternative:
[s + '.' + tld for s in strings for tld in tlds]

The itertools module provides a function that does this.
from itertools import product
urls = [".".join(elem) for elem in product(strings, tlds)]
The urls variable now holds this list:
['string1.com',
'string1.net',
'string1.org',
'string2.com',
'string2.net',
'string2.org',
'string3.com',
'string3.net',
'string3.org']

One very simple way to write this is the same as in most other languages.
for s in strings:
for t in tlds:
print s + '.' + t

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove different substrings from list of strings - python

example_list = [path.split('/')[-1] for path in example_list] Output: ['ABCD_1', 'ABCD_2', 'ABCD_3', 'ABCD_4', 'ABCD_5', 'ABCD_6', 'ABCD_7']

Just use reverse indexing: new_list=[] for i in example_list: j=i[-6:] new_list.append(j) print(new_list) Output will be ['ABCD_1', 'ABCD_2', 'ABCD_3', 'ABCD_4', 'ABCD_5', 'ABCD_6', 'ABCD_7']

Related

Remove Prefixes From a String

Converting A List With Python

Extract int between two different strings in python

Get all characters after a certain character?

concatenate all items from two lists in Python

Categories

Resources