Del part of list in python by for loop - python

I am trying to remove some words from string list of words.
list1= "abc dfc kmc jhh jkl".
My goal is to remove the words from 'dfc' to 'jhh'. I am new in Python, so I am trying some things with the index from c#, but they don't work here.
I am trying this:
index=0
for x in list1:
if x=='dfc'
currentindex=index
for y in list1[currentindex:]
if y!='jhh'
break;
del list1[currentindex]
currentindex=index
elif x=='jhh'
break;

Instead of a long for loop, a simple slice in Python does the trick:
words = ['abc', 'dfc', 'kmc', 'jhh', 'jkl']
del words[1:4]
print(words)
indexes start at 0. So you want to delete index 1-3. We enter 4 in the slice because Python stops -1 before the last index argument (so at index 3). Much easier than a loop.
Here is your output:
['abc', 'jkl']

>>> a = "abc dfc kmc jhh jkl"
>>> print(a.split("dfc")[0] + a.split("jhh")[1])
abc jkl
You can do this sample treatment with lambda:
b = lambda a,b,c : a.split(b)[0] + a.split(c)[1]
print(b(a, "dfc", "jhh"))

First, split the string into words:
list1 = "abc dfc kmc jhh jkl"
words = list1.split(" ")
Next, iterate through the words until you find a match:
start_match = "dfc"
start_index = 0
end_match = "jhh"
end_index = 0
for i in range(len(words)):
if words[i] == start_match:
start_index = i
if words[i] == end_match:
end_index = j
break
print ' '.join(words[:start_index]+words[end_index+1:])
Note: In the case of multiple matches, this will delete the least amount of words (choose the last start_match and first end_match).

list1= "abc dfc kmc jhh jkl".split() makes list1 as follows:
['abc', 'dfc', 'kmc', 'jhh', 'jkl']
Now if you want to remove a list element you can try either
list1.remove(item) #removes first occurrence of 'item' in list1
Or
list1.pop(index) #removes item at 'index' in list1

Create a list of words by splitting the string
list1= "abc dfc kmc jhh jkl".split()
Then iterate over the list, using a flag variable to indicate whether an element should be deleted from the list
flag = False
for x in list1:
if x=='dfc':
flag = True
if x == 'jhh':
list1.remove(x)
flag = False
if flag == True:
list1.remove(x)

There are several problems with what you have tried, especially:
list1 is a string, not a list
when you write list1[i], you get the character at index i (not a word)
in your for loop, you try to modify the string you iterate on: it is a very bad idea.
Here is my one-line style suggestion using re.sub(), which simply substitute a part of the string matching with the given regex pattern. It may be sufficient for your purpose:
import re
list1= "abc dfc kmc jhh jkl"
list1 = re.sub(r'dfc .* jhh ', "", list1)
print(list1)
Note: I kept the identifier list1 even if it is a string.

You can do like this
test = list1.replace("dfc", "")

Related

I Would Like To Replace A Word With A Letter In Python

Code:
list = ['hello','world']
list2 = ['a','b']
string = 'hello'# should output a
string_fin = ''
for s in string:
for i, j in zip (list, list2):
if s == i:
string_fin += j
print(string_fin)
I want to write hello or world in string = '' and to get the output a or b
I get which is nothing
The reason this is happening is because hello and world have more characters than a and b when I try something that has the same amount of characters as a or b it works
Please help
Thanks
Your program's main loop never runs because string is empty! So your program is basically:
list = ['hello','world']
list2 = ['a','b']
string = ''
string_fin = ''
print(string_fin)
Although based on how you worded your question, it is really hard to understand what you are trying to accomplish, but here is my go.
You have two lists: list1 and list2 (Please do not name your list list as it is a reserved keyword, use list1 instead!)
You want to check whether each word in your string matches with any word in your first list.
If it matches you want to take the corresponding word or letter from your second list, and append it into the string string_fin.
Finally, when you looped through all the words in the list, you print the content of string_fin.
The correct way to do this would be to split your string variable, and get each word stored in it.
string = 'hello or world'
stringWords = string.split()
Now, stringWords contains ['hello', 'or', 'world']. But I think you are not interested in the item or. So you can remove this item from the list, by using remove().
if 'or' in stringWords:
stringWords.remove('or')
Now you have the words that you are interested in. And we want to check whether any word in the first list matches with these words. (Remember, I renamed the first list from list to list1 to prevent any unexpected behavior.)
for word in stringWords:
tempIndex = list1.index(word)
temp = list2[tempIndex]
string_fin += temp
However, using index raises ValueError if a match is not found, so depending on your program logic, you may need to catch an exception and handle it.
The string string_fin will now contain ab or a or b depending on the value inside string.
Now, since you wanted to print something like a or b, you can instead create a list and store the matching words in it, and then, join this list using or separator.
string_fin = (' or ').join(tempList)
A complete program now will look like this:
list1 = ['hello', 'world']
list2 = ['a', 'b']
string = 'hello or world'
tempList = []
stringWords = string.split()
if 'or' in stringWords:
stringWords.remove('or')
for word in stringWords:
tempIndex = list1.index(word)
temp = list2[tempIndex]
tempList.append(temp)
string_fin = ' or '.join(tempList)
print(string_fin)
Better to store your lists as a dictionary, so you can do an easy lookup:
mapping = {'hello':'a', 'world':'b'}
string = 'hello or world'
out = []
for s in string.split():
out.append( mapping.get( s, s ) )
print(' '.join(out))
Purists will note that the for loop can be made into a one-liner:
mapping = {'hello':'a', 'world':'b'}
string = 'hello or world'
out = ' '.join(mapping.get(s,s) for s in string.split())
print(out)

Searching for similar values within a regex string

I'm trying to do a search with regex within two lists that have similar strings, but not the same, how to fix the fault below?
Script:
import re
list1 = [
'juice',
'potato']
list2 = [
'juice;44',
'potato;55',
'apple;66']
correlation = []
for a in list1:
r = re.compile(r'\b{}\b'.format(a), re.I)
for b in list2:
if r.search(b):
pass
else:
correlation.append(b)
print(correlation)
Output:
['potato;55', 'apple;66', 'juice;44', 'apple;66']
Desired Output:
['apple;66']
Regex:
You can create a single regex pattern to match terms from list1 as whole words, and then use filter:
import re
list1 = ['juice', 'potato']
list2 = ['juice;44', 'potato;55', 'apple;66']
rx = re.compile(r'\b(?:{})\b'.format("|".join(list1)))
print( list(filter(lambda x: not rx.search(x), list2)) )
# => ['apple;66']
See the Python demo.
The regex is \b(?:juice|potato)\b, see its online demo. The \b is a word boundary, the regex matches juice or potato as whole words. filter(lambda x: not rx.search(x), list2) removes all items from list2 that match the regex.
First, inner and outer for-loop must be swapped to make this work.
Then you can set a flag to False before the inner for-loop, set it in the inner loop to True if you found a match, after the loop add to correlation if flag is False yet.
This finally looks like:
import re
list1 = [
'juice',
'potato']
list2 = [
'juice;44',
'potato;55',
'apple;66']
correlation = []
for b in list2:
found = False
for a in list1:
r = re.compile(r'\b{}\b'.format(a), re.I)
if r.search(b):
found = True
if not found:
correlation.append(b)
print(correlation)
Convert list1 into a single regexp that matches all the words. Then append the element of list2 if it doesn't match the regexp.
regex = re.compile(r'\b(?:' + '|'.join(re.escape(word) for word in ROE) + r')\b')
correlation = [a for a in list2 if not regex.search(a)]

trimming words in a list of strings

I'm writing some code that trims down a words in a list of string. if the last character of a word in the string is 't' or 's' it is removed and if the first character is 'x' it is removed.
words = ['bees', 'xerez']
should return:
['bee', 'erez']
So far my solution is:
trim_last = [x[:-1] for x in words if x[-1] == 's' or 't']
I think this trims the last characters fine. I then to trim the first characters if they are 'x' with this line:
trim_first = [x[1:] for x in trim_last if x[0] == 'x']
but this just returns an empty list, can i some how incorporate this into one working line?
[v.lstrip('x').rstrip('ts') for v in words]
You're doing a filter, not a mapping.
The right way would be
trim_first = [x[1:] if x.startswith('x') else x for x in trim_last]
Also, your solution should not return an empty list since the filter would match on the second element
In one step with re.sub() function:
import re
words = ['bees', 'xerez']
result = [re.sub(r'^x|[ts]$', '', w) for w in words]
print(result)
The output:
['bee', 'erez']
Just to chime in - since this is in fact, a mapping:
map(lambda x: x[1:] if x[0] == 'x' else x, words)
If you are looking for a one-liner you can use some arithmetic to play with the list slicing:
words = ['bees', 'xerez', 'xeret']
[w[w[0] == 'x' : len(w) - int(w[-1] in 'st')] for w in words]
# output: ['bee', 'erez', 'ere']
You can try this code:
trim_last = [x.lstrip('x').rstrip('t').rstrip('s') for x in words]
Why you are using two list comprehension for that you can do with one list comprehension :
one line solution:
words = ['bees', 'xerez','hellot','xnewt']
print([item[:-1] if item.endswith('t') or item.endswith('s') else item for item in [item[1:] if item.startswith('x') else item for item in words]])
output:
['bee', 'erez', 'hello', 'new']
Explanation of above list comprehension :
final=[]
for item in words:
sub_list=[]
if item.endswith('t') or item.endswith('s'):
sub_list.append(item[:-1])
else:
sub_list.append(item)
for item in sub_list:
if item.startswith('x'):
final.append(item[1:])
else:
final.append(item)
print(final)

If list contains string print all the indexes / elements in the list that contain it

I am able to detect matches but unable to locate where are they.
Given the following list:
['A second goldfish is nice and all', 3456, 'test nice']
I need to search for match (i.e. "nice") and print all the list elements that contain it. Ideally if the keyword to search were "nice" the results should be:
'A second goldfish is nice and all'
'test nice'
I have:
list = data_array
string = str(raw_input("Search keyword: "))
print string
if any(string in s for s in list):
print "Yes"
So it finds the match and prints both, the keyword and "Yes" but it doesn't tell me where it is.
Should I iterate through every index in list and for each iteration search "string in s" or there is an easier way to do this?
Try this:
list = data_array
string = str(raw_input("Search keyword: "))
print string
for s in list:
if string in str(s):
print 'Yes'
print list.index(s)
Editted to working example. If you only want the first matching index you can also break after the if statement evaluates true
matches = [s for s in my_list if my_string in str(s)]
or
matches = filter(lambda s: my_string in str(s), my_list)
Note that 'nice' in 3456 will raise a TypeError, which is why I used str() on the list elements. Whether that's appropriate depends on if you want to consider '45' to be in 3456 or not.
print filter(lambda s: k in str(s), l)
To print all the elements that contains nice
mylist = ['nice1', 'def456', 'ghi789', 'nice2', 'nice3']
sub = 'nice'
print("\n".join([e for e in mylist if sub in e]))
>>> nice1
nice2
nice3
To get the index of elements that contain nice (irrespective of the letter case)
mylist = ['nice1', 'def456', 'ghi789', 'Nice2', 'NicE3']
sub = 'nice'
index_list = []
i = 0
for e in mylist:
if sub in e.lower():
index_list.append(i)
i +=1
print(index_list)
>>> [0, 3, 4]

Cross-matching two lists

I have two lists where I am trying to see if there is any matches between substrings in elements in both lists.
["Po2311tato","Pin2231eap","Orange2231edg","add22131dfes"]
["2311","233412","2231"]
If any substrings in an element matches the second list such as "Po2311tato" will match with "2311". Then I would want to put "Po2311tato" in a new list in which all elements of the first that match would be placed in the new list. So the new list would be ["Po2311tato","Pin2231eap","Orange2231edg"]
You can use the syntax 'substring' in string to do this:
a = ["Po2311tato","Pin2231eap","Orange2231edg","add22131dfes"]
b = ["2311","233412","2231"]
def has_substring(word):
for substring in b:
if substring in word:
return True
return False
print filter(has_substring, a)
Hope this helps!
This can be a little more concise than the jobby's answer by using a list comprehension:
>>> list1 = ["Po2311tato","Pin2231eap","Orange2231edg","add22131dfes"]
>>> list2 = ["2311","233412","2231"]
>>> list3 = [string for string in list1 if any(substring in string for substring in list2)]
>>> list3
['Po2311tato', 'Pin2231eap', 'Orange2231edg']
Whether or not this is clearer / more elegant than jobby's version is a matter of taste!
import re
list1 = ["Po2311tato","Pin2231eap","Orange2231edg","add22131dfes"]
list2 = ["2311","233412","2231"]
matchlist = []
for str1 in list1:
for str2 in list2:
if (re.search(str2, str1)):
matchlist.append(str1)
break
print matchlist

Categories