Search a list using a string - python

I have a List of Lists:
A = [['andy', 'dear', 'boy', 'tobe', 'todo'],
['where', 'what', 'when', 'how'],
['korea', 'japan', 'china', 'usa'],
['tweet', 'where', 'why', 'how']]
I have three questions to be exact:
How do I retrieve a sub-list from this list using a particular element as a keyword?
For instance, I want to retrieve all the lists having element 'why' in them? What is
the best possible way of doing so?
How do I retrieve a sub-list from this list using a part of a particular element as a keyword?
For instance, I want to retrieve all the lists having elements containing 'wh' as beginning
characters of any of the elements?
How do I get the position or index of resulting sub-lists from any of these two searching methods?
I am familiar with the concept of retrieving all the elements from a list with matching with a particular keyword, but its confusing when it comes to retrieve all the lists matching a particular keyword...
Any guesses? Thanks in advance.

Simple and straight
elem = 'why'
index_li = []
for idx, item in enumerate(A):
for word in item:
if word.startswith(elem):
index_li.append(idx)
break
print index_li
Example
>>> elem = 'wh'
... index_li = []
... for idx, item in enumerate(A):
... for word in item:
... if word.startswith(elem):
... print word, item
... index_li.append(idx)
... break
... print index_li
where ['where', 'what', 'when', 'how']
where ['tweet', 'where', 'why', 'how']
[1, 3]

For all the answers combined:
mylist = [['andy', 'dear', 'boy', 'tobe', 'todo'], ['where', 'what', 'when', 'how'], ['korea', 'japan', 'china', 'usa'], ['tweet', 'where', 'why', 'how']]
for num, sublist in enumerate(mylist):
if 'why' in sublist:
print sublist
for ele in sublist:
if ele.startswith('wh'):
print ele, num

I love list comprehensions and functional programming, so here's that approach
A
sub_list_a = [ el for el in A if 'why' in el ]
B
sub_list_b = [ el for el in A if any( ['wh' in s for s in el] ) ]
A little more difficult to read, but concise and sensible.
C
Then to get the indices of where each of the sub_lists were found
location_a = [ ii for ii, el in enumerate(A) if el in sub_list_a ]
Simply replace b for a to get the locations for part b.

For first:
>>> for l in A:
... if 'why' in l:
... print l
...
['tweet', 'where', 'why', 'how']
For the second:(wy any where)
>>> for l in A:
... for i in l:
... if 'wh' in i:
... print l
... break
...
['where', 'what', 'when', 'how']
['tweet', 'where', 'why', 'how']
to test at beginning try this: (using startswith() from #Harido)
>>> for l in A:
... for i in l:
... if i.startswith('wh'):
... print l
... break
...
['where', 'what', 'when', 'how']
['tweet', 'where', 'why', 'how']
For third:
To find index, you can use A.index(l) method after print stamens for example:
>>> for l in A:
... for i in l:
... if 'wh' in i:
... print l
... print A.index(l)
... break
...
['where', 'what', 'when', 'how']
1
['tweet', 'where', 'why', 'how']
3
But remember I am not good in Python. Some one can give you better ways. (I am writing C like code that is poor) I would like to share this link: Guido van Rossum
Edit:
thanks to #Jaime to suggesting me for k, l in enumerate(A):
>>> for k, l in enumerate(A):
... for i in l:
... if 'wh' in i:
... print l
... print "index =", k
... break
...
['where', 'what', 'when', 'how']
index = 1
['tweet', 'where', 'why', 'how']
index = 3

You'd do this the same way you'd do it for a single list, except you'll require an extra loop:
for inner_list in A:
for element in inner_list:
if # <some condition>
# <do something>
For positions, adding an enumerate in the specific loop you want the position will help, or simply using something like find for each inner list will also do the job.

>>> A = [['andy', 'dear', 'boy', 'tobe', 'todo'], ['where', 'what', 'when', 'how'], ['korea', 'japan', 'china', 'usa'], ['tweet', 'where', 'why', 'how']]
>>> for i, sublist in enumerate(A):
if 'why' in sublist:
print i, sublist
3 ['tweet', 'where', 'why', 'how']
>>> for i, sublist in enumerate(A):
if any(word.startswith('wh') for word in sublist):
print i, sublist
1 ['where', 'what', 'when', 'how']
3 ['tweet', 'where', 'why', 'how']

a. suppose you have a list A and the word your are looking for is x.
def howmuch(word)
lis = []
for x in A:
//point A
if x.count(word) > 0:
//point A
lis.append(x)
return lis
basically lis is a list that holds all the lists that has that word. you iterate through the original list. each element x is a list. x.count(word) tells you how many times that word is in this list. 0 means it never appeared in the list. so if it is greater than 0, it must be in the list. if so, we added to our lis variable. then we return it.
b. it is the same thing as problem a except at point A, after you get the list, do another for loop for each word in there. for python, there is a find function for strings to see if a substring exists in there if it does exists, then append it to the list element:
http://docs.python.org/2/library/string.html
c. for problem a, after you check to see that count is greater than 0, then that means the word exist in the list. do a list.index(word) and it will return the index. for problem b, once you find a string with the proper substring, do a list.index(word).
all in all, you should look at the python site, it tells you a lot of the functions you can use.

You can use this simple approach:
>>> key='wh'
>>> res=[(index, sublist) for index, sublist in enumerate(A) for value in sublist if value.startswith(key)]
>>> dict(res)
{1: ['where', 'what', 'when', 'how'], 3: ['tweet', 'where', 'why', 'how']}
enjoy!

Related

Split The Second String of Every Element in a List into Multiple Strings

I am very very new to python so I'm still figuring out the basics.
I have a nested list with each element containing two strings like so:
mylist = [['Wowza', 'Here is a string'],['omg', 'yet another string']]
I would like to iterate through each element in mylist, and split the second string into multiple strings so it looks like:
mylist = [['wowza', 'Here', 'is', 'a', 'string'],['omg', 'yet', 'another', 'string']]
I have tried so many things, such as unzipping and
for elem in mylist:
mylist.append(elem)
NewList = [item[1].split(' ') for item in mylist]
print(NewList)
and even
for elem in mylist:
NewList = ' '.join(elem)
def Convert(string):
li = list(string.split(' '))
return li
print(Convert(NewList))
Which just gives me a variable that contains a bunch of lists
I know I'm way over complicating this, so any advice would be greatly appreciated
You can use list comprehension
mylist = [['Wowza', 'Here is a string'],['omg', 'yet another string']]
req_list = [[i[0]]+ i[1].split() for i in mylist]
# [['Wowza', 'Here', 'is', 'a', 'string'], ['omg', 'yet', 'another', 'string']]
I agree with #DeepakTripathi's list comprehension suggestion (+1) but I would structure it more descriptively:
>>> mylist = [['Wowza', 'Here is a string'], ['omg', 'yet another string']]
>>> newList = [[tag, *words.split()] for (tag, words) in mylist]
>>> print(newList)
[['Wowza', 'Here', 'is', 'a', 'string'], ['omg', 'yet', 'another', 'string']]
>>>
You can use the + operator on lists to combine them:
a = ['hi', 'multiple words here']
b = []
for i in a:
b += i.split()

How to remove elements from a list on the basis of the rows in Python without messing up the indexes? [duplicate]

This question already has answers here:
How to remove multiple indexes from a list at the same time? [duplicate]
(8 answers)
Closed 7 months ago.
Le't assume I have a list with elements, from which I want to remove a few rows on the basis of a list of row indexes. This is an example:
l = ['ciao','hopefully','we','are','going','to','sort','this','out']
idx = [1,3,5]
If I do the following, it doesn't work as the loop doesn't consider that the lenght of the list after the nth remove object is n-i:
for x in idx:
del l[x]
# what happens? The code only removes correctly the first element of idx, then it doesn't take into account that the list has shrunk and therefore the nth index no longer corresponds to the updated row list.
Note, I cannot turn the list into an array and then use np.delete as my list is coming from webscraping and it fails when I try to do so.
Can anyone suggest me a way to remove list's elements on the basis of its rows in one shot without messing up with the shifting of the index?
Thanks!
Another approach without enumerate
new_l = [item for item in l if l.index(item) not in idx]
You can use list comprehension and enumerate.
# enumerate give:
# : 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8
l = ['ciao','hopefully','we','are','going','to','sort','this','out']
idx = set([1,3,5])
out = [el for i, el in enumerate(l) if not i in idx]
print(out)
['ciao', 'we', 'going', 'sort', 'this', 'out']
If you must delete them in place:
>>> lst = ['ciao', 'hopefully', 'we', 'are', 'going', 'to', 'sort', 'this', 'out']
>>> idx = [1, 3, 5]
>>> for i in sorted(idx, reverse=True):
... del lst[i]
...
>>> lst
['ciao', 'we', 'going', 'sort', 'this', 'out']
If do not need, you can use list comprehension to create a new list:
>>> lst = ['ciao', 'hopefully', 'we', 'are', 'going', 'to', 'sort', 'this', 'out']
>>> idx = [1, 3, 5]
>>> idx_set = set(idx)
>>> [v for i, v in enumerate(lst) if i not in idx_set]
['ciao', 'we', 'going', 'sort', 'this', 'out']
Of course, you can also achieve the effect of deleting in place through slice assignment:
>>> lst[:] = [v for i, v in enumerate(lst) if i not in idx_set]

Return items around all instances of an item in a list

Say I have a list...
['a','brown','cat','runs','another','cat','jumps','up','the','hill']
...and I want to go through that list and return all instances of a specific item as well as the 2 items leading up to and proceeding that item. Exactly like this if I am searching for 'cat'
[('a','brown','cat','runs','another'),('runs','another','cat','jumps','up')]
the order of the returned list of tuples is irrelevant, ideally the code handle instances where the word was the first or last in a list, and an efficient and compact piece of code would be better of course.
Thanks again everybody, I am just getting my feet wet in Python and everybody here has been a huge help!
Without error checking:
words = ['a','brown','cat','runs','another','cat','jumps','up','the','hill']
the_word = 'cat'
seqs = []
for i, word in enumerate(words):
if word == the_word:
seqs.append(tuple(words[i-2:i+3]))
print seqs #Prints: [('a', 'brown', 'cat', 'runs', 'another'), ('runs', 'another', 'cat', 'jumps', 'up')]
A recursive solution:
def context(ls, s):
if not s in ls: return []
i = ls.index('cat')
return [ tuple(ls[i-2:i+3]) ] + context(ls[i + 1:], s)
ls = ['a','brown','cat','runs','another','cat','jumps','up','the','hill']
print context(ls, 'cat')
Gives:
[('a','brown','cat','runs','another'),('runs','another','cat','jumps','up')]
With error checking:
def grep(in_list, word):
out_list = []
for i, val in enumerate(in_list):
if val == word:
lower = i-2 if i-2 > 0 else 0
upper = i+3 if i+3 < len(in_list) else len(in_list)
out_list.append(tuple(in_list[lower:upper]))
return out_list
in_list = ['a', 'brown', 'cat', 'runs', 'another', 'cat', 'jumps', 'up', 'the', 'hill']
grep(in_list, "cat")
# output: [('a', 'brown', 'cat', 'runs', 'another'), ('runs', 'another', 'cat', 'jumps', 'up')]
grep(in_list, "the")
# output: [('jumps', 'up', 'the', 'hill')]

Split strings in a list of lists

I currently have a list of lists:
[['Hi my name is'],['What are you doing today'],['Would love some help']]
And I would like to split the strings in the lists, while remaining in their current location. For example
[['Hi','my','name','is']...]..
How can I do this?
Also, if I would like to use a specific of the lists after searching for it, say I search for "Doing", and then want to append something to that specific list.. how would I go about doing that?
You can use a list comprehension to create new list of lists with all the sentences split:
[lst[0].split() for lst in list_of_lists]
Now you can loop through this and find the list matching a condition:
for sublist in list_of_lists:
if 'doing' in sublist:
sublist.append('something')
or searching case insensitively, use any() and a generator expression; this will the minimum number of words to find a match:
for sublist in list_of_lists:
if any(w.lower() == 'doing' for w in sublist):
sublist.append('something')
list1 = [['Hi my name is'],['What are you doing today'],['Would love some help']]
use
[i[0].split() for i in list1]
then you will get the output like
[['Hi', 'my', 'name', 'is'], ['What', 'are', 'you', 'doing', 'today'], ['Would', 'love', 'some', 'help']]
l = [['Hi my name is'],['What are you doing today'],['Would love some help']]
for x in l:
l[l.index(x)] = x[0].split(' ')
print l
Or simply:
l = [x[0].split(' ') for x in l]
Output
[['Hi', 'my', 'name', 'is'], ['What', 'are', 'you', 'doing', 'today'], ['Would', 'love', 'some', 'help']]

Python: Split list based on first character of word

Im kind of stuck on an issue and Ive gone round and round with it until ive confused myself.
What I am trying to do is take a list of words:
['About', 'Absolutely', 'After', 'Aint', 'Alabama', 'AlabamaBill', 'All', 'Also', 'Amos', 'And', 'Anyhow', 'Are', 'As', 'At', 'Aunt', 'Aw', 'Bedlam', 'Behind', 'Besides', 'Biblical', 'Bill', 'Billgone']
Then sort them under and alphabetical order:
A
About
Absolutely
After
B
Bedlam
Behind
etc...
Is there and easy way to do this?
Use itertools.groupby() to group your input by a specific key, such as the first letter:
from itertools import groupby
from operator import itemgetter
for letter, words in groupby(sorted(somelist), key=itemgetter(0)):
print letter
for word in words:
print word
print
If your list is already sorted, you can omit the sorted() call. The itemgetter(0) callable will return the first letter of each word (the character at index 0), and groupby() will then yield that key plus an iterable that consists only of those items for which the key remains the same. In this case that means looping over words gives you all items that start with the same character.
Demo:
>>> somelist = ['About', 'Absolutely', 'After', 'Aint', 'Alabama', 'AlabamaBill', 'All', 'Also', 'Amos', 'And', 'Anyhow', 'Are', 'As', 'At', 'Aunt', 'Aw', 'Bedlam', 'Behind', 'Besides', 'Biblical', 'Bill', 'Billgone']
>>> from itertools import groupby
>>> from operator import itemgetter
>>>
>>> for letter, words in groupby(sorted(somelist), key=itemgetter(0)):
... print letter
... for word in words:
... print word
... print
...
A
About
Absolutely
After
Aint
Alabama
AlabamaBill
All
Also
Amos
And
Anyhow
Are
As
At
Aunt
Aw
B
Bedlam
Behind
Besides
Biblical
Bill
Billgone
Instead of using any library imports, or anything fancy.
Here is the logic:
def splitLst(x):
dictionary = dict()
for word in x:
f = word[0]
if f in dictionary.keys():
dictionary[f].append(word)
else:
dictionary[f] = [word]
return dictionary
splitLst(['About', 'Absolutely', 'After', 'Aint', 'Alabama', 'AlabamaBill', 'All', 'Also', 'Amos', 'And', 'Anyhow', 'Are', 'As', 'At', 'Aunt', 'Aw', 'Bedlam', 'Behind', 'Besides', 'Biblical', 'Bill', 'Billgone'])
def split(n):
n2 = []
for i in n:
if i[0] not in n2:
n2.append(i[0])
n2.sort()
for j in n:
z = j[0]
z1 = n2.index(z)
n2.insert(z1+1, j)
return n2
word_list = ['be','have','do','say','get','make','go','know','take','see','come','think',
'look','want','give','use','find','tell','ask','work','seem','feel','leave','call']
print(split(word_list))

Categories