I am trying to write some code which adds an element in one list to another list and then removes it from the first list. It also should not add duplicates to the new list which is where the if statement comes in.
However, when adding to the 'individuals' list and removing from the 'sentence_list' list it misses out certain words such as 'not' and 'for'. This is also not random and the same words are missed each time. Any help?
sentence = "I am a yellow fish"
sentence_list = sentence.lower().split()
individuals = []
for i in sentence_list:
if i in individuals:
print ("yes")
sentence_list.remove(i)
else:
individuals.append(i)
sentence_list.remove(i)
print ("individuals", individuals)
print ("sentence_list", sentence_list)
The issue is that you are removing items from the list you are looping through. You can fix this just by making a copy of the list and looping through it instead, like this:
sentence = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY"
sentence_list = sentence.lower().split()
individuals = []
#We slice with [:] to make a copy of the list
orig_list = sentence_list[:]
for i in orig_list:
if i in individuals:
print ("yes")
sentence_list.remove(i)
else:
individuals.append(i)
sentence_list.remove(i)
print ("individuals", individuals)
print ("sentence_list", sentence_list)
The lists are now what was expected:
print(individuals)
print(sentence_list)
['ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you']
[]
In general you should not add or remove elements to a list as you iterate over it. Given that you are removing every single element of the list, just remove the lines with sentence_list.remove(i).
If you actually need to remove just some elements from the list you're iterating I'd either: make a new empty list and add the elements you want to keep to that, or keep a track of which indices in the list you want to remove as you iterate and then remove after the loop.
For the first solution,
oldList = [1, 2, 3, 4]
newList = []
for i in oldList:
shouldRemove = i % 2
if not shouldRemove:
newList.append(i)
For the second,
oldList = [1, 2, 3, 4]
indicesToKeep = []
for i, e in enumerate(oldList):
shouldRemove = e % 2
if not shouldRemove:
indicesToKeep.append(i)
newList = [e for i, e in enumerate(oldList) if i in indicesToKeep]
Related
I have a list of 200,000 words, a list containing indexes, and a keyword. The index_list is not predefined and can be of any size between 0 to len(keyword).
I wish to iterate through the 200,000 words and only keep the ones that contain the letters in the keyword at the specific index.
Examples:
keyword = "BEANS"
indexList = [0, 3]
I want to keep words that contain 'B" at the 0th index and 'N' and the 3rd index.
keyword = "BEANS"
indexList = [0, 1, 2]
I want to keep words that contain 'B" at the 0th index and 'E' and the 1st index, and 'A' at the 2nd index.
keyword = "BEANS"
indexList = []
No specific words, return all 200,000 words
At the moment,
I have this code. sampleSpace refers to the list of 200,000 words.
extractedList = []
for i in range(len(indexList)):
for word in sampleSpace:
if (word[indexList[i]] == keyword[indexList[i]]):
extractedList.append(word)
However, this code is extracting words that have values at the first index OR values at the second index OR values at the Nth index.
I need words to have ALL of the letters at the specific index.
You can use a simple comprehension with all. Have the comprehension loop over all the words in the big word list, and then use all to check all the indices in indexList:
>>> from wordle_solver import wordle_corpus as corpus
>>> keyword = "BEANS"
>>> indexList = [0, 3]
>>> [word for word in corpus if all(keyword[i] == word[i] for i in indexList)]
['BLAND', 'BRUNT', 'BUNNY', 'BLANK', 'BRINE', 'BLEND', 'BLINK', 'BLUNT', 'BEING', 'BRING', 'BRINY', 'BOUND', 'BLOND', 'BURNT', 'BORNE', 'BRAND', 'BRINK', 'BLIND']
First, change your logic so that your outer loop is for word in sampleSpace. This is because you want to consider each word at once, and look at all the relevant indices in that word.
Next, look up the all() function, which returns true if all of the elements of iterable you gave it are truthy. How can we apply this here? We want to check if
all(
word[index] == keyword[index] for index in indexList
)
So we have:
extractedWords = []
for word in sampleSpace:
if all(word[index] == keyword[index] for index in indexList):
extractedWords.append(word)
Now since this loop is just constructing a list, we can write it as a list comprehension like so:
extractedWords = [word
for word in sampleSpace
if all(word[index] == keyword[index] for index in indexList)
]
You can handle the case of empty indexList separately using an if condition before you do any of this.
def search_keyword_index(sampleSpace, keyword, indexList)
if not indexList:
return sampleSpace # or return sampleSpace[:] if you need to return a copy
return [word for word in sampleSpace if all(word[index] == keyword[index] for index in indexList)]
You can create a set of (index,character) and use it to quickly compare each word in your list:
with open("/usr/share/dict/words") as f:
words = f.read().upper().split('\n') # 235,887 words
keyword = "ELEPHANT"
indexList = [0, 3, 5, 7]
letterSet = {(i,keyword[i]) for i in indexList}
for word in words:
if letterSet.issubset(enumerate(word)):
print(word)
EGGPLANT
ELEPHANT
ELEPHANTA
ELEPHANTIAC
ELEPHANTIASIC
ELEPHANTIASIS
ELEPHANTIC
ELEPHANTICIDE
ELEPHANTIDAE
ELEPHANTINE
ELEPHANTLIKE
ELEPHANTOID
ELEPHANTOIDAL
ELEPHANTOPUS
ELEPHANTOUS
ELEPHANTRY
EPIPLASTRAL
EPIPLASTRON
You could place the result in a list using a comprehension:
letterSet = {(i,keyword[i]) for i in indexList}
eligible = [word for word in words if letterSet.issubset(enumerate(word))]
print(len(eligible)) # 18
Still very much a beginner with python and I am trying to create a function that accepts a list of strings as a parameter and replaces each string with a duplicate of each string. I am having a bit of trouble with the code..
I was able to duplicate each string, but I am having trouble because each word is in one string, and my original list is printing the blist.
Any help or guidance would be greatly appreciated
This is what I have so far:
blist = []
def double_list(alist):
for i in alist:
blist.append(i*2)
return blist
print('original list: ',double_list(['how','are','you?']))
print('double list: ',blist)
Output:
original list: ['howhow', 'areare', 'you?you?']
double list: ['howhow', 'areare', 'you?you?']
EXPECTED Output:
original list: ['how', 'are', 'you?']
double list: ['how', 'how', 'are', 'are', 'you?', 'you?']
If you want to use the * 2 idiom, you can use extend() and pass it a list with two items made with [i] * 2.
For example:
def double_list(alist):
blist = []
for i in alist:
blist.extend([i]*2)
return blist
orig = ['how','are','you?']
print('double list: ',double_list(orig))
# double list: ['how', 'how', 'are', 'are', 'you?', 'you?']
Note: the reason you were getting the doubles in original list: ['howhow', 'areare', 'you?you?'] is because you are printing the return value of the function which is not the original list.
You are adding strings together. Just append them twice.
blist = []
def double_list(alist):
for i in alist:
blist.append(i)
blist.append(i)
return blist
print('original list: ',double_list(['how','are','you?']))
print('double list: ',blist)
I'm sure there is a better way to do it, but this helps understand the solution better.
Simply use the extend() - method, which allows you to combine two lists together.
Also you will have to change the print statements, because otherwise you will only print out the "duplicated" version of the list, not the original.
Thus we will extend "blist" by a list with 2x the desired string.
Code should look like:
blist = []
def double_list(alist):
for word in alist:
blist.extend([word]*2)
return blist
original_list = ['how','are','you?']
print('original list: ', original_list)
print('double list: ', double_list(original_list))
for loop prints out all contents not just the last added into list box... any reason why??
def changeCss():
readingCss = open(FileName.get()+'.css','r')
FileContentsCss = readingCss.readlines()
readingCss.close()
index = 0
while index < len(FileContentsCss):
FileContentsCss[index]= FileContentsCss[index].rstrip('\n')
index +=1
while '' in FileContentsCss:
FileContentsCss.remove('')
for cont in FileContentsCss:
Open.insert(END, cont + '\n')
Explanation: On for loop
The for loop iterates through all the elements in a list
Example:
list1 = ['hello', 'world']
for ele in list1:
print(ele)
or
Using slicing, we can specify the start index and end index to make for loop iterate through all the elements between those indexes
l1 = ['hello', 'world', 'good']
for a in l1[1:3]:
print(a)
For the above query if you want to print the last added element in the list:
l1 = ['hello', 'world', 'good']
l1.append('morning')
for a in l1[-1::1]:
print(a)
Output:
morning
I faced an issue with my code where the loop stops running once it removes the list from the list of list.
data=[["why","why","hello"],["why","why","bell"],["why","hi","sllo"],["why","cry","hello"]]
for word_set in data:
if word_set[-1]!="hello":
data.remove(word_set)
print(data)
My desired output is
[['why', 'why', 'hello'], ['why', 'cry', 'hello']]
but the output is
[['why', 'why', 'hello'], ['why', 'hi', 'sllo'], ['why', 'cry', 'hello']]
How do I make the loop go on till the end of the list?
That's because, when you remove the second item (whose index is 1), the items after it move forward. In the next iteration, the index is 2. It should have been pointing to ["why","hi","solo"]. But since the items moved forward, it points to ["why","cry","hello"]. That's why you get the wrong result.
It's not recommended to remove list items while iterating over the list.
You can either create a new list (which is mentioned in the first answer) or use the filter function.
def filter_func(item):
if item[-1] != "hello":
return False
return True
new_list = filter(filter_func, old_list)
Remember
data = [["list", "in","a list"],["list", "in","a list"],["list", "in","a list"]]
#data[0] will return ["list", "in","a list"]
#data[0][0] will return "list"
#remember that lists starts with '0' -> data[0]
>>> data=[["why","why","hello"],["why","why","bell"],["why","hi","sllo"],["why","cry","hello"]]
>>> y = []
>>> for subData in data:
for dataItem in subData:
if dataItem == "hello":
y.append(subData)
>>> y
[['why', 'why', 'hello'], ['why', 'cry', 'hello']]
filter(lambda x : x[-1] =="hello",[["why","why","hello"],["why","why","bell"],["why","hi","sllo"],["why","cry","hello"]])
OR
reduce(lambda x,y : x + [y] if y[-1]=="hello" else x ,[["why","why","hello"],["why","why","bell"],["why","hi","sllo"],["why","cry","hello"]],[])
OR
[i for i in [["why","why","hello"],["why","why","bell"],["why","hi","sllo"],["why","cry","hello"]] if i[-1]=="hello"]
data=[["why","why","hello"],["why","why","bell"],["why","hi","sllo"],["why","cry","hello"]]
for word_set in data[:]:
if word_set[-1]!= "hello":
data.remove(word_set)
print(data)
Don't iterate the origin data,but make a duplicate(data[:]). Because when remove items from list, the index of item will change.["why","why","bell"] in list index is 1. when it's removed from data. ["why","hi","sllo"] in data index will be 1. The next iteration index is 2, so ["why","hi","sllo"] is passed and checks ["why","cry","hello"].
I'm trying to right a program that takes in a nested list, and returns a new list that takes out proper nouns.
Here is an example:
L = [['The', 'name', 'is', 'James'], ['Where', 'is', 'the', 'treasure'], ['Bond', 'cackled', 'insanely']]
I want to return:
['the', 'name', 'is', 'is', 'the', 'tresure', 'cackled', 'insanely']
Take note that 'where' is deleted. It is ok since it does not appear anywhere else in the nested list. Each nested list is a sentence. My approach to it is append every first element in the nested list to a newList. Then I compare to see if elements in the newList are in the nested list. I would lowercase the element's in the newList to check. I'm half way done with this program, but I'm running into an error when I try to remove the element from the newList at the end. Once i get the new updated list, I want to delete items from the nestedList that are in the newList. I'd lastly append all the items in the nested list to a newerList and lowercase them. That should do it.
If someone has a more efficient approach I'd gladly listen.
def lowerCaseFirst(L):
newList = []
for nestedList in L:
newList.append(nestedList[0])
print newList
for firstWord in newList:
sum = 0
firstWord = firstWord.lower()
for nestedList in L:
for word in nestedList[1:]:
if firstWord == word:
print "yes"
sum = sum + 1
print newList
if sum >= 1:
firstWord = firstWord.upper()
newList.remove(firstWord)
return newList
Note this code is not finished due to the error in the second to last line
Here is with the newerList (updatedNewList):
def lowerCaseFirst(L):
newList = []
for nestedList in L:
newList.append(nestedList[0])
print newList
updatedNewList = newList
for firstWord in newList:
sum = 0
firstWord = firstWord.lower()
for nestedList in L:
for word in nestedList[1:]:
if firstWord == word:
print "yes"
sum = sum + 1
print newList
if sum >= 1:
firstWord = firstWord.upper()
updatedNewList.remove(firstWord)
return updatedNewList
error message:
Traceback (most recent call last):
File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 1, in <module>
# Used internally for debug sandbox under external interpreter
File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 80, in lowerCaseFirst
ValueError: list.remove(x): x not in list
The error in your first function is because you try to remove an uppercased version of firstWord from newlist where there are no uppercase words (you see that from the printout). Remember that you store a upper/lowercased version of your words in a new variable, but you don't change the contents of the original list.
I still don't understand your approach. You want to do to things as you describe your task; 1) flatten the a lists of lists to a list of elements (always an interesting programming exercise) and 2) remove proper nouns from this list. This means that you have to decide what is a proper noun. You could do that rudimentarily (all non-starting capitalized words, or an exhaustive list), or you could use a POS tagger (see: Finding Proper Nouns using NLTK WordNet). Unless I misunderstand your task completely, you needn't worry about the casing here.
The first task can be solved in many ways. Here is a nice way that illustrates well what actually happenes in the simple case where your list L is a list of lists (and not lists that can be infinitely nested):
def flatten(L):
newList = []
for sublist in L:
for elm in sublist:
newList.append(elm)
return newList
this function you could make into flattenAndFilter(L) by checking each element like this:
PN = ['James', 'Bond']
def flattenAndFilter(L):
newList = []
for sublist in L:
for elm in sublist:
if not elm in PN:
newList.append(elm)
return newList
You might not have such a nice list of PNs, though, then you would have to expand on the checking, as for instance by parsing the sentence and checking the POS tags.