Python lists similitudes - python

I am looking to get the number of similar characters between two lists.
The first list is:
list1=['e', 'n', 'z', 'o', 'a']
The second list is going to be a word user inputted turned into a list:
word=input("Enter word")
word=list(word)
I'll run this function below to get the number of similitudes in the two lists:
def getSimilarItems(word,list1):
counter = 0
for i in list2:
for j in list1:
if i in j:
counter = counter + 1
return counter
What I don't know how to do is how to get the number of similitudes for each item of the list(which is going to be either 0 or 1 as the word is going to be split into a list where an item is a character).
Help would be VERY appreciated :)
For example:
If the word inputted by the user is afez:
I'd like the run the function:
wordcount= getSimilarItems(word,list1)
And get this as an output:
>>>1 (because a from afez is in list ['e', 'n', 'z', 'o', 'a'])
>>>0 (because f from afez isn't in list ['e', 'n', 'z', 'o', 'a'])
>>>1 (because e from afez is in list ['e', 'n', 'z', 'o', 'a'])
>>>1 (because z from afez is in list ['e', 'n', 'z', 'o', 'a'])

Sounds like you simply want:
def getSimilarItems(word,list1):
return [int(letter in list1) for letter in word]

What I don't know how to do is how to get the number of similitudes
for each item of the list(which is going to be either 0 or 1 as the
word is going to be split into a list where an item is a character).
I assume that instead of counting the number of items in the list, you want to get the individual match result for each element.
For that you can use a dictionary or a list, and return that from your function.
Going off the assumption that the input is going to be the same length as the list,
def getSimilarItems(list1,list2):
counter = 0
list = []
for i in list2:
for j in list1:
if i in j:
list.append(1)
else:
list.append(0)
return list
Based off your edit,
def getSimilarItems(list1,list2):
counter = 0
for i in list2:
if i in list1:
print('1 (because )'+i +' from temp_word is in list'+ str(list1))
else:
print("0 (because )"+i +" from temp_word isn't in list" + str(list1))
Look at Julien's answer if you want a more condensed version (I'm not very good with list comprehension)

Related

Error "index out of range" when working with strings in a for loop in python

I'm very new to python and I'm practicing different exercises.
I need to write a program to decode a string. The original string has been modified by adding, after each vowel (letters ’a’, ’e’, ’i’, ’o’ and ’u’), the letter ’p’ and then that same vowel again.
For example, the word “kemija” becomes “kepemipijapa” and the word “paprika” becomes “papapripikapa”.
vowel = ['a', 'e', 'i', 'o', 'u']
input_word = list(input())
for i in range(len(input_word)):
if input_word[i] in vowel:
input_word.pop(i + 1)
input_word.pop(i + 2)
print(input_word)
The algorithm I had in mind was to detect the index for which the item is a vowel and then remove the following 2 items after this item ,so if input_word[0] == 'e' then the next 2 items (input_word[1], input_word[2]) must be removed from the list. For the sample input zepelepenapa, I get this error message : IndexError: pop index out of range even when I change the for loop to range(len(input_word) - 2) ,again I get this same error.
thanks in advance
The loop will run a number of times equal to the original length of input_word, due to range(len(input_word)). An IndexError will occur if input_word is shortened inside the loop, because the code inside the loop tries to access every element in the original list input_word with the expression input_word[i] (and, for some values of input_word, the if block could even attempt to pop items off the list beyond its original length, due to the (i + 1) and (i + 2)).
Hardcoding the loop definition with a specific number like 2, e.g. with range(len(input_word) - 2), to make it run fewer times to account for removed letters isn't a general solution, because the number of letters to be removed is initially unknown (it could be 0, 2, 4, ...).
Here are a couple of possible solutions:
Instead of removing items from input_word, create a new list output_word and add letters to it if they meet the criteria. Use a helper list skip_these_indices to keep track of indices that should be "removed" from input_word so they can be skipped when building up the new list output_word:
vowel = ['a', 'e', 'i', 'o', 'u']
input_word = list("zepelepenapa")
output_word = []
skip_these_indices = []
for i in range(len(input_word)):
# if letter 'i' shouldn't be skipped, add it to output_word
if i not in skip_these_indices:
output_word.append(input_word[i])
# check whether to skip the next two letters after 'i'
if input_word[i] in vowel:
skip_these_indices.append(i + 1)
skip_these_indices.append(i + 2)
print(skip_these_indices) # [2, 3, 6, 7, 10, 11]
print(output_word) # ['z', 'e', 'l', 'e', 'n', 'a']
print(''.join(output_word)) # zelena
Alternatively, use two loops. The first loop will keep track of which letters should be removed in a list called remove_these_indices. The second loop will remove them from input_word:
vowel = ['a', 'e', 'i', 'o', 'u']
input_word = list("zepelepenapa")
remove_these_indices = []
# loop 1 -- find letters to remove
for i in range(len(input_word)):
# if letter 'i' isn't already marked for removal,
# check whether we should remove the next two letters
if i not in remove_these_indices:
if input_word[i] in vowel:
remove_these_indices.append(i + 1)
remove_these_indices.append(i + 2)
# loop 2 -- remove the letters (pop in reverse to avoid IndexError)
for i in reversed(remove_these_indices):
# if input_word has a vowel in the last two positions,
# without a "p" and the same vowel after it,
# which it shouldn't based on the algorithm you
# described for generating the coded word,
# this 'if' statement will avoid popping
# elements that don't exist
if i < len(input_word):
input_word.pop(i)
print(remove_these_indices) # [2, 3, 6, 7, 10, 11]
print(input_word) # ['z', 'e', 'l', 'e', 'n', 'a']
print(''.join(input_word)) # zelena
pop() removes an item at the given position in the list and returns it. This alters the list in place.
For example if I have:
my_list = [1,2,3,4]
n = my_list.pop()
will return n = 4 in this instance. If I was to print my_list after this operation it would return [1,2,3]. So the length of the list will change every time pop() is used. That is why you are getting IndexError: pop index out of range.
So to solve this we should avoid using pop() since it's really not needed in this situation. The following will work:
word = 'kemija'
vowels = ['a', 'e', 'i', 'o', 'u']
new_word = []
for w in word:
if w in vowels:
new_word.extend([w,'p',w])
# alternatively you could use .append() over .extend() but would need more lines:
# new_word.append(w)
# new_word.append('p')
# new_word.append(w)
else:
new_word.append(w)
decoded_word = ''.join(new_word)
print(decoded_word)

How do I extend all elements of a list to another list every Nth item

I am trying to find a way to insert all elements from a list every nth element of another. There are several post similar but they are for single elements of a list to another list. What I am trying to figure out is taking the following list:
l1 = ["a","b"]
l2 = ["j","k","l","m","n","o","p","q","r","s","t"u"]
And outputting them together:
["j","k","l","a","b","m","n","o","a","b","p","q","r","a","b","s","t","u"]
What I'm thinking would work is something that at least starts with:
for x in 1st:
for i in range(len(l2)):
2nd.extend(l1)
I know this doesn't work but I'm not sure how to implement this.
In the specific output above, the first list is added after every 3rd element. It doesn't have to be every third element but I'm just using that as an example.
Can someone educate me on how to do this?
EDIT:
With the help of #Chris, I was able to find a 3rd party library called more_itertools and created a new code that does exactly what I was looking for. Here is what I came up with:
import more_itertools as mit
l1 = ["a","b"]
l2 = ["j","k","l","m","n","o","p","q","r","s","t","u"]
#We will place the elements of 1st after every two elements of 2nd
l3 = list(mit.collapse(mit.intersperse(l1, l2, n=len(l1))))
Results:
>>> print(l3)
['j', 'k', 'a', 'b', 'l', 'm', 'a', 'b', 'n', 'o', 'a', 'b', 'p', 'q', 'a',
'b', 'r', 's', 'a', 'b', 't', 'u']
I found that the intersperse function will allow the user to place an element into a separate list at the "nth" interval. In this example, I place the list of l1 in the l2 list after every second element (since len(l1) is equal to 2). The collapse function will take the list that was placed in l2 and put each element in order as a separate element.
Thank you to everyone that helped me with this. I enjoyed learning something new.
numpy has the ability to split a list into evenly distributed smaller lists. You can specify in np.array_split the list itself, adn the number of splits. Here I used math.floor to get the even number of times the n value goes into the length of the list. In this case you have n=3 and the list has 12 elements, so it will return 4 as the number of resulting sublists.
The [np.append(x,l1) for x.... part says to attach the values in l1 to the end of each sublist. and chain_from_iterable will mash all of them together, and you can render that as a list with list().
It has the side effect of adding the l1 values at the very end as well, which if you don't want to you can use slicing to drop of the last n values where n is the length of the l1 list.
import numpy as np
import itertools
import math
n = 3
l1 = ["a","b"]
l2 = ["j","k","l","m","n","o","p","q","r","s","t","u"]
list(itertools.chain.from_iterable([np.append(x, l1) for x in np.array_split(l2,math.floor(len(l2)) / n)]))
or if you don't want to trailing append:
list(itertools.chain.from_iterable([np.append(x,
l1) for x in np.array_split(l2,
math.floor(len(l2)) / n)]))[:-len(l1)]
Just iterate through every n'th element of list b and insert every iteration of list a.
a = ["a", "b"]
b = ["j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u"]
n = 3 # where you want to insert
while n < len(b):
for i in range(len(a)):
b.insert(n, a[i])
n += 1 # increment n by 1 so 'b' insert after 'a'
n += 3 # increment n by 3 every iteration to insert list a
print(b)
Result: ['j', 'k', 'l', 'a', 'b', 'm', 'n', 'o', 'a', 'b', 'p', 'q', 'r', 'a', 'b', 's', 't', 'u']
list3 = []
n = 3
for x in range(1, len(list2)+1):
if x%n == 0 and x != len(list2):
list3.append(list2[x-1])
for y in list1:
list3.append(y)
else:
list3.append(list2[x-1])
print(list3)
one = ["a","b"]
two = ["j","k","l","m","n","o","p","q","r","s","t","u"]
start =0
end = 3 #or whatever number you require
complete_list=[]
iterations = int(len(two)/3)
for x in range(iterations):
sub_list= two[start:end]
start = start+3
end= end+3
complete_list.append(sub_list)
if x < iterations-1:
complete_list.append(one)
complete_list = flatten(complete_list)
There may be a shorter version of code to do this, but this will work as well.
list1= ["a","b"]
list= ["j","k","l","m","n","o","p","q","r","s","t","u"]
d=[]
for i in list:
print(list.index(i))
if ((list.index(i)+1)%3==0 and list.index(i)!=0):
d.append(i)
d.extend(list1)
else:
d.append(i)
print(d)

Cannot find glitch in program using recursion for multible nested for-loops

alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u',
'v', 'w', 'x', 'y', 'z']
endlist = []
def loopfunc(n, lis):
if n ==0:
endlist.append(lis[0]+lis[1]+lis[2]+lis[3]+lis[4])
for i in alphabet:
if n >0:
lis.append(i)
loopfunc(n-1, lis )
loopfunc(5, [])
This program is supposed to make endlist be:
endlist = [aaaaa, aaaab, aaaac, ... zzzzy, zzzzz]
But it makes it:
endlist = [aaaaa, aaaaa, aaaaa, ... , aaaaa]
The lenght is right, but it won't make different words. Can anyone help me see why?
The only thing you ever add to endlist is the first 5 elements of lis, and since you have a single lis that is shared among all the recursive calls (note that you never create a new list in this code other than the initial values for endlist and lis, so every append to lis is happening to the same list), those first 5 elements are always the a values that you appended in your first 5 recursive calls. The rest of the alphabet goes onto the end of lis and is never reached by any of your other code.
Since you want string in the end, it's a little easier just to use strings for collecting your items. This avoids the possibility of shared mutable references which is cause your issues. With that the recursion becomes pretty concise:
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def loopfunc(n, lis=""):
if n < 1:
return [lis]
res = []
for a in alphabet:
res.extend(loopfunc(n-1, lis + a))
return res
l = loopfunc(5)
print(l[0], l[1], l[-1], l[-2])
# aaaaa aaaab zzzzz zzzzy
Note that with n=5 you'll have almost 12 million combinations. If you plan on having larger n values, it may be worth rewriting this as a generator.

Recursion, out of memory?

I wrote a function with two parameters. One is an empty string and the other is a string word. My assignment is to use to recursion to reverse the word and place it in the empty string. Just as I think ive got it, i received an "out of memory error". I wrote the code so that so it take the word, turn it into a list, flips it backwards, then places the first letter in the empty string, then deletes the letter out of the list so recursion can happen to each letter. Then it compares the length of the the original word to the length of the empty string (i made a list so they can be compared) so that when their equivalent the recursion will end, but idk
def reverseString(prefix, aStr):
num = 1
if num > 0:
#prefix = ""
aStrlist = list(aStr)
revaStrlist = list(reversed(aStrlist))
revaStrlist2 = list(reversed(aStrlist))
prefixlist = list(prefix)
prefixlist.append(revaStrlist[0])
del revaStrlist[0]
if len(revaStrlist2)!= len(prefixlist):
aStr = str(revaStrlist)
return reverseString(prefix,aStr)
When writing something recursive I try and think about 2 things
The condition to stop the recursion
What I want one iteration to do and how I can pass that progress to the next iteration.
Also I'd recommend getting the one iteration working then worry about calling itself again. Otherwise it can be harder to debug
Anyway so applying this to your logic
When the length of the output string matches the length of the input string
add one letter to the new list in reverse. to maintain progress pass list accumulated so far to itself
I wanted to just modify your code slightly as I thought that would help you learn the most...but was having a hard time with that so I tried to write what i would do with your logic.
Hopefully you can still learn something from this example.
def reverse_string(input_string, output_list=[]):
# condition to keep going, lengths don't match we still have work to do otherwise output result
if len(output_list) < len(list(input_string)):
# lets see how much we have done so far.
# use the length of current new list as a way to get current character we are on
# as we are reversing it we need to take the length of the string minus the current character we are on
# because lists are zero indexed and strings aren't we need to minus 1 from the string length
character_index = len(input_string)-1 - len(output_list)
# then add it to our output list
output_list.append(input_string[character_index])
# output_list is our progress so far pass it to the next iteration
return reverse_string(input_string, output_list)
else:
# combine the output list back into string when we are all done
return ''.join(output_list)
if __name__ == '__main__':
print(reverse_string('hello'))
This is what the recursion will look like for this code
1.
character_index = 5-1 - 0
character_index is set to 4
output_list so far = ['o']
reverse_string('hello', ['o'])
2.
character_index = 5-1 - 1
character_index is set to 3
output_list so far = ['o', 'l']
reverse_string('hello', ['o', 'l'])
3.
character_index = 5-1 - 2
character_index is set to 2
output_list so far = ['o', 'l', 'l']
reverse_string('hello', ['o', 'l', 'l'])
4.
character_index = 5-1 - 3
character_index is set to 1
output_list so far = ['o', 'l', 'l', 'e']
reverse_string('hello', ['o', 'l', 'l', 'e'])
5.
character_index = 5-1 - 4
character_index is set to 0
output_list so far = ['o', 'l', 'l', 'e', 'h']
reverse_string('hello', ['o', 'l', 'l', 'e', 'h'])
6. lengths match just print what we have!
olleh

How to return a list of strings from a list?

Trying to return a list of strings found in rows of lists. But the order has to be from left to right starting from the top row to lowest without returning duplicates. I'm not sure how to proceed. Would I need to make an IF statement for the letters A to Z if it matches with the list then append them to a new list?
def get_locations(lst):
new_lst = [] # New list?
for i in range(len(lst)):
if 'A' lst[i] <= 'Z' or 'a' <= lst[i] <= 'z': #If strings are A to Z
new_lst.append # Add to new list?
return new_lst
List example: It should return like this get_locations(lst) → ["a","B","A","z","C"]
lst1 = [['.', '.', 'a', 'B'],
['.', '.', 'a', '.'],
['A', '.', '.', 'z'],
['.', '.', '.', 'z'],
['.', '.', 'C', 'C']]
Let's go through the function line by line:
new_lst = [] # New list?
Yes, it creates a new list.
for i in range(len(lst)):
Although this is valid, there is rarely a reason to iterate through list indices in Python. You can iterate through the elements instead. Furthermore, this is a list of lists, so that should be handled as well:
for sublist in lst:
for character in sublist:
After that use character instead of lst[i].
if 'A' lst[i] <= 'Z' or 'a' <= lst[i] <= 'z': #If strings are A to Z
There is a syntax error at 'A' lst[i]. Otherwise, this could work if character is actually a character. If it is a longer string, it may give unexpected results (depending on what you expect):
if 'A' <= character <= 'Z' or 'a' <= character <= 'z':
So, an interesting character was found. Add it to the result?
new_lst.append # Add to new list?
The function should be called:
new_lst.append(character)
BTW, this appends the character regardless of whether it was already in new_lst or not. I gather it should only add a character once:
if character not in new_lst:
new_lst.append(character)
The next line returns the list, but too early:
return new_lst
It should not be indented. It should be outside of the loops, so that the result is returned after all has been looped.
I had a ton of trouble understanding what you meant, but assuming I've figured it out, you weren't too far off:
def get_locations(lst):
new_lst = [] # New list?
for row in lst:
for letter in row:
if ('A' <= letter <= 'Z' or 'a' <= letter <= 'z') and letter not in new_lst: #If strings are A to Z
new_lst.append(letter) # Add to new list?
return new_lst
import re
def get_locations(lst):
new_lst = [] # New list? Yes!
# Iterate directly over "flattened" list. That involves confusing
# nested list comprehension. Look it up, it's fun!
for i in [e for row in lst for e in row]:
# using regular expression gives more flexibility - just to show off :)
if re.match('^[a-zA-Z]+$', i) and i not in new_lst: #deal with duplicates
new_lst.append(i) # Add to new list? yes!
return new_lst # corrected indentation
lst1 = [['.', '.', 'a', 'B'],
['.', '.', 'a', '.'],
['A', '.', '.', 'z'],
['.', '.', '.', 'z'],
['.', '.', 'C', 'C']]
print(get_locations(lst1))
Well if you are not very much concerned about the ordering, The following statement will be powerful enough to keep lengthy methods aside:
ans = {j for i in lst for j in i if j!='.'}
And if maintaining the order is a must to have, then you may consider using the following method:
def get_locations(lst):
ans=[]
for i in lst:
for j in i:
if (j is not '.') and (j not in ans):
ans.append(j)
return ans
You can also use the following generator version for your problem:
def get_locations(lst):
ans=[]
for i in lst:
for j in i:
if (j is not '.') and (j not in ans):
ans.append(j)
yield j
Pretty straightforward. string.ascii_letters is a good shortcut to know for all letters a-Z
import string
def get_chars(lst):
new_list = []
for nested_list in lst:
for char in nested_list:
if char not in new_list and char in string.ascii_letters:
new_list.append(char)
return new_list
Then:
>>> get_chars(lst1)
['a', 'B', 'A', 'z', 'C']

Categories