Search list with another list but stop on first match - python

I have two lists, a short one and a longer one.
list1= ['one', 'two']
list2= ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
I need to search the long list for every word in the short list. If it finds a match, stop searching and do something. If it doesn't find it, do something else. The actual list can be quite long so if it finds it I don't want it to keep looking. The only part I can't figure out is getting it to stop once found. Maybe my search terms are wrong. How do I get it to stop search once found, return None if not found? What's the most efficient or pythonic way of doing this? Here is what I have (the fuzzy search is part of something else):
for name in list1:
for dict in reversed(list2):
if fuzz.WRatio(name, dict['Number']) > 90:
I know I can add what to do when found and then break but then I'm not sure what to do if it isn't found except put in another if but now it's starting to seem kludgy.

The pattern you described is often designed to be a function of the form def find(content, pattern) -> offset.
You iterate over the candidates and find the first one matching the pattern, which in your case is by checking if it matches any string in the second list.
When there's no match found, this kind of function often returned -1, for example, the string.find method in Python returns -1 when nothing's found.
So in your case you may create a function like the following:
def find(candidates, patterns):
for i, name in enumerate(candidates):
for dict in reversed(patterns):
if fuzz.WRatio(name, dict['Number']) > 90:
return i # return the index of the name match a pattern
return -1

As far as I understand, maybe code like this is what you want.
list1 = ['one', 'two']
list2 = ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
list1_count = 0
for name1 in list1:
for name2 in list2:
if name1 == name2:
list1_count = list1_count + 1
break
if list1_count == len(list1):
print("found")
else:
print("not found")
Lines from list1_count = 0 to break can be (maybe more Pythonically) replaced to:
list1_count = 0
for name1 in list1:
if name1 in list2:
list1_count = list1_count + 1

I don't know if I understand what you're looking for, but something that finds the first value and stops it
list1 = ['one', 'two']
list2 = ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
for l in list1:
a = list2.index(l)
break
print(a)
If you want to return None if you find nothing, try
list1 = ['one', 'two']
list2 = ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
try:
for l in list1:
a = list2.index(l)
break
except:
a = None
print(a)

The following will tell you if all of the values from list1 are in list2.
all_in = all([val in list2 for val in list1])
If all of the values from list1 are in list2, the value of all_in will be True, and if they weren't, the value of all_in will be False.
If you wanted, you could use this line directly to control your if-else logic.
if all([val in list2 for val in list1]):
#do thing if match
else:
#do thing if no match
Edit
If you were looking for the first match of any word in the first list, this might be closer to what you were looking for.
This will give you a True value if there is any match from the first list in the second. Again you can use this for an if statement.
any_in = any((val in list2 for val in list1))
If you need the value of the first match, or a None value if no match is found, this should work.
first_match = next((val for val in list1 if val in list2), None)
That will make use of Python's generators to stop on the very first matching case of any of the words in the first list.
Edit 2
I think I'm pretty sure that the behavior that you were trying to describe was nesting the loops.
for val in list1:
if val in list2:
#do something
else:
#do something else

Related

Python - Sort Dictionary Key Alphabetically while sorting Value list elements by length

I have a Dictionary here:
test_dict = {'gfg': ['One', 'six', 'three'],
'is': ['seven', 'eight', 'nine'],
'best': ['ten', 'six']}
I tried:
for i in range(len(test_dict)):
values = list(test_dict.values())
keys = list(test_dict)
value_sorted_list = values[i]
value_sorted_list = keys[i]
keys_sorted_list = random.shuffle(value_sorted_list)
test_dict.update({f"{keys_sorted_list}":value_sorted_list})
I want to sort the keys alphabetically while the value list by length
Something like this:
test_dict = {'best': ['six', 'ten'],
'gfg': ['One', 'six', 'three'],
'is': ['nine', 'eight', 'seven]}
I also want another function similar to the one i mentioned above but if the elements are similar length, to sort them randomly.
As well as another function to sort value list randomly.
Sorting keys alphabetically and values by length.
new_dict = {}
for key in sorted(test_dict.keys()):
sorted_values = sorted(test_dict[key], key=len)
new_dict[key] = sorted_values
print(new_dict)
This can be achieved with a dictionary comprehension as follows:
test_dict = {'gfg': ['One', 'six', 'three'],
'is': ['seven', 'eight', 'nine'],
'best': ['ten', 'six']}
new_dict = {k:sorted(v, key=len) for k, v in sorted(test_dict.items())}
print(new_dict)
Output:
{'best': ['ten', 'six'], 'gfg': ['One', 'six', 'three'], 'is': ['nine', 'seven', 'eight']}
dict preserves insertion order since 3.7.
Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.
Therefore, you can simply construct a new dictionary according to the sorted key and ensure the corresponding value is sorted. From the output your posted, the value is sorted by length first then alphabetically.
result = {key: sorted(value, key=lambda x: (len(x), x)) for key, value in sorted(test_dict.items())} # Thanks to Masklinn
print(result)
# {'best': ['six', 'ten'], 'gfg': ['One', 'six', 'three'], 'is': ['nine', 'eight', 'seven']}
Reference:
dict-comprehension - a way to construct a dictionary
sorted - The key function here achieves sorting by length first then alphabetically. You can change it according to your sorting rules.

regex for combining length, inclusion and exclusion?

A search on SO with just [regex] gave me 249'446 hits and a search with [regex] inclusion exclusion gave me 47 hits but I guess none of the latter (maybe some of the former?) fit my case.
I am also aware, e.g. about this regex page https://www.regular-expressions.info/refquick.html,
but I guess there might be a regex concept which I am not yet familiar with
and would be grateful for hints.
Here is a minimal example of what I am trying to do with a given list of strings.
Find all items which:
have a fixed defined number of characters, i.e. length
must include all characters from a certain list (doesn't matter at what position and if multiple times)
must NOT include any characters from a certain list
Constructs like: [ei^no]{4}, ((?![no])[ei]){4} and a lot of other more complex trials didn't give the desired results.
Hence, I currently implemented this as a 3 step process with checking the length, doing a search and a match. This looks pretty cumbersome and inefficient to me.
Is there a more efficient way to do this?
Script:
import re
items = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve']
count = 4
mustContain = 'ei' # all of these charactes at least once
mustNotContain = 'no' # none of those chars
hits1 = []
for item in items:
if len(item)==count:
hits1.append(item)
print("Hits1:",hits1)
hits2 = []
for hit in hits1:
regex = '[{}]'.format(mustContain)
if re.search(regex,hit):
hits2.append(hit)
print("Hits2:", hits2)
hits3 = []
for hit in hits2:
regex = '[{}]'.format(mustNotContain)
if re.match(regex,hit):
hits3.append(hit)
print("Hits3:", hits3)
Result:
Hits1: ['four', 'five', 'nine']
Hits2: ['five', 'nine']
Hits3: ['five']
If you are interested in a regex approach, you can create a single dynamic pattern that looks like:
^(?=.{4}$)(?![^no\n]*[no])(?=[^e\n]*e)[^i\n]*i.*$
Explanation
^ Start of string
(?=.{4}$) Assert 4 characters
(?![^no\n]*[no]) Assert no occurrence of n or o to the right using a leading negated character class
(?=[^e\n]*e) Assert an e char to the right
[^i\n]*i Match any char except i and then match i
.* Match the rest of the line
$ end of string
See a regex demo and a Python demo.
Example
import re
items = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve', 'tree']
hits = [item for item in items if re.match(r"(?=.{4}$)(?![^no\n]*[no])(?=[^e\n]*e)[^i\n]*i.*$", item)]
print(hits)
Output
['five']
Using a variation of all and a list comprehension:
items = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve', 'tree']
count = 4
mustContain = ["e", "i"] # all of these characters at least once
mustNotContain = ["n", "o"] # none of those chars
hits = [
item for item in items if
len(item) == count and
all([c in item for c in mustContain]) and
all([c not in item for c in mustNotContain])
]
print(hits)
Output
['five']
See a Python demo.
Apparently, the "trick" which I was missing was the "Positive lookahead" (?=regex).
I guess the regex in #Thefourthbird's solution can be shortened,
unless I overlooked something and somebody will prove me wrong.
The regex for the included characters can be generated dynamically.
The regex for the original minimal example of the question would be:
^(?=.{4}$)(?!.*[no])(?=.*e)(?=.*i)
Script: (dynamically generated regex)
import re
items = ['one', 'two', 'three', 'four', 'five', 'six',
'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve',
'tree', 'mean', 'mine', 'fine', 'dime', 'eire']
count = 4
mustContain = 'ei' # all of these characters at least once
mustNotContain = 'no' # none of those chars
hits = []
regex1 = '^(?=.{' + str(count) + '}$)' # limit number of chars
regex2 = '(?!.*[' + mustNotContain + '])' if mustNotContain else '' # excluded chars
regex3 = ''.join(['(?=.*{})'.format(c) for c in mustContain]) # included chars
regex = regex1 + regex2 + regex3
for item in items:
if re.match(regex,item,re.IGNORECASE):
hits.append(item)
print("Hits:", hits)
Result:
Hits: ['five', 'dime', 'eire']

Shift elements of a list forward (rotating a list)

My issue is as follows: I want to create a program which accepts strings divided from each other by one space. Then the program should prompt a number, which is going to be the amount of words it's going to shift forward. I also want to use lists for words as well as for the output, because of practice.
Input: one two three four five six seven 3
Output: ['four', 'five', 'six', 'seven', 'one', 'two', 'three']
This is what I've came up with. For the input I've used the same input as above. However, when I try increasing a prompt number by N, the amount of appended strings to list cuts by N. Same happens when I decrease the prompt number by N (the amount of appended strings increases by N). What can be an issue here?
l_words = list(input().split())
shift = int(input()) + 1 #shifting strings' number
l = [l_words[shift - 1]]
for k in range(len(l_words)):
if (shift+k) < len(l_words):
l.append(l_words[shift+k])
else:
if (k-shift)>=0:
l.append(l_words[k-shift])
print(l)
You can use slicing by just joining the later sliced part to the initial sliced part given the rotation number.
inp = input().split()
shift_by = int(inp[-1])
li = inp[:-1]
print(li[shift_by:] + li[:shift_by]) # ['four', 'five', 'six', 'seven', 'one', 'two', 'three']

replace all the values in a string in lesser line of codes

I have a string = "12345678"
I wanted to replace each character of this string into text:
I have already built the dictionary(its my requirement). However, I do not know how to replace all of it.
The dictionary is build all I have to do is just loop string and then replace it with the value in the dictionary.
text = [string.replace(x, dictionary[x]) for x in string]
My current output:
it replaces one by one instead and then created 8 different element in the list with each element only one character is replace.
Example(Sorry I cant show much):
text = [one2345678, 1two345678, 12three45678...1234567eight]
I dont know why.
My expected output:
text= onetwothreefourfivesixseveneight
The issue is because you are using a list comprehension instead try
string = "12345678"
text = ''
for x in string:
text += dictionary[x]
or
text = "".join(dictionary[x] for x in string)
import re
s='12345678'
d={
'1':'one',
'2':'two'
}
print(re.sub(r'\d',lambda x:d[x.group()],s))
The regular expression route that SmartManoj recommended is perfect if the thing you want to replace is more than one character, but if you're mapping single characters to arbitrary-length strings, then it's waaaaaay overkill.
You can instead use str.translate alongside str.maketrans
dictionary = {'1': 'one', '2': 'two', ... }
mapping = str.maketrans(dictionary)
string = '12345678'
text = string.translate(mapping)
List comprehension solution:
text_in = '12345678 and the rest is not in dic'
replace_dic = {
'1': 'one',
'2': 'two',
'3': 'three',
'4': 'four',
'5': 'five',
'6': 'six',
'7': 'seven',
'8': 'eight',
}
text_out = ''.join(replace_dic.get(c, c) for c in text_in)
print(text_out) # 'onetwothreefourfivesixseveneight and the rest is not in dic'
EDIT: code edited per comment, and print converted to Python3.

List all elements, but only one of duplicated elements?

Say I have a list of strings such as
words = ['one', 'two', 'one', 'three', 'three']
I want to create a new list in alphabetical order like
newList = ['one', 'three', 'two']
Anyone have any solutions? I have seen suggestions that output duplicates, but I cannot figure out how to achieve this particular goal (or maybe I just can't figure out how to google well.)
Throw the contents into a set to remove duplicates and sort:
newList = sorted(set(words))
OR maybe this, using set:
newList=sorted({*words})
If Order of elements in words is important for you. You can try this.
from collections import OrderedDict
words = ['one', 'two', 'one', 'three', 'three']
w1 = OrderedDict()
for i in words:
if i in w1:
w1[i]+=1
else:
w1[i] = 1
print(w1.keys())

Categories