List all elements, but only one of duplicated elements? - python

Say I have a list of strings such as
words = ['one', 'two', 'one', 'three', 'three']
I want to create a new list in alphabetical order like
newList = ['one', 'three', 'two']
Anyone have any solutions? I have seen suggestions that output duplicates, but I cannot figure out how to achieve this particular goal (or maybe I just can't figure out how to google well.)

Throw the contents into a set to remove duplicates and sort:
newList = sorted(set(words))

OR maybe this, using set:
newList=sorted({*words})

If Order of elements in words is important for you. You can try this.
from collections import OrderedDict
words = ['one', 'two', 'one', 'three', 'three']
w1 = OrderedDict()
for i in words:
if i in w1:
w1[i]+=1
else:
w1[i] = 1
print(w1.keys())

Related

How to remove items from list that have more then 20 characters?

I have a list, inside of list I have items. Some of items have more then 20 characters and I want to remove those items. But I don't want to remove items that have spaces. I provide minimal reproducible example...
This is a list...
some_list = ['one', 'two', 'three', 'black', 'verylongcharacterwords', 'very long character words']
I want to remove 'verylongcharacterwords', but I don't want to remove 'very long character words'.
This is wanted output...
new_list = ['one', 'two', 'three', 'black', 'very long character words']
Thanks in advance!
List comprehensions to the rescue:
>>> lst = ['one', 'two', 'three', 'black', 'verylongcharacterwords', 'very long character words']
>>> [l for l in lst if not any(len(w) > 20 for w in l.split())]
['one', 'two', 'three', 'black', 'very long character words']
>>>
Use a list comprehension with a condition.
new_list = [s for s in old_list if ' ' in s or len(s) <= 20]

Compare two bigrams lists and return the matching bigram

Please suggest how to compare 2 bigrams lists and return the matching bigram only.
From the below example lists, how to return the matching bigrams ['two', 'three'].
bglist1 =
[['one', 'two'],
['two', 'three'],
['three', 'four']]
bglist2 =
[['one', 'six'],
['two', 'four'],
['two', 'three']]
You could just test for if the bigram is in the other list of bigrams.
out = list()
for x in bglist1:
if x in bglist2:
out.append(x)
This would give you a list of lists that are in both bglists.

Search list with another list but stop on first match

I have two lists, a short one and a longer one.
list1= ['one', 'two']
list2= ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
I need to search the long list for every word in the short list. If it finds a match, stop searching and do something. If it doesn't find it, do something else. The actual list can be quite long so if it finds it I don't want it to keep looking. The only part I can't figure out is getting it to stop once found. Maybe my search terms are wrong. How do I get it to stop search once found, return None if not found? What's the most efficient or pythonic way of doing this? Here is what I have (the fuzzy search is part of something else):
for name in list1:
for dict in reversed(list2):
if fuzz.WRatio(name, dict['Number']) > 90:
I know I can add what to do when found and then break but then I'm not sure what to do if it isn't found except put in another if but now it's starting to seem kludgy.
The pattern you described is often designed to be a function of the form def find(content, pattern) -> offset.
You iterate over the candidates and find the first one matching the pattern, which in your case is by checking if it matches any string in the second list.
When there's no match found, this kind of function often returned -1, for example, the string.find method in Python returns -1 when nothing's found.
So in your case you may create a function like the following:
def find(candidates, patterns):
for i, name in enumerate(candidates):
for dict in reversed(patterns):
if fuzz.WRatio(name, dict['Number']) > 90:
return i # return the index of the name match a pattern
return -1
As far as I understand, maybe code like this is what you want.
list1 = ['one', 'two']
list2 = ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
list1_count = 0
for name1 in list1:
for name2 in list2:
if name1 == name2:
list1_count = list1_count + 1
break
if list1_count == len(list1):
print("found")
else:
print("not found")
Lines from list1_count = 0 to break can be (maybe more Pythonically) replaced to:
list1_count = 0
for name1 in list1:
if name1 in list2:
list1_count = list1_count + 1
I don't know if I understand what you're looking for, but something that finds the first value and stops it
list1 = ['one', 'two']
list2 = ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
for l in list1:
a = list2.index(l)
break
print(a)
If you want to return None if you find nothing, try
list1 = ['one', 'two']
list2 = ['ten', 'seven', 'three', 'one', 'eight', 'six', 'nine', 'two', 'four', 'five']
try:
for l in list1:
a = list2.index(l)
break
except:
a = None
print(a)
The following will tell you if all of the values from list1 are in list2.
all_in = all([val in list2 for val in list1])
If all of the values from list1 are in list2, the value of all_in will be True, and if they weren't, the value of all_in will be False.
If you wanted, you could use this line directly to control your if-else logic.
if all([val in list2 for val in list1]):
#do thing if match
else:
#do thing if no match
Edit
If you were looking for the first match of any word in the first list, this might be closer to what you were looking for.
This will give you a True value if there is any match from the first list in the second. Again you can use this for an if statement.
any_in = any((val in list2 for val in list1))
If you need the value of the first match, or a None value if no match is found, this should work.
first_match = next((val for val in list1 if val in list2), None)
That will make use of Python's generators to stop on the very first matching case of any of the words in the first list.
Edit 2
I think I'm pretty sure that the behavior that you were trying to describe was nesting the loops.
for val in list1:
if val in list2:
#do something
else:
#do something else

Removing the element in a list occuring more than once [duplicate]

This question already has answers here:
How do I remove duplicates from a list, while preserving order?
(30 answers)
Closed 3 years ago.
I have a list and don't know all the values but I want to remove all the values which occurs more than once and only one of that value is left.Suppose here is the list:
lst = ['one', 'two', 'three', 'four','four','five','five','five']
This is what I need:
lst = ['one', 'two', 'three', 'four','five']
Here is what I have tried:
i=0
for ele in lst:
if ele[i] in lst:
lst.remove(ele[i])
but it's not working.
This works pretty well:
lst = ['one', 'two', 'three', 'four','four','five','five','five']
newList = list(dict.fromkeys(lst))
print(newList)
output: ['one', 'two', 'three', 'four', 'five']

Converting String list to pure list in Python

I have a string type list from bash which looks like this:
inp = "["one","two","three","four","five"]"
The input is coming from bash script.
In my python script I would like to convert this to normal python list in this format:
["one","two","three","four","five"]
where all elements would be string, but the whole thin is represented as list.
I tried: list(inp)
it does not work. Any suggestions?
Try this code,
import ast
inp = '["one","two","three","four","five"]'
ast.literal_eval(inp) # will prints ['one', 'two', 'three', 'four', 'five']
Have a look at ast.literal_eval:
>>> import ast
>>> inp = '["one","two","three","four","five"]'
>>> converted_inp = ast.literal_eval(inp)
>>> type(converted_inp)
<class 'list'>
>>> print(converted_inp)
['one', 'two', 'three', 'four', 'five']
Notice that your original input string is not a valid python string, since it ends after "[".
>>> inp = "["one","two","three","four","five"]"
SyntaxError: invalid syntax
The solution using re.sub() and str.split() functions:
import re
inp = '["one","two","three","four","five"]'
l = re.sub(r'["\]\[]', '', inp).split(',')
print(l)
The output:
['one', 'two', 'three', 'four', 'five']
you can use replace and split as the following:
>>> inp
"['one','two','three','four','five']"
>>> inp.replace('[','').replace(']','').replace('\'','').split(',')
['one', 'two', 'three', 'four', 'five']

Categories