Find the uncommon word between two lists? - python

I have 2 lists such as this:
>>>first = ['hello', 'hey', 'hi', 'hey']
>>>second = ['hey', 'hi', 'hello']
I want a new list that contains the word that is not included in the second list. In this case:
odd_word = ['hey']
What is the quickest way of doing this? Thank you.
I tried using the method shown here: Get difference between two lists, but it gives me a blank list.
>>> odd = list(set(first) - set(second))
>>> odd
[]

You could use collections.Counter:
>>> from collections import Counter
>>> first = ['hello', 'hey', 'hi', 'hey']
>>> second = ['hey', 'hi', 'hello']
>>> odd_word = list((Counter(first) - Counter(second)).elements())
>>> print(odd_word)
['hey']

Do this.
odd_word = [s for s in first if s not in second]
This will give you duplicates if there are duplicate words in first that aren't in second. If you don't want the duplicates, do this instead.
odd_word = list({s for s in first if s not in second})

Related

How to get most occuring element(s) in a list of strings, also when there are many with the same count?

Output is expected to be a list of string as well.
For example ['hey', 'ho', 'hi', 'hey', 'ho'] should output ['hey', 'ho']
def find_most_popular(list):
popular = max(set(list), key=list.count)
return popular
My code only outputs the one most popular
Thanks!
This is what you need (i used "l" instead of "list" to avoid confusion with built in data type "list"):
def find_most_popular(l):
popular = [i for i in l if l.count(i)==max([l.count(p) for p in set(l)])]
return list(set(popular))
For
your_list=['hey', 'ho', 'hi', 'hey', 'ho']
Result will be:
find_most_popular(your_list)
['hey', 'ho']
You can use collections.Counter:
from collections import Counter
c = Counter(['hey', 'ho', 'hi', 'hey', 'ho'])
print([k for k, v in c.items() if v==c.most_common()[0][1]])
Prints:
['hey', 'ho']

How can I merge a list with a nested list?

I have two lists. The first one is a simple list with one entry:
l = [hello]
The second one is a nested list, with several thousand entries
i = [[world, hi],
[world, hi],
[world, hi],
[world, hi]]
Now I want to insert the entry from list l into the nested lists of i like this:
i = [[hello, world, hi],
[hello, world, hi],
[hello, world, hi],
[hello, world, hi]]
How can I do this?
You can use list comprehension to achieve that
i = [l+x for x in i]
Just use the insert method on each sublist in i.
for sublist in i:
sublist.insert(0, l[0])
If you don't want to modify i in place, and would prefer to build a new list, then
i = [[l[0], *sublist] for sublist in i]
simple: [[*l,*el] for el in i]
Output:
[['hello', 'world', 'hi'],
['hello', 'world', 'hi'],
['hello', 'world', 'hi'],
['hello', 'world', ' hi']]
I guess the benefits is that no matter how many elements you have in l, they will all be put in front of the stuff in i
Just add the element to the list you want
l = ["hello"]
i = [["world", "hi"],
["world", "hi"],
["world", "hi"],
["world", "hi"]]
x = []
for y in i:
x.append(l+y)
x
output:
[['hello', 'world', 'hi'],
['hello', 'world', 'hi'],
['hello', 'world', 'hi'],
['hello', 'world', 'hi']]
I believe something like this should do the trick:
l = ['hello']
i = [['world', 'hi'],
['world', 'hi'],
['world', 'hi'],
['world', 'hi']]
for j in i:
for k in l:
j = j.append(k)
I'm just iterating through your list i, nesting another for loop to iterate through the simpler list l (to account for the case whereby it has multiple elements), and simply appending each entry in l to the current list in i.

Python: Remove a single entry in a list based on the position of the entry [duplicate]

This question already has answers here:
How to remove an element from a list by index
(18 answers)
Closed 7 years ago.
Is there an easy way to delete an entry in a list? I would like to only remove the first entry. In every forum that I have looked at, the only way that I can delete one entry is with the list.remove() function. This would be perfect, but I can only delete the entry if I know it's name.
list = ['hey', 'hi', 'hello', 'phil', 'zed', 'alpha']
list.remove(0)
This doesn't work because you can only remove an entry based on it's name. I would have to run list.remove('hey'). I can't do this in this particular instance.
If you require any additional information, ask.
These are methods you can try:
>>> my_list = ['hey', 'hi', 'hello', 'phil', 'zed', 'alpha']
>>> del my_list[0]
>>> my_list = ['hey', 'hi', 'hello', 'phil', 'zed', 'alpha']
>>> if 'hey' in my_list: # you're looking for this one I think
... del my_list[my_list.index('hey')]
...
>>> my_list
['hi', 'hello', 'phil', 'zed', 'alpha']
You can also use filter:
my_list = filter(lambda x: x!='hey', my_list)
Using list comprehension:
my_list = [ x for x in my_list if x!='hey']
First of all, never call something "list" since it clobbers the built-in type 'list'. Second of all, here is your answer:
>>> my_list = ['hey', 'hi', 'hello', 'phil', 'zed', 'alpha']
>>> del my_list[1]
>>> my_list
['hey', 'hello', 'phil', 'zed', 'alpha']
Lists work with positions, not keys (or names, whatever you want to call them).
If you need named access to your data structure consider using a dictionary instead which allows access to its value by using keys which map to the values.
d = {'hey':0, 'hi':0, 'hello':0, 'phil':0, 'zed':0, 'alpha':0}
del d['hey']
print(d) # d = {'alpha': 0, 'hello': 0, 'hi': 0, 'phil': 0, 'zed': 0}
Otherwise you will need to resort to index based deletion by getting the index of the element and calling del alist[index].
To add to the poll of answers..how about:
>>> my_list = ['hey', 'hi', 'hello', 'phil', 'zed', 'alpha']
>>> my_list=my_list[1:]
>>> my_list
['hi', 'hello', 'phil', 'zed', 'alpha']

Making a new list from an existing list

Basically just looking if there is an easy way to reverse the list.
People were getting to confused with my original question.
This was the list: words = ['hey', 'hi', 'hello', 'hi']
How to reverse it (to a new list) and only add to new list if it is not already in it.
This snippet iterates through the list of words in reverse; and adds new unique entries to a new list.
words = ['hey', 'hi', 'hello', 'hi']
result = []
for word in reversed(words):
if word not in result:
result.append(word)
print(result)
Output
['hi', 'hello', 'hey']
Converting the first list to a set() ensures duplicates are removed. Then the set is converted to a reversely sorted list.
final_lst = sorted(set(words), reverse=True)

How to eliminate duplicate list entries in Python while preserving case-sensitivity?

I'm looking for a way to remove duplicate entries from a Python list but with a twist; The final list has to be case sensitive with a preference of uppercase words.
For example, between cup and Cup I only need to keep Cup and not cup. Unlike other common solutions which suggest using lower() first, I'd prefer to maintain the string's case here and in particular I'd prefer keeping the one with the uppercase letter over the one which is lowercase..
Again, I am trying to turn this list:
[Hello, hello, world, world, poland, Poland]
into this:
[Hello, world, Poland]
How should I do that?
Thanks in advance.
This does not preserve the order of words, but it does produce a list of "unique" words with a preference for capitalized ones.
In [34]: words = ['Hello', 'hello', 'world', 'world', 'poland', 'Poland', ]
In [35]: wordset = set(words)
In [36]: [item for item in wordset if item.istitle() or item.title() not in wordset]
Out[36]: ['world', 'Poland', 'Hello']
If you wish to preserve the order as they appear in words, then you could use a collections.OrderedDict:
In [43]: wordset = collections.OrderedDict()
In [44]: wordset = collections.OrderedDict.fromkeys(words)
In [46]: [item for item in wordset if item.istitle() or item.title() not in wordset]
Out[46]: ['Hello', 'world', 'Poland']
Using set to track seen words:
def uniq(words):
seen = set()
for word in words:
l = word.lower() # Use `word.casefold()` if possible. (3.3+)
if l in seen:
continue
seen.add(l)
yield word
Usage:
>>> list(uniq(['Hello', 'hello', 'world', 'world', 'Poland', 'poland']))
['Hello', 'world', 'Poland']
UPDATE
Previous version does not take care of preference of uppercase over lowercase. In the updated version I used the min as #TheSoundDefense did.
import collections
def uniq(words):
seen = collections.OrderedDict() # Use {} if the order is not important.
for word in words:
l = word.lower() # Use `word.casefold()` if possible (3.3+)
seen[l] = min(word, seen.get(l, word))
return seen.values()
Since an uppercase letter is "smaller" than a lowercase letter in a comparison, I think you can do this:
orig_list = ["Hello", "hello", "world", "world", "Poland", "poland"]
unique_list = []
for word in orig_list:
for i in range(len(unique_list)):
if unique_list[i].lower() == word.lower():
unique_list[i] = min(word, unique_list[i])
break
else:
unique_list.append(word)
The min will have a preference for words with uppercase letters earlier on.
Some better answers here, but hopefully something simple, different and useful. This code satisfies the conditions of your test, sequential pairs of matching words, but would fail on anything more complicated; such as non-sequential pairs, non-pairs or non-strings. Anything more complicated and I'd take a different approach.
p1 = ['Hello', 'hello', 'world', 'world', 'Poland', 'poland']
p2 = ['hello', 'Hello', 'world', 'world', 'Poland', 'Poland']
def pref_upper(p):
q = []
a = 0
b = 1
for x in range(len(p) /2):
if p[a][0].isupper() and p[b][0].isupper():
q.append(p[a])
if p[a][0].isupper() and p[b][0].islower():
q.append(p[a])
if p[a][0].islower() and p[b][0].isupper():
q.append(p[b])
if p[a][0].islower() and p[b][0].islower():
q.append(p[b])
a +=2
b +=2
return q
print pref_upper(p1)
print pref_upper(p2)

Categories