How to find common values in list using Python? - python

I have a dataframe looks like this:
names
year name
0 1990 'a', 'b', 'c'
1 2001 'a', 'd', 'c'
2 2004 'e', 'b', 'c'
And I want to find the common values in names such that,
c:3, a:2, b:2, d:1, e:1
I am not sure how to approach this.
But what I thought of is to convert the name column to a list:
names_list = name['name'].tolist()
names_list = ['a', 'b', 'c', 'a', 'd', 'c', 'e', 'b', 'c']
And then, use the below function I found in another post to get the most common value:
def most_common(lst):
return max(set(lst), key=lst.count)
most_common(names_list)
'c'
And it only gives one most common value, but I'm trying to get at least the top 3 values from the list. How can I do this?

Let us do mode after split and explode
df.name.str.split(', ').explode().mode()
Return the count
df.name.str.split(', ').explode().value_counts() # if only would like the highest count ,
#df.name.str.split(', ').explode().value_counts().sort_values().tail(1)

If you have
names_list = ['a', 'b', 'c', 'a', 'd', 'c', 'e', 'b', 'c']
then you might use collections.Counter following way:
import collections
names_list = ['a', 'b', 'c', 'a', 'd', 'c', 'e', 'b', 'c']
occurs = collections.Counter(names_list)
print(occurs)
Output:
Counter({'c': 3, 'a': 2, 'b': 2, 'e': 1, 'd': 1})
Note that collections.Counter is subclass of dict, so occurs has .keys(), .values(), .items() and so on.

Related

why does this code stop without raising error?

I want to collect some data which is more than two in a list.
For this, I wrote code like the one below.
A= ['a', 'a', 'b', 'b', 'a', 'c', 'd', 'b']
for ab in A:
ab_list = list()
for _ in range(A.count(ab)):
Ab_list.append(A.pop(A.index(ab)))
# other code ~
When I checked the code, it didn't work at 'c', 'd'.
It just stops when all 'b' in list A are removed.
For me, it's okay because 'c', 'd' is just one, but I want to know the reason it stops at 'c' and 'd'.
Please help newbie
thanks expert
Try this
A = ['a', 'a', 'b', 'b', 'a', 'c', 'd', 'b']
ab_list = [character for character in set(A) if A.count(character) > 0]
print(ab_list)
Output:
['b', 'a', 'c', 'd']
Your problem in this line
for _ in range(A.count(ab)):
You receive count of current character, for example 2, and after make loop by this value. So loop check only 2-3 first element is array.
But if you want just count elements in array, you can use numpy
A = ['a', 'a', 'b', 'b', 'a', 'c', 'd', 'b']
# count element in A array
B = np.unique(A, return_counts=True)
Results:
(array(['a', 'b', 'c', 'd'], dtype='<U1'), array([3, 3, 1, 1], dtype=int64))

Python: How to update dictionary with step-index from list

I am a week-old python learner. I would like to know: Let’s say:
list= [“a”, “A”, “b”, “B”, “c”, “C”]
I need to update them in dictionary to be a result like this:
dict={“a”:”A”, “b”:”B”, “c”:”C”}
I try to use index of list within dict.update({list[n::2]: list[n+1::2]} and for n in range(0,(len(list)/2))
I think i did something wrong. Please correct me.
Thank you in advance.
Try the following:
>>> lst = ['a', 'A', 'b', 'B', 'c', 'C']
>>> dct = dict(zip(lst[::2],lst[1::2]))
>>> dct
{'a': 'A', 'b': 'B', 'c': 'C'}
Explanation:
>>> lst[::2]
['a', 'b', 'c']
>>> lst[1::2]
['A', 'B', 'C']
>>> zip(lst[::2], lst[1::2])
# this actually gives a zip iterator which contains:
# [('a', 'A'), ('b', 'B'), ('c', 'C')]
>>> dict(zip(lst[::2], lst[1::2]))
# here each tuple is interpreted as key value pair, so finally you get:
{'a': 'A', 'b': 'B', 'c': 'C'}
NOTE: Don't name your variables same as python keywords.
Correct version of your program would be:
lst = ['a', 'A', 'b', 'B', 'c', 'C']
dct = {}
for n in range(0,int(len(lst)/2)):
dct.update({lst[n]: lst[n+1]})
print(dct)
Yours did not work because you used slices in each iteration, instead of accessing each individual element. lst[0::2] gives ['a', 'b', 'c'] and lst[1::2] gives ['A', 'B', 'C']. So for the first iteration, when n == 0 you are trying to update the dictionary with the pair ['a', 'b', 'c'] : ['A', 'B', 'C'] and you will get a type error as list can not be assigned as key to the dictionary as lists are unhashable.
You can use dictionary comprehension like this:
>>> l = list("aAbBcCdD")
>>> l
['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
>>> { l[i] : l[i+1] for i in range(0,len(l),2)}
{'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
The below code would be the perfect apt to your question. Hope this helped you
a = ["a", "A", "B","b", "c","C","d", "D"]
b = {}
for each in range(len(a)):
if each % 2 == 0:
b[a[each]] = a[each + 1]
print(b)

Check which items exist in all sublists [duplicate]

This question already has answers here:
How to find common elements in list of lists?
(7 answers)
Closed 3 years ago.
I would like to collect all the items that exist in every sublist.
Let's say the list of lists looks like this:
list[0] = ['A', 'B', 'C', 'D']
list[1] = ['X', 'B', 'A']
list[2] = ['R', 'C', 'A', 'B', 'X']
'A' and 'B' exist in every sublist and my goal is to save these to another list:
list2 = ['A', 'B']
I've been trying to use list comprehension but I can't figure out how to get it to work the way I want. Any help is greatly appreciated!
If order doesn't matter, you can use set.intersection:
l = [['A', 'B', 'C', 'D'],
['X', 'B', 'A'],
['R', 'C', 'A', 'B', 'X']]
list2 = list(set.intersection(*(set(sl) for sl in l)))
print(list2)
Output:
['A', 'B']
# or ['B', 'A']
Use set intersections, and then convert the result back to a list.
your_list = [
['A', 'B', 'C', 'D'],
['X', 'B', 'A'],
['R', 'C', 'A', 'B', 'X']
]
print(set.intersection(*map(set,your_list)))
If you know the values per list are unique in the first place, and you don't care about order, then you can just use a list of sets in your original code, so this simplifies to:
your_list = [
set(['A', 'B', 'C', 'D']),
set(['X', 'B', 'A']),
set(['R', 'C', 'A', 'B', 'X'])
]
print(set.intersection(*your_list))
Note don't call your own variables list, as it collides with the built-in list type and will get really confusing.
Try this :
>>> list1 = [['A', 'B', 'C', 'D'], ['X', 'B', 'A'], ['R', 'C', 'A', 'B', 'X']]
>>> list2 = list(set(list1[0]).intersection(list1[1]).intersection(list1[2]))
>>> list2
['A', 'B']

Compare a users input list to a set list in order with duplicates

I am trying to take a set of answers either 'A' 'B' 'C' or 'D' in a specific order such as a multiple choice test and have the user input his answers. After I would like it to create a third list and print out what was right and wrong. Here is what I have so far.
userAnswersList = []
correctAnswers = ['A', 'C', 'A', 'A', 'D', 'B', 'C', 'A', 'C', 'B', 'A', 'D', 'C', 'A', 'D', 'C', 'B', 'B', 'D', 'A']
while len(userAnswersList) <= 19:
userAnswers = input('Give me each answer total of 20 questions I\'ll let you know how many you missed.')
userAnswersList.append(userAnswers.upper())
correctedList = []
for i in userAnswersList:
if i in correctAnswers:
correctedList.append(i)
else:
correctedList.append('XX')
print(correctedList)
So my end result would be the corrected list with a 'X' in place where they missed the answer, If it is right it just puts the user input in that place.
So after the user input their 20 answers it would look like
['A', 'C', 'A', 'XX', 'D', 'B', 'C', 'XX', 'C', 'B', 'A', 'XX', 'C', 'A', 'D', 'XX', 'B', 'B', 'XX', 'A']
if they missed 5 questions in that order
EDIT
Thank you again for all your help I was able to solve my problems with your help and some great answers. I used Nicks solution as that is how we are learning it.
I will try out others just so I can get used to them.
Rather than using:
for i in userAnswersList:
you may find it easier to iterate through the array and check if the values are equal, such as:
for i in range(len(userAnswersList)):
if userAnswersList[i] == correctAnswers[i]:
correctedList.append(userAnswersList[i])
else:
correctedList.append('XX')
There is no question here, so I'll assume you're asking what's wrong with what you have.
The compare section uses the same variable i in both lists but even if it was different it wouldn't work.
You'll need something along the following lines:
for i in range(len(correctAnswers)):
correctedList.append(correctAnswers[i] if userAnswersList[i] == correctAnswers[i] else 'XX')
This can be done using Python's map method.
As explained in the help for map:
map(...)
map(function, sequence[, sequence, ...]) -> list
Return a list of the results of applying the function to the items of
the argument sequence(s). If more than one sequence is given, the
function is called with an argument list consisting of the corresponding
item of each sequence, substituting None for missing values when not all
sequences have the same length. If the function is None, return a list of
the items of the sequence (or a list of tuples if more than one sequence).
So in that case, you want to compare each item of your two equal lists, and apply a condition against them. The condition we will introduce will follow this logic:
if something in one list is not equal to the other, set 'XX', otherwise return the value.
So, we will introduce what is called a "lambda" function here to put that above condition. Here is documentation on what a lambda is: http://www.python-course.eu/lambda.php
lambda x, y: 'XX' if x != y else y
So, when we put it all together, we have this:
d = map(lambda x, y: 'XX' if x != y else y, userAnswersList, correctAnswers)
Demo:
correctAnswers = ['A', 'C', 'A', 'A', 'D', 'B', 'C', 'A', 'C', 'B', 'A', 'D', 'C', 'A', 'D', 'C', 'B', 'B', 'D', 'A']
userAnswersList = ['A', 'C', 'A', 'B', 'D', 'B', 'C', 'A', 'A', 'B', 'A', 'C', 'C', 'A', 'D', 'C', 'D', 'C', 'D', 'B']
Result:
['A', 'C', 'A', 'XX', 'D', 'B', 'C', 'A', 'XX', 'B', 'A', 'XX', 'C', 'A', 'D', 'C', 'XX', 'XX', 'D', 'XX']
You can zip the two lists together and check the elements at common indexes, you can get all the answers using a list comprehension replacing your while with range:
userAnswersList = [input('Give me each answer total of 20 questions I\'ll let you know how many you missed.').upper()
for _ in range(20)]
correctAnswers = ['A', 'C', 'A', 'A', 'D', 'B', 'C', 'A', 'C', 'B', 'A', 'D', 'C', 'A', 'D', 'C', 'B', 'B', 'D', 'A']
correctedList = ["XX" if u_a != c_a else u_a for u_a, c_a in zip(userAnswersList)]
If the corresponding elements from each list are the same we add the letter, if not we add "XX" to mark an incorrect answer.

Keep strings that occur N times or more

I have a list that is
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
And I used Counter from collections on this list to get the result:
from collection import Counter
counts = Counter(mylist)
#Counter({'a': 3, 'c': 2, 'b': 2, 'd': 1})
Now I want to subset this so that I have all elements that occur some number of times, for example: 2 times or more - so that the output looks like this:
['a', 'b', 'c']
This seems like it should be a simple task - but I have not found anything that has helped me so far.
Can anyone suggest somewhere to look? I am also not attached to using Counter if I have taken the wrong approach. I should note I am new to python so I apologise if this is trivial.
[s for s, c in counts.iteritems() if c >= 2]
# => ['a', 'c', 'b']
Try this...
def get_duplicatesarrval(arrval):
dup_array = arrval[:]
for i in set(arrval):
dup_array.remove(i)
return list(set(dup_array))
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
print get_duplicatesarrval(mylist)
Result:
[a, b, c]
The usual way would be to use a list comprehension as #Adaman does.
In the special case of 2 or more, you can also subtract one Counter from another
>>> counts = Counter(mylist) - Counter(set(mylist))
>>> counts.keys()
['a', 'c', 'b']
from itertools import groupby
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
res = [i for i,j in groupby(mylist) if len(list(j))>=2]
print res
['a', 'b', 'c']
I think above mentioned answers are better, but I believe this is the simplest method to understand:
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
newlist=[]
newlist.append(mylist[0])
for i in mylist:
if i in newlist:
continue
else:
newlist.append(i)
print newlist
>>>['a', 'b', 'c', 'd']

Categories