'Counter' object has no attribute 'count' - python
There are two lists and I want to check how many of elements are duplicate. Assuming list one is l1 = ['a', 'b', 'c', 'd', 'e'] and list two is l2 = ['a', 'f', 'c', 'g']. Since a and c are in both lists, therefore, the output should be 2 which means there are two elements that repeated in both lists. Below is my code and I want to count how many 2 are in counter. I am not sure how to count that.
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
sum = c1+c2
z=sum.count(2)
What you want is set.intersection (if there are no duplicates in each list):
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']
print(len(set(l1).intersection(l2)))
Output:
2
Every time we use a counter it converts the lists into dict. So it is throwing the error. You can simply change the number of lists and run the following code to get the exact number of duplicate values.
# Duplicate elements in 2 lists
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']# a,c are duplicate
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
sum = c1+c2
j = sum.values()
print(sum)
print(j)
v = 0
for i in j:
if i>1:
v = v+1
print("Duplicate in lists:", v)
Output:
Counter({'a': 2, 'c': 2, 'b': 1, 'd': 1, 'e': 1, 'f': 1, 'g': 1})
dict_values([2, 1, 2, 1, 1, 1, 1])
Duplicate in lists: 2
Related
Iterate lists at intervals based on list values
I've been trying to accomplish this in a few different ways and just can't quite seem to get it to work for me. I'm trying to iterate over a list in blocks, where the first index value is an integer for how many elements are in the first block. After that, another integer with n elements, and another, etc. Example: test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h'] I want to read 3, pull 'a', 'b', 'c' from the list and perform some operation on them. Then return to the list at 2, pull 'd', 'e' - more operations, etc. Or even just using the integers to split into sub-lists would work. I'm thinking list slicing with updated [start:stop:step] variables but am having trouble pulling it together. Any suggestions? Can only use the standard Python library.
You could create a generator to iterate lazily on the parts of the list: test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h'] def parts(lst): idx = 0 while idx < len(lst): part_length = lst[idx] yield lst[idx+1: idx + part_length + 1 ] idx += part_length+1 for part in parts(test): print(part) Output: ['a', 'b', 'c'] ['d', 'e'] ['f', 'g', 'h']
If your input structure is always like this you can do the following: result = [test[i:i+j] for i, j in enumerate(test, 1) if isinstance(j, int)] print(result) # [['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]
Using an iterator on the list makes this super simple. Just grab the next item which tells you how much more to grab next, and so on until the end of the list: test = [3, 'a', 'b', 'c', 2, 'd', 'e', 3, 'f', 'g', 'h'] it = iter(test) for num in it: print(", ".join(next(it) for _ in range(num))) which prints: a, b, c d, e f, g, h You can also convert this to a list if you need to save the result: >>> it = iter(test) >>> [[next(it) for _ in range(num)] for num in it] [['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h']]
Removing duplicate characters from a list in Python where the pattern repeats
I am monitoring a serial port that sends data that looks like this: ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e', '','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] I need to be able to convert this into: ['a','b','c','d','a','b','c','d','a','b','c','d','a','b','c','d'] So I'm removing duplicates and empty strings, but also retaining the number of times the pattern repeats itself. I haven't been able to figure it out. Can someone help?
Here's a solution using a list comprehension and itertools.zip_longest: keep an element only if it's not an empty string, and not equal to the next element. You can use an iterator to skip the first element, to avoid the cost of slicing the list. from itertools import zip_longest def remove_consecutive_duplicates(lst): ahead = iter(lst) next(ahead) return [ x for x, y in zip_longest(lst, ahead) if x and x != y ] Usage: >>> remove_consecutive_duplicates([1, 1, 2, 2, 3, 1, 3, 3, 3, 2]) [1, 2, 3, 1, 3, 2] >>> remove_consecutive_duplicates(my_list) ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e'] I'm assuming either that there are no duplicates separated by empty strings (e.g. 'a', '', 'a'), or that you don't want to remove such duplicates. If this assumption is wrong, then you should filter out the empty strings first: >>> example = ['a', '', 'a'] >>> remove_consecutive_duplicates([ x for x in example if x ]) ['a']
You can loop over the list and add the appropriate contitions. For the response that you are expecting, you just need to whether previous character is not same as current character current_sequence = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] sequence_list = [] for x in range(len(current_sequence)): if current_sequence[x]: if current_sequence[x] != current_sequence[x-1]: sequence_list.append(current_sequence[x]) print(sequence_list)
You need something like that li = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d','','','e','e','e','e','e','e','','','a','a','a','a','a','a','','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c','','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] new_li = [] e_ = '' for e in li: if len(e) > 0 and e_ != e: new_li.append(e) e_ = e print(new_li) Output ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
You can use itertools.groupby: if your list is ll ll = [i for i in ll if i] out = [] for k, g in groupby(ll, key=lambda x: ord(x)): out.append(chr(k)) print(out) #prints ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', ...
from itertools import groupby from operator import itemgetter # data <- your data a = [k for k, v in groupby(data) if k] # approach 1 b = list(filter(bool, map(itemgetter(0), groupby(data)))) # approach 2 assert a == b print(a) Result: ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e']
using the set method you can remove the duplicates from the list data = ['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e', '','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b', '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d', '','','e','e','e','e','e','e','','','a','a','a','a','a','a', '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c', '','','','d','d','d','d','d','d','','','e','e','e','e','e','e','',''] print(set(data))
Remove elements from nested list - Python
data = [['A', 'B', 'C', 'D'], ['E', 'F', 'G'], ['I', 'J'], ['A', 'B', 'C', 'E', 'F']] I would like to remove unpopular elements (appearing only once) from the lists. So the results should look like this: data = [['A', 'B', 'C'], ['E', 'F'], ['A', 'B', 'C', 'E', 'F']] I was able to count the frequency of each element using the following codes: from collections import Counter Counter(x for sublist in data for x in sublist) #output Counter({'A': 2, 'C': 2, 'B': 2, 'E': 2, 'F': 2, 'D': 1, 'G': 1, 'I': 1, 'J': 1}) However, I am not sure how to use this count information to remove unpopular elements from the list. Any help?
Generate the new list based on the frequency information. The following code uses nested list comprehension to do that: from collections import Counter freq = Counter(x for sublist in data for x in sublist) data = [[x for x in row if freq[x] > 1] for row in data] # Remove non-popular item data = [row for row in data if row] # Remove empty rows # data => [['A', 'B', 'C'], ['E', 'F'], ['A', 'B', 'C', 'E', 'F']]
The complexity is similar. Just use map and filter function to make the code more pythonic. from collections import Counter data = [['A', 'B', 'C', 'D'], ['E', 'F', 'G'], ['I', 'J'], ['A', 'B', 'C', 'E', 'F']] counter = Counter({'A': 2, 'C': 2, 'B': 2, 'E': 2, 'F': 2, 'D': 1, 'G': 1, 'I': 1, 'J': 1}) result = map(lambda row: filter(lambda x: counter.get(x) > 1, row), data) print result
Keep strings that occur N times or more
I have a list that is mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd'] And I used Counter from collections on this list to get the result: from collection import Counter counts = Counter(mylist) #Counter({'a': 3, 'c': 2, 'b': 2, 'd': 1}) Now I want to subset this so that I have all elements that occur some number of times, for example: 2 times or more - so that the output looks like this: ['a', 'b', 'c'] This seems like it should be a simple task - but I have not found anything that has helped me so far. Can anyone suggest somewhere to look? I am also not attached to using Counter if I have taken the wrong approach. I should note I am new to python so I apologise if this is trivial.
[s for s, c in counts.iteritems() if c >= 2] # => ['a', 'c', 'b']
Try this... def get_duplicatesarrval(arrval): dup_array = arrval[:] for i in set(arrval): dup_array.remove(i) return list(set(dup_array)) mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd'] print get_duplicatesarrval(mylist) Result: [a, b, c]
The usual way would be to use a list comprehension as #Adaman does. In the special case of 2 or more, you can also subtract one Counter from another >>> counts = Counter(mylist) - Counter(set(mylist)) >>> counts.keys() ['a', 'c', 'b']
from itertools import groupby mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd'] res = [i for i,j in groupby(mylist) if len(list(j))>=2] print res ['a', 'b', 'c']
I think above mentioned answers are better, but I believe this is the simplest method to understand: mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd'] newlist=[] newlist.append(mylist[0]) for i in mylist: if i in newlist: continue else: newlist.append(i) print newlist >>>['a', 'b', 'c', 'd']
Python, work with list, find max sequence length
for example test_list: test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a'] what tool or algorithm i need to use, to get max sequences count, for this example: 'a' = 3 'b' = 2 'c = 1
Using a dict to track max lengths, and itertools.groupby to group the sequences by consecutive value: from itertools import groupby max_count = {} for val, grp in groupby(test_list): count = sum(1 for _ in grp) if count > max_count.get(val, 0): max_count[val] = count Demo: >>> from itertools import groupby >>> test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a'] >>> max_count = {} >>> for val, grp in groupby(test_list): ... count = sum(1 for _ in grp) ... if count > max_count.get(val, 0): ... max_count[val] = count ... >>> max_count {'a': 3, 'c': 1, 'b': 2}
Here is a direct way to do it: Counts, Count, Last_item = {}, 0, None test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a'] for item in test_list: if Last_item == item: Count+=1 else: Count=1 Last_item=item if Count>Counts.get(item, 0): Counts[item]=Count print Counts # {'a': 3, 'c': 1, 'b': 2}
You should read about what a dictionary is (dict in Python) and how you could store how many occurrences there are for a sequence. Then figure out how to code the logic - Figure out how to loop over your list. As you go, for every item - If it isn't the same as the previous item Store how many times you saw the previous item in a row into the dictionary Else Increment how many times you've seen the item in the current sequence Print your results
You can use re module for find all sequences of the character in a string composed by all the characters in your list. Then just pick the largest string for a single character. import re test_list = ['a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a', 'a'] # First obtain the characters. unique = set(test_list) max_count = {} for elem in unique: # Find all sequences for the same character. result = re.findall('{0}+'.format(elem), "".join(test_list)) # Find the longest. maximun = max(result) # Save result. max_count.update({elem: len(maximun)}) print(max_count) This will print: {'c': 1, 'b': 2, 'a': 3}
For Python, Martijn Pieters' groupby is the best answer. That said, here is a 'basic' way to do it that could be translated to any language: test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a'] hm={}.fromkeys(set(test_list), 0) idx=0 ll=len(test_list) while idx<ll: item=test_list[idx] start=idx while idx<ll and test_list[idx]==item: idx+=1 end=idx hm[item]=max(hm[item],end-start) print hm # {'a': 3, 'c': 1, 'b': 2}