Python, work with list, find max sequence length - python

for example test_list:
test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
what tool or algorithm i need to use, to get max sequences count, for this example:
'a' = 3
'b' = 2
'c = 1

Using a dict to track max lengths, and itertools.groupby to group the sequences by consecutive value:
from itertools import groupby
max_count = {}
for val, grp in groupby(test_list):
count = sum(1 for _ in grp)
if count > max_count.get(val, 0):
max_count[val] = count
Demo:
>>> from itertools import groupby
>>> test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
>>> max_count = {}
>>> for val, grp in groupby(test_list):
... count = sum(1 for _ in grp)
... if count > max_count.get(val, 0):
... max_count[val] = count
...
>>> max_count
{'a': 3, 'c': 1, 'b': 2}

Here is a direct way to do it:
Counts, Count, Last_item = {}, 0, None
test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
for item in test_list:
if Last_item == item:
Count+=1
else:
Count=1
Last_item=item
if Count>Counts.get(item, 0):
Counts[item]=Count
print Counts
# {'a': 3, 'c': 1, 'b': 2}

You should read about what a dictionary is (dict in Python) and how you could store how many occurrences there are for a sequence.
Then figure out how to code the logic -
Figure out how to loop over your list. As you go, for every item -
If it isn't the same as the previous item
Store how many times you saw the previous item in a row into the dictionary
Else
Increment how many times you've seen the item in the current sequence
Print your results

You can use re module for find all sequences of the character in a string composed by all the characters in your list. Then just pick the largest string for a single character.
import re
test_list = ['a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a', 'a']
# First obtain the characters.
unique = set(test_list)
max_count = {}
for elem in unique:
# Find all sequences for the same character.
result = re.findall('{0}+'.format(elem), "".join(test_list))
# Find the longest.
maximun = max(result)
# Save result.
max_count.update({elem: len(maximun)})
print(max_count)
This will print: {'c': 1, 'b': 2, 'a': 3}

For Python, Martijn Pieters' groupby is the best answer.
That said, here is a 'basic' way to do it that could be translated to any language:
test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
hm={}.fromkeys(set(test_list), 0)
idx=0
ll=len(test_list)
while idx<ll:
item=test_list[idx]
start=idx
while idx<ll and test_list[idx]==item:
idx+=1
end=idx
hm[item]=max(hm[item],end-start)
print hm
# {'a': 3, 'c': 1, 'b': 2}

Related

'Counter' object has no attribute 'count'

There are two lists and I want to check how many of elements are duplicate. Assuming list one is l1 = ['a', 'b', 'c', 'd', 'e'] and list two is l2 = ['a', 'f', 'c', 'g']. Since a and c are in both lists, therefore, the output should be 2 which means there are two elements that repeated in both lists. Below is my code and I want to count how many 2 are in counter. I am not sure how to count that.
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
sum = c1+c2
z=sum.count(2)
What you want is set.intersection (if there are no duplicates in each list):
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']
print(len(set(l1).intersection(l2)))
Output:
2
Every time we use a counter it converts the lists into dict. So it is throwing the error. You can simply change the number of lists and run the following code to get the exact number of duplicate values.
# Duplicate elements in 2 lists
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']# a,c are duplicate
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
sum = c1+c2
j = sum.values()
print(sum)
print(j)
v = 0
for i in j:
if i>1:
v = v+1
print("Duplicate in lists:", v)
Output:
Counter({'a': 2, 'c': 2, 'b': 1, 'd': 1, 'e': 1, 'f': 1, 'g': 1})
dict_values([2, 1, 2, 1, 1, 1, 1])
Duplicate in lists: 2

Python: How to update dictionary with step-index from list

I am a week-old python learner. I would like to know: Let’s say:
list= [“a”, “A”, “b”, “B”, “c”, “C”]
I need to update them in dictionary to be a result like this:
dict={“a”:”A”, “b”:”B”, “c”:”C”}
I try to use index of list within dict.update({list[n::2]: list[n+1::2]} and for n in range(0,(len(list)/2))
I think i did something wrong. Please correct me.
Thank you in advance.
Try the following:
>>> lst = ['a', 'A', 'b', 'B', 'c', 'C']
>>> dct = dict(zip(lst[::2],lst[1::2]))
>>> dct
{'a': 'A', 'b': 'B', 'c': 'C'}
Explanation:
>>> lst[::2]
['a', 'b', 'c']
>>> lst[1::2]
['A', 'B', 'C']
>>> zip(lst[::2], lst[1::2])
# this actually gives a zip iterator which contains:
# [('a', 'A'), ('b', 'B'), ('c', 'C')]
>>> dict(zip(lst[::2], lst[1::2]))
# here each tuple is interpreted as key value pair, so finally you get:
{'a': 'A', 'b': 'B', 'c': 'C'}
NOTE: Don't name your variables same as python keywords.
Correct version of your program would be:
lst = ['a', 'A', 'b', 'B', 'c', 'C']
dct = {}
for n in range(0,int(len(lst)/2)):
dct.update({lst[n]: lst[n+1]})
print(dct)
Yours did not work because you used slices in each iteration, instead of accessing each individual element. lst[0::2] gives ['a', 'b', 'c'] and lst[1::2] gives ['A', 'B', 'C']. So for the first iteration, when n == 0 you are trying to update the dictionary with the pair ['a', 'b', 'c'] : ['A', 'B', 'C'] and you will get a type error as list can not be assigned as key to the dictionary as lists are unhashable.
You can use dictionary comprehension like this:
>>> l = list("aAbBcCdD")
>>> l
['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
>>> { l[i] : l[i+1] for i in range(0,len(l),2)}
{'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
The below code would be the perfect apt to your question. Hope this helped you
a = ["a", "A", "B","b", "c","C","d", "D"]
b = {}
for each in range(len(a)):
if each % 2 == 0:
b[a[each]] = a[each + 1]
print(b)

how to print dictionary key in the even frequency order

I would like print out dictionary key, value pair in the even frequency like
a = dict('A': 3, 'B': 5}
=> ['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
a = dict('A': 4, 'B': 1}
=> ['A', 'B', 'A', 'A', 'A']
I know I can use a while loop to print each key and remove the count every time until all value from all key is 0 but if there is better way to do it?
def func(d: dict):
res = []
while any(i > 0 for i in d.values()):
for k, c in d.items():
if c > 0:
res.append(k)
d[k] -= 1
return res
(I'm assuming you're using a version of Python that guarantees the iteration order of dictionaries)
Here's an itertools-y approach. It creates a generator for each letter that yields the letter the given number of times, and it combines all of them together with zip_longest so they get yielded evenly.
from itertools import repeat, zip_longest
def iterate_evenly(d):
generators = [repeat(k, v) for k,v in d.items()]
exhausted = object()
for round in zip_longest(*generators, fillvalue=exhausted):
for x in round:
if x is not exhausted:
yield x
print(list(iterate_evenly({"A": 3, "B": 5})))
print(list(iterate_evenly({"A": 4, "B": 1})))
Result:
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
['A', 'B', 'A', 'A', 'A']
You can do the same thing in fewer lines, although it becomes harder to read.
from itertools import repeat, zip_longest
def iterate_evenly(d):
exhausted = object()
return [x for round in zip_longest(*(repeat(k, v) for k,v in d.items()), fillvalue=exhausted) for x in round if x is not exhausted]
print(iterate_evenly({"A": 3, "B": 5}))
print(iterate_evenly({"A": 4, "B": 1}))
For a one-liner.
First, create a list with two elements: a list of As and a list of Bs:
>>> d = {'A': 3, 'B': 5}
>>> [[k]*v for k, v in d.items()]
[['A', 'A', 'A'], ['B', 'B', 'B', 'B', 'B']]
[k]*v means: a list with v ks. Second, interleave As and B. We need zip_longest because zip would stop after the end of the first list:
>>> import itertools
>>> list(itertools.zip_longest(*[[k]*v for k, v in d.items()]))
[('A', 'B'), ('A', 'B'), ('A', 'B'), (None, 'B'), (None, 'B')]
Now, just flatten the list and remove None values:
>>> [v for vs in itertools.zip_longest(*[[k]*v for k, v in d.items()]) for v in vs if v is not None]
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
Other example:
>>> d = {'A': 4, 'B': 1}
>>> [v for vs in itertools.zip_longest(*[[k]*v for k, v in d.items()]) for v in vs if v is not None]
['A', 'B', 'A', 'A', 'A']
You can just use sum with a generator comprehension:
res = sum(([key]*value for key, value in d.items()), [])
This exploits the fact that sum can "add" anything that can use the + operators, like lists, in addition to sequence multiplication ("A"*4 == "AAAA").
If you want the order to be randomized, use the random module:
from random import shuffle
shuffle(res)
If, as Thierry Lathuille notes, you want to cycle through the values in the original order, you can use some itertools magic:
from itertools import chain, zip_longest
res = [*filter(
bool, # drop Nones
chain(*zip_longest(
*([key]*val for key, val in d.items()))
)
)]
As an alternative to the replication & zip_longest approach, let's try to simplify the OP's original code:
def function(dictionary):
result = []
while dictionary:
result.extend(dictionary)
dictionary = {k: v - 1 for k, v in dictionary.items() if v > 1}
return result
print(function({'A': 3, 'B': 5}))
print(function({'A': 4, 'B': 1}))
OUTPUT
% python3 test.py
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
['A', 'B', 'A', 'A', 'A']
%
Although it might look otherwise, it's not destructive on the dictionary argument, unlike the OP's original code.
It could also be done using a sort of the (position,character) tuples formed by expanding each dictionary entry:
a = {'A': 3, 'B': 5}
result = [c for _,c in sorted( (p,c) for c,n in a.items() for p,c in enumerate(c*n))]
print(result) # ['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
If the dictionary's order is usable, you can forgo the sort and use this:
result = [c for i in range(max(a.values())) for c,n in a.items() if i<n]

Keep strings that occur N times or more

I have a list that is
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
And I used Counter from collections on this list to get the result:
from collection import Counter
counts = Counter(mylist)
#Counter({'a': 3, 'c': 2, 'b': 2, 'd': 1})
Now I want to subset this so that I have all elements that occur some number of times, for example: 2 times or more - so that the output looks like this:
['a', 'b', 'c']
This seems like it should be a simple task - but I have not found anything that has helped me so far.
Can anyone suggest somewhere to look? I am also not attached to using Counter if I have taken the wrong approach. I should note I am new to python so I apologise if this is trivial.
[s for s, c in counts.iteritems() if c >= 2]
# => ['a', 'c', 'b']
Try this...
def get_duplicatesarrval(arrval):
dup_array = arrval[:]
for i in set(arrval):
dup_array.remove(i)
return list(set(dup_array))
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
print get_duplicatesarrval(mylist)
Result:
[a, b, c]
The usual way would be to use a list comprehension as #Adaman does.
In the special case of 2 or more, you can also subtract one Counter from another
>>> counts = Counter(mylist) - Counter(set(mylist))
>>> counts.keys()
['a', 'c', 'b']
from itertools import groupby
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
res = [i for i,j in groupby(mylist) if len(list(j))>=2]
print res
['a', 'b', 'c']
I think above mentioned answers are better, but I believe this is the simplest method to understand:
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
newlist=[]
newlist.append(mylist[0])
for i in mylist:
if i in newlist:
continue
else:
newlist.append(i)
print newlist
>>>['a', 'b', 'c', 'd']

Reiterating over lists and dictionaries

In the following code, why does my code not iterate properly? I'm probably missing one line but I can't figure out why it doesn't work.
I have a function with the following test case:
>>> borda([['A', 'B', 'C', 'D'], ['B', 'A', 'C', 'D'], ['B', 'C', 'D', 'A']])
('B', [5, 8, 4, 1])
Where lists in the parameter are rankings, each #1 rank gets 3 points, #2 gets 2 points, #3 gets 1 point, and no other ranks get anything. There may not necessarily four choices. The first element in the tuple should be the choice with the highest number of points, and the second element is the number of points each choice got, in alphabetical order.
I'm not done with the function, but I'm trying to get a dictionary of the choices as the keys in alphabetical order and the count of rankings as the values, but the output is a dictionary of only the very last element of the last list in the parameter.
L = ['A', 'B', 'C', 'D'] #This is referenced outside the function since it might change
D = {}
i = 0
num = 0
while num < len(L):
num += 1
for choice in L:
while i < len(parameter):
for item in parameter:
if item[0] == choice:
D[choice] = D.get(choice, 0) + 3
if item[1] == choice:
D[choice] = D.get(choice, 0) + 2
if item[2] == choice:
D[choice] = D.get(choice, 0) + 1
i += 1
return D
The way I'd do this is something like this:
import operator
from collections import defaultdict
listoflists = [['A', 'B', 'C', 'D'], ['B', 'A', 'C', 'D'], ['B', 'C', 'D', 'A']]
def borda(listoflists):
outdict = defaultdict(int)
for item in listoflists:
outdict[item[0]] += 3
outdict[item[1]] += 2
outdict[item[2]] += 1
highestitem = max(outdict.iteritems(), key=operator.itemgetter(1))[0]
outlist = [outdict[item[0]] for item in sorted(outdict.keys())]
return (highestitem, outlist)
Update:
I'm not sure why you wouldn't be able to import standard modules, but if for whatever reason you're forbidden from using the import statement, here's a version with only built-in functions:
listoflists = [['A', 'B', 'C', 'D'], ['B', 'A', 'C', 'D'], ['B', 'C', 'D', 'A']]
def borda(listoflists):
outdict = {}
for singlelist in listoflists:
# Below, we're just turning singlelist around in order to
# make use of index numbers from enumerate to add to the scores
for index, item in enumerate(singlelist[2::-1]):
if item not in outdict:
outdict[item] = index + 1
else:
outdict[item] += index + 1
highestitem = max(outdict.iteritems(), key=lambda i: i[1])[0]
outlist = [outdict[item[0]] for item in sorted(outdict.keys())]
return (highestitem, outlist)
If you had 2.7:
import operator
from collections import Counter
listoflists = [['A', 'B', 'C', 'D'], ['B', 'A', 'C', 'D'], ['B', 'C', 'D', 'A']]
def borda(listoflists):
outdict = sum([Counter({item[x]:3-x}) for item in listoflists for x in range(3]],
Counter())
highestitem = max(outdict.iteritems(), key=operator.itemgetter(1))[0]
outlist = [outdict[item[0]] for item in sorted(outdict.iteritems(),
key=operator.itemgetter(0))]
return (highestitem, outlist)
Look ma.. no loops :-)
Check out http://ua.pycon.org/static/talks/kachayev/index.html to see why this is better.

Categories