Sort a list by frequency and value - python

I am trying to solve the following problem: a function takes a list A. The results must be a ordered list of list. Each list contains the elements which have the same frequency in the original list A.
Example:
Input: [3, 1, 2, 2, 4]
Output: [[1, 3, 4], [2, 2]]
I managed to sort the initial list A and determine how the frequency of an element.
However, I do not know how to split the original list A based on the frequencies.
My code:
def customSort(arr):
counter = Counter(arr)
y = sorted(arr, key=lambda x: (counter[x], x))
print(y)
x = Counter(arr)
a = sorted(x.values())
print()
customSort([3,1,2,2,4])
My current output:
[1, 3, 4, 2, 2]
[1, 1, 1, 2]

You can use a defaultdict of lists and iterate your Counter:
from collections import defaultdict, Counter
def customSort(arr):
counter = Counter(arr)
dd = defaultdict(list)
for value, count in counter.items():
dd[count].extend([value]*count)
return dd
res = customSort([3,1,2,2,4])
# defaultdict(list, {1: [3, 1, 4], 2: [2, 2]})
This gives additional information, i.e. the key represents how many times the values in the lists are seen. If you require a list of lists, you can simply access values:
res = list(res.values())
# [[3, 1, 4], [2, 2]]

Doing the grunt work suggested by Scott Hunter (Python 3):
#!/usr/bin/env python3
from collections import Counter
def custom_sort(arr):
v = {}
for key, value in sorted(Counter(arr).items()):
v.setdefault(value, []).append(key)
return [v * k for k,v in v.items()]
if __name__ == '__main__':
print(custom_sort([3, 1, 2, 2, 4])) # [[1, 3, 4], [2, 2]]
For Python 2.7 or lower use iteritems() instead of items()
Partially taken from this answer

Having sorted the list as you do:
counter = Counter(x)
y = sorted(x, key=lambda x: (counter[x], x))
#[1, 3, 4, 2, 2]
You could then use itertools.groupby, using the result from Counter(x) in the key argument to create groups according to the counts:
[list(v) for k,v in groupby(y, key = lambda x: counter[x])]
#[[1, 3, 4], [2, 2]]

Find your maximum frequency, and create a list of that many empty lists.
Loop over your values, and add each to the element of the above corresponding to its frequency.
There might be something in Collections that does at least part of the above.

Another variation of the same theme, using a Counter to get the counts and then inserting the elements into the respective position in the result list-of-lists. This retains the original order of the elemens (does not group same elements together) and keeps empty lists for absent counts.
>>> lst = [1,4,2,3,4,3,2,5,4,4]
>>> import collections
>>> counts = collections.Counter(lst)
>>> res = [[] for _ in range(max(counts.values()))]
>>> for x in lst:
... res[counts[x]-1].append(x)
...
>>> res
[[1, 5], [2, 3, 3, 2], [], [4, 4, 4, 4]]

A bit late to the party, but with plain Python:
test = [3, 1, 2, 2, 4]
def my_sort(arr):
count = {}
for x in arr:
if x in count:
count[x] += 1
else:
count[x] = 0
max_frequency = max(count.values()) + 1
res = [[] for i in range(max_frequency)]
for k,v in count.items():
for j in range(v + 1):
res[v].append(k)
return res
print(my_sort(test))

Using only Pythons built-in functions, no imports and a single for loop.
l1= []
l2 = []
def customSort(mylist):
sl = sorted(mylist)
for i in sl:
n = sl.count(i)
if n > 1:
l1.append(i)
if i not in l1:
l2.append(i)
return [l2, l1]
print(customSort([3, 1, 2, 2, 4]))
Output:
[[1, 3, 4], [2, 2]]

Related

Extracting lists from tuple with a condition

I've been trying to extracting from this tuples
E=tuple([random.randint(0,10) for x in range(10)])
Let's say the result is (3,4,5,0,0,3,4,2,2,4) .
I want to extract from this tuple lists of numbers is ascending order without sorting the tuple or anything.
Example : [[3,4,5],[0,0,3,4],[2,2,4]]
You can create a custom function (generator in my example) to group ascending elements:
def get_ascending(itr):
lst = []
for v in itr:
if not lst:
lst = [v]
elif v < lst[-1]:
yield lst
lst = [v]
else:
lst.append(v)
yield lst
E = 3, 4, 5, 0, 0, 3, 4, 2, 2, 4
print(list(get_ascending(E)))
Prints:
[[3, 4, 5], [0, 0, 3, 4], [2, 2, 4]]

how to convert a list in python to a set according to index number

I have a list y=[0,2,1,2,1,1,2,1] and it has 8 elements (from 0 to 7).
And since it has three unique elements, so three sets will be created.
I want the output to be
s1={0}
s2={1,3,6}
s3={2,4,5,7}
If I understood correctly, what you want to do is to get a set for each unique value in your list that will return indexes of all occurrences of this value in the list.
We can't tell how many sets we will need, therefore we should create a new set for each unique value and hold these sets in a list.
y = [0,2,1,2,1,1,2,1]
y_set = set(y)
set_list = []
for unique_value in y_set:
new_set = set()
for i, value in enumerate(y):
if unique_value == value:
new_set.add(i)
set_list.append(new_set)
This is the result you will get from this approach:
[{0}, {2, 4, 5, 7}, {1, 3, 6}]
You can use enumerate and defaultdict:
from collections import defaultdict
my_dict = defaultdict(list)
for index, element in enumerate(y):
my_dict[element].append(index)
result = my_dict.values()
I think it's better to use the value as the key, it's more clear for further steps.
y=[0,2,1,2,1,1,2,1]
dict = {}
for i, x in enumerate(y):
if x not in dict.keys():
dict[x] = [i]
else:
dict[x].append(i)
print dict
{0: [0], 1: [2, 4, 5, 7], 2: [1, 3, 6]}
And if you prefer to get the exactly output as mentioned -
from collections import OrderedDict
y= [0, 2, 1, 2, 1, 1, 2, 1]
dict2 = OrderedDict()
dict_seq = OrderedDict()
sequence = 1
list = []
for i, x in enumerate(y):
if x not in dict2.keys():
dict2[x] = [i]
dict_seq['S{0}'.format(sequence)] = dict2[x]
sequence+=1
else:
dict2[x].append(i)
print("dict2 = {0}".format(dict2))
for key, value in dict_seq.items():
print("{0} = {1}\n".format(key, value))
dict2 = OrderedDict([(0, [0]), (2, [1, 3, 6]), (1, [2, 4, 5, 7])])
S1 = [0]
S2 = [1, 3, 6]
S3 = [2, 4, 5, 7]
You can try manual approach:
y=[0,2,1,2,1,1,2,1]
groub_by={}
for i,j in enumerate(y):
if j not in groub_by:
groub_by[j]=[i]
else:
groub_by[j].append(i)
print(groub_by.values())
or you can also try itertools grouby approach:
list_with_index=[(j,i) for i,j in enumerate(y)]
import itertools
for i,j in itertools.groupby(sorted(list_with_index),key=lambda x:x[0]):
print(list(map(lambda x:x[1],j)))
output:
[0]
[2, 4, 5, 7]
[1, 3, 6]
try like this :
from collections import defaultdict
lst = [0,2,1,2,1,1,2,1]
lst_indxs = dict(enumerate(lst))
final_dict = defaultdict(list)
for i,j in lst_indxs.items():
final_dict[j].append(i)
print(dict(final_dict))
Try something like this:
y = [0,2,1,2,1,1,2,1]
d = {}
for i in set(y):
d[i] = []
for j in range(len(y)):
d[y[j]].append(j)
print d
Output:
{0: [0], 1: [2, 4, 5, 7], 2: [1, 3, 6]}

Checking pairs in a list?

Basically I have a program which will take a number and factor it down to the smallest number, being 2, 3, 5, 7, and so on. I'm having trouble figuring out how to check if there are one or multiple pairs of numbers inside of a list. For example.
myList = [1,1,1,4,5,6,6,3,3,1]
in myList, there are four 1's which would be two pairs of two. Pairs then need to be thrown into another list but instead of adding both numbers making it a pair it only needs to have one of those numbers of the pair.
For example:
myList = [1,1,1,4,5,6,6,3,3,1]
doubles = [1,1,6,3]
So, there are four ones. Which in turn make two pairs of two, which would add into a list, but only one number needs to be added to a list representing a pair.
This is similar to qarma's first solution, but it avoids the double for loop.
from collections import Counter
my_list = [1, 1, 1, 4, 5, 6, 6, 3, 3, 1, 7, 7, 7]
doubles = []
for k, v in Counter(my_list).items():
doubles.extend([k] * (v // 2))
print(doubles)
output
[1, 1, 6, 3, 7]
Something like this?
>>> myList = [1,1,1,4,5,6,6,3,3,1]
>>> mySet = set()
>>> doubles = []
>>> for i in myList:
... if i in mySet:
... doubles.append(i)
... mySet.remove(i)
... else:
... mySet.add(i)
...
>>> doubles
[1, 6, 3, 1]
Note - This doesn't preserve the order you seem to have expected in your question, i.e. [1, 1, 6, 3].
simple solution
[k for k, v in collections.Counter([1,1,1,4,5,6,6,3,3,1]).items() for _i in range(v // 2)]
[1, 1, 3, 6]
Counter is a kind of a dict, thus doesn't keep insertion order. Also, it compresses input, so for example, input like 1, 1, 3, 3, 1, 1 is guaranteed to result in either 1, 1, 3 or 3, 1, 1 and never 1, 3, 1.
more complex
In [7]: def pairs(s):
...: queue = set()
...: for i in s:
...: if i in queue:
...: yield i
...: queue.remove(i)
...: else:
...: queue.add(i)
...:
In [8]: list(pairs([1,1,1,4,5,6,6,3,3,1]))
Out[8]: [1, 6, 3, 1]
This, preserves order of pairs, but pairs are ordered according to last item in a pair, e.g. 1, 9, 9, 1 becomes 9, 1.
even more complex
In [12]: def pairs(s):
...: incomplete = dict()
...: done = []
...: for i, v in enumerate(s):
...: if v in incomplete:
...: done.append((incomplete[v], v))
...: del incomplete[v]
...: else:
...: incomplete[v] = i
...: return [v[1] for v in sorted(done)]
...:
...:
In [13]: pairs([1,1,1,4,5,6,6,3,3,1])
Out[13]: [1, 1, 6, 3]
Here, original position of first element of each pair is kept as a value in the incomplete dict, which allows to reconstruct original order according to first item in a pair.
using defaultdictionary
from collections import defaultdict
def func(lis):
dic = defaultdict(int)
for i in lis:
dic[i]+=1
list1 =[]
for k,v in dic.items():
if v>=2:
list1.append([k]*(v//2))
return list1
myList = [1,1,1,4,5,6,6,3,3,1]
data = [j for i in func(myList) for j in i]
print(data)
# output
# [1,1,6,3]
Another option with Counter:
from collections import Counter
myList = [1,1,1,4,5,6,6,3,3,1]
dupes = Counter({k: v // 2 for k, v in Counter(myList).items()})
sorted(dupes.elements())
# [1, 1, 3, 6]

python: sum similar values in list

Is there an easy way to sum all similar values in a list using list comprehensions?
i.e. input:
[1, 2, 1, 3, 3]
expected output:
[6, 2, 2] (sorted)
I tried using zip, but it only works for max 2 similar values:
[x + y for (x, y) in zip(l[:-1], l[1:]) if x == y]
You can use Counter.
from collections import Counter
[x*c for x,c in Counter([1, 2, 1, 3, 3]).items()]
from itertools import groupby
a=[1, 2, 1,1,4,5,5,5,5, 3, 3]
print sorted([sum(g) for i,g in groupby(sorted(a))],reverse=True)
#output=[20, 6, 4, 3, 2]
explantion for the code
first sort the list using sorted(a)
perform groupby to make groupf of similar elements
from each group use sum()
You can use collections.Counter for this, this will take O(N) time.:
>>> from collections import Counter
>>> lst = [1, 2, 1, 3, 3]
>>> [k*v for k, v in Counter(lst).iteritems()]
[2, 2, 6]
Here Counter() returns the count of each unique item, and then we multiply those numbers with their count to get the sum.

Merge two lists based on condition

I am trying to merge two lists based on position of index, so sort of a proximity intersection.
A set doesn't work in this case. What i am trying to do is match index in each list then if the element is one less than that of the element in other list, only then i collect it.
An example will explain my scenario better.
Sample Input:
print merge_list([[0, 1, 3], [1, 2], [4, 1, 3, 5]],
[[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
Sample Output:
[[0,2],[4,6]]
so on position 0 in list1 we have 1, 3 and in list2 we have 2, 6. Since 1 is one less than 2, so we collect that and move on, now 3 is less than 6 but it's not one less than i.e. not 5 so we ignore that. Next we have [1, 2][1, 4], so both index/position 1, but 2 is not one less than 4 so we ignore that. Next we have [2, 2] in list2 both index 2 doesn't match any index in first list so no comparison. Finally we have [4, 1, 3, 5] [4, 1, 6] comparison. Both index match and only 5 in list one is one less than list two so we collect six hence we collect [4,6] meaning index 4 and match etc.
I have tried to make it work, but i don't seem to make it work.
This is my code so far.
def merge_list(my_list1, my_list2):
merged_list = []
bigger_list = []
smaller_list = []
temp_outer_index = 0
temp_inner_index = 0
if(len(my_list1) > len(my_list2)):
bigger_list = my_list1
smaller_list = my_list2
elif(len(my_list2) > len(my_list1)):
bigger_list = my_list2
smaller_list = my_list1
else:
bigger_list = my_list1
smaller_list = my_list2
for i, sublist in enumerate(bigger_list):
for index1 , val in enumerate(sublist):
for k, sublist2 in enumerate(smaller_list):
for index2, val2 in enumerate(sublist2):
temp_outer_index = index1 + 1
temp_inner_index = index2 + 1
if(temp_inner_index < len(sublist2) and temp_outer_index < len(sublist)):
# print "temp_outer:%s , temp_inner:%s, sublist[temp_outer]:%s, sublist2[temp_inner_index]:%s" % (temp_outer_index, temp_inner_index, sublist[temp_outer_index], sublist2[temp_inner_index])
if(sublist2[temp_inner_index] < sublist[temp_outer_index]):
merged_list.append(sublist[temp_outer_index])
break
return merged_list
No clue what you are doing, but this should work.
First, convert the list of lists to a mapping of indices to set of digits contained in that list:
def convert_list(l):
return dict((sublist[0], set(sublist[1:])) for sublist in l)
This will make the lists a lot easier to work with:
>>> convert_list([[0, 1, 3], [1, 2], [4, 1, 3, 5]])
{0: set([1, 3]), 1: set([2]), 4: set([1, 3, 5])}
>>> convert_list([[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
{0: set([2, 6]), 1: set([4]), 2: set([2]), 4: set([1, 6])}
Now the merge_lists function can be written as such:
def merge_lists(l1, l2):
result = []
d1 = convert_list(l1)
d2 = convert_list(l2)
for index, l2_nums in d2.items():
if index not in d1:
#no matching index
continue
l1_nums = d1[index]
sub_nums = [l2_num for l2_num in l2_nums if l2_num - 1 in l1_nums]
if sub_nums:
result.append([index] + sorted(list(sub_nums)))
return result
Works for your test case:
>>> print merge_lists([[0, 1, 3], [1, 2], [4, 1, 3, 5]],
[[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
[[0, 2], [4, 6]]
I believe this does what you want it to do:
import itertools
def to_dict(lst):
dct = {sub[0]: sub[1:] for sub in lst}
return dct
def merge_dicts(a, b):
result = []
overlapping_keys = set.intersection(set(a.keys()), set(b.keys()))
for key in overlapping_keys:
temp = [key] # initialize sublist with index
for i, j in itertools.product(a[key], b[key]):
if i == j - 1:
temp.append(j)
if len(temp) > 1: # if the sublist has anything besides the index
result.append(temp)
return result
dict1 = to_dict([[0, 1, 3], [1, 2], [4, 1, 3, 5]])
dict2 = to_dict([[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
result = merge_dicts(dict1, dict2)
print(result)
Result:
[[0, 2], [4, 6]]
First, we convert your lists to dicts because they're easier to work with (this separates the key out from the other values). Then, we look for the keys that exist in both dicts (in the example, this is 0, 1, 4) and look at all pairs of values between the two dicts for each key (in the example, 1,2; 1,6; 3,2; 3,6; 2,4; 1,1; 1,6; 3,1; 3,6; 5,1; 5,6). Whenever the first element of a pair is one less than the second element, we add the second element to our temp list. If the temp list ends up containing anything besides the key (i.e. is longer than 1), we add it to the result list, which we eventually return.
(It just occurred to me that this has pretty bad performance characteristics - quadratic in the length of the sublists - so you might want to use Claudiu's answer instead if your sublists are going to be long. If they're going to be short, though, I think the cost of initializing a set is large enough that my solution might be faster.)
def merge_list(a, b):
d = dict((val[0], set(val[1:])) for val in a)
result = []
for val in b:
k = val[0]
if k in d:
match = [x for x in val[1:] if x - 1 in d[k]]
if match:
result.append([k] + match)
return result
Similar to the other answers, this will first convert one of the lists to a dictionary with the first element of each inner list as the key and the remainder of the list as the value. Then we walk through the other list and if the first element exists as a key in the dictionary, we find all values that meet your criteria using the list comprehension and if there were any, add an entry to the result list which is returned at the end.

Categories