Matrix COUNT + GROUP BY in python - python

I have a matrix of vectors filled with integers. For example:
[[1, 2, 3],
[2, 3, 1],
[1, 2, 3],
[2, 3, 1],
[2, 3, 1]]
and I want to count all distinguish vectors in order to obtain something like this:
[[2, [1, 2, 3]],
[3, [2, 3, 1]]]
where first I have the number of number of occurrences and then the vector.
In SQL it can be done with COUNT + GROUP BY.
However, how can I 'smartly' compute it with python?

With Python only, you can use a Counter:
from collections import Counter
matrix = [[1, 2, 3],
[2, 3, 1],
[1, 2, 3],
[2, 3, 1],
[2, 3, 1]]
c = Counter(map(tuple, matrix))
result = [[count, list(row)] for row, count in c.items()]
print(result)
# [[2, [1, 2, 3]], [3, [2, 3, 1]]]
With NumPy, you can use np.unique:
import numpy as np
matrix = np.array([[1, 2, 3],
[2, 3, 1],
[1, 2, 3],
[2, 3, 1],
[2, 3, 1]])
rows, counts = np.unique(matrix, axis=0, return_counts=True)
result = [[count, list(row)] for row, count in zip(rows, counts)]
print(result)
# [[2, [1, 2, 3]], [3, [2, 3, 1]]]

First convert every sub-list of m to tuple. Then use collections.Counter to count the occurances of the tuple in the main list. Now loop through this counter object with keys (no of counts) and values (tuples) and append them into a new list like this :
from collections import Counter
m = [[1, 2, 3],
[2, 3, 1],
[1, 2, 3],
[2, 3, 1],
[2, 3, 1]]
m = map(tuple, m)
l = []
for k,v in Counter(m).items():
l.append([v, list(k)])
Output :
[[2, [1, 2, 3]], [3, [2, 3, 1]]]
Note : Counter(m) produces this counter object :
Counter({(2, 3, 1): 3, (1, 2, 3): 2})

Related

How to create a matrix like below using Numpy

Matrix is like
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
For clarification, it's not just to create one such matrix but many other different matrices like this.
[0, 1, 2, 3]
[1, 2, 3, 4]
[2, 3, 4, 5]
You can use a sliding_window_view
from numpy.lib.stride_tricks import sliding_window_view as swv
cols = 4
rows = 3
out = swv(np.arange(cols+rows-1), cols).copy()
NB. because this is a view, you need .copy() to make it a mutable array, it's not necessary if a read-only object is sufficient (e.g., for display or indexing).
Output:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
Output with cols = 3 ; rows = 5:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
alternative: broadcasting:
cols = 4
rows = 3
out = np.arange(rows)[:,None] + np.arange(cols)
Output:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
L = 3
np.array([
np.array(range(L)) + j
for j in range(L)
])
or a bit of optimization:
L = 3
a = np.array(range(L))
np.array([
a + j
for j in range(L)
])
You can easily create a matrix like that using broadcasting, for instance
>>> np.arange(3)[:, None] + np.arange(4)
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])

checking for elements in rows and columns of lists of lists without using pandas

If I have a list of lists
matrix = [[2, 3, 1, 2],[1, 2, 3, 2],[3, 3, 1, 2], [2, 2, 3, 3]]
how can I check with a for loop if for example element = 1 is present in each column
Using numpy:
np.any(a==1, 1).all()
>>> a = np.array([[2, 3, 1, 2],[1, 2, 3, 2],[3, 3, 1, 2], [2, 2, 3, 3]])
>>> np.any(a==1, 1).all()
False
>>> a = np.array([[2, 3, 1, 2],[1, 2, 3, 2],[3, 3, 1, 2], [2, 1, 3, 3]])
>>> np.any(a==1, 1).all()
True
Using all, in and a list comprehension:
matrix = [[2, 3, 1, 2], [1, 2, 3, 2], [3, 3, 1, 2], [2, 2, 3, 3]]
valid = all(1 in row for row in matrix)
Or, the verbose way:
matrix = [[2, 3, 1, 2], [1, 2, 3, 2], [3, 3, 1, 2], [2, 2, 3, 3]]
valid = True
for row in matrix:
if 1 not in row:
valid = False
break

Expanding a vector to a new size

Given a vector, for example
my_list=[1, 2, 3]
how to expand each entry to a new matrix (seen as a multidimensional list or, more likely, a numpy array)?
For example, in the case of matrix being a numpy array of size 2x2 matrix, the output, expanded_my_list, would be:
[[[1, 1], [1, 1]], [[2, 2], [2, 2]], [[3, 3], [3, 3]]]
or as a numpy array:
array([[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]],
[[3, 3],
[3, 3]]])
where expanded_my_list.shape is (3,2,2).
One solution may be:
for i in range(len(my_list)):
expanded[:, i] = my_list[i].expand_as(matrix[:, i])
my_list = [1, 2, 3]
[[[e] * 2 for _ in range(2)] for e in my_list]
output:
[[[1, 1], [1, 1]], [[2, 2], [2, 2]], [[3, 3], [3, 3]]]
You could use numpy.tile().
By creating additional axis via np.newaxis and repeating the elements of the list along these axis you can create your wanted result:
import numpy as np
lst = [1, 2, 3]
arr = np.array(lst)
arr2 = np.tile(arr[:, np.newaxis, np.newaxis], reps=(1, 2, 2))
# Output:
# array([[[1, 1],
# [1, 1]],
# [[2, 2],
# [2, 2]],
# [[3, 3],
# [3, 3]]])
lst2 = arr2.tolist() # If a nested list is required
Or more general:
arr = np.array([[1, 2],
[3, 4]])
expanded_shape = (3, 4)
arr2 = np.tile(arr.reshape(arr.shape + (1,)*len(expanded_shape)),
reps=(1,)*arr.ndim + expanded_shape)
# Output, shape (2, 2, 3, 4)
# array([[[[1, 1, 1, 1],
# [1, 1, 1, 1],
# [1, 1, 1, 1]],
# [[2, 2, 2, 2],
# [2, 2, 2, 2],
# [2, 2, 2, 2]]],
# [[[3, 3, 3, 3],
# [3, 3, 3, 3],
# [3, 3, 3, 3]],
# [[4, 4, 4, 4],
# [4, 4, 4, 4],
# [4, 4, 4, 4]]]])

Subset Sum Problem: Returning the indices of the values instead of the values

would like to know how to put a twist on the subset sum problem.
Given a list of integers and a target integer, I want to compute all the possible groups (consisting of 2 or 3 members) from the list that sum up to the target.
The output will be a 2D list of groups with the indices of the 2 or 3 numbers.
So for example,
nums = [3, 0, 1, 0, -1, -2, 0]
t = 0
ttsum(nums, t) returns [[1, 3], [1, 6], [2, 4], [3, 6], [0, 4, 5], [1, 3, 6], [1, 2, 4], [2, 3, 4], [2, 4, 6]]
Thank you!!!
Also find the combinations of a list that represents the indices. Check the sum, then take the indices.
from itertools import combinations
l = [3, 0, 1, 0, -1, -2, 0]
[list(idx) for i in range(2, 4, 1) for seq,idx in zip(combinations(l, i), combinations(range(0, len(l), 1), i)) if sum(seq) == 0]
Output:
[[1, 3],
[1, 6],
[2, 4],
[3, 6],
[0, 4, 5],
[1, 2, 4],
[1, 3, 6],
[2, 3, 4],
[2, 4, 6]]

Python combination and permuation code

I have following code to generate the set of combination, append the combination in the list, and return list.
def make_combination():
import itertools
max_range = 5
indexes = combinations_plus = []
for i in range(0, max_range):
indexes.append(i)
for i in xrange(2, max_range):
each_combination = [list(x) for x in itertools.combinations(indexes, i)]
combinations_plus.append(each_combination)
retrun combinations_plus
It generates so many combinations that I don't want (hard to display). But, I want the following combination:
1) [[0, 1], [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
2) [[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]]
3) [[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 3, 4], [0, 2, 3, 4], [1, 2, 3, 4]]
I think problem in the following line but I don't know what it is. Any idea about what the mistake is.
combinations_plus.append(each_combination)
An easier way of doing what you want is the following:
list(list(itertools.combinations(list(range(5)), i)) for i in range(2, 5))
To fix your original code, there were two problems:
indexes = combinations_plus = []
The above creates two names for the exact same list. Appending to either appends to both which is not what you want.
The two for statements shouldn't be nested, or the list of indexes is incomplete:
for i in range(0, max_range):
indexes.append(i)
for i in xrange(2, max_range):
each_combination = [list(x) for x in itertools.combinations(indexes, i)]
combinations_plus.append(each_combination)
In fact, initialize indexes with range and skip the first for loop:
indexes = range(max_range) # becomes [0,1,2,3,4]
combinations_plus = []
With these fixes (and fixing the spelling of return, you have:
def make_combination():
import itertools
max_range = 5
indexes = range(max_range)
combinations_plus = []
for i in xrange(2, max_range):
each_combination = [list(x) for x in itertools.combinations(indexes, i)]
combinations_plus.append(each_combination)
return combinations_plus
Which returns (newlines added for readability):
[[[0, 1], [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]],
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]],
[[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 3, 4], [0, 2, 3, 4], [1, 2, 3, 4]]]

Categories