During communities' detection I am trying to remove duplicates nodes from lists of lists (aimed to calculate ARI).
What I have – few dozen lists inside one list with different dimensions:
lst_of_lts= [[5192, 32896, 34357, 34976, 36683, 43315], … ,[19, 92585, 94137, 98381, 99041, 100395, 101100, 109759]]
What I am running:
import itertools
Lst_of_lts.sort()
Lst_of_lts_2 = list(k for k,_ in itertools.groupby(Lst_of_lts))
Lst_of_lts_nodops= [list(i) for i in {tuple(sorted(i)) for i in Lst_of_lts_2}]
For some reason, it doesn’t remove duplicates.
The dimensions remain the same-
Any suggestions?
Also tried many options such as:
Remove duplicate items from lists in Python lists and
Remove duplicated lists in list of lists in Python
If you are removing duplicates just in the list itself, you can use set.
a = np.random.randint(0,5,(10,10)).tolist()
a
Out[128]:
[[0, 3, 0, 2, 4, 4, 0, 0, 3, 3],
[2, 4, 0, 2, 4, 2, 2, 4, 3, 1],
[3, 2, 0, 1, 2, 0, 2, 0, 2, 1],
[3, 1, 4, 1, 0, 1, 4, 4, 3, 4],
[2, 0, 1, 1, 0, 4, 1, 4, 2, 3],
[0, 0, 1, 3, 4, 3, 1, 3, 0, 1],
[1, 2, 0, 2, 1, 3, 4, 2, 2, 0],
[3, 3, 2, 2, 0, 4, 1, 1, 0, 0],
[0, 1, 3, 0, 4, 4, 2, 1, 1, 4],
[0, 1, 4, 4, 0, 1, 3, 2, 1, 1]]
[list(set(i)) for i in a]
Out[129]:
[[0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3],
[0, 1, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]
Or if you want to preserve the order of the element, you can use dict.fromkeys
[list(dict.fromkeys(i)) for i in a]
Out[133]:
[[0, 3, 2, 4],
[2, 4, 0, 3, 1],
[3, 2, 0, 1],
[3, 1, 4, 0],
[2, 0, 1, 4, 3],
[0, 1, 3, 4],
[1, 2, 0, 3, 4],
[3, 2, 0, 4, 1],
[0, 1, 3, 4, 2],
[0, 1, 4, 3, 2]]
I have two arrays
a = np.array([[0, 0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 2, 3, 4]])
and
b = np.array([[1, 1],
[2, 2],
[3, 3]])
I want to one array where I am adding the values of b to the first two columns in a like this:
c = np.array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4]])
if it helps you can think of the first two columns in a as the x,y coordinates and b as dx, dy.
My current method is as follows:
c = np.concatenate([a[:, 0:2] + b, a[:, 2:]],1)
but I am looking for a better method
Thank you
You can use np.pad to add zeros to b to make its shape the same as a's, then add them:
>>> a + np.pad(b, ((0, 0), (0, 3)))
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4]])
In general (for 2-D):
>>> a = np.array([[0, 0, 2, 3, 4],
... [0, 1, 2, 3, 4],
... [0, 2, 2, 3, 4]])
>>> b = np.array([[1, 1],
... [2, 2],
... [3, 3],
... [4, 4],
... [5, 5]])
>>> a_shape, b_shape = a.shape, b.shape
>>> max_w = max(a_shape[0], b_shape[0])
>>> max_h = max(a_shape[1], b_shape[1])
>>> padded_a = np.pad(a,
((0, np.abs(a_shape[0] - max_w)),
(0, np.abs(a_shape[1] - max_h))))
>>> padded_a
array([[0, 0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 2, 3, 4],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> padded_b = np.pad(b,
((0, np.abs(b_shape[0] - max_w)),
(0, np.abs(b_shape[1] - max_h))))
>>> padded_b
array([[1, 1, 0, 0, 0],
[2, 2, 0, 0, 0],
[3, 3, 0, 0, 0],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
>>> padded_a + padded_b
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
In general (2-D, using a zeros array and adding to it):
>>> c = np.zeros((max_h, max_w), dtype=a.dtype)
>>> c[:a_shape[0], :a_shape[1]] += a
>>> c[:b_shape[0], :b_shape[1]] += b
>>> c
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
I have a list of list where the sublist contains numbers that are either 0, 1 or 2.
I need to remove any sublist where any of the numbers are 0.
I tried this code:
l = [list(b) for b in x.Branches]
z = 0
list2 = filter(z, l)
print list2
But it keeps telling me that int is not iterable. the first line gets me my list from rhino grasshopper data, and my list is
[[1, 1, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
filter takes function as first argument, I actually got TypeError. That's because 0 is not a function and it is not callable.
In [49]: list(filter(0, l))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-fe89c490c585> in <module>
----> 1 list(filter(0, l))
TypeError: 'int' object is not callable
the way I tried is below, hope it helps
In [50]: is_zero = lambda x: x == 0
In [51]: is_all_zero = lambda x: all(map(is_zero, x))
In [52]: not_is_all_zero = lambda x: not is_all_zero(x)
is_zero checks if x is zero
is_all_zero checks if list is all zero
not_is_all_zero get opposite output of is_all_zero
and now we can use not_is_all_zero to filter l
In [54]: list(filter(not_is_all_zero, l))
Out[54]:
[[1, 1, 2, 0],
[0, 2, 2, 0],
[0, 2, 0, 0],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 0],
[0, 2, 2, 0],
[0, 2, 0, 0],
[1, 1, 2, 2],
...
Update
you want filter any of item is zero, so you can apply below function to filter the list
In [55]: is_any_zero = lambda x: any(map(is_zero, x))
In [56]: is_any_zero([0,1])
Out[56]: True
In [57]: is_any_zero([1,1])
Out[57]: False
In [59]: not_is_any_zero = lambda x: not is_any_zero(x)
In [60]: list(filter(not_is_any_zero, l))
Out[60]:
[[1, 1, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2]]
To remove lists containing at least one zero, you could use
res = [ls for ls in lst if 0 not in ls]
Here is a hacky way of removing all suslists consisting only of zeros, assuming all elements are non-negative:
res = filter(sum, lst)
This uses the fact that bool(0) == False and bool(x) == True for x > 0.
Another approach to accomplish the same task is this:
list_2 = list(filter(all, l))
Here, the all function evaluates to true if every element being evaluated is true. Since bool(x) is true for all non-zero values, all(l) is true if all values in l are non-zero.
How can i get the sorted indices of a numpy array (distance), only considering certain indices from another numpy array (val).
For example, consider the two numpy arrays val and distance below:
val = np.array([[10, 0, 0, 0, 0],
[0, 0, 10, 0, 10],
[0, 10, 10, 0, 0],
[0, 0, 0, 10, 0],
[0, 0, 0, 0, 0]])
distance = np.array([[4, 3, 2, 3, 4],
[3, 2, 1, 2, 3],
[2, 1, 0, 1, 2],
[3, 2, 1, 2, 3],
[4, 3, 2, 3, 4]])
the distances where val == 10 are 4, 1, 3, 1, 0, 2. I would like to get these sorted to be 0, 1, 1, 2, 3, 4 and return the respective indices from distance array.
Returning something like:
(array([2, 1, 2, 3, 1, 0], dtype=int64), array([2, 2, 1, 3, 4, 0], dtype=int64))
or:
(array([2, 2, 1, 3, 1, 0], dtype=int64), array([2, 1, 2, 3, 4, 0], dtype=int64))
since the second and third element both have distance '1', so i guess the indices can be interchangable.
Tried using combinations of np.where, np.argsort, np.argpartition, np.unravel_index but cant seem to get it working right
Here's one way with masking -
In [20]: mask = val==10
In [21]: np.argwhere(mask)[distance[mask].argsort()]
Out[21]:
array([[2, 2],
[1, 2],
[2, 1],
[3, 3],
[1, 4],
[0, 0]])