Related
I need to make a system such that if the number of Category_Number is above the maximum, the number switches to the second column, and so on.
For example, imagine I have this array of 'club choices', where each row is a submission, with first, second, and third choices:
array([first_choice, second_choice, third_choice])
And to track the applicants, we add a tracking number id:
array([(first_choice, id), (second_choice, id), (third_choice, id)], dtype=[('choice', np.int8),('id', np.int16)])
For simplicity's sake, we won't look at the ids for now. I want the 2nd/third choice of a former applicant to tally to the max before the first choices of the next applicants.
arr['choice']:
array([
[3, 1, 2],
[0, 3, 2],
[2, 1, 3],
[3, 2, 1],
[3, 1, 2],
[2, 3, 1],
[0, 1, 2],
[3, 1, 0],
[2, 0, 3],
[2, 1, 0],
[2, 3, 0],
[0, 1, 2],
[0, 1, 2],
[0, 2, 3],
[0, 3, 1],
[0, 3, 2],
[2, 1, 3],
[3, 1, 2],
[1, 0, 3],
[0, 1, 3],
[3, 1, 0],
[0, 2, 3],
[3, 2, 0],
[0, 2, 3],
[2, 1, 0]
])
Now, I want to categorize the data such that if the amount of a particular club type is beyond a number, let's say, 7, then that application's value is switched to the second column, like so:
array([
[3, 1, 2],
[0, 3, 2],
[2, 1, 3],
[3, 2, 1],
[3, 1, 2],
[2, 3, 1],
[0, 1, 2],
[3, 1, 0],
[2, 0, 3],
[2, 1, 0],
[2, 3, 0],
[0, 1, 2],
[0, 1, 2],
[0, 2, 3],
[0, 3, 1],
[0, 3, 2],
#Category 0 max
[2, 1, 3],
[3, 1, 2],
[1, 0, 3],
[1, 3], #was [0, 1, 3]
[3, 1, 0],
[2, 3], #was [0, 2, 3],
#Category 2 would have continued, but due to the changes, it now stops here.
[3, 2, 0],
[], #was [0, 2, 3], then became [2, 3], then became [3], but 3 is also full, so it is now empty.
[1, 0] #was [2, 1, 0]
])
The final choices are simply the first column transposed.
I hope what I am attempting to do is understood. I'm not restricted to numpy, if another approach is faster, I'd accept it. I made a couple of attempts at this, but none of them got anywhere. I don't even know where to start.
I have two different list of list, with different size (List A with size in range 1000 and list B with size in range 10,000).
A=[[0, 0, 0],
[0, 0, 1],
[0, 0, 2],
[0, 0, 3],
[0, 0, 4],
[0, 0, 5],
[0, 1, 0],
[0, 1, 1],
[0, 1, 2],
[0, 1, 3],
[0, 1, 4],
[0, 1, 5],
[0, 1, 6],
[0, 1, 7],
[0, 1, 8],
[0, 1, 9],
[0, 2, 0],
[0, 2, 1],
[0, 2, 2]]
B=[[1, 1, 2],
[0, 0, 2],
[0, 0, 1],
[4, 2, 2],
[3, 1, 2],
[1, 0, 1],
[1, 1, 2],
[0, 1, 2],
[0, 0, 0],
[2, 2, 3],
[1, 2, 1],
[0, 2, 1],
[0, 2, 0],
[0, 2, 1],
[0, 1, 3],
[0, 0, 0],
[1, 2, 5],
[0, 4, 3],
[0, 1, 3]]
I need to compare list with list B and find out how many times each element of A occur in B. For example I need to find out how many times [0,0,0] (first element of A) occur in B.
Thank you for your help.
This should work:
A = [[0, 0, 0],
[0, 0, 1],
[0, 0, 2],
[0, 0, 3],
[0, 0, 4],
[0, 0, 5],
[0, 1, 0],
[0, 1, 1],
[0, 1, 2],
[0, 1, 3],
[0, 1, 4],
[0, 1, 5],
[0, 1, 6],
[0, 1, 7],
[0, 1, 8],
[0, 1, 9],
[0, 2, 0],
[0, 2, 1],
[0, 2, 2]]
B = [[1, 1, 2],
[0, 0, 2],
[0, 0, 1],
[4, 2, 2],
[3, 1, 2],
[1, 0, 1],
[1, 1, 2],
[0, 1, 2],
[0, 0, 0],
[2, 2, 3],
[1, 2, 1],
[0, 2, 1],
[0, 2, 0],
[0, 2, 1],
[0, 1, 3],
[0, 0, 0],
[1, 2, 5],
[0, 4, 3],
[0, 1, 3]]
nums = []
for i in A:
for j in B:
if i in B:
nums.append(str(i))
nums_freq = {}
nums = list(dict.fromkeys(nums))
for i in nums:
count = 0
for j in B:
if i == str(j):
if i in nums_freq.keys():
nums_freq[i] += 1
else:
nums_freq[i] = 1
Value for num_freq:
{'[0, 0, 0]': 2,
'[0, 0, 1]': 1,
'[0, 0, 2]': 1,
'[0, 1, 2]': 1,
'[0, 1, 3]': 2,
'[0, 2, 0]': 1,
'[0, 2, 1]': 2}
>>> import operator
>>> for e in A:
... print(e, 'appearing in :', operator.countOf(B, e))
...
[0, 0, 0] appearing in : 2
[0, 0, 1] appearing in : 1
[0, 0, 2] appearing in : 1
[0, 0, 3] appearing in : 0
[0, 0, 4] appearing in : 0
[0, 0, 5] appearing in : 0
[0, 1, 0] appearing in : 0
[0, 1, 1] appearing in : 0
[0, 1, 2] appearing in : 1
[0, 1, 3] appearing in : 2
[0, 1, 4] appearing in : 0
[0, 1, 5] appearing in : 0
[0, 1, 6] appearing in : 0
[0, 1, 7] appearing in : 0
[0, 1, 8] appearing in : 0
[0, 1, 9] appearing in : 0
[0, 2, 0] appearing in : 1
[0, 2, 1] appearing in : 2
[0, 2, 2] appearing in : 0
I can suggest 2 different ways to do it:
Using a counter:
from collections import Counter
cb = Counter(tuple(b) for b in B)
list((a, cb[tuple(a)]) for a in A))
Using nested comprehensions:
list((a, sum(all(ia == ib for ia, ib in zip(a, b)) for b in B)) for a in A)
Actually, you should consider having the inner collection as tuples instead of lists, because tuples support element-wise equality and are hashable.
if A and B were lists of tuples, it would be as simple as:
Counter(a for a in A for b in B if a == b)
I have a Numpy Array that with integer values 1 or 0 (can be cast as booleans if necessary). The array is square and symmetric (see note below) and I want a list of the indices where a 1 appears:
Note that array[i][j] == array[j][i] and array[i][i] == 0 by design. Also I cannot have any duplicates.
import numpy as np
array = np.array([
[0, 0, 1, 0, 1, 0, 1],
[0, 0, 1, 1, 0, 1, 0],
[1, 1, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 1, 1, 0],
[1, 0, 0, 1, 0, 0, 1],
[0, 1, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0]
])
I would like a result that is like this (order of each sub-list is not important, nor is the order of each element within the sub-list):
[
[0, 2],
[0, 4],
[0, 6],
[1, 2],
[1, 3],
[1, 5],
[2, 6],
[3, 4],
[3, 5],
[4, 6]
]
Another point to make is that I would prefer not to loop over all indices twice using the condition j<i because the size of my array can be large but I am aware that this is a possibility - I have written an example of this using two for loops:
result = []
for i in range(array.shape[0]):
for j in range(i):
if array[i][j]:
result.append([i, j])
print(pd.DataFrame(result).sort_values(1).values)
# using dataframes and arrays for formatting but looking for
# 'result' which is a list
# Returns (same as above but columns are the opposite way round):
[[2 0]
[4 0]
[6 0]
[2 1]
[3 1]
[5 1]
[6 2]
[4 3]
[5 3]
[6 4]]
idx = np.argwhere(array)
idx = idx[idx[:,0]<idx[:,1]]
Another way:
idx = np.argwhere(np.triu(array))
output:
[[0 2]
[0 4]
[0 6]
[1 2]
[1 3]
[1 5]
[2 6]
[3 4]
[3 5]
[4 6]]
Comparison:
##bousof solution
def method1(array):
return np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]>=0))).transpose()[:,::-1]
#Also mentioned by #hpaulj
def method2(array):
return np.argwhere(np.triu(array))
def method3(array):
idx = np.argwhere(array)
return idx[idx[:,0]<idx[:,1]]
#The original method in question by OP(d-man)
def method4(array):
result = []
for i in range(array.shape[0]):
for j in range(i):
if array[i][j]:
result.append([i, j])
return result
#suggestd by #bousof in comments
def method5(array):
return np.vstack(np.where(np.triu(array))).transpose()
inputs = [np.random.randint(0,2,(n,n)) for n in [10,100,1000,10000]]
Seems like method1, method2 and method5 are slightly faster for large arrays while method3 is faster for smaller cases:
In [249]: arr = np.array([
...: [0, 0, 1, 0, 1, 0, 1],
...: [0, 0, 1, 1, 0, 1, 0],
...: [1, 1, 0, 0, 0, 0, 1],
...: [0, 1, 0, 0, 1, 1, 0],
...: [1, 0, 0, 1, 0, 0, 1],
...: [0, 1, 0, 1, 0, 0, 0],
...: [1, 0, 1, 0, 1, 0, 0]
...: ])
The most common way of getting indices on non-zeros (True) is with np.nonzero (aka np.where):
In [250]: idx = np.nonzero(arr)
In [251]: idx
Out[251]:
(array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6]),
array([2, 4, 6, 2, 3, 5, 0, 1, 6, 1, 4, 5, 0, 3, 6, 1, 3, 0, 2, 4]))
This is a tuple - 2 arrays for a 2d array. It can be used directly to index the array (or anything like it): arr[idx] will give all 1s.
Apply np.transpose to that and get an array of 'pairs':
In [252]: np.argwhere(arr)
Out[252]:
array([[0, 2],
[0, 4],
[0, 6],
[1, 2],
[1, 3],
[1, 5],
[2, 0],
[2, 1],
[2, 6],
[3, 1],
[3, 4],
[3, 5],
[4, 0],
[4, 3],
[4, 6],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
Using such an array to index arr is harder - requiring a loop and conversion to tuple.
To weed out the symmetric duplicates we could make a tri-lower array:
In [253]: np.tril(arr)
Out[253]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0]])
In [254]: np.argwhere(np.tril(arr))
Out[254]:
array([[2, 0],
[2, 1],
[3, 1],
[4, 0],
[4, 3],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
You can use numpy.where:
>>> np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]<=0))).transpose()
array([[2, 0],
[2, 1],
[3, 1],
[4, 0],
[4, 3],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]<=0 is true only on the lower part of the matrix. If the order is important, you can get the same order as in the question using:
>>> np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]>=0))).transpose()[:,::-1]
array([[2, 0],
[4, 0],
[6, 0],
[2, 1],
[3, 1],
[5, 1],
[6, 2],
[4, 3],
[5, 3],
[6, 4]])
I want to use a 2D array as an index for a 3D array as a heightmap to index axis 0 of the 3D array. Is there an efficient "numpy-way" of doing this? In my example I want to set everything at equal or greater height of the heightmap in each corresponding pillar two zero. Example:
3D Array:
[[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]]]
2D Array (heightmap):
[[0, 1, 2],
[2, 3, 4],
[2, 0, 0]]
Desired output:
[[[0, 1, 1],
[1, 1, 1],
[1, 0, 0]],
[[0, 0, 1],
[1, 1, 1],
[1, 0, 0]],
[[0, 0, 0],
[0, 1, 1],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 1],
[0, 0, 0]]]
So far I have implemented this with a for python loop as in
for y in range(arr2d.shape[0]):
for x in range(arr2d.shape[1]):
height = arr2d[y, x]
arr3d[height:, y, x] = 0
but this seems very ineffecient and I feel like there might be a way better way to do this.
Drawing inspiration from an fast way of padding arrays:
In [104]: (np.arange(4)[:,None,None]<arr2d).astype(int)
Out[104]:
array([[[0, 1, 1],
[1, 1, 1],
[1, 0, 0]],
[[0, 0, 1],
[1, 1, 1],
[1, 0, 0]],
[[0, 0, 0],
[0, 1, 1],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 1],
[0, 0, 0]]])
I have a list of list where the sublist contains numbers that are either 0, 1 or 2.
I need to remove any sublist where any of the numbers are 0.
I tried this code:
l = [list(b) for b in x.Branches]
z = 0
list2 = filter(z, l)
print list2
But it keeps telling me that int is not iterable. the first line gets me my list from rhino grasshopper data, and my list is
[[1, 1, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 2, 0], [0, 2, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 0], [0, 2, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 2, 2], [2, 2, 2, 2], [2, 2, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [1, 1, 0, 2], [2, 0, 0, 2], [2, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
filter takes function as first argument, I actually got TypeError. That's because 0 is not a function and it is not callable.
In [49]: list(filter(0, l))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-fe89c490c585> in <module>
----> 1 list(filter(0, l))
TypeError: 'int' object is not callable
the way I tried is below, hope it helps
In [50]: is_zero = lambda x: x == 0
In [51]: is_all_zero = lambda x: all(map(is_zero, x))
In [52]: not_is_all_zero = lambda x: not is_all_zero(x)
is_zero checks if x is zero
is_all_zero checks if list is all zero
not_is_all_zero get opposite output of is_all_zero
and now we can use not_is_all_zero to filter l
In [54]: list(filter(not_is_all_zero, l))
Out[54]:
[[1, 1, 2, 0],
[0, 2, 2, 0],
[0, 2, 0, 0],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 0],
[0, 2, 2, 0],
[0, 2, 0, 0],
[1, 1, 2, 2],
...
Update
you want filter any of item is zero, so you can apply below function to filter the list
In [55]: is_any_zero = lambda x: any(map(is_zero, x))
In [56]: is_any_zero([0,1])
Out[56]: True
In [57]: is_any_zero([1,1])
Out[57]: False
In [59]: not_is_any_zero = lambda x: not is_any_zero(x)
In [60]: list(filter(not_is_any_zero, l))
Out[60]:
[[1, 1, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[1, 1, 2, 2],
[2, 2, 2, 2]]
To remove lists containing at least one zero, you could use
res = [ls for ls in lst if 0 not in ls]
Here is a hacky way of removing all suslists consisting only of zeros, assuming all elements are non-negative:
res = filter(sum, lst)
This uses the fact that bool(0) == False and bool(x) == True for x > 0.
Another approach to accomplish the same task is this:
list_2 = list(filter(all, l))
Here, the all function evaluates to true if every element being evaluated is true. Since bool(x) is true for all non-zero values, all(l) is true if all values in l are non-zero.