Related
I was given a matrix question on a technical assessment that revolved around the idea of finding the "Max Hotspot".
Areas of connected 1's represent a hotspot, and given a matrix we want to return the max hotspot. Similar to "Number of Islands" or "Word Search.
areamap = [[1, 0, 0, 0],
[1, 1, 1, 0],
[0, 0, 0, 1],
[1, 1, 1, 1]]
areamap = [[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0],
[1, 0, 0, 0]]
I tried using the 4 way DFS approach but had trouble creating/incrementing a variable that keeps track of the size of each hotspot. Any suggestions/help? Below is my attempt.
The idea of my algo is that every time we find a 1, we "evaporate" it to avoid traveling duplicates. For each 1 we evaporate, we incrementing the count of 1's. The count variable and the tmp list variable always print/return as empty.
class Solution:
def hotSpots(self, area: List[int]):
def evap(r, c, cnt):
if r < 0 or c < 0 or r >= len(area) or c >= len(area[0]) or area[r][c] == "0":
return
cnt += 1
area[r][c] = "0"
evap(r, c + 1)
evap(r, c - 1)
evap(r + 1, c)
evap(r - 1, c)
return
tmp = []
for i in range(len(area)):
for j in range(len(area[0])):
count = 0
if area[i][j] == "1":
evap(i, j, count)
tmp.append(count)
return sum(max(tmp))
There are two issues with the code:
evap does not return any result and so count never gets assigned anything other than 0
You have an array of integers, but you check its elements against strings ("0" and "1")
Solving those issues yields the following code, which outputs the result you want
def hotSpots(area):
def evap(r, c):
if r < 0 or c < 0 or r >= len(area) or c >= len(area[0]) or area[r][c] == 0:
return 0
area[r][c] = 0
return 1 + evap(r, c + 1) + evap(r, c - 1) + evap(r + 1, c) + evap(r - 1, c)
tmp = []
for i in range(len(area)):
for j in range(len(area[0])):
if area[i][j] == 1:
count = evap(i, j)
tmp.append(count)
return sum(max(tmp))
I have a problem. It is a 2D list of non-negative integers will be given like
0, 0, 2, 0, 1
0, 2, 1, 1, 0
3, 0, 2, 1, 0
0, 0, 0, 0, 0
I have to drop the numbers, number columns. e.g. drop down the 1's down 1 column, the 2's down 2 columns, the 3's down 3 columns, and so on. If the number can't be moved down enough, wrap it around the top. (e. g If there is a 3 in the second-to-last row, it should wrap around to the first row.) If two numbers map to the same slot, the biggest number takes that slot.
After this transformation the given matrix above will end up like:
0, 0, 2, 0, 0
3, 0, 0, 0, 1
0, 0, 2, 1, 0
0, 2, 0, 1, 0
Here's my trivial solution to the problem (Assumes a list l is pre-set):
new = [[0] * len(l[0]) for _ in range(len(l))]
idx = sorted([((n + x) % len(l), m, x) for n, y in enumerate(l) for m, x in enumerate(y)], key=lambda e: e[2])
for x, y, z in idx:
new[x][y] = z
print(new)
The strategy is:
Build a list new with 0s of the shape of l
Save the new indices of each number in l and each number as tuple pairs in idx
Sort idx by each number
Assign indices from idx to the respective numbers to new list
Print new
I am not satisfied with this strategy. Is there a neater/better way to do this? I can use numpy.
Let's say you have
a = np.array([
[0,0,2,0,1],
[0,2,1,1,0],
[3,0,2,1,0],
[0,0,0,0,0]])
You can get the locations of the elements with np.where or np.nonzero:
r, c = np.nonzero(a)
And the elements themselves with the index:
v = a[r, c]
Incrementing the row is simple now:
new_r = (r + v) % a.shape[0]
To settle collisions, sort the arrays so that large values come last:
i = v.argsort()
Now you can assign to a fresh matrix of zeros directly:
result = np.zeros_like(a)
result[new_r[i], c[i]] = v[i]
The result is
[[0 0 2 0 0]
[3 0 0 0 1]
[0 0 2 1 0]
[0 2 0 1 0]]
I suggest doing it like this if only because it's more readable :-
L = [[0, 0, 2, 0, 1],
[0, 2, 1, 1, 0],
[3, 0, 2, 1, 0],
[0, 0, 0, 0, 0]]
R = len(L)
NL = [[0]*len(L[0]) for _ in range(R)]
for i, r in enumerate(L):
for j, c in enumerate(r):
_r = (c + i) % R
if c > NL[_r][j]:
NL[_r][j] = c
print(NL)
I am trying to compute the ARI between two sets of clusters, using this code:
#computes ARI for this type of clustering
def ARI(table,n):
index = 0
sum_a = 0
sum_b = 0
for i in range(len(table)-1):
for j in range(len(table)-1):
sum_a += choose(table[i][len(table)-1],2)
sum_b += choose(table[len(table)-1][j],2)
index += choose(table[i][j],2)
expected_index = (sum_a*sum_b)
expected_index = expected_index/choose(n,2)
max_index = (sum_a+sum_b)
max_index = max_index/2
return (index - expected_index)/(max_index-expected_index)
#choose to compute rand
def choose(n,r):
f = math.factorial
if (n-r)>=0:
return f(n) // f(r) // f(n-r)
else:
return 0
assuming I have created the contingency table correctly, I still get values outside the range of (-1,1).
For instance:
Contingency table:
[1, 0, 0, 0, 0, 0, 0, 1]
[1, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 1, 0, 0, 0, 1]
[0, 1, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 1, 1, 2]
[1, 0, 1, 0, 1, 0, 0, 3]
[0, 0, 0, 0, 0, 0, 1, 1]
[3, 1, 1, 1, 1, 1, 2, 0]
yields an ARI of -1.6470588235294115 when I run my code.
Is there a bug in this code?
Also Here is how I am computing the contingency matrix:
table = [[0 for _ in range(len(subjects)+1)]for _ in range(len(subjects)+1)]
#comparing all clusters
for i in range(len(clusters)):
index_count = 0
for subject, orgininsts in orig_clusters.items():
madeinsts = clusters[i].instances
intersect_count = 0
#comparing all instances between the 2 clusters
for orginst in orgininsts:
for madeinst in makeinsts:
if orginst == madeinst:
intersect_count += 1
table[index_count][i] = intersect_count
index_count += 1
for i in range(len(table)-1):
a = 0
b = 0
for j in range(len(table)-1):
a += table[i][j]
b += table[j][i]
table[i][len(table)-1] = a
table[len(table)-1][i] = b
clusters is a list of cluster objects that have attribute instances, which is a list of instances contained in that cluster. orig_clusters is a dictonary with keys representing cluster labels, and values are a list of instances contained in that cluster. Is there a bug in this code?
You make some mistakes calculating the ARI in your code -- you calculate a and b too often because you loop over your table twice instead of just once.
Also, you pass n as a parameter, but apparently it is set to 10 (that is how I get your result). It would be easier to just pass the table and then calculate n from there. I fixed your code a bit:
def ARI(table):
index = 0
sum_a = 0
sum_b = 0
n = sum([sum(subrow) for subrow in table]) #all items summed
for i in range(len(table)):
b_row = 0#this is to hold the col sums
for j in range(len(table)):
index += choose(table[i][j], 2)
b_row += table[j][i]
#outside of j-loop b.c. we want to use a=rowsums, b=colsums
sum_a += choose(sum(table[i]), 2)
sum_b += choose(b_row, 2)
expected_index = (sum_a*sum_b)
expected_index = expected_index/choose(n,2)
max_index = (sum_a+sum_b)
max_index = max_index/2
return (index - expected_index)/(max_index-expected_index)
or if you pass on the table with row- and column sums:
def ARI(table):
index = 0
sum_a = 0
sum_b = 0
n = sum(table[len(table)-1]) + sum([table[i][len(table)-1] for i in range(len(table)-1)])
for i in range(len(table)-1):
sum_a += choose(table[i][len(table)-1],2)
sum_b += choose(table[len(table)-1][i],2)
for j in range(len(table)-1):
index += choose(table[i][j],2)
expected_index = (sum_a*sum_b)
expected_index = expected_index/choose(n,2)
max_index = (sum_a+sum_b)
max_index = max_index/2
return (index - expected_index)/(max_index-expected_index)
then
def choose(n,r):
f = math.factorial
if (n-r)>=0:
return f(n) // f(r) // f(n-r)
else:
return 0
table = [[1, 0, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 1, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 1, 2],
[1, 0, 1, 0, 1, 0, 0, 3],
[0, 0, 0, 0, 0, 0, 1, 1],
[3, 1, 1, 1, 1, 1, 2, 0]]
ARI(table)
ARI(table)
Out[56]: -0.0604008667388949
The correct result!
Consider a sequence of coin tosses: 1, 0, 0, 1, 0, 1 where tail = 0 and head = 1.
The desired output is the sequence: 0, 1, 2, 0, 1, 0
Each element of the output sequence counts the number of tails since the last head.
I have tried a naive method:
def timer(seq):
if seq[0] == 1: time = [0]
if seq[0] == 0: time = [1]
for x in seq[1:]:
if x == 0: time.append(time[-1] + 1)
if x == 1: time.append(0)
return time
Question: Is there a better method?
Using NumPy:
import numpy as np
seq = np.array([1,0,0,1,0,1,0,0,0,0,1,0])
arr = np.arange(len(seq))
result = arr - np.maximum.accumulate(arr * seq)
print(result)
yields
[0 1 2 0 1 0 1 2 3 4 0 1]
Why arr - np.maximum.accumulate(arr * seq)? The desired output seemed related to a simple progression of integers:
arr = np.arange(len(seq))
So the natural question is, if seq = np.array([1, 0, 0, 1, 0, 1]) and the expected result is expected = np.array([0, 1, 2, 0, 1, 0]), then what value of x makes
arr + x = expected
Since
In [220]: expected - arr
Out[220]: array([ 0, 0, 0, -3, -3, -5])
it looks like x should be the cumulative max of arr * seq:
In [234]: arr * seq
Out[234]: array([0, 0, 0, 3, 0, 5])
In [235]: np.maximum.accumulate(arr * seq)
Out[235]: array([0, 0, 0, 3, 3, 5])
Step 1: Invert l:
In [311]: l = [1, 0, 0, 1, 0, 1]
In [312]: out = [int(not i) for i in l]; out
Out[312]: [0, 1, 1, 0, 1, 0]
Step 2: List comp; add previous value to current value if current value is 1.
In [319]: [out[0]] + [x + y if y else y for x, y in zip(out[:-1], out[1:])]
Out[319]: [0, 1, 2, 0, 1, 0]
This gets rid of windy ifs by zipping adjacent elements.
Using itertools.accumulate:
>>> a = [1, 0, 0, 1, 0, 1]
>>> b = [1 - x for x in a]
>>> list(accumulate(b, lambda total,e: total+1 if e==1 else 0))
[0, 1, 2, 0, 1, 0]
accumulate is only defined in Python 3. There's the equivalent Python code in the above documentation, though, if you want to use it in Python 2.
It's required to invert a because the first element returned by accumulate is the first list element, independently from the accumulator function:
>>> list(accumulate(a, lambda total,e: 0))
[1, 0, 0, 0, 0, 0]
The required output is an array with the same length as the input and none of the values are equal to the input. Therefore, the algorithm must be at least O(n) to form the new output array. Furthermore for this specific problem, you would also need to scan all the values for the input array. All these operations are O(n) and it will not get any more efficient. Constants may differ but your method is already in O(n) and will not go any lower.
Using reduce:
time = reduce(lambda l, r: l + [(l[-1]+1)*(not r)], seq, [0])[1:]
I try to be clear in the following code and differ from the original in using an explicit accumulator.
>>> s = [1,0,0,1,0,1,0,0,0,0,1,0]
>>> def zero_run_length_or_zero(seq):
"Return the run length of zeroes so far in the sequnece or zero"
accumulator, answer = 0, []
for item in seq:
accumulator = 0 if item == 1 else accumulator + 1
answer.append(accumulator)
return answer
>>> zero_run_length_or_zero(s)
[0, 1, 2, 0, 1, 0, 1, 2, 3, 4, 0, 1]
>>>
We have a list
list = [1, 1, 1, 0, 1, 0, 0, 1]
I am trying find a function that would count the number of 0's before each item and then multiply this number by 3.
def formula(n):
for x in list:
if x == 1:
form = n * 3
return form
#If x is 1, count the number of zeros right before x and multiply this by 3,
For example for the list above, the first element is a 1 and there are no numbers right before it, the program should compute 0 * 3 = 0, for the second item, which is also a 1, the number right before it is also not a zero, the program should also compute 0 * 3 = 0. The 4th element is 0 so the program should ignore, For the 5th element which is a 1, the number right before it is a 0, the programme to compute 1 * 3 = 3, for the 6th element the number right before it is a 1, the system should compute 0 * 3 = 0. The 7th element is a 0, since x is not equal to 1 the program should not do anything. For the last element which is a 1, the last two numbers before it are zeros, the program should compute 2 * 3 = 6
I believe you are looking for a generator, with a simple counter:
def get_values (list):
n = 0
for x in list:
if x == 1:
yield n * 3
n = 0 # reset the counter
elif x == 0:
n += 1 # increment the counter
print(list(get_values([1, 1, 1, 0, 1, 0, 0, 1])))
# [0, 0, 0, 3, 6]
Try this,
def formula(l):
count_zero = 0
result = []
for i in l:
if i == 1:
result.append(3*count_zero)
count_zero = 0
elif i == 0:
count_zero += 1
return result
# Test case
l = [1, 1, 1, 0, 1, 0, 0, 1]
result = formula(l)
print(result)
# [0, 0, 0, 3, 6]
Here is my solution for the problem.
test_list = [1, 1, 1, 0, 1, 0, 0, 1]
def formula(list):
track = []
start = 0
for i,n in enumerate(list):
count_list_chunk = list[start:i]
if count_list_chunk.count(0) > 0 and n != 0:
start = i
if n != 0:
track.append( count_list_chunk.count(0)*3 )
return track
print formula(test_list)
#[ 0, 0, 0, 3, 6]