I'm working on a problem that requires developing code for a context model:
P(A|x)=∑a∈A𝕊(x,a)/∑a∈A𝕊(x,a)+∑b∈B𝕊(x,b)
which calculates the probability of that a novel stimulus x belongs to the category A. The S is a similarity function I defined earlier as:
def calculate_similarity(x, y, theta=0.1):
return (np.prod([1 if x[i] == y[i] else theta for i in range(len(x))]))
The function takes in the test stimuli, exemplars, exemplar names and theta and wants us to return an array of the probabilities that each stimulus belongs to A.
def context_model(test_stimuli, exemplars, exemplar_categories, theta=0.1):
I know that I should be iterating over every test stimuli and calculating the similarity between it and the exemplars- I can even do this by the exemplar category name "A" and "B", but when I run my code I get and index error.
result = []
a_data = np.argwhere(exemplar_categories == 'A')
b_data = np.argwhere(exemplar_categories == 'B')
for x in test_stimuli:
a = [calculate_similarity(x, exemplars[i, :], theta) for a_data[i] in range(0, len(a_data))]
b = [calculate_similarity(x, exemplars[i, :], theta) for b_data[i] in range(0, len(b_data))]
final = sum(a)/ (sum(a) + sum(b))
result = np.append(result, final)
return result
When I run,
stimuli = np.array([
[1, 1, 1, 1, 0],
[0, 0, 1, 1, 0],
[1, 0, 0, 1, 1],
[1, 1, 0, 1, 1]])
exemplars = np.array([
[0, 0, 1, 0, 1],
[0, 1, 1, 1, 0],
[1, 0, 0, 0, 1],
[0, 0, 1, 1, 1],
[0, 0, 1, 0, 0],
[1, 0, 0, 1, 1]])
exemplar_categories = np.array(['B', 'A', 'B', 'B', 'B', 'A'])
context_model(stimuli, exemplars, exemplar_categories, theta=0.1)
I get the statement "index 2 is out of bounds for axis 0 with size 2
I've been working on this problem for days and don't know where to go from here. I've tried applying the function to every X in the stimuli and then parsing out the A and B data, but nothing is working.
Any help would be greatly appreciated!! I'm still new to coding and this problem is driving me crazy.
Related
I wanna do a kernel of zeros and ones. I have a list with pairs of heights (e.g. [[191.0, 243.0], [578.0, 632.0]]. What I want to do is set ones in the kernel on those rows with height between the values of a pair of heights.
Example image about what I want to do (in this case 2 pairs, the values above):
enter image description here
My method makes a double loop but takes minutes to execute. Is there any faster way to do this? Here is the code:
mascara = numpy.zeros((height, width),numpy.uint8)
#print(mascara)
for i in range(height):
for j in range(width): #For each element of the kernel (kernel=mascara)
indice_parejas = 0
while indice_parejas < numero_de_parejas: #indice_parejas: index that indicates the pair we are checking
#print(i,j) #numero_de_parejas: how many pairs we have
if i > vector_mascara[indice_parejas][0] and i < vector_mascara[indice_parejas][1]: #If it is between a pair
mascara[i][j] = 1
break #we don't have to check the others pairs because we know it is in this pair (get out of the loop)
else:
indice_parejas = indice_parejas + 1
IIUC, it is quite simple:
for h0, h1 in hpairs:
mask[h0:h1, :] = 1
Reproducible example
w, h = 8, 4
mask = np.zeros((w, h), dtype=np.uint8)
hpairs = [
[1,3],
[5,6],
]
for h0, h1 in hpairs:
mask[h0:h1, :] = 1
>>> mask
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 0, 0, 0]], dtype=uint8)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
Given some m by n grid of 1's and 0's, how would you find how much water would be captured by it, where the 1's are 'walls', and 0's are empty space?
Examples:
[1, 1, 1, 1, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 1, 1, 1, 1]
This grid would capture 9 units of water.
[1, 1, 1, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 1, 1, 1, 1]
However, because this grid has a 'leak' in one of its walls, this would capture 0 units of water.
[1, 1, 1, 0, 1],
[1, 0, 1, 0, 1],
[1, 0, 1, 0, 1],
[1, 0, 1, 0, 1],
[1, 1, 1, 1, 1]
Likewise, because there is a partition between the two sections, the leaky one does not affect the other, and as such this grid would capture 3 units of water.
I'm just really uncertain of how to start on this problem. Are there any algorithms that would be helpful for this? I was thinking depth-first-search or some sort of flood-fill, but now I'm not sure if those are applicable to this exercise.
You can create a list of leaks starting from the positions of 0s on the edges. Then expand that list with 0s that are next to the leaking positions (until no more leaks can be added). Finally, subtract the number of leaks from the total number of zeros in the grid.
def water(G):
rows = len(G)
cols = len(G[0])
# initial leaks are 0s on edges
leaks = [ (r,c) for r in range(rows) for c in range(cols)
if G[r][c]==0 and (r==0 or c==0 or r==rows-1 or c==cols-1) ]
for r,c in leaks:
for dr,dc in [(-1,0),(1,0),(0,-1),(0,1)]: # offsets of neighbours
nr,nc = r+dr, c+dc # coordinates of a neighbour
if nr not in range(rows): continue # out of bounds
if nc not in range(cols): continue # out of bounds
if G[nr][nc] != 0: continue # Wall
if (nr,nc) in leaks: continue # already known
leaks.append((nr,nc)) # add new leak
return sum( row.count(0) for row in G) - len(leaks)
Output:
grid = [[1, 1, 1, 1, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 1, 1, 1, 1]]
print(water(grid)) # 9
grid = [[1, 1, 1, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 1, 1, 1, 1]]
print(water(grid)) # 0
grid = [[1, 1, 1, 0, 1],
[1, 0, 1, 0, 1],
[1, 0, 1, 0, 1],
[1, 0, 1, 0, 1],
[1, 1, 1, 1, 1]]
print(water(grid)) # 3
Note that this only looks for leaks in horizontal and vertical (but not diagonal) directions. To manage leaking through diagonals, you'll need to add (-1,-1),(-1,1),(1,-1),(1,1) to the list of offsets.
Removing zeros starting at the edges, representing the coordinates of zeros with a set (for fast lookup) of complex numbers (for easy neighbor calculation):
def water(G):
m, n = len(G), len(G[0])
zeros = {complex(i, j)
for i in range(m) for j in range(n)
if G[i][j] == 0}
for z in list(zeros):
if z.real in (0, m-1) or z.imag in (0, n-1):
q = [z]
for z in q:
if z in zeros:
zeros.remove(z)
for a in range(4):
q.append(z + 1j**a)
return len(zeros)
Or with Alain's style of a single BFS, initializing the queue with all edge zeros:
def water(G):
m, n = len(G), len(G[0])
zeros = {complex(i, j)
for i in range(m) for j in range(n)
if G[i][j] == 0}
q = [z for z in zeros
if z.real in (0, m-1) or z.imag in (0, n-1)]
for z in q:
if z in zeros:
zeros.remove(z)
for a in range(4):
q.append(z + 1j**a)
return len(zeros)
Rule 30 is a one dimensional cellular automaton where only the cells in the previous generation are considered by the current generation. There are two states that a cell can be in: 1 or 0. The rules for creating the next generation are represented in the row below, and depend on the cell immediately above the current cell, as well as it's immediate neighbours.
The cellular automaton is applied by the following rule (using bitwise operators):
left_cell ^ (central_cell | right_cell)
This rule forms the table below:
Now I tried to implement these rules into Python, using numpy. I defined an initial state that accepts width as a parameter and produces an initial row of zeros with 1 in the middle.
def initial_state(width):
initial = np.zeros((1, width), dtype=int)
if width % 2 == 0:
initial = np.insert(initial, int(width / 2), values=0, axis=1)
initial[0, int(width / 2)] = 1
return initial
else:
initial[0, int(width / 2)] = 1
return initial
The function below just produces the second generation given an initial row. How do I create a for loop that keeps producing new generations until the first element of the last bottom row becomes 1?
def rule30(array):
row1 = np.pad(array,[(0,0), (1,1)], mode='constant')
next_row = array.copy()
for x in range(1, array.shape[0]+1):
for y in range(1, array.shape[1]+1):
if row1[x-1][y-1] == 1 ^ (row1[x-1][y] == 1 or row1[x-1][y+1] == 1):
next_row[x - 1, y - 1] = 1
else:
next_row[x - 1, y - 1] = 0
return np.concatenate((array, next_row))
For example, if the input is
A = [0, 0, 0, 1, 0, 0, 0]
The output should be
>>> print(rule30(A))
[[0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 1, 1, 0, 0, 1, 0],
[1, 1, 0, 1, 1, 1, 1]]
Here is the code based on string representations and lookup. It does use some of the ideas from the comments above. Besides I added padding for handling edge cells - the conditions were unclear about that. Also note that your proposed patterns table is not symmetric. Compare new states for '110' and '011'.
def rule30(a):
patterns = {'111': '0', '110': '0', '101': '0', '100': '1',
'011': '1', '010': '1', '001': '1', '000': '0', }
a = '0' + a + '0' # padding
return ''.join([patterns[a[i:i+3]] for i in range(len(a)-2)])
a = '0001000'
result = [list(map (int, a))]
while a[0] != '1':
a = rule30(a)
result.append (list(map (int, a)))
print (result) # list of lists
print (np.array(result)) # np.array
list of lists:
[[0, 0, 0, 1, 0, 0, 0], [0, 0, 1, 1, 1, 0, 0], [0, 1, 1, 0, 0, 1, 0], [1, 1, 0, 1, 1, 1, 1]]
np.array:
array([[0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 1, 1, 0, 0, 1, 0],
[1, 1, 0, 1, 1, 1, 1]])
Method 1 - Numpy
You could achieve this using the following slight modification to your current code - alter the return value of rule30 to return np.array(next_row). Then you can use the following function:
def apply_rule(n):
rv = initial_state(n)
while rv[-1][0] == 0:
rv = np.append(rv, rule30(rv[-1].reshape(1,-1)), axis=0)
return rv
Usage:
>>> apply_rule(7)
array([[0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 1, 1, 0, 0, 1, 0],
[1, 1, 0, 1, 1, 1, 1]])
Or plotted:
>>> plt.imshow(apply_rule(7), cmap='hot')
Method 2 - Lists
Alternatively, you could use the following solution without using numpy, which uses a few functions to apply the Rule 30 logic across each triple in each padded list, until the stop-condition is met.
Code:
def rule(t):
return t[0] ^ (t[1] or t[2])
def initial_state(width):
initial = [0]*width
if width%2:
initial[width // 2] = 1
else:
initial.insert(width//2, 1)
return initial
def get_triples(l):
return zip(l,l[1:],l[2:])
def rule30(l):
return [rule(t) for t in get_triples([0] + l + [0])]
def apply_rule(width):
rv = [initial_state(width)]
while not rv[-1][0]:
rv.append(rule30(rv[-1]))
return rv
Usage:
>>> apply_rule(7)
[[0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 1, 1, 1, 0, 1, 0],
[1, 1, 1, 0, 0, 1, 1]]
>>> [''.join(str(y) for y in x) for x in apply_rule(7)]
['0001000',
'0011100',
'0111010',
'1110011']
Matplotlib visualisation (using either method):
import matplotlib.pyplot as plt
plt.figure(figsize=(10,6))
plt.imshow(apply_rule(250), cmap='hot')
I was trying to create a code for a identity matrix and came out with this code:
def identidade(n):
i =0
l = [0] * n
l1 = [l.copy()] *n
for i in range (n):
l1[i][i] = 1
print(l1)
return l1
the output is:
[[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1]]
But i came along with a very similar code in the internet:
def identity(n):
m=[[0 for x in range(n)] for y in range(n)]
for i in range(0,n):
m[i][i] = 1
return m
that returns:
[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]]
So, my question is why my code doesn't return the correct output when selecting the element in the list of lists (l1[i][i] = 1) ?
tks in advance
The actual problem here is that you are using the * operator to create (as you hope) the copies of the '[l.copy()]' list, but it actually creates references. Using the copy() inside of the square brackets just breaks the connection to the original 'l' list, but does not solve the problem with creation of references to the newly created copy.
Just try to replace the * operator with for loop - this will solve your problem.
So in a binary array I'm trying to find the points where a 0 and a 1 are next to each other, and redraw the array with these crossover points indicated by modifying the 0 value. Just wondering if there's a better way of comparing each of the values in a numpy array to the 8 surrounding values than using nested for loops.
Currently I have this, which compares to 4 surrounding just for readability here
for x in range(1, rows - 1):
for y in range(1, columns - 1):
if f2[x, y] == 0:
if f2[x-1, y] == 1 or f2[x+1, y] == 1 or f2[x, y-1] == 1 or f2[x, y+1] == 1:
f2[x, y] = 2
EDIT
For example
[[1, 1, 1, 1, 1, 1, 1],
[1, 1, 0, 0, 0, 1, 1],
[1, 1, 0, 0, 0, 1, 1],
[1, 1, 0, 0, 0, 1, 1],
[1, 1, 1, 1, 1, 1, 1]]
to
[[1, 1, 1, 1, 1, 1, 1],
[1, 1, 2, 2, 2, 1, 1],
[1, 1, 2, 0, 2, 1, 1],
[1, 1, 2, 2, 2, 1, 1],
[1, 1, 1, 1, 1, 1, 1]]
This problem can be solved quickly with binary morphology functions
import numpy as np
from scipy.ndimage.morphology import binary_dilation, generate_binary_structure
# Example array
f2 = np.zeros((5,5), dtype=float)
f2[2,2] = 1.
# This line determines the connectivity (all 8 neighbors or just 4)
struct_8_neighbors = generate_binary_structure(2, 2)
# Replace cell with maximum of neighbors (True if any neighbor != 0)
has_neighbor = binary_dilation(f2 != 0, structure=struct_8_neighbors)
# Was cell zero to begin with
was_zero = f2 == 0
# Update step
f2[has_neighbor & was_zero] = 2.