code that plots a classifier's decision boundary

code that plots a classifier's decision boundary - python

I'm having hard time to draw this... Can someone help me please
Make linspaces of grid_resolution points in xlim and grid_resolution points in ylim. e.g. For xlim=(-1, 1), ylim=(0, 2) and grid_resolution=3, make the linspace (-1, 0, 1) of x coordinates and the linspace (0, 1, 2) of y coordinates.
Use np.tile() to repeat the x grid points grid_resolution times (e.g. (-1, 0, 1, -1, 0, 1, -1, 0, 1)) and np.repeat() to repeat each of the y grid points grid_resolution times (e.g. (0, 0, 0, 1, 1, 1, 2, 2, 2)).
Use np.stack() to combine the x grid points and y grid points into a 2D array of size grid_resolution2 x 2. (e.g. [[-1, 0], [0, 0], [1, 0], [-1, 1], [0, 1], [1, 1], [-1, 2], [0, 2], [1, 2]] )
Make a dictionary keyed by -1 and 1 with values 'pink' and 'lightskyblue'.
Use clf.predict() on the 2D array of points to get predicted y values.
6.For each y in {-1, 1}, use plt.plot() to plot those points in your 2D array with that predicted y value in the color specified by your dictionary.
above is the requirements
def plot_decision_boundary(clf, xlim, ylim, grid_resolution):
"""Display how clf classifies each point in the space specified by xlim and ylim.
- clf is a classifier (already fit to data).
- xlim and ylim are each 2-tuples of the form (low, high).
- grid_resolution specifies the number of points into which the xlim is divided
and the number into which the ylim interval is divided. The function plots
grid_resolution * grid_resolution points."""
below are the test code
data_string = """
x0, x1, y
0, 0, -1
-1, 1, -1
1, -1, -1
0, 1, 1
1, 1, 1
1, 0, 1
"""
df = pd.read_csv(StringIO(data_string), sep='\s*,\s+', engine='python')
clf = svm.SVC(kernel="linear", C=1000)
clf.fit(df[['x0', 'x1']], df['y'])
# Call student's function.
plot_decision_boundary(clf=clf, xlim=(-4, 4), ylim=(-4, 4), grid_resolution=100)
# Add training examples to plot.
colors = {-1:'red', 1:'blue'}
for y in (-1, 1):
plt.plot(df.x0[df.y == y], df.x1[df.y == y], '.', color=colors[y])

Related

Calculating neighbor-distances for all grid points in NumPy

Imagine a 3-dimensional grid in space, where each grid point has a binary value.
The values of the grid points are represented by a 3d numpy array.
For each grid point that has a value of 1, we want to know where the nearest 0-valued point is located, in 14 different directions (±x, ±y, ±z, and 8 diagonals).
That means, with a numpy array of shape (nx, ny, nz) as input, the output should be an array of shape (nx, ny, nz, 14), where each value in the last dimension of the output corresponds to distance of that point to the nearest 0-valued neighbor in one direction (but we don't need to calculate this for 0-valued grid points, so their values can be set to zero).
What is the most efficient way to calculate this in numpy?
My current approach is looping over the grid points (three nested for-loops), and for each point, first checking whether it's a 1, and if so, slicing the array 14 times to get the points in one of the directions starting from the current point, and taking the index of first 0 element as distance to nearest 0-valued neighbor in that direction.

Here's a way that should work for all the non-diagonal cases. The example is just in 1d, but should work along each axis in 3d. This just computes the distance to the nearest 0 to the left:
a = np.array([0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1])
n = len(a)
zero_index = np.arange(0, n) * (1 - a)
one_index = np.arange(0, n) * a
(one_index - np.maximum.accumulate(zero_index)) * a
# -> returns array([0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 1, 2])
The idea is that zero_index has the integer index wherever a is 0, and one_index has the integer index wherever a is 1. np.maximum.accumulate does a cumulative maximum, which fills the zero values in zero_index with the values from the left. The final * a just sets the final distance to 0 wherever these was a 0.
This doesn't handle the case when a[0] is 1. We can fix that by offsetting the zero_index by 1 (so it starts with 1 if a[0] is 0):
a = np.array([1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1])
n = len(a)
zero_index = np.arange(1, n+1) * (1 - a)
one_index = np.arange(0, n) * a
(one_index - np.maximum.accumulate(zero_index) + 1) * a
# -> returns array([1, 2, 3, 4, 5, 0, 0, 0, 1, 2, 0, 1, 2])
You can handle the opposite direction with reverse indexing.
Unfortunately I can't propose a vectorized way of doing the diagonals. Perhaps there's a way (some kind of shape or roll) that skews the diagonal to make it aligned with an axis.

Function to sparsify a matrix given a specific block size

Problem Statement
I am trying to write a function that would sparsify a matrix given a target sparsity and an argument called block_shape which defines the minimum size of zeros block in the matrix. The target doesn't have to be met perfectly, but as close as possible.
For example, given the following arguments,
>>> matrix = [
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]
]
>>> target = 0.5
>>> block_shape = (2, 2)
valid outputs of 50% sparsity could be
>>> sparse_matrix = sparsify(matrix, target, block_shape)
>>> sparse_matrix
[
[1, 1, 0, 0],
[1, 1, 0, 0],
[0, 0, 1, 1],
[0, 0, 1, 1]
]
>>> sparse_matrix = sparsify(matrix, target, block_shape)
>>> sparse_matrix
[
[1, 0, 0, 1],
[1, 0, 0, 1],
[0, 0, 1, 1],
[0, 0, 1, 1]
]
Note that there could be multiple valid sparsified versions of the input. The only criteris is to get to the target as much as possible. One of the constraints is that only the zeros of shape block_size are considered to be sparse.
For example, the matrix below has a sparsity level of 0%, given the arguments
>>> sparse_matrix = sparsify(matrix, target, block_shape)
>>> sparse_matrix
[
[1, 0, 0, 1],
[1, 1, 0, 0],
[0, 1, 1, 1],
[0, 0, 0, 0]
]
What I have so far
Currently, I have the following piece of code
import numpy as np
def sparsify(matrix, target, block_shape=None):
if block_shape is None or block_shape == 1 or block_shape == (1,) or block_shape == (1, 1):
# 1x1 is just bernoulli with p=target
probs = np.random.uniform(size=matrix.shape)
mask = np.zeros(matrix.shape)
mask[probs >= target] = 1.0
else:
if isinstance(block_shape, int):
block_shape = (block_shape, block_shape)
if len(block_shape) == 1:
block_shape = (block_shape[0], block_shape[0])
mask = np.ones(matrix.shape)
rows, cols = matrix.shape
for row in range(rows):
for col in range(cols):
submask = mask[row:row+block_shape[0], col:col+block_shape[1]]
if submask.shape != block_shape:
# we don't care about the edges, cannot partially sparsify
continue
if (submask == 0).any():
# If current (row, col) is already in the sparsified area, skip
continue
prob = np.random.random()
if prob < target:
submask[:, :] = np.zeros(submask.shape)
return matrix * mask, mask
The problem with the code above is that it does not match the target if the block size is not (1, 1)
>>> matrix = np.random.randn(100, 100)
>>> matrix, mask = sparsify(matrix, target=0.5, block_shape=(2, 2))
>>> print((matrix == 0).mean())
0.73
>>> print((mask == 0).mean())
0.73
Reason for discrepancy (I think)
I am not sure why I am not getting the target I expect, but I think it has something to do with the fact that I check the probability of every element, instead of the block as a whole. However, I have skipping conditions in my code, so I thought that should cover it
Edits
Edit 1 -- additional examples
Just giving some more examples.
Example 1: Given different block size
>>> sparse_matrix = sparsify(matrix, 0.25, (3, 3))
>>> sparse_matrix
[
[0, 0, 0, 1],
[0, 0, 0, 1],
[0, 0, 0, 1],
[1, 1, 1, 1]
]
The example above is a valid sparse matrix, although the level of sparsity is not 25%, another valid result could be a matrix of all 1's.
Example 2: Given a different block size and target
>>> sparse_matrix = sparsify(matrix, 0.6, (1, 2))
>>> sparse_matrix
[
[0, 0, 0, 0],
[1, 0, 0, 1],
[0, 0, 1, 1],
[1, 1, 0, 0]
]
Notice that all zeroes can be put in blocks of (1, 2), and the sparsity level = 60%
Edit 2 -- forgot a constraint
Another constraint that I forgot to mention, but tried incorporating into my code is that the zero blocks must be non-overlapping.
Example 1: The result below is NOT valid
>>> sparse_matrix = sparsify(matrix, 0.5, (2, 2))
>>> sparse_matrix
[
[0, 0, 1, 1],
[0, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 1]
]
Although the blocks starting at index (0, 0) and (1, 1) have valid zero-shapes, the result does not meet the requirements. The reason is that only one of those blocks can be considered valid. if we label the zero blocks as z0 and z1, here is what this matrix is:
[
[z0, z0, 1, 1],
[z0, z0, z1, 1],
[ 1, z1, z1, 1],
[ 1, 1, 1, 1]
]
element at (1, 1) can be treated as belonging to z0 or z1. That means that there is only one sparse block, which makes the level of sparsity at 25% (not ~44%).

The probability of becoming 0 is not all equal.
For example: block_shape (2, 2), matrix(0, 0) becoming 0 has probability of target since the loop only passes through once. matrix(1, 0) has probability more than target since the loop passes it twice. similarly, matrix(1, 1) has probability more than (1, 0) because the loop sees it four times at (0, 0), (1, 0), (0, 1), (1, 1).
This also happens in the middle of the matrix due to prior loop operations.
So the main variable affecting the result is the block_shape.
I've been fiddling around for a bit and here's an alternative way using while loop instead of for loop. Simulating through until you reach target probability within err. You just need to watch out for inf loop due to too small err.
import numpy as np
def sparsify(matrix, target, block_shape=None):
if block_shape is None or block_shape == 1 or block_shape == (1,) or block_shape == (1, 1):
# 1x1 is just bernoulli with p=target
probs = np.random.uniform(size=matrix.shape)
mask = np.zeros(matrix.shape)
mask[probs >= target] = 1.0
else:
if isinstance(block_shape, int):
block_shape = (block_shape, block_shape)
if len(block_shape) == 1:
block_shape = (block_shape[0], block_shape[0])
mask = np.ones(matrix.shape)
rows, cols = matrix.shape
# vars for probability check
total = float(rows * cols)
zero_cnt= total - np.count_nonzero(matrix)
err = 0.005 # .5%
# simulate until we reach target probability range
while not target - err < (zero_cnt/ total) < target + err:
# pick a random point in the matrix
row = np.random.randint(rows)
col = np.random.randint(cols)
# submask = mask[row:row + block_shape[0], col:col + block_shape[1]]
submask = matrix[row:row + block_shape[0], col:col + block_shape[1]]
if submask.shape != block_shape:
# we don't care about the edges, cannot partially sparsify
continue
if (submask == 0).any():
# If current (row, col) is already in the sparsified area, skip
continue
# need more 0s to reach target probability range
if zero_cnt/ total < target - err:
matrix[row:row + block_shape[0], col:col + block_shape[1]] = 0
# need more 1s to reach target probability range
else:
matrix[row:row + block_shape[0], col:col + block_shape[1]] = 1
# update 0 count
zero_cnt = total - np.count_nonzero(matrix)
return matrix * mask, mask
note.
Didn't check for any optimization or code refactoring.
Didn't use the mask var. Worked on the matrix directly.
matrix = np.ones((100, 100))
matrix, mask = sparsify(matrix, target=0.5, block_shape=(2, 2))
print((matrix == 0).mean())
# prints somewhere between target - err and target + err
# likely to see a lower value in the range since we're counting up (0s)

Scanning for groups of the same value in numpy array

I have a numpy array where 0 denotes empty space and 1 denotes that a location is filled. I am trying to find a quick method of scanning the numpy array for where there are multiple values of zero adjacent to each other and return the location of the central zero.
For Example if I had the following array
[0 1 0 1]
[0 0 0 1]
[0 1 0 1]
[1 1 1 1]
I want to return the locations for which there is an adjacent zero on either side of a central zero
e.g
[1,1]
as this is the central of 3 zeros, i.e there is a zero either side of the zero at this location
Im aware that this can be calculated using if statements, but wondered if there was a more pythonic way of doing this.
Any help is greatly appreciated

The desired output here for arbitrary inputs is not exhaustively specified in the question, but here is a possible approach that might be useful for this kind of problem, and adapted to the details of the desired output. It uses np.cumsum, np.bincount, np.where, and np.median to find the middle index for groups of consecutive zeros along rows of a 2D array:
import numpy as np
def find_groups(x, min_size=3, value=0):
# Compute a sequential label for groups in each row.
xc = (x != value).cumsum(1)
# Count the number of occurances per group in each row.
counts = np.apply_along_axis(
lambda x: np.bincount(x, minlength=1 + xc.max()),
axis=1, arr=xc)
# Filter by minimum number of occurances.
i, j = np.where(counts >= min_size)
# Compute the median index of each group.
return [
(ii, int(np.ceil(np.median(np.where(xc[ii] == jj)[0]))))
for ii, jj in zip(i, j)
]
x = np.array([[0, 1, 0, 1],
[0, 0, 0, 1],
[0, 1, 0, 1],
[1, 1, 1, 1]])
print(find_groups(x))
# [(1, 1)]
It should work properly even for multiple rows with groups of varying sizes, and even multiple groups per row:
x2 = np.array([[0, 1, 0, 1, 1, 1, 1],
[0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0]])
print(find_groups(x2))
# [(1, 1), (1, 5), (2, 3), (3, 3)]

How can I display displacement and node numbers on a 3D truss?

I am trying to show displacement on a 3D truss example however I am running into an error.I have simplified my code below.I am able to show displacement on a 2D problem however I am unable on a 3D problem.I am also trying to show the node numbers at each node.I managed to put the nodes(green color) however the numbers are not showing even after i used the "plt.annotate" command.Can someone help me get the displacement and node numbers to show?Thank you in advance.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import sys
np.set_printoptions(threshold=sys.maxsize)
def plot_truss(nodes, elements, areas,forces):
# plot nodes in 3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = [i[0] for i in nodes.values()]
y = [i[1] for i in nodes.values()]
z = [i[2] for i in nodes.values()]
# size = 400
# ax.scatter(x, y, z, c='r', marker='o', s=size, zorder=5)
size = 400
offset = size / 4000
ax.scatter(x, y, z, c='y', s=size, zorder=5)
for i, location in enumerate(zip(x, y, z)):
plt.annotate(i + 1, (location[0] - offset, location[1] - offset), zorder=10)
# plot elements in 3d
for element in elements:
fromPoint = np.array(nodes[elements[element][0]])
toPoint = np.array(nodes[elements[element][1]])
x1 = fromPoint[0]
y1 = fromPoint[1]
z1 = fromPoint[2]
x2 = toPoint[0]
y2 = toPoint[1]
z2 = toPoint[2]
ax.plot([x1, x2], [y1, y2], zs=[z1, z2], c='b', linestyle='-', linewidth=5*areas[element], zorder=1)
nodes = {1: [0, 10, 0], 2: [0, 0, 0], 3: [10, 5, 0], 4: [0, 10, 10]}
areas = {1: 1.0, 2: 2.0, 3: 2.0}
elements = {1: [1, 3], 2: [2, 3], 3: [4, 3]}
forces = {1: [0, 0, 0], 2: [0, 0, 0], 3: [0, -200, 0], 4: [0, 0, 0]}
disps = {1: [0, 0, 0], 2: [0, 0, 0], 3: [ 3, -2, 4], 4: [0, 0, 0]}
def plt_displacement(nodes,elements,disps color="red"):
nodes_disp = np.copy(nodes)
nodes_disp[:, 0] += disp[::2, 0]
nodes_disp[:, 1] += disp[1::2, 0]
plt.scatter(nodes_disp[:, 0], nodes_disp[:, 1], color=color)
for e in elements:
x_tmp = [nodes_disp[e[0], 0], nodes_disp[e[1], 0]]
y_tmp = [nodes_disp[e[0], 1], nodes_disp[e[1], 1]]
plt.plot(x_tmp, y_tmp, color=color)
plt_displacement(nodes,elements,disps)
plot_truss(nodes, elements, areas, forces)
plt.show()
when i run the code I am getting the error below;
<ipython-input-47-758895b259be> in plt_displacement(elements, nodes, disp, color)
31 def plt_displacement(elements, nodes, disp, color="red"):
32 nodes_disp = np.copy(nodes)
---> 33 nodes_disp[:, 0] += disp[::2, 0]
34 nodes_disp[:, 1] += disp[1::2, 0]
35 plt.scatter(nodes_disp[:, 0], nodes_disp[:, 1], color=color)
IndexError: too many indices for array

It looks like you may have switched “nodes” and “elements” in your call to plt_displacement() (3rd and 12th to last lines) vs your definition.
plt_displacement(nodes,elements,disps)
def plt_displacement(elements, nodes, disp, color="red"):
I’m not sure exactly what plt_displacement is supposed to do. But looking at nodes_disp it is an array of no shape, so slicing won’t work.
>>> nodes_disp = np.copy(nodes)
>>> nodes_disp
array({1: [0, 10, 0], 2: [0, 0, 0], 3: [10, 5, 0], 4: [0, 10, 10]}, dtype=object)
>>> nodes_disp.shape
()
You can change the values to an array and slice it like this:
>>> npdisp = np.copy(list(disps.values()))
>>> nodes_disp
array([[ 0, 10, 0],
[ 0, 0, 0],
[10, 5, 0],
[ 0, 10, 10]])
But I’m not sure if that’s your intent.
Like wise you’d have to change the type of disp to an array in order to slice it, as it is a dictionary

Generate image matrix from Freeman chain code

Suppose I have a 8-direction freeman chain code as follows, in a python list:
freeman_code = [3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5]
Where directions would be defined as follows:
I need to convert this to an image matrix of variable dimensions with valules of 1s and 0s where 1s would depict the shape, as follows, for example:
image_matrix = [
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 1, 1]
]
Of course, the above is not an exact implementation of the above freeman code. Is there any implementation in python, or in any language that achieves this?
My idea (in python):
Use a defaultdict of defaultdicts with 0 as default:
ImgMatrixDict = defaultdict(lambda: defaultdict(lambda:0))
and then start at a midpoint, say ImgMatrixDict[25][25], and then change values to 1 depending on the freeman code values as I traverse. Afte tis I would convert ImgMatrixDict to a list of lists.
Is this a viable idea or are there any existing libraries or suggestions to implement this? Any idea/pseudo-code would be appreciated.
PS: On performance, yes it would not be important as I won't be doing this in realtime, but generally a code would be around 15-20 charactors in length. I assumed a 50*50 by matrix would suffice for this purpose.

If I am understanding your question correctly:
import numpy as np
import matplotlib.pyplot as plt
freeman_code = [3, 3, 3, 6, 6, 4, 6, 7, 7, 0, 0, 6]
img = np.zeros((10,10))
x, y = 4, 4
img[y][x] = 1
for direction in freeman_code:
if direction in [1,2,3]:
y -= 1
if direction in [5,6,7]:
y += 1
if direction in [3,4,5]:
x -= 1
if direction in [0,1,7]:
x += 1
img[y][x] = 1
plt.imshow(img, cmap='binary', vmin=0, vmax=1)
plt.show()

Here is a solution in python. A dictionary is not adapted to this problem, you would better use a list of list to simulate the table.
D = 10
# DY, DX
FREEMAN = [(0, 1), (-1, 1), (-1, 0), (-1, -1), (0, -1), (1, -1), (1, 0), (1, 1)]
freeman_code = [3, 3, 3, 3, 6, 6, 6, 6, 0, 0, 0, 0]
image = [[0]*D for x in range(D)]
y = D/2
x = D/2
image[y][x] = 1
for i in freeman_code:
dy, dx = FREEMAN[i]
y += dy
x += dx
image[y][x] = 1
print("freeman_code")
print(freeman_code)
print("image")
for line in image:
strline = "".join([str(x) for x in line])
print(strline)
>0000000000
>0100000000
>0110000000
>0101000000
>0100100000
>0111110000
>0000000000
>0000000000
>0000000000
>0000000000
Note that the image creation is a condensed expression of:
image = []
for y in range(D):
line = []
for x in range(D):
line.append(0)
image.append(line)
If one day, you need better performance for bigger images, there are solutions using numpy Library but requiring a good knowledge of basic python. Here is an example:
import numpy as np
D = 10
# DY, DX
FREEMAN = [(0, 1), (-1, 1), (-1, 0), (-1, -1), (0, -1), (1, -1), (1, 0), (1, 1)]
DX = np.array([1, 1, 0, -1, -1, -1, 0, 1])
DY = np.array([0, -1, -1, -1, 0, 1, 1, 1])
freeman_code = np.array([3, 3, 3, 3, 6, 6, 6, 6, 0, 0, 0, 0])
image = np.zeros((D, D), int)
y0 = D/2
x0 = D/2
image[y0, x0] = 1
dx = DX[freeman_code]
dy = DY[freeman_code]
xs = np.cumsum(dx)+x0
ys = np.cumsum(dy)+y0
print(xs)
print(ys)
image[ys, xs] = 1
print("freeman_code")
print(freeman_code)
print("image")
print(image)
Here, all loops built with 'for' on previous solution are fast-processed in C.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

code that plots a classifier's decision boundary - python

Related

Calculating neighbor-distances for all grid points in NumPy

Function to sparsify a matrix given a specific block size

Scanning for groups of the same value in numpy array

How can I display displacement and node numbers on a 3D truss?

Generate image matrix from Freeman chain code

Categories

Resources