Numpy zeros 2d array: substituting elements at specific indices

Numpy zeros 2d array: substituting elements at specific indices - python

For a function I have to write again for CodeSignal, I create an 'empty' matrix with numpy called 'result'. During the course of a for loop, I want to add 1s to certain elements of this zeros matrix:
matrix = [[True, False, False],
[False, True, False],
[False, False, False]]
matrix = np.array(matrix) ## input matrix
(row, col) = matrix.shape
result = np.zeros((row,col), dtype=int) ## made empty matrix of same size
for i in range(0, row):
for j in range(0, col):
mine = matrix[i,j],[i,j]
if mine[0] == True: ##for indices in input matrix where element is called True..
result[i+1,j+1][i+1,j+1] = 1 ##..replace neighbouring elements with 1 (under construction ;) )
print(result)
My very first problem comes with the last part, substituting elements at given indices with another value.
E.g. result[1,1][1,1] = 1
I always get the error
TypeError: object does not support item assignment
and this happened after setting np.zeros to various object types - int32, int8, complex, float64...
If I try:
E.g. result[1,1][1,1] == 1
I get:
IndexError: invalid index to scalar variable.
So what is the way to change or add elements to 2d np arrays at specific locations?

It makes no sense t write:
matrix[i,j][i,j]
The matrix is a 2d array, so that means that matrix[i,j] is a scalar, not an array. Applying 0[i,j] is non-sensical.
You can implement this as:
for i in range(row-1):
for j in range(col-1):
if matrix[i,j]:
result[i+1,j+1] = 1
here you thus will "shift" the values of matrix one to the right, and one down. But then you better perform this with:
result[1:,1:] = matrix[:-1,:-1]
This then gives us:
>>> result
array([[0., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])

Related

Finding 2D boolean patterns in larger boolean tensors/arrays

I am looking for a way to find a 2D pattern in a MxNxR tensor/array with pytorch or numpy.
For instance, to see if a dictionary of tensor of boolean pattern (e.g. {6x6 : freq}) exist in a larger boolean tensor (e.g. 3x256x256).
Then I want to update my patterns and frequencies of the dictionary.
I was hoping that there was a pytorchi way of doing it, instead of having loops over it, or have an optimized loop for doing it.
As far as I know, torch.where works when we have a scalar value. I’m not sure how should I do, if I have a tensor of 6x6 instead of a value.
I looked into Finding Patterns in a Numpy Array , but I don't think that it's feasible to follow it for a 2D pattern.

I'm thinking maybe you can pull this off using convolutions. Let's imagine you have an input made up of 0 and 1. Here we will take a minimal example with an u=input of 3x3 and a 2x2 pattern:
>>> x = torch.tensor([[1., 0., 0.],
[0., 1., 0.],
[1., 0., 0.]])
And the pattern would be:
>>> pattern = torch.tensor([[1., 0.],
[0., 1.]])
Here the pattern can be found in the upper left corner of the input.
We perform a convolution with nn.functional.conv2d with 1 - pattern as the kernel.
>>> img, mask = x[None, None], pattern[None, None]
>>> M = F.conv2d(img, 1 - mask)
tensor([[[[0., 1.],
[2., 0.]]]])
There is a match if and only if the result is equal to the number of 1s in the pattern:
>>> M == mask.sum(dim=(2,3)))
tensor([[[[ True, False],
[False, False]]]])
You can deduce the frequencies from this final boolean mask. You can extend this method to multiple patterns by adding in kernels in your convolution.

How to index single element in ndarray using list?

I'm starting to learn numpy and can't understand the very basic thing. I have a list of indexes
in a multidimensional array (one for each axis). How can I set the value to the point in an array that corresponds to that index? Basically, how can I use idxs variable in the folllowing code and produce same result?
A = np.zeros((2, 2))
idxs = [1, 0]
A[1, 0] = 1
A
array([[0., 0.],
[1., 0.]])

Thanks to Ivan,
A[tuple(idxs)] = 1
works

numpy - Multidimensional boolean mask

I'm quite new to Python and numpy and I just cannot get this to work without manual iteration.
I have an n-dimensional data array with floating point values and an equally shaped boolean "mask" array. From that I need to get a new array in the same shape as the both others with all values from the data array where the mask array at the same position is True. Everything else should be 0.:
# given
data = np.array([[1., 2.], [3., 4.]])
mask = np.array([[True, False], [False, True]])
# target
[[1., 0.], [0., 4.]]
Seems like numpy.where() might offer this but I could not get it to work.
Bonus: Don't create new array but replace data values in-position where mask is False to prevent new memory allocation.
Thanks!

This should work
data[~mask] = 0
Numpy boolean array can be used as index (https://docs.scipy.org/doc/numpy-1.15.0/user/basics.indexing.html#boolean-or-mask-index-arrays). The operation will be applied only on pixels with the value "True". Here you first need to invert your mask so False becomes True. You need the inversion because you want to operate on pixels with a False value.

Also, you can just multiply them. Because 'True' and 'False' is treated as '1' and '0' respectively when a boolean array is input in mathematical operations. So,
#element-wise multiplication
data*mask
or
np.multiply(data, mask)

how to use vstack in a for loop in order to append csr_matrix matrices together

I am using the following piece of code in order to concatenate matrices of type csr_matrix together. It is based on How to flatten a csr_matrix and append it to another csr_matrix?
#! /usr/bin/python
# -*- coding: utf-8 -*-
import re, sys
import os
import numpy
from scipy.sparse import csr_matrix
from scipy.sparse import vstack
if __name__ == "__main__":
centroids = []
for i in range(0,3):
a = csr_matrix([[i,i,i]])
centroids = vstack((centroids, a), format='csr')
print "centroids : " + str(centroids.shape[0]) +" "+ str(centroids.shape[1])
As output I am getting
centroids : 4 3
The size of centroids should be 3 and not 4. Am I concatenating them correctly?
I tried the following just to see if I can ignore the first rows:
from sklearn.metrics.pairwise import euclidean_distances
matrix = euclidean_distances(centroids[1:][:], centroids[1:][:])
print matrix
[[ 0. 1.73205081 3.46410162]
[ 1.73205081 0. 1.73205081]
[ 3.46410162 1.73205081 0. ]]
It sounds ok to me.

Don't use vstack in a loop, since it's expensive to change the size and sparsity of the matrix in every iteration.
Instead do:
centroids = []
for i in range(3):
a = csr_matrix([[i, i, i]])
centroids.append(a)
centroids = vstack(centroids, format="csr")

It is the concatenation of [] with csr_matrix([[i,i,i]]) that is giving you the problem.
centroids = []
a = csr_matrix([[1,2,3]])
centroids = vstack((centroids, a), format='csr')
print centroids.toarray()
gives you
array([[ 0., 0., 0.],
[ 1., 2., 3.]])
So just start incrementing the counter from 1
centroids = []
for i in range(1,3):
a = csr_matrix([[i,i,i]])
centroids = vstack((centroids, a), format='csr')
By the way, stacking csr_matrices is really inefficient, as the sparsity of centroids keeps changing in every iteration. Perhaps, better if you store row, column and coefficients and then call sparse on them at once. Have a look here.

vstack is treating that initial centroids values as a 1 row matrix
In [1]: from scipy import sparse
In [2]: centroids = []
In [3]: a = sparse.csr_matrix([[0,0,0]])
In [4]: b=sparse.vstack((centroids,a),format='csr')
In [5]: b
Out[5]:
<2x3 sparse matrix of type '<class 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Row format>
In [6]: b.A
Out[6]:
array([[ 0., 0., 0.],
[ 0., 0., 0.]])
a is all zeros, so it is a csr with 0 stored elements. To make things more obvious make a with nonzero values:
In [7]: a = sparse.csr_matrix([[1,1,1]])
In [8]: b=sparse.vstack((centroids,a),format='csr')
In [9]: b
Out[9]:
<2x3 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [10]: b.A
Out[10]:
array([[ 0., 0., 0.],
[ 1., 1., 1.]])
You should have printed centroids after the iteration; the nature of the problem would have been a lot more obvious.
What you are doing is like:
In [12]: x=[0]
In [13]: for i in range(3): x.append(i)
In [14]: x
Out[14]: [0, 0, 1, 2]
Yes, you could use centroids by slicing off the 1st row, but that's a clumsy way of solving a more basic problem - the starting value of your iteration.
If I start with a centroids with 0 rows I can avoid the problem
In [30]: centroids = sparse.csr_matrix((0,3),dtype=int)
In [31]: b=sparse.vstack((centroids,a),format='csr')
In [32]: b
Out[32]:
<1x3 sparse matrix of type '<class 'numpy.int32'>'
with 3 stored elements in Compressed Sparse Row format>
In [33]: b.A
Out[33]: array([[1, 1, 1]])
If you must do iteration with something like sparse.vstack, make sure you start with a meaningful value.
But as others point out, building a sparse array by repeated vstack is an inefficient process.

Normalise 2D Numpy Array: Zero Mean Unit Variance

I have a 2D Numpy array, in which I want to normalise each column to zero mean and unit variance. Since I'm primarily used to C++, the method in which I'm doing is to use loops to iterate over elements in a column and do the necessary operations, followed by repeating this for all columns. I wanted to know about a pythonic way to do so.
Let class_input_data be my 2D array. I can get the column mean as:
column_mean = numpy.sum(class_input_data, axis = 0)/class_input_data.shape[0]
I then subtract the mean from all columns by:
class_input_data = class_input_data - column_mean
By now, the data should be zero mean. However, the value of:
numpy.sum(class_input_data, axis = 0)
isn't equal to 0, implying that I have done something wrong in my normalisation. By isn't equal to 0, I don't mean very small numbers which can be attributed to floating point inaccuracies.

Something like:
import numpy as np
eg_array = 5 + (np.random.randn(10, 10) * 2)
normed = (eg_array - eg_array.mean(axis=0)) / eg_array.std(axis=0)
normed.mean(axis=0)
Out[14]:
array([ 1.16573418e-16, -7.77156117e-17, -1.77635684e-16,
9.43689571e-17, -2.22044605e-17, -6.09234885e-16,
-2.22044605e-16, -4.44089210e-17, -7.10542736e-16,
4.21884749e-16])
normed.std(axis=0)
Out[15]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Numpy zeros 2d array: substituting elements at specific indices - python

Related

Finding 2D boolean patterns in larger boolean tensors/arrays

How to index single element in ndarray using list?

numpy - Multidimensional boolean mask

how to use vstack in a for loop in order to append csr_matrix matrices together

Normalise 2D Numpy Array: Zero Mean Unit Variance

Categories

Resources