Replace elements in Numpy array by value and location - python

I am working on a program which will create contour data out of numpy arrays, and trying to avoid calls to matplotlib.
I have an array of length L which contains NxN arrays of booleans. I want to convert this into an LxNxN array where, for example, the "True"s in the first inner array get replaced by "red", in the second, by "blue" and so forth.
The following code works as expected:
import numpy as np
import pdb
def new_layer(N,p):
return np.random.choice(a=[False,True],size=(N,N),p=[p,1-p])
a = np.array([new_layer(3,0.5),new_layer(3,0.5),new_layer(3,0.5)]).astype('object')
colors = np.array(["red","green","blue"])
for i in range(np.shape(a)[0]):
b = a[i]
b[np.where(b==True)] = colors[i]
a[i] = b
print(a)
But I am wondering if there is a way to accomplish the same using Numpy's built-in tools, e.g., indexing. I am a newcomer to Numpy and I suspect there is a better way to do this but I can't think what it would be. Thank you.

You could use np.copyto:
np.copyto(a, colors[:, None, None], where=a.astype(bool))

Here's one way -
a_bool = a.astype(bool)
a[a_bool] = np.repeat(colors,a_bool.sum((1,2)))
Another with extending colors to 3D -
a_bool = a.astype(bool)
colors3D = np.broadcast_to(colors[:,None,None],a.shape)
a[a_bool] = colors3D[a_bool]

You can use a combination of boolean indexes and np.indices. Also you can use a as index to itself. Then you could do what you did in the for loop with this line (although I don't think it necessarily is a good idea):
a[a.astype(bool)] = colors[np.indices(a.shape)[0][a.astype(bool)]]
Also, for the new_layer function you could just use np.random.rand(N,N) > p (not sure if the actual distribution will be exactly the same as what you had).

Related

Fill a portion of a list in Python / Equivalent of std::fill

I was just wondering if there is a clean solution in python to filling a portion of a list with some value (apart from simply iterating over the sublist). E.g., in C++ I would use std::fill. So far, I found the following syntax:
x = [0]*10 # some array
x[2:5] = [7]*3
A solution using numpy would be fine as well.
You can use np.repeat:
import numpy as np
x = np.repeat(0, 10)
x[2:5] = np.repeat(7, 3)
Each class has its own methods. For numpy`
x = np.zeros(10, int)
makes an array of zeros.
x[2:7] = 3
assigns 3 to a portion of it.
That's similar to your list example, but critically different in some ways. List slice assignment is different from numpy's.

How to get arrays that ouput result in brackets like [1][2][3] to [1 2 3]

The title kind of says it all. I have this (excerpt):
import numpy as np
import matplotlib.pyplot as plt
number_of_particles=1000
phi = np.arccos(1-2*np.random.uniform(0.0,1.,(number_of_particles,1)))
vc=2*pi
mux=-vc*np.sin(phi)
and I get out
[[-4.91272413]
[-5.30620302]
[-5.22400513]
[-5.5243784 ]
[-5.65050497]...]
which is correct, but I want it to be in the format
[-4.91272413 -5.30620302 -5.22400513 -5.5243784 -5.65050497....]
Feel like there should be a simple solution, but I couldn't find it.
Suppose your array is represented by the variable arr.
You can do,
l = ''
for i in arr:
l = l+i+' '
arr = [l]
Use this command:
new_mux = [i[0] for i in mux]
But I need it in an array, so then I add this
new_mux=np.array(new_mux)
and I get the desired output.
There's a method transpose in numpy's array object
mux.transpose()[0]
(I just noticed that this is a very old question, but since I have typed up this answer, and I believe it is simpler and more efficient than the existing ones, I'll post it...)
Notice that when you do
np.random.uniform(0.0,1.,(number_of_particles, 1))
you are creating a two-dimensional array with number_of_particles rows and one column. If you want a one-dimensional array throughout, you could do
np.random.uniform(0.0,1.,(number_of_particles,))
instead.
If you want to keep things 2d, but reshape mux for some reason, you can... well, reshape it:
mux_1d = mux.reshape(-1)
-1 here means "reshape it to one axis (because there’s just one number) and figure out automatically home many elements there should be along that axis (because the number is -1)."

Piecewise Operation on List of Numpy Arrays

My question is, can I make a function or variable that can perform an on operation or numpy method on each np.array element within a list in a more succinct way than what I have below (preferably by just calling one function or variable)?
Generating the list of arrays:
import numpy as np
array_list = [np.random.rand(3,3) for x in range(5)]
array_list
Current Technique of operating on each element:
My current method (as seen below) involves unpacking it and doing something to it:
[arr.std() for arr in array_list]
[arr + 2 for arr in array_list]
Goal:
My hope it to get something that could perform the operations above by simply typing:
x.std()
or
x +2
Yes - use an actual NumPy array and perform your operations over the desired axes, instead of having them stuffed in a list.
actual_array = np.array(array_list)
actual_array.std(axis=(1, 2))
# array([0.15792346, 0.25781021, 0.27554279, 0.2693581 , 0.28742179])
If you generally wanted all axes except the first, this could be something like tuple(range(1, actual_array.ndim)) instead of explicitly specifying the tuple.

Defining a matrix with unknown size in python

I want to use a matrix in my Python code but I don't know the exact size of my matrix to define it.
For other matrices, I have used np.zeros(a), where a is known.
What should I do to define a matrix with unknown size?
In this case, maybe an approach is to use a python list and append to it, up until it has the desired size, then cast it to a np array
pseudocode:
matrix = []
while matrix not full:
matrix.append(elt)
matrix = np.array(matrix)
You could write a function that tries to modify the np.array, and expand if it encounters an IndexError:
x = np.random.normal(size=(2,2))
r,c = (5,10)
try:
x[r,c] = val
except IndexError:
r0,c0 = x.shape
r_ = r+1-r0
c_ = c+1-c0
if r > 0:
x = np.concatenate([x,np.zeros((r_,x.shape[1]))], axis = 0)
if c > 0:
x = np.concatenate([x,np.zeros((x.shape[0],c_))], axis = 1)
There are problems with this implementation though: First, it makes a copy of the array and returns a concatenation of it, which translates to a possible bottleneck if you use it many times. Second, the code I provided only works if you're modifying a single element. You could do it for slices, and it would take more effort to modify the code; or you can go the whole nine yards and create a new object inheriting np.array and override the .__getitem__ and .__setitem__ methods.
Or you could just use a huge matrix, or better yet, see if you can avoid having to work with matrices of unknown size.
If you have a python generator you can use np.fromiter:
def gen():
yield 1
yield 2
yield 3
In [11]: np.fromiter(gen(), dtype='int64')
Out[11]: array([1, 2, 3])
Beware if you pass an infinite iterator you will most likely crash python, so it's often a good idea to cap the length (with the count argument):
In [21]: from itertools import count # an infinite iterator
In [22]: np.fromiter(count(), dtype='int64', count=3)
Out[22]: array([0, 1, 2])
Best practice is usually to either pre-allocate (if you know the size) or build the array as a list first (using list.append). But lists don't build in 2d very well, which I assume you want since you specified a "matrix."
In that case, I'd suggest pre-allocating an oversize scipy.sparse matrix. These can be defined to have a size much larger than your memory, and lil_matrix or dok_matrix can be built sequentially. Then you can pare it down once you enter all of your data.
from scipy.sparse import dok_matrix
dummy = dok_matrix((1000000, 1000000)) # as big as you think you might need
for i, j, data in generator():
dummy[i,j] = data
s = np.array(dummy.keys).max() + 1
M = dummy.tocoo[:s,:s] #or tocsr, tobsr, toarray . . .
This way you build your array as a Dictionary of Keys (dictionaries supporting dynamic assignment much better than ndarray does) , but still have a matrix-like output that can be (somewhat) efficiently used for math, even in a partially built state.

numpy.ndarray sent as argument doesn't need loop for iteration?

In this code np.linspace() assigns to inputs 200 evenly spaced numbers from -20 to 20.
This function works. What I am not understanding is how could it work. How can inputs be sent as an argument to output_function() without needing a loop to iterate over the numpy.ndarray?
def output_function(x):
return 100 - x ** 2
inputs = np.linspace(-20, 20, 200)
plt.plot(inputs, output_function(inputs), 'b-')
plt.show()
numpy works by defining operations on vectors the way that you really want to work with them mathematically. So, I can do something like:
a = np.arange(10)
b = np.arange(10)
c = a + b
And it works as you might hope -- each element of a is added to the corresponding element of b and the result is stored in a new array c. If you want to know how numpy accomplishes this, it's all done via the magic methods in the python data model. Specifically in my example case, the __add__ method of numpy's ndarray would be overridden to provide the desired behavior.
What you want to use is numpy.vectorize which behaves similarly to the python builtin map.
Here is one way you can use numpy.vectorize:
outputs = (np.vectorize(output_function))(inputs)
You asked why it worked, it works because numpy arrays can perform operations on its array elements en masse, for example:
a = np.array([1,2,3,4]) # gives you a numpy array of 4 elements [1,2,3,4]
b = a - 1 # this operation on a numpy array will subtract 1 from every element resulting in the array [0,1,2,3]
Because of this property of numpy arrays you can perform certain operations on every element of a numpy array very quickly without using a loop (like what you would do if it were a regular python array).

Categories