Replacing original values - python

I have a numpy array y which I'm trying to preserve, however is getting replaced by the following operation:
ys = np.unique(y)
y2 = y
for i,val in enumerate(ys):
y2[y2==val]=i
Why is the original numpy array getting replaced by this operation? originally the ys were 1,5,7 and after the above operation np.unique(y) gives: 0,1,2

As already stated, y2 = y simply makes another reference to the underlying numpy array. As far as python is concerned, y2 and y are indistinguishable. You can even check y2 is y will return True and both arrays have the same id (memory location). As noted in the comments, you can make y2 a copy of y which does not share the same memory address:
y2 = y.copy()
Alternatively (and perhaps more efficient), you can rely on builtin numpy functions. In this case, I think that numpy.digitize might suit your needs:
np.digitize(y, np.unique(y)) - 1
Seems to do the trick.
>>> a = np.array([0, 0, 1, 2, 1, 3, 4, 5, 0, 10, 30])
>>> b = np.digitize(a, np.unique(a)) - 1
>>> b
array([0, 0, 1, 2, 1, 3, 4, 5, 0, 6, 7])

It's because when you do y2[y2==val]=i you're manipulating the original array y. Python doesn't copy np array's unless you explicitly tell it to as #John Galt mentioned.
Instead of doing y2 = y do y2 = y.copy(). This will create a copy of y and you'll be manipulating the copy instead of the original.

Related

how to overwrite only a portion of the array?

i have an array that contains labels, for instance
X1 = [1, 0, 1, 2, 3, 1, 3, 2, 3, 1, 0]
X1 = np.array(X1)
I also have an array X2 that contains the updated labels for label [1] in X1, for instance.
X2 = [-1, 1, -1, -1]
X2 = np.array(X2)
how to overwrite X1 for all labels equal to [1] to be X2?
The output should look like:
New_X1 = [-1, 0, 1, 2, 3, -1, 3, 2, 3, -1, 0]
I tried something like this:
New_X1 = [np.where(X1==1)]= X2
This obviously didn't work.
Any help, please.
Assuming that the lists you've written down are indeed NumPy arrays, you are not indexing into New_X1 properly. It should be:
New_X1[np.where(X1 == 1)] = X2
However, you can achieve the same thing with logical indexing instead. It's not only cleaner, but faster:
New_X1[X1 == 1] = X2
Here's a nice and concise way to accomplish your task through map():
New_X1 = list(map(lambda x1: x1 if x1 != 1 else X2.pop(0), X1))
EDIT:
I've seen you edited your post specifying that the sequences are np arrays: you can easily adapt this implementation to that case too.
Thank you all for your feedback. All correct.
Here is what i did, based on your feedback.
New_X1 = X1.copy()
New_X1[X1 == 1] = X2

Numpy double-slice assignment with integer indexing followed by boolean indexing

I already know that Numpy "double-slice" with fancy indexing creates copies instead of views, and the solution seems to be to convert them to one single slice (e.g. This question). However, I am facing this particular problem where i need to deal with an integer indexing followed by boolean indexing and I am at a loss what to do. The problem (simplified) is as follows:
a = np.random.randn(2, 3, 4, 4)
idx_x = np.array([[1, 2], [1, 2], [1, 2]])
idx_y = np.array([[0, 0], [1, 1], [2, 2]])
print(a[..., idx_y, idx_x].shape) # (2, 3, 3, 2)
mask = (np.random.randn(2, 3, 3, 2) > 0)
a[..., idx_y, idx_x][mask] = 1 # assignment doesn't work
How can I make the assignment work?
Not sure, but an idea is to do the broadcasting manually and adding the mask respectively just like Tim suggests. idx_x and idx_y both have the same shape (3,2) which will be broadcasted to the shape (6,6) from the cartesian product (3*2)^2.
x = np.broadcast_to(idx_x.ravel(), (6,6))
y = np.broadcast_to(idx_y.ravel(), (6,6))
# this should be the same as
x,y = np.meshgrid(idx_x, idx_y)
Now reshape the mask to the broadcasted indices and use it to select
mask = mask.reshape(6,6)
a[..., x[mask], y[mask]] = 1
The assignment now works, but I am not sure if this is the exact assignment you wanted.
Ok apparently I am making things complicated. No need to combine the indexing. The following code solves the problem elegantly:
b = a[..., idx_y, idx_x]
b[mask] = 1
a[..., idx_y, idx_x] = b
print(a[..., idx_y, idx_x][mask]) # all 1s
EDIT: Use #Kevin's solution which actually gets the dimensions correct!
I haven't tried it specifically on your sample code but I had a similar issue before. I think I solved it by applying the mask to the indices instead, something like:
a[..., idx_y[mask], idx_x[mask]] = 1
-that way, numpy can assign the values to the a array correctly.
EDIT2: Post some test code as comments remove formatting.
a = np.arange(27).reshape([3, 3, 3])
ind_x = np.array([[0, 0], [1, 2]])
ind_y = np.array([[1, 2], [1, 1]])
x = np.broadcast_to(ind_x.ravel(), (4, 4))
y = np.broadcast_to(ind_y.ravel(), (4, 4)).T
# x1, y2 = np.meshgrid(ind_x, ind_y) # above should be the same as this
mask = a[:, ind_y, ind_x] % 2 == 0 # what should this reshape to?
# a[..., x[mask], y[mask]] = 1 # Then you can mask away (may also need to reshape a or the masked x or y)

Replacing array at i`th dimension

Let's say I have a two-dimensional array
import numpy as np
a = np.array([[1, 1, 1], [2,2,2], [3,3,3]])
and I would like to replace the third vector (in the second dimension) with zeros. I would do
a[:, 2] = np.array([0, 0, 0])
But what if I would like to be able to do that programmatically? I mean, let's say that variable x = 1 contained the dimension on which I wanted to do the replacing. How would the function replace(arr, dimension, value, arr_to_be_replaced) have to look if I wanted to call it as replace(a, x, 2, np.array([0, 0, 0])?
numpy has a similar function, insert. However, it doesn't replace at dimension i, it returns a copy with an additional vector.
All solutions are welcome, but I do prefer a solution that doesn't recreate the array as to save memory.
arr[:, 1]
is basically shorthand for
arr[(slice(None), 1)]
that is, a tuple with slice elements and integers.
Knowing that, you can construct a tuple of slice objects manually, adjust the values depending on an axis parameter and use that as your index. So for
import numpy as np
arr = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
axis = 1
idx = 2
arr[:, idx] = np.array([0, 0, 0])
# ^- axis position
you can use
slices = [slice(None)] * arr.ndim
slices[axis] = idx
arr[tuple(slices)] = np.array([0, 0, 0])

Swapping side from an array

I have thousand of data in X and Y. I am trying to plot an interpolation graph but to start plotting it need to began from negative value.
x = [15000,14000,13000,12000,11000,0,-1000,-10000,-15000]
y = [1,1,1,1,1,0,-1,-1,-1]
How can i make it into this format
x = [-15000,-10000,-1000,0,11000,12000,13000,14000,15000]
y = [-1,-1,-1,0,1,1,1,1,1]
Try this:
x = x[::-1]
y = y[::-1]
I'd call this "reversing" a list, not "swapping" it, but you get the idea.
Assuming you really need to sort the x list, and also move around the y values in the same way the x values were permuted, take a look at this:
>>> x = [15000,14000,13000,12000,11000,0,-1000,-10000,-15000]
>>> y = [1,1,1,1,1,0,-1,-1,-1]
>>> x1, y1 = zip(*sorted(zip(x, y)))
>>> x1
(-15000, -10000, -1000, 0, 11000, 12000, 13000, 14000, 15000)
>>> y1
(-1, -1, -1, 0, 1, 1, 1, 1, 1)
So x1 and y1 are in the orders you want. But they're tuples instead of lists. If you need lists instead, then, e.g.,
x1, y1 = map(list, zip(*sorted(zip(x, y))))
is one way to do it.
Bwt if all you really need is to simply reverse the lists, then #OscarLopez's answer is much easier :-)

numpy ndarray slicing and iteration

I'm trying to slice and iterate over a multidimensional array at the same time. I have a solution that's functional, but it's kind of ugly, and I bet there's a slick way to do the iteration and slicing that I don't know about. Here's the code:
import numpy as np
x = np.arange(64).reshape(4,4,4)
y = [x[i:i+2,j:j+2,k:k+2] for i in range(0,4,2)
for j in range(0,4,2)
for k in range(0,4,2)]
y = np.array(y)
z = np.array([np.min(u) for u in y]).reshape(y.shape[1:])
Your last reshape doesn't work, because y has no shape defined. Without it you get:
>>> x = np.arange(64).reshape(4,4,4)
>>> y = [x[i:i+2,j:j+2,k:k+2] for i in range(0,4,2)
... for j in range(0,4,2)
... for k in range(0,4,2)]
>>> z = np.array([np.min(u) for u in y])
>>> z
array([ 0, 2, 8, 10, 32, 34, 40, 42])
But despite that, what you probably want is reshaping your array to 6 dimensions, which gets you the same result as above:
>>> xx = x.reshape(2, 2, 2, 2, 2, 2)
>>> zz = xx.min(axis=-1).min(axis=-2).min(axis=-3)
>>> zz
array([[[ 0, 2],
[ 8, 10]],
[[32, 34],
[40, 42]]])
>>> zz.ravel()
array([ 0, 2, 8, 10, 32, 34, 40, 42])
It's hard to tell exactly what you want in the last mean, but you can use stride_tricks to get a "slicker" way. It's rather tricky.
import numpy.lib.stride_tricks
# This returns a view with custom strides, x2[i,j,k] matches y[4*i+2*j+k]
x2 = numpy.lib.stride_tricks(
x, shape=(2,2,2,2,2,2),
strides=(numpy.array([32,8,2,16,4,1])*x.dtype.itemsize))
z2 = z2.min(axis=-1).min(axis=-2).min(axis=-3)
Still, I can't say this is much more readable. (Or efficient, as each min call will make temporaries.)
Note, my answer differs from Jaime's because I tried to match your elements of y. You can tell if you replace the min with max.

Categories