make a numpy array with shape and offset argument in another style - python

I wanted to access my array both as a 3-element entity (3d position) and individual element (each of x,y,z coordinate).
After some researching, I ended up doing the following.
>>> import numpy as np
>>> arr = np.zeros(5, dtype={'pos': (('<f8', (3,)), 0),
'x': (('<f8', 1), 0),
'y': (('<f8', 1), 8),
'z': (('<f8', 1), 16)})
>>> arr["x"] = 0
>>> arr["y"] = 1
>>> arr["z"] = 2
# I can access the whole array by "pos"
>>> print(arr["pos"])
>>> array([[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.]])
However, I've always been making array in this style:
>>> arr = np.zeros(10, dtype=[("pos", "f8", (3,))])
But I can't find a way to specify both the offset and the shape of the element at the same time in this style. Is there a way to do this?

In reference to the docs page, https://docs.scipy.org/doc/numpy-1.14.0/reference/arrays.dtypes.html
you are using the fields dictionary form, with (data-type, offset) value
{'field1': ..., 'field2': ..., ...}
dt1 = {'pos': (('<f8', (3,)), 0),
'x': (('<f8', 1), 0),
'y': (('<f8', 1), 8),
'z': (('<f8', 1), 16)}
The display for the resulting dtype is the other dictionary format:
{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}
In [15]: np.dtype(dt1)
Out[15]: dtype({'names':['x','pos','y','z'],
'formats':['<f8',('<f8', (3,)),'<f8','<f8'],
'offsets':[0,0,8,16], 'itemsize':24})
In [16]: np.dtype(dt1).fields
Out[16]:
mappingproxy({'pos': (dtype(('<f8', (3,))), 0),
'x': (dtype('float64'), 0),
'y': (dtype('float64'), 8),
'z': (dtype('float64'), 16)})
offsets aren't mentioned any where else on the documentation page.
The last format is a union type. It's a little unclear as to whether that's allowed or discouraged. The examples don't seem to work. There have been some changes in how multifield indexing works, and that may have affected this.
Let's play around with various ways of viewing the array:
In [25]: arr
Out[25]:
array([(0., [ 0. , 10. , 0. ], 10., 0. ),
(1., [ 1. , 11. , 0.1], 11., 0.1),
(2., [ 2. , 12. , 0.2], 12., 0.2),
(3., [ 3. , 13. , 0.3], 13., 0.3),
(4., [ 4. , 14. , 0.4], 14., 0.4)],
dtype={'names':['x','pos','y','z'], 'formats':['<f8',('<f8', (3,)),'<f8','<f8'], 'offsets':[0,0,8,16], 'itemsize':24})
In [29]: dt3=[('x','<f8'),('y','<f8'),('z','<f8')]
In [30]: np.dtype(dt3)
Out[30]: dtype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
In [31]: np.dtype(dt3).fields
Out[31]:
mappingproxy({'x': (dtype('float64'), 0),
'y': (dtype('float64'), 8),
'z': (dtype('float64'), 16)})
In [32]: arr.view(dt3)
Out[32]:
array([(0., 10., 0. ), (1., 11., 0.1), (2., 12., 0.2), (3., 13., 0.3),
(4., 14., 0.4)], dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
In [33]: arr['pos']
Out[33]:
array([[ 0. , 10. , 0. ],
[ 1. , 11. , 0.1],
[ 2. , 12. , 0.2],
[ 3. , 13. , 0.3],
[ 4. , 14. , 0.4]])
In [35]: arr.view('f8').reshape(5,3)
Out[35]:
array([[ 0. , 10. , 0. ],
[ 1. , 11. , 0.1],
[ 2. , 12. , 0.2],
[ 3. , 13. , 0.3],
[ 4. , 14. , 0.4]])
In [37]: arr.view(dt4)
Out[37]:
array([([ 0. , 10. , 0. ],), ([ 1. , 11. , 0.1],),
([ 2. , 12. , 0.2],), ([ 3. , 13. , 0.3],),
([ 4. , 14. , 0.4],)], dtype=[('pos', '<f8', (3,))])
In [38]: arr.view(dt4)['pos']
Out[38]:
array([[ 0. , 10. , 0. ],
[ 1. , 11. , 0.1],
[ 2. , 12. , 0.2],
[ 3. , 13. , 0.3],
[ 4. , 14. , 0.4]])

Related

Inserting complex functions in a python code

I have been trying to insert $e^ix$ as matrix element.
The main aim is to find the eigenvalue of a matrix which has many complex functions as elements. Can anyone help me how to insert it? My failed attempt is below:
for i in range(0,size):
H[i,i]=-2*(cmath.exp((i+1)*aj))
H[i,i+1]=1.0
H[i,i-1]=1.0
'a' is defined earlier in the program. The error flagged shows that aj is not defined. Using cmath I thought a complex number can be expontiated as (x+yj). Unfortunately, I couldn't figure out the right way to use it. Any help would be appreciated
Define a small float array:
In [214]: H = np.eye(3)
In [215]: H
Out[215]:
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Create a complex number:
In [216]: 1+3j
Out[216]: (1+3j)
In [217]: np.exp(1+3j)
Out[217]: (-2.6910786138197937+0.383603953541131j)
Trying to assign it to H:
In [218]: H[1,1]=np.exp(1+3j)
<ipython-input-218-6c0b228d2833>:1: ComplexWarning: Casting complex values to real discards the imaginary part
H[1,1]=np.exp(1+3j)
In [219]: H
Out[219]:
array([[ 1. , 0. , 0. ],
[ 0. , -2.69107861, 0. ],
[ 0. , 0. , 1. ]])
Now make an complex dtype array:
In [221]: H = np.eye(3).astype( complex)
In [222]: H[1,1]=np.exp(1+3j)
In [223]: H
Out[223]:
array([[ 1. +0.j , 0. +0.j ,
0. +0.j ],
[ 0. +0.j , -2.69107861+0.38360395j,
0. +0.j ],
[ 0. +0.j , 0. +0.j ,
1. +0.j ]])
edit
For an array of values:
In [225]: a = np.array([1,2,3])
In [226]: np.exp(a+1j*a)
Out[226]:
array([ 1.46869394+2.28735529j, -3.07493232+6.7188497j ,
-19.88453084+2.83447113j])
In [228]: H[:,0]=np.exp(a+1j*a)
In [229]: H
Out[229]:
array([[ 1.46869394+2.28735529j, 0. +0.j ,
0. +0.j ],
[ -3.07493232+6.7188497j , -2.69107861+0.38360395j,
0. +0.j ],
[-19.88453084+2.83447113j, 0. +0.j ,
1. +0.j ]])

How to convert numpy array of lists into array of tuples

I am trying to convert my array of lists into an array of tuples.
results=
array([[1. , 0.0342787 ],
[0. , 0.04436508],
[1. , 0.09101833 ],
[0. , 0.03492954],
[1. , 0.06059857]])
results1=np.empty((5,), dtype=object)
results1[:] = np.array([tuple(i) for i in results])
results1
I tried the above following the advice given here but I get the error ValueError: could not broadcast input array from shape (5,2) into shape (5).
How do I create a numpy array of tuples from a numpy array of lists?
Try this, in order to get an array of tuples as mentioned in title:
import numpy as np
results = np.array([[1. , 0.0342787 ],
[0. , 0.04436508],
[1. , 0.09101833],
[0. , 0.03492954],
[1. , 0.06059857]])
temp = []
for item in results:
temp.append(tuple(item))
results1= np.empty(len(temp), dtype=object)
results1[:] = temp
print(results1)
# array([(1.0, 0.0342787), (0.0, 0.04436508), (1.0, 0.09101833),
# (0.0, 0.03492954), (1.0, 0.06059857)], dtype=object)
Remove np.array() from the assignment step in np.array([tuple(i) for i in results]) and it will work like a breeze. When you pass this list to np.array, the highest possible number of axes is automatically guessed, and your tuples, having pairs of numbers, end up reproducing a (5,2) matrix.
Why dont do this?:
import numpy as np
results= np.array([[1. , 0.0342787 ],
[0. , 0.04436508],
[1. , 0.09101833 ],
[0. , 0.03492954],
[1. , 0.06059857]])
results1 = [tuple(i) for i in results]
results1
Output:
[(1.0, 0.0342787), (0.0, 0.04436508), (1.0, 0.09101833), (0.0,
0.03492954), (1.0, 0.06059857)]
Working from the examples in my answer in your link, Convert array of lists to array of tuples/triple
In [22]: results=np.array([[1. , 0.0342787 ],
...: [0. , 0.04436508],
...: [1. , 0.09101833 ],
...: [0. , 0.03492954],
...: [1. , 0.06059857]])
In [23]: a1 = np.empty((5,), object)
In [24]: a1[:]= [tuple(i) for i in results]
In [25]: a1
Out[25]:
array([(1.0, 0.0342787), (0.0, 0.04436508), (1.0, 0.09101833),
(0.0, 0.03492954), (1.0, 0.06059857)], dtype=object)
or the structured array:
In [26]: a1 = np.array([tuple(i) for i in results], dtype='i,i')
In [27]: a1
Out[27]:
array([(1, 0), (0, 0), (1, 0), (0, 0), (1, 0)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
You got the error because you did not follow my answer:
In [30]: a1[:]= np.array([tuple(i) for i in results])
Traceback (most recent call last):
File "<ipython-input-30-5c1cc6c4105a>", line 1, in <module>
a1[:]= np.array([tuple(i) for i in results])
ValueError: could not broadcast input array from shape (5,2) into shape (5)
The a1[:]=... assign works for a list, but not for an array.
Note that wrapping the tuple list in an array just reproduces the original results:
In [31]: np.array([tuple(i) for i in results])
Out[31]:
array([[1. , 0.0342787 ],
[0. , 0.04436508],
[1. , 0.09101833],
[0. , 0.03492954],
[1. , 0.06059857]])
A list of tuples:
In [32]: [tuple(i) for i in results]
Out[32]:
[(1.0, 0.0342787),
(0.0, 0.04436508),
(1.0, 0.09101833),
(0.0, 0.03492954),
(1.0, 0.06059857)]

Ndarray of lists with mix of floats and integers?

I have an array of lists (corr: N-Dimensional array)
s_cluster_data
Out[410]:
array([[ 0.9607611 , 0.19538569, 0. ],
[ 1.03990463, 0.22274072, 0. ],
[ 1.09430461, 0.22603228, 0. ],
...,
[ 1.10802461, -0.54190659, 2. ],
[ 0.9288097 , -0.49195368, 2. ],
[ 0.81606986, -0.47141286, 2. ]])
I would like to make the third column an integer. I've tried to assign dtype as such
dtype=[('A','f8'),('B','f8'),('C','i4')]
s_cluster_data = np.array(s_cluster_data, dtype=dtype)
s_cluster_data
Out[414]:
array([[( 0.9607611 , 0.9607611 , 0), ( 0.19538569, 0.19538569, 0),
( 0. , 0. , 0)],
[( 1.03990463, 1.03990463, 1), ( 0.22274072, 0.22274072, 0),
( 0. , 0. , 0)],
[( 1.09430461, 1.09430461, 1), ( 0.22603228, 0.22603228, 0),
( 0. , 0. , 0)],
...,
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<i4')])
Which creates an array of lists of tuples (corr: array with dtype), with each index in lists becoming a separate tuple.
I've also tried to take apart the array, read it in as array of tuples, but return back to original state.
list_cluster = s_cluster_data.tolist() # py list
tuple_cluster = [tuple(l) for l in list_cluster] # list of tuples
dtype=[('A','f8'),('B','f8'),('C','i4')]
sd_cluster_data = np.array(tuple_cluster, dtype=dtype) # array of tuples with dtype
sd_cluster_data
Out: ...,
(1.0020371 , -0.56034073, 2), (1.18264038, -0.55773913, 2),
(1.00550194, -0.55359672, 2), (1.10802461, -0.54190659, 2),
(0.9288097 , -0.49195368, 2), (0.81606986, -0.47141286, 2)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<i4')])
So ideally the above output is what I would like to see, but with array of lists, not array of tuples.
I tried to take the array apart and merge it back as lists
x_val_arr = np.array([x[0] for x in sd_cluster_data])
y_val_arr = np.array([x[1] for x in sd_cluster_data])
cluster_id_arr = np.array([x[2] for x in sd_cluster_data])
coordinates_arr = np.stack((x_val_arr,y_val_arr,cluster_id_arr),axis=1)
But once again I get floats in the third column
coordinates_arr
Out[416]:
array([[ 0.9607611 , 0.19538569, 0. ],
[ 1.03990463, 0.22274072, 0. ],
[ 1.09430461, 0.22603228, 0. ],
...,
[ 1.10802461, -0.54190659, 2. ],
[ 0.9288097 , -0.49195368, 2. ],
[ 0.81606986, -0.47141286, 2. ]])
So this is probably a question due to my lack of domain knowledge, but do ndarrays not support mixed data types if it consists of lists, not tuples?
In [87]: import numpy.lib.recfunctions as rf
In [88]: arr = np.array([[ 0.9607611 , 0.19538569, 0. ],
...: [ 1.03990463, 0.22274072, 0. ],
...: [ 1.09430461, 0.22603228, 0. ],
...: [ 1.10802461, -0.54190659, 2. ],
...: [ 0.9288097 , -0.49195368, 2. ],
...: [ 0.81606986, -0.47141286, 2. ]])
In [89]: arr
Out[89]:
array([[ 0.9607611 , 0.19538569, 0. ],
[ 1.03990463, 0.22274072, 0. ],
[ 1.09430461, 0.22603228, 0. ],
[ 1.10802461, -0.54190659, 2. ],
[ 0.9288097 , -0.49195368, 2. ],
[ 0.81606986, -0.47141286, 2. ]])
There are various ways of constructing a structured array from 2d array like this. Recent versions provide a convenient unstructured_to_structured function:
In [90]: dt = np.dtype([('A','f8'),('B','f8'),('C','i4')])
In [92]: rf.unstructured_to_structured(arr, dt)
Out[92]:
array([(0.9607611 , 0.19538569, 0), (1.03990463, 0.22274072, 0),
(1.09430461, 0.22603228, 0), (1.10802461, -0.54190659, 2),
(0.9288097 , -0.49195368, 2), (0.81606986, -0.47141286, 2)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<i4')])
Each row of arr has been turned into a structured record, displayed as a tuple.
A functionally equivalent approach is to create a 'blank' array, and assign field values by name:
In [93]: res = np.zeros(arr.shape[0], dt)
In [94]: res
Out[94]:
array([(0., 0., 0), (0., 0., 0), (0., 0., 0), (0., 0., 0), (0., 0., 0),
(0., 0., 0)], dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<i4')])
In [95]: res['A'] = arr[:,0]
In [96]: res['B'] = arr[:,1]
In [97]: res['C'] = arr[:,2]
In [98]: res
Out[98]:
array([(0.9607611 , 0.19538569, 0), (1.03990463, 0.22274072, 0),
(1.09430461, 0.22603228, 0), (1.10802461, -0.54190659, 2),
(0.9288097 , -0.49195368, 2), (0.81606986, -0.47141286, 2)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<i4')])
and to belabor the point, we could also make the structured array from a list of tuples:
In [104]: np.array([tuple(row) for row in arr.tolist()], dt)
Out[104]:
array([(0.9607611 , 0.19538569, 0), (1.03990463, 0.22274072, 0),
(1.09430461, 0.22603228, 0), (1.10802461, -0.54190659, 2),
(0.9288097 , -0.49195368, 2), (0.81606986, -0.47141286, 2)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<i4')])
The problem might be in the way you pass data to np.array. The rows of array should be tuples.
a = np.array([( 0.9607611 , 0.19538569, 0. )], dtype='f8, f8, i4')
will create an array
array([(0.9607611, 0.19538569, 0)],
dtype=[('f0', '<f8'), ('f1', '<f8'), ('f2', '<i4')])

Zip arrays in Python

I have one 2D array and one 1D array. I would like to zip them together.
import numpy as np
arr2D = [[5.88964708e-02, -2.38142395e-01, -4.95821417e-01, -7.07269274e-01],
[0.53363666, 0.1654723 , -0.16439857, -0.44880487]]
arr2D = np.asarray(arr2D)
arr1D = np.arange(7, 8.5+0.5, 0.5)
arr1D = np.asarray(arr1D)
res = np.array(list(zip(arr1D, arr2D)))
print(res)
which results in:
[[7.0 array([ 0.05889647, -0.2381424 , -0.49582142, -0.70726927])]
[7.5 array([ 0.53363666, 0.1654723 , -0.16439857, -0.44880487])]]
But I am trying to get:
[[(7.0, 0.05889647), (7.5, -0.2381424), (8.0, -0.49582142), (8.5, -0.70726927)]]
[[(7.0, 0.53363666), (7.5, 0.1654723),(8.0, -0.16439857), (8.5, -0.44880487)]]
How can I do this?
You were almost there! Here's a solution:
list(map(lambda x: list(zip(arr1D, x)), arr2D))
[[(7.0, 0.0588964708),
(7.5, -0.238142395),
(8.0, -0.495821417),
(8.5, -0.707269274)],
[(7.0, 0.53363666), (7.5, 0.1654723), (8.0, -0.16439857), (8.5, -0.44880487)]]
In [382]: arr2D = [[5.88964708e-02, -2.38142395e-01, -4.95821417e-01, -7.07269274e-01],
...: [0.53363666, 0.1654723 , -0.16439857, -0.44880487]]
...: arr2D = np.asarray(arr2D)
...: arr1D = np.arange(7, 8.5+0.5, 0.5) # already an array
In [384]: arr2D.shape
Out[384]: (2, 4)
In [385]: arr1D.shape
Out[385]: (4,)
zip iterates on the first dimension of the arguments, and stops with the shortest:
In [387]: [[i,j[0:2]] for i,j in zip(arr1D, arr2D)]
Out[387]:
[[7.0, array([ 0.05889647, -0.2381424 ])],
[7.5, array([0.53363666, 0.1654723 ])]]
If we transpose the 2d, so it is now (4,2), we get a four element list:
In [389]: [[i,j] for i,j in zip(arr1D, arr2D.T)]
Out[389]:
[[7.0, array([0.05889647, 0.53363666])],
[7.5, array([-0.2381424, 0.1654723])],
[8.0, array([-0.49582142, -0.16439857])],
[8.5, array([-0.70726927, -0.44880487])]]
We could add another level of iteration to get the desired pairs:
In [390]: [[(i,k) for k in j] for i,j in zip(arr1D, arr2D.T)]
Out[390]:
[[(7.0, 0.0588964708), (7.0, 0.53363666)],
[(7.5, -0.238142395), (7.5, 0.1654723)],
[(8.0, -0.495821417), (8.0, -0.16439857)],
[(8.5, -0.707269274), (8.5, -0.44880487)]]
and with list transpose idiom:
In [391]: list(zip(*_))
Out[391]:
[((7.0, 0.0588964708), (7.5, -0.238142395), (8.0, -0.495821417), (8.5, -0.707269274)),
((7.0, 0.53363666), (7.5, 0.1654723), (8.0, -0.16439857), (8.5, -0.44880487))]
Or we can get that result directly by moving the zip into an inner loop:
[[(i,k) for i,k in zip(arr1D, row)] for row in arr2D]
In other words, you are pairing the elements of arr1D with the elements of each row of 2D, rather than with the whole row.
Since you already have arrays, one of the array solutions might be better, but I'm trying to clarify what is happening with zip.
numpy
There are various ways of building a numpy array from these arrays. Since you want to repeat the arr1D values:
This repeat makes a (4,2) array that matchs arr2D (tile also works):
In [400]: arr1D[None,:].repeat(2,0)
Out[400]:
array([[7. , 7.5, 8. , 8.5],
[7. , 7.5, 8. , 8.5]])
In [401]: arr2D
Out[401]:
array([[ 0.05889647, -0.2381424 , -0.49582142, -0.70726927],
[ 0.53363666, 0.1654723 , -0.16439857, -0.44880487]])
which can then be joined on a new trailing axis:
In [402]: np.stack((_400, arr2D), axis=2)
Out[402]:
array([[[ 7. , 0.05889647],
[ 7.5 , -0.2381424 ],
[ 8. , -0.49582142],
[ 8.5 , -0.70726927]],
[[ 7. , 0.53363666],
[ 7.5 , 0.1654723 ],
[ 8. , -0.16439857],
[ 8.5 , -0.44880487]]])
Or a structured array with tuple-like display:
In [406]: arr = np.zeros((2,4), dtype='f,f')
In [407]: arr
Out[407]:
array([[(0., 0.), (0., 0.), (0., 0.), (0., 0.)],
[(0., 0.), (0., 0.), (0., 0.), (0., 0.)]],
dtype=[('f0', '<f4'), ('f1', '<f4')])
In [408]: arr['f1'] = arr2D
In [409]: arr['f0'] = _400
In [410]: arr
Out[410]:
array([[(7. , 0.05889647), (7.5, -0.2381424 ), (8. , -0.49582142),
(8.5, -0.70726925)],
[(7. , 0.5336367 ), (7.5, 0.1654723 ), (8. , -0.16439857),
(8.5, -0.44880486)]], dtype=[('f0', '<f4'), ('f1', '<f4')])
You can use numpy.tile to expand the 1d array, and then use numpy.dstack, namely:
import numpy as np
arr2D = np.array([[5.88964708e-02, -2.38142395e-01, -4.95821417e-01, -7.07269274e-01],
[0.53363666, 0.1654723 , -0.16439857, -0.44880487]])
arr1D = np.arange(7, 8.5+0.5, 0.5)
np.dstack([np.tile(arr1D, (2,1)), arr2D])
array([[[ 7. , 0.05889647],
[ 7.5 , -0.2381424 ],
[ 8. , -0.49582142],
[ 8.5 , -0.70726927]],
[[ 7. , 0.53363666],
[ 7.5 , 0.1654723 ],
[ 8. , -0.16439857],
[ 8.5 , -0.44880487]]])

Python numpy indexing copy

I'm reading the book Python for data analysis about numpy Boolen indexing, it says Selecting data from an array by boolean indexing always creates a copy of the data, but why I could change the original array using Boolen indexing? Is anyone could help me? Thanks a lot.
here is the example:
In [86]: data
Out[86]:
array([[-0.048 , 0.5433, -0.2349, 1.2792],
[-0.268 , 0.5465, 0.0939, -2.0445],
[-0.047 , -2.026 , 0.7719, 0.3103],
[ 2.1452, 0.8799, -0.0523, 0.0672],
[-1.0023, -0.1698, 1.1503, 1.7289],
[ 0.5994, 0.8174, -0.9297, -1.2564]])
In [96]: data[data < 0] = 0
In [97]: data
Out[97]:
array([[ 0. , 0.5433, 0. , 1.2792],
[ 0. , 0.5465, 0.0939, 0. ],
[ 0. , 0. , 0.7719, 0.3103],
[ 2.1452, 0.8799, 0. , 0.0672],
[ 0. , 0. , 1.1503, 1.7289],
[ 0.1913, 0.4544, 0.4519, 0.5535],
[ 0.5994, 0.8174, 0. , 0. ]])
Boolean indexing returns a copy of the data, not a view of the original data, like one gets for slices.
>>> b=data[data<0]; b # this is a copy of data
array([-0.048 , -0.2349, -0.268 , -2.0445, -0.047 , -2.026 , -0.0523,
-1.0023, -0.1698, -0.9297, -1.2564])
I can manipulate b and data is preserved.
>>> b[:] = 0; b
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
>>> data
array([[-0.048 , 0.5433, -0.2349, 1.2792],
[-0.268 , 0.5465, 0.0939, -2.0445],
[-0.047 , -2.026 , 0.7719, 0.3103],
[ 2.1452, 0.8799, -0.0523, 0.0672],
[-1.0023, -0.1698, 1.1503, 1.7289],
[ 0.5994, 0.8174, -0.9297, -1.2564]])
Now, for a slice:
>>> a = data[0,:]; a # a is not a copy of data
array([-0.048 , 0.5433, -0.2349, 1.2792])
>>> a[:] = 0; a
array([ 0., 0., 0., 0.])
>>> data
array([[ 0. , 0. , 0. , 0. ],
[-0.268 , 0.5465, 0.0939, -2.0445],
[-0.047 , -2.026 , 0.7719, 0.3103],
[ 2.1452, 0.8799, -0.0523, 0.0672],
[-1.0023, -0.1698, 1.1503, 1.7289],
[ 0.5994, 0.8174, -0.9297, -1.2564]])
However, as you've identified, assignments made via indexed arrays are always made to the original data.
>>> data[data<0] = 1; data
array([[ 1. , 0.5433, 1. , 1.2792],
[ 1. , 0.5465, 0.0939, 1. ],
[ 1. , 1. , 0.7719, 0.3103],
[ 2.1452, 0.8799, 1. , 0.0672],
[ 1. , 1. , 1.1503, 1.7289],
[ 0.5994, 0.8174, 1. , 1. ]])
In a fetch or __getitem__ the boolean indexing does return a copy. But if used immediately before an assignment, it's a __setitem__ case, and the selected values will be changed:
In [196]: data = np.arange(10)
In [197]: d1 = data[data<5]
In [198]: d1 # a copy
Out[198]: array([0, 1, 2, 3, 4])
In [199]: d1[:] = 0
In [200]: data # not change to the original
Out[200]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Masked assignment:
In [201]: data[data<5] = 0
In [202]: data
Out[202]: array([0, 0, 0, 0, 0, 5, 6, 7, 8, 9]) # changed data
Indirect assignment does nothing:
In [204]: data[data<5][:] = 1
In [205]: data
Out[205]: array([0, 0, 0, 0, 0, 5, 6, 7, 8, 9])
Think of it as data.__getitem__(mask).__setitem__(slice) = 1. The get item returns a copy, which the set item changes - but doesn't change the original.
So if you need to use advanced indexing of the LHS, make sure it is immediately before the assignment. And you can't use 2 advanced indexing step on the LHS.
view v copy
With basic indexing it is possible to use the original databuffer, and just change attributes like shape and strides. For example:
In [85]: x = np.arange(10)
In [86]: x.shape
Out[86]: (10,)
In [87]: x.strides
Out[87]: (4,)
In [88]: y = x[::2]
In [89]: y.shape
Out[89]: (5,)
In [90]: y.strides
Out[90]: (8,)
y has the same databuffer as x (compare the x.__array_interface__ dictionaries). x uses all 10 4bytes elements; y uses every other one (strides steps by 8 bytes instead of 4).
But with advanced indexing you can't express the element selection in terms of shape and strides.
In [98]: z = x[[1,2,6,7,0]]
In [99]: z.shape
Out[99]: (5,)
In [100]: z.strides
Out[100]: (4,)
Items in the original array can be selected in any order and with repetitions. There's no regular pattern. So a copy is required.

Categories