why a[:,[x]] could create a column vector from an array? - python

why a[:,[x]] could create a column vector from an array? The [ ] represents what?
Could anyone explain to me the principle?
a = np.random.randn(5,6)
a = a.astype(np.float32)
print(a)
c = torch.from_numpy(a[:,[1]])
[[-1.6919796 0.3160475 0.7606999 0.16881375 1.325092 0.71536326]
[ 1.217861 0.35804042 0.0285245 0.7097111 -2.1760604 0.992101 ]
[-1.6351479 0.6607222 0.9375339 0.5308735 -1.9699149 -2.002803 ]
[-1.1895325 1.1744579 -0.5980689 -0.8906375 -0.00494479 0.51751447]
[-1.7642071 0.4681248 1.3938268 -0.7519176 0.5987852 -0.5138923 ]]
###########################################
tensor([[0.3160],
[0.3580],
[0.6607],
[1.1745],
[0.4681]])

The [ ] mean you are giving extra dimension. Try numpy shape method to see the diference.
a[:,1].shape
output :
(10,)
with [ ]
a[:,[1]].shape
output :
(10,1)

That syntax is for array slicing in numpy, where arrays are indexed as a[rows, columns, page, ... (higher-dimensions)]
Selecting for a specific row/column/page is done by giving a specific number or range of numbers. So when you use a[1,2], numpy gets the element from row 1, column 2.
You can select for several specific indices by giving the dimension multiple values. So a[[1,3],1] gets you both elements (1,1) and (1,3).
The : tells numpy to get everything from that specific array dimension. So when you use a[:,1], numpy gets every row in column 1. Alternatively, a[1,:] gets every column in row 1.

Related

Need to extract and display row with a specific minimum value

I have the following output sample:
[[-5.53759409e-01 -2.68382610e-01 4.06747784e+00]
[-1.66055379e+00 -8.08889466e-01 7.06720368e+01]
[ 2.92172488e-01 8.17347290e-01 3.18001189e+00]
[ 1.89072607e+00 -6.68502526e-01 9.08233869e+01]
[-1.31451627e+00 1.61831269e+00 5.41709058e+00]
[ 1.15886824e+00 3.31177259e-01 5.14391851e+00]
[ 1.87270676e+00 1.24100260e+00 2.64360316e+01]
[ 1.93323801e+00 -5.64255644e-02 7.28368451e+01]
[ 1.33014215e+00 1.96282476e+00 2.96295301e-01]]
The minimum function value at generation 10 is [0.2962953]
I have concatenated two arrays - the coordinate array (elements 0 and 1) and the function values (element 2) to form the above array.
However, I would like to not only display the minimum function value e.g 0.2962953 but also the coordinates associated with it, hence the row of the above array.
Any ideas how I would approach this?
In this case, I would need the bottom row of the above array and a way to highlight the coordinates and function value.
Problem fixed! Just used: printValues = array[np.argmin(array[:, 2]), (0,1)]

How to split a python array into different arrays based on a condition?

I have a python array like this:
array([[0.34201428, 0.46875536, 0.37900415, 0.4906195 ],
[0.58203477, 0.35279346, 0.61418074, 0.37601328],
[0.3388086 , 0.21167754, 0.37330517, 0.2436498 ],
[0.57343255, 0.34535545, 0.62878576, 0.38982747]],dtype=float32)
I want to split the array into different clusters based on the zeroth column. The final output should be like this:
The first array as,
array([[0.34201428, 0.46875536, 0.37900415, 0.4906195 ],
[0.3388086 , 0.21167754, 0.37330517, 0.2436498 ]])
And the second array as,
array([[0.58203477, 0.35279346, 0.61418074, 0.37601328],
[0.57343255, 0.34535545, 0.62878576, 0.38982747]])
Can anyone help me out. Thanks!!

Summing up columns of arrays of different shapes in array of arrays- Python 3.x

I have an array that contains 2D arrays.
For each 2D array i want to sum up the columns and the result must be in column form.
I have a piece of code to do this, but I feel like I am not utilising numpy optimally. What is the fastest to do this?
My current code:
temp = [np.sum(l_i,axis=1).reshape(-1,1) for l_i in self.layer_inputs]
Sample Array:
array([
array([[ 0.48517904, -11.10809746],
[ 13.64104864, 5.77576326]]),
array([[16.74109924, -3.28535518],
[-4.00977275, -3.39593759],
[ 5.9048581 , -1.65258805],
[13.40762143, -1.61158724],
[ 9.8634849 , 8.02993728]]),
array([[-7.61920427, -3.2314264 ],
[-3.79142779, -2.44719713],
[32.42085005, 4.79376209],
[13.97676962, -1.19746096],
[45.60100807, -3.01680368]])
], dtype=object)
Sample Expected Result:
[array([[-10.62291842],
[ 19.41681191]]),
array([[13.45574406],
[-7.40571034],
[ 4.25227005],
[11.7960342 ],
[17.89342218]]),
array([[-10.85063067],
[ -6.23862492],
[ 37.21461214],
[ 12.77930867],
[ 42.58420439]]) ]
New answer
Given your stringent requirement for a list of arrays, there is no more computationally efficient solution.
Original answer
To leverage NumPy, don't work with a list of arrays: dtype=object is the hint you won't be able to use vectorised operations.
Instead, combine into one array, e.g. via np.vstack, and store split indices. If you need a list of arrays, use np.split as a final step. But this constant flipping between lists and a single array is expensive. Really, you should attempt to just store the splits and a single array, i.e. idx and data below.
idx = np.array(list(map(len, A))).cumsum()[:-1] # [2, 7]
data = np.vstack(A).sum(1)

Numpy - Compare elements in two 2D arrays and replace values

I have a specific requirement for this problem. I need it to be simple and fast.
My problem:
I have two 2D arrays and I need to replace values in 1. array by values in 2. array according to condition. That is if element in x,y position in 1. array is smaller than element in x,y position in 2. array, then replace element in 1. array by element in 2. array.
what I tried and is not working:
import numpy as np
arr = np.random.randint(3,size=(2, 2))
arr2 = np.random.randint(3,size=(2, 2))
print(arr)
print(arr2)
arr[arr<arr2]=arr2 # Doesnt work.
This raises TypeError:
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions.
I can see, that it would be possible to iterate through columns or rows, but I believe it can be done without iteration.
Thanks in advance

Python - split matrix data into separate columns

I have read data from a file and stored into a matrix (frag_coords):
frag_coords =
[[ 916.0907976 -91.01391344 120.83596334]
[ 916.01117655 -88.73389753 146.912555 ]
[ 924.22832597 -90.51682575 120.81734705]
...
[ 972.55384732 708.71316138 52.24644577]
[ 972.49089559 710.51583744 72.86369124]]
type(frag_coords) =
class 'numpy.matrixlib.defmatrix.matrix'
I do not have any issues when reordering the matrix by a specified column. For example, the code below works just fine:
order = np.argsort(frag_coords[:,2], axis=0)
My issue is that:
len(frag_coords[0]) = 1
I need to access the individual numbers of the first row individually, I've tried splitting it, transforming it into a list and everything seems to return the 3 numbers not as columns but rather as a single element with len=1. I need help please!
Your problem is that you're using a matrix instead of an ndarray. Are you sure you want that?
For a matrix, indexing the first row alone leads to another matrix, a row matrix. Check frag_coords[0].shape: it will be (1,3). For an ndarray, it would be (3,).
If you only need to index the first row, use two indices:
frag_coords[0,j]
Or if you store the row temporarily, just index into it as a row matrix:
tmpvar = frag_coords[0] # shape (1,3)
print(tmpvar[0,2]) # for column 2 of row 0
If you don't need too many matrix operations, I'd advise that you use np.arrays instead. You can always read your data into an array directly, but at a given point you can just transform an existing matrix with np.array(frag_coords) too if you wish.

Categories