I'm new to python so I need some help with this:
I have 2D array of numbers that represent density of a circle of material in a space and I want to locate the centre.
So I want to get the index of the numbers representing the diameter and then the middle index will be the centre. In this code I only store the value of the density : tempdiameter.append(cell)
I want the index of the cell itself. How can I do that.
Also I don't want to use lists for diameter. so how can I create a dynamic 1D np array?
Thanks
for row in x:
for cell in row:
if cell!=0:
tempdensity+=cell
tempdiameter.append(cell)
if tempdensity>maxdensity:
maxdensity=tempdensity
if len(tempdiameter)>=len(diameter):
diameter=tempdiameter
tempdensity=0
tempdiameter=[]
To get the row with the highest number of non-zero cells and the highest sum you can do
densities = x.sum(axis=1)
lengths = (x > 0).sum(axis=1)
center = x[(densities == densities.max()) & (lengths == lengths.max()]
Try to avoid using loops in numpy. Let me know if this isn't what you want and I'll try to answer it better. You should provide sample input/output when asking a question. You can also edit your question rather than adding comments.
Related
I am new to Python. I am enumerating through a large list of data, as shown below, and would like to find the mean of every line.
for index, line in enumerate (data):
#calculate the mean
However, the lines of this particular set of data are as such:
[array([[2.3325655e-10, 2.4973504e-10],
[1.3025138e-10, 1.3025231e-10]], dtype=float32)].
I would like to find the mean of both 2x1s separately, then the average of both means, so it outputs a single number.
Thanks in advance.
You probably do not need to enumerate through the list to achieve what you want. You can do it in two steps using list comprehension.
For example,
data = [[2.3325655e-10, 2.4973504e-10],
[1.3025138e-10, 1.3025231e-10]]
# Calculate the average for 2X1s or each row
avgs_along_x = [sum(line)/len(line) for line in data]
# Calculate the average along y
avg_along_y = sum(avgs_along_x)/len(avgs_along_x)
There are other ways to calculate the mean of a list in python. You can read about them here.
If you are using numpy this can be done in one line.
import numpy as np
np.average(data, 1) # calculate the mean along x-axes denoted as 1
# To get what you want, we can pass tuples of axes.
np.average(data, (1,0))
From a matrix filled with values (see picture), I want to obtain a matrix with at most one value for every row and column. If there is more than one value, the maximum should be kept and the other set to 0. I know I can do that with np.max and np.argmax, but I'm wondering if there is some clever way to do it that I'm not aware of.
Here's the solution I have for now:
tmp = np.zeros_like(matrix)
for x in np.argmax(matrix, axis=0): # get max on x axis
for y in np.argmax(matrix, axis=1): # get max on y axis
tmp[x][y] = matrix[x][y]
matrix = tmp
The sparse structure may be used for efficiency, however right now I see a contradiction between at most one value for every row and column and your current implementation which may leave more than one value per row/column.
Either you need an order to prefer rows over columns or to go along an absolute sorting of all matrix values.
Just an idea which produces for sure at most one entry per row and column would be to, firstly select the maxima of rows, and secondly select from this intermediate matrix the maxima of columns:
import numpy as np
rows=5
cols=5
matrix=np.random.rand(rows, cols)
rowmax=np.argmax(matrix, 1)
rowmax_matrix=np.zeros((rows, cols))
for ri, rm in enumerate(rowmax):
rowmax_matrix[ri,rm]=matrix[ri, rm]
colrowmax=np.argmax(rowmax_matrix, 0)
colrowmax_matrix=np.zeros((rows, cols))
for ci, cm in enumerate(colrowmax):
colrowmax_matrix[cm, ci]=rowmax_matrix[cm, ci]
This is probably not the final answer, but may help to formulate the desired algorithm precisely.
This question already has answers here:
Efficient way to take the minimum/maximum n values and indices from a matrix using NumPy
(3 answers)
Closed 2 years ago.
I have a 2D numpy array "bigrams" of shape (851, 851) with float values inside. I want to get the top ten values from this array and I want their coordinates.
I know that np.amax(bigrams) can return the single highest value, so that's basically what I want but then for the top ten.
As a numpy-noob, I wrote some code using a loop to get the top values per row and then using np.where() to get the coordinates, but i feel there must be a smarter way to solve this..
You can flatten and use argsort.
idxs = np.argsort(bigrams.ravel())[-10:]
rows, cols = idxs//851, idxs%851
print(bigrams[rows,cols])
An alternative would be to do a partial sorting with argpartition.
partition = np.argpartition(bigrams.ravel(),-10)[-10:]
max_ten = bigrams[partition//851,partition%851]
You will get the top ten values and their coordinates, but they won't be sorted. You can sort this smaller array of ten values later if you want.
Does anyone know how to combine integer indices in numpy? Specifically, I've got the results of a few np.wheres and I would like to extract the elements that are common between them.
For context, I am trying to populate a large 3d array with the number of elements that are between boundary values of each cell, i.e. I have records of individual events including their time, latitude and longitude. I want to grid this into a 3D frequency matrix, where the dimensions are time, lat and lon.
I could loop round the array elements doing an np.where(timeCondition & latCondition & lonCondition), population with the length of the where result, but I figured this would be very inefficient as you would have to repeat a lot of the wheres.
What would be better is to just have a list of wheres for each of the cells in each dimension, and then loop through the logically combining them?
as #ali_m said, use bitwise and should be much faster, but to answer your question:
call ravel_multi_index() to convert the multi-dim index into 1-dim index.
call intersect1d() to get the index that in both condition.
call unravel_index() to convert the 1-dim index back to multi-dim index.
Here is the code:
import numpy as np
a = np.random.rand(10, 20, 30)
idx1 = np.where(a>0.2)
idx2 = np.where(a<0.4)
ridx1 = np.ravel_multi_index(idx1, a.shape)
ridx2 = np.ravel_multi_index(idx2, a.shape)
ridx = np.intersect1d(ridx1, ridx2)
idx = np.unravel_index(ridx, a.shape)
np.allclose(a[idx], a[(a>0.2) & (a<0.4)])
or you can use ridx directly:
a.ravel()[ridx]
I am looking for coding examples to learn Numpy.
Usage would be dtype ='object'.
To construnct array the code used would
a= np.asarray(d, dtype ='object')
not np.asarray(d) or np.asarray(d, dtype='float32')
Is sorting any different than float32/64?
Coming from excel "cell" equations, wrapping my head around Row Column math.
Ex:
A = array([['a',2,3,4],['b',5,6,2],['c',5,1,5]], dtype ='object')
[['a',2,3,4],
['b',5,6,2],
['c',5,1,5]])
Create new array with:
How would I sort high to low by [3].
How calc for entire col. (1,1)- (1,0), Example without sorting A
['b',3],
['c',0]
How calc for enitre array (1,1) - (2,0) Example without sorting A
['b',2],
['c',-1]
Despite the fact that I still cannot understand exactly what you are asking, here is my best guess. Let's say you want to sort A by the values in 3rd column:
A = array([['a',2,3,4],['b',5,6,2],['c',5,1,5]], dtype ='object')
ii = np.argsort(A[:,2])
print A[ii,:]
Here the rows have been sorted according to the 3rd column, but each row is left unsorted.
Subtracting all of the columns is a problem due to the string objects, however if you exclude them, you can for example subtract the 3rd row from the 1st by:
A[0,1:] - A[2,1:]
If I didn't understand the basic point of your question, then please revise it. I highly recommend you take a look at the numpy tutorial and documentation if you have not done so already:
http://docs.scipy.org/doc/numpy/reference/
http://docs.scipy.org/doc/numpy/user/