I have a numpy array of 4000*6 (6 column). And I have a numpy column (1*6) of minimum values (made from another numpy array of 3000*6).
I want to find everything in the large array that is below those values. but each value to it's corresponding column.
I've tried the simple way, based on a one column solution I already had:
largearray=[float('nan') if x<min_values else x for x in largearray]
but sadly it didn't work :(.
I can do a for loop for each column and each value, but i was wondering if there is a faster more elegant solution.
Thanks
EDIT: I'll try to rephrase: I have 6 values, and 6 columns.
i want to find the values in each column that are lower then the corresponding one from the 6 values.
by array I mean a 2d array. sorry if it wasn't clear
sorry, i'm still thinking in Matlab a bit.
this my loop solution. It's on df, not numpy. still, is there a faster way?
a=0
for y in dfnames:
df[y]=[float('nan') if x<minvalues[a] else x for x in df[y]]
a=a+1
df is the large array or dataframe
dfnames are the column names i'm interested in.
minvalues are the minimum values for each column. I'm assuming that the order is the same. bad assumption, but works for now.
will appreciate any help making it better
I think you just need
result = largearray.copy()
result[result < min_values] = np.nan
That is, result is a copy of largearray but ay element less than the corresponding column of min_values is set to nan.
If you want to blank entire rows only when all entries in the row are less than the corresponding column of min_values, then you want:
result = largearray.copy()
result[np.all(result < min_values, axis=1)] = np.nan
I don't use numpy, so it may be not commont used solution, but such work:
largearray = numpy.array([[1,2,3], [3,4,5]])
minvalues =numpy.array([3,4,5])
largearray1=[(float('nan') if not numpy.all(numpy.less(x, min_values)) else x) for x in largearray]
result should be: [[1,2,3], 'nan']
Related
Probably this is a very easy and silly fix, but I cannot think of anyway to solve this issue.
I have a three column array with 2000 elements like so where each column represent x, y, z coordinates.
Final_array =np.zeros([2000,3])
Through some for loops I am trying to populate this array's columns. I was able to populate the first 1000 rows of the y values (second column) with the information of another array y_coords1, but I don't know how to populate the remaining 1000 bottom rows with another array y_coords2. Can someone please help?
# putting y values
for y in range(len(y_coords1)):
Final_array[y, 1] = y_coords1[y]
for w in Final_array[999:2000]:
for y in range(len(y_coords2)):
Final_array[y,1] = y_coords2[y]
print(Final_array)
I have tried some variations on the for loops but I get errors.
I was looking for a different approach to answer this question, but the solution provided by Ulises Bussi worked to solve the issue
FinalArray[:,1] = np.concatenate([y_coords1 ,y_coords2 ],axis=0)
Looking to print the minimum values of numpy array columns.
I am using a loop in order to do this.
The array is shaped (20, 3) and I want to find the min values of columns, starting with the first (i.e. col_value=0)
I have coded
col_value=0
for col_value in X:
print(X[:, col_value].min)
col_value += 1
However, it is coming up with an error
"arrays used as indices must be of integer (or boolean) type"
How do I fix this?
Let me suggest an alternative approach that you might find useful. numpy min() has axis argument that you can use to find min values along various
dimensions.
Example:
X = np.random.randn(20, 3)
print(X.min(axis=0))
prints numpy array with minimum values of X columns.
You don't need col_value=0 nor do you need col_value+=1.
x = numpy.array([1,23,4,6,0])
print(x.min())
EDIT:
Sorry didn't see that you wanted to iterate through columns.
import numpy as np
X = np.array([[1,2], [3,4]])
for col in X.T:
print(col.min())
Transposing the axis of the matrix is one the best solution.
X=np.array([[11,2,14],
[5,15, 7],
[8,9,20]])
X=X.T #Transposing the array
for i in X:
print(min(i))
I have a Pandas dataframe with two columns, x and y, that correspond to a large signal. It is about 3 million rows in size.
Wavelength from dataframe
I am trying to isolate the peaks from the signal. After using scipy, I got a 1D Python list corresponding to the indexes of the peaks. However, they are not the actual x-values of the signal, but just the index of their corresponding row:
from scipy.signal import find_peaks
peaks, _ = find_peaks(y, height=(None, peakline))
So, I decided I would just filter the original dataframe by setting all values in its y column to NaN unless they were on an index found in the peak list. I did this iteratively, however, since it is 3000000 rows, it is extremely slow:
peak_index = 0
for data_index in list(data.index):
if data_index != peaks[peak_index]:
data[data_index, 1] = float('NaN')
else:
peak_index += 1
Does anyone know what a faster method of filtering a Pandas dataframe might be?
Looping in most cases is extremely inefficient when it comes to pandas. Assuming you just need filtered DataFrame that contains the values of both x and y columns only when y is a peak, you may use the following piece of code:
df.iloc[peaks]
Alternatively, if you are hoping to retrieve an original DataFrame with y column retaining its peak values and having NaN otherwise, then please use:
df.y = df.y.where(df.y.iloc[peaks] == df.y.iloc[peaks])
Finally, since you seem to care about just the x values of the peaks, you might just rework the first piece in the following way:
df.iloc[peaks].x
Can someone please help me out? I am trying to get the minimum value of each row and of each column of this matrix
matrix =[[12,34,28,16],
[13,32,36,12],
[15,32,32,14],
[11,33,36,10]]
So for example: I would want my program to print out that 12 is the minimum value of row 1 and so on.
Let's repeat the task statement: "get the minimum value of each row and of each column of this matrix".
Okay, so, if the matrix has n rows, you should get n minimum values, one for each row. Sounds interesting, doesn't it? So, the code'll look like this:
result1 = [<something> for row in matrix]
Well, what do you need to do with each row? Right, find the minimum value, which is super easy:
result1 = [min(row) for row in matrix]
As a result, you'll get a list of n values, just as expected.
Wait, by now we've only found the minimums for each row, but not for each column, so let's do this as well!
Given that you're using Python 3.x, you can do some pretty amazing stuff. For example, you can loop over columns easily:
result2 = [min(column) for column in zip(*matrix)] # notice the asterisk!
The asterisk in zip(*matrix) makes each row of matrix a separate argument of zip's, like this:
zip(matrix[0], matrix[1], matrix[2], matrix[3])
This doesn't look very readable and is dependent on the number of rows in matrix (basically, you'll have to hard-code them), and the asterisk lets you write much cleaner code.
zip returns tuples, and the ith tuple contains the ith values of all the rows, so these tuples are actually the columns of the given matrix.
Now, you may find this code a bit ugly, you may want to write the same thing in a more concise way. Sure enough, you can use some functional programming magic:
result1 = list(map(min, matrix))
result2 = list(map(min, zip(*matrix)))
These two approaches are absolutely equivalent.
Use numpy.
>>> import numpy as np
>>> matrix =[[12,34,28,16],
... [13,32,36,12],
... [15,32,32,14],
... [11,33,36,10]]
>>> np.min(matrix, axis=1) # computes minimum in each row
array([12, 12, 14, 10])
>>> np.min(matrix, axis=0) # computes minimum in each column
array([11, 32, 28, 10])
I am looking for coding examples to learn Numpy.
Usage would be dtype ='object'.
To construnct array the code used would
a= np.asarray(d, dtype ='object')
not np.asarray(d) or np.asarray(d, dtype='float32')
Is sorting any different than float32/64?
Coming from excel "cell" equations, wrapping my head around Row Column math.
Ex:
A = array([['a',2,3,4],['b',5,6,2],['c',5,1,5]], dtype ='object')
[['a',2,3,4],
['b',5,6,2],
['c',5,1,5]])
Create new array with:
How would I sort high to low by [3].
How calc for entire col. (1,1)- (1,0), Example without sorting A
['b',3],
['c',0]
How calc for enitre array (1,1) - (2,0) Example without sorting A
['b',2],
['c',-1]
Despite the fact that I still cannot understand exactly what you are asking, here is my best guess. Let's say you want to sort A by the values in 3rd column:
A = array([['a',2,3,4],['b',5,6,2],['c',5,1,5]], dtype ='object')
ii = np.argsort(A[:,2])
print A[ii,:]
Here the rows have been sorted according to the 3rd column, but each row is left unsorted.
Subtracting all of the columns is a problem due to the string objects, however if you exclude them, you can for example subtract the 3rd row from the 1st by:
A[0,1:] - A[2,1:]
If I didn't understand the basic point of your question, then please revise it. I highly recommend you take a look at the numpy tutorial and documentation if you have not done so already:
http://docs.scipy.org/doc/numpy/reference/
http://docs.scipy.org/doc/numpy/user/