finding the max of a column in an array - python

def maxvalues():
for n in range(1,15):
dummy=[]
for k in range(len(MotionsAndMoorings)):
dummy.append(MotionsAndMoorings[k][n])
max(dummy)
L = [x + [max(dummy)]] ## to be corrected (adding columns with value max(dummy))
## suggest code to add new row to L and for next function call, it should save values here.
i have an array of size (k x n) and i need to pick the max values of the first column in that array. Please suggest if there is a simpler way other than what i tried? and my main aim is to append it to L in columns rather than rows. If i just append, it is adding values at the end. I would like to this to be done in columns for row 0 in L, because i'll call this function again and add a new row to L and do the same. Please suggest.

General suggestions for your code
First of all it's not very handy to access globals in a function. It works but it's not considered good style. So instead of using:
def maxvalues():
do_something_with(MotionsAndMoorings)
you should do it with an argument:
def maxvalues(array):
do_something_with(array)
MotionsAndMoorings = something
maxvalues(MotionsAndMoorings) # pass it to the function.
The next strange this is you seem to exlude the first row of your array:
for n in range(1,15):
I think that's unintended. The first element of a list has the index 0 and not 1. So I guess you wanted to write:
for n in range(0,15):
or even better for arbitary lengths:
for n in range(len(array[0])): # I chose the first row length here not the number of columns
Alternatives to your iterations
But this would not be very intuitive because the max function already implements some very nice keyword (the key) so you don't need to iterate over the whole array:
import operator
column = 2
max(array, key=operator.itemgetter(column))[column]
this will return the row where the i-th element is maximal (you just define your wanted column as this element). But the maximum will return the whole row so you need to extract just the i-th element.
So to get a list of all your maximums for each column you could do:
[max(array, key=operator.itemgetter(column))[column] for column in range(len(array[0]))]
For your L I'm not sure what this is but for that you should probably also pass it as argument to the function:
def maxvalues(array, L): # another argument here
but since I don't know what x and L are supposed to be I'll not go further into that. But it looks like you want to make the columns of MotionsAndMoorings to rows and the rows to columns. If so you can just do it with:
dummy = [[MotionsAndMoorings[j][i] for j in range(len(MotionsAndMoorings))] for i in range(len(MotionsAndMoorings[0]))]
that's a list comprehension that converts a list like:
[[1, 2, 3], [4, 5, 6], [0, 2, 10], [0, 2, 10]]
to an "inverted" column/row list:
[[1, 4, 0, 0], [2, 5, 2, 2], [3, 6, 10, 10]]
Alternative packages
But like roadrunner66 already said sometimes it's easiest to use a library like numpy or pandas that already has very advanced and fast functions that do exactly what you want and are very easy to use.
For example you convert a python list to a numpy array simple by:
import numpy as np
Motions_numpy = np.array(MotionsAndMoorings)
you get the maximum of the columns by using:
maximums_columns = np.max(Motions_numpy, axis=0)
you don't even need to convert it to a np.array to use np.max or transpose it (make rows to columns and the colums to rows):
transposed = np.transpose(MotionsAndMoorings)
I hope this answer is not to unstructured. Some parts are suggestions to your function and some are alternatives. You should pick the parts that you need and if you have any trouble with it, just leave a comment or ask another question. :-)

An example with a random input array, showing that you can take the max in either axis easily with one command.
import numpy as np
aa= np.random.random([4,3])
print aa
print
print np.max(aa,axis=0)
print
print np.max(aa,axis=1)
Output:
[[ 0.51972266 0.35930957 0.60381998]
[ 0.34577217 0.27908173 0.52146593]
[ 0.12101346 0.52268843 0.41704152]
[ 0.24181773 0.40747905 0.14980534]]
[ 0.51972266 0.52268843 0.60381998]
[ 0.60381998 0.52146593 0.52268843 0.40747905]

Related

How to get specific index of np.array of np.arrays fast

At the most basic I have the following dataframe:
a = {'possibility' : np.array([1,2,3])}
b = {'possibility' : np.array([4,5,6])}
df = pd.DataFrame([a,b])
This gives me a dataframe of size 2x1:
like so:
row 1: np.array([1,2,3])
row 2: np.array([4,5,6])
I have another vector of length 2. Like so:
[1,2]
These represent the index I want from each row.
So if I have [1,2] I want: from row 1: 2, and from row 2: 6.
Ideally, my output is [2,6] in a vector form, of length 2.
Is this possible? I can easily run through a for loop, but am looking for FAST approaches, ideally vectors approaches since it is already in pandas/numpy.
For actual use case approximations, I am looking to make this work in the 300k-400k row ranges. And need to run it in optimization problems (hence the fast part)
You could transform to a multi-dimensional numpy array and take_along_axis:
v = np.array([1,2])
a = np.vstack(df['possibility'])
np.take_along_axis(a.T, v[None], axis=0)[0]
output: array([2, 6])

How to update only selected values in a 2 dimensional list without using for loop?

I have a huge matrix with around 80000 rows and 66000 columns. I need to update selected values in each row. These selected values vary from row to row. For example, I might have to update 346th, 446th, 789th and 321th column values for first row and for second row I might have to update 821th, 564th, 101th, 781th column values. I hope you get the situation.
Here, I am simulating the problem using a small matrix.
Suppose I have a 2 dimensional list/matrix.
matrix = [ [1,2,3], [4,5,6], [7,8,9]]
In the actual problem, I need to update all the rows but here for the sake of simplicity I am considering only 1 row. i.e. 2nd row. I wish to update 1st and 2nd values of 2nd row and keep the rest of the values in 2nd row as they are.
I need to do it without using for loops.
The code I tried is as follows :
index_list = [0,1]
matrix[1] = [ matrix[1][index] + 1 for index in index_list ]
print(matrix)
Here, index_list is the list of selected columns that need to be updated. The output I get is :
[[1, 2, 3], [5, 6], [7, 8, 9]]
The output I need / expected output is :
[[1, 2, 3], [5, 6, 6], [7, 8, 9]]
So, the question is, I wish to update only 1sta and 2nd values of second row for above given matrix and keep the rest of the values in 2nd row as it is. And this needs to be done without using for loops because of time constraints. I am trying to use list compression because it is relatively fast. Could you please help with it ?
I forgot to mention the code is in python, and we can use pandas, numpy if required.
matrix[1] = [matrix[1][i] + 1 if i in index_list else matrix[1][i] for i in range(len(matrix[1]))]
This solution worked.
You need to cover the case where the index is not in the list, preserving those values:
matrix[1] = [ matrix[1][index] + 1 if index in index_list
else matrix[1][index] ]

How to calculate standard deviation of count-value pairs

In numpy the function for calculating the standard deviaiton expects a list of values like [1, 2, 1, 1] and calculates the standard deviation from those. In my case I have a nested list of values and counts like [[1, 2], [3, 1]], where the first list contains the values and the second contains the count of how often the corresponding values appear.
I am looking for a clean way of calculating the standard deviation for a given list like above, clean meaning
an already existing function in numpy, scipy, pandas etc.
a more pythonic approach to the problem
a more concise and nicely readable solution
I already have a working solution, that converts the nested count value list into a flattened list of values and calculates the standard deviation with the function above, but i find it not that pleasing and would rather have another option.
A minimal working example of my workaround is
import numpy as np
# The usual way
values = [1,2,1,1]
deviation = np.std(values)
print(deviation)
# My workaround for the problem
value_counts = [[1, 2], [3, 1]]
values, counts = value_counts
flattened = []
for value, count in zip(values, counts):
# append the current value count times
flattened = flattened + [value]*count
deviation = np.std(flattened)
print(deviation)
The output is
0.4330127018922193
0.4330127018922193
Thanks for any ideas or suggestions :)
You are simply looking for numpy.repeat.
numpy.std(numpy.repeat(value_counts[0], value_counts[1]))

Minimum value in each row and each column of a matrix - Python

Can someone please help me out? I am trying to get the minimum value of each row and of each column of this matrix
matrix =[[12,34,28,16],
[13,32,36,12],
[15,32,32,14],
[11,33,36,10]]
So for example: I would want my program to print out that 12 is the minimum value of row 1 and so on.
Let's repeat the task statement: "get the minimum value of each row and of each column of this matrix".
Okay, so, if the matrix has n rows, you should get n minimum values, one for each row. Sounds interesting, doesn't it? So, the code'll look like this:
result1 = [<something> for row in matrix]
Well, what do you need to do with each row? Right, find the minimum value, which is super easy:
result1 = [min(row) for row in matrix]
As a result, you'll get a list of n values, just as expected.
Wait, by now we've only found the minimums for each row, but not for each column, so let's do this as well!
Given that you're using Python 3.x, you can do some pretty amazing stuff. For example, you can loop over columns easily:
result2 = [min(column) for column in zip(*matrix)] # notice the asterisk!
The asterisk in zip(*matrix) makes each row of matrix a separate argument of zip's, like this:
zip(matrix[0], matrix[1], matrix[2], matrix[3])
This doesn't look very readable and is dependent on the number of rows in matrix (basically, you'll have to hard-code them), and the asterisk lets you write much cleaner code.
zip returns tuples, and the ith tuple contains the ith values of all the rows, so these tuples are actually the columns of the given matrix.
Now, you may find this code a bit ugly, you may want to write the same thing in a more concise way. Sure enough, you can use some functional programming magic:
result1 = list(map(min, matrix))
result2 = list(map(min, zip(*matrix)))
These two approaches are absolutely equivalent.
Use numpy.
>>> import numpy as np
>>> matrix =[[12,34,28,16],
... [13,32,36,12],
... [15,32,32,14],
... [11,33,36,10]]
>>> np.min(matrix, axis=1) # computes minimum in each row
array([12, 12, 14, 10])
>>> np.min(matrix, axis=0) # computes minimum in each column
array([11, 32, 28, 10])

Python - Create an array from columns in file

I have a text file with two columns and n rows. Usually I work with two separate vector using x,y=np.loadtxt('data',usecols=(0,1),unpack=True) but I would like to have them as an array of the form array=[[a,1],[b,2],[c,3]...] where all the letters correspond to the x-vector and the numbers to the y-vector so I can ask something like array[0,2]=b. I tried defining
array[0,:]=x but I didn't succeed. Any simple way to do this?
In addition, I want to get the respective x-value for certain y-value. I tried with
x_value=np.argwhere(array[:,1]==3)
And I'm expecting the x_value to be c because it corresponds to 3 in column 1 but it doesn't work either.
I think you simply need to not unpack the array you get back from loadtxt. Do:
arr = np.loadtxt('data', usecols=(0,1))
If your file contained:
0 1
2 3
4 5
arr will be like:
[[0, 1],
[2, 3],
[4, 5]]
Note that to index into this array, you need to specify the row first (and indexes start at 0):
arr[1,0] == 2 # True!
You can find the x values that correspond to a give y value with:
x_vals = arr[:,0][arr[:,1]==y_val]
The indexing will return an array, though x_vals will have only a single value if the y_val was unique. If you know in advance there will be only one match for the y_val, you could tack on [0] to the end of the indexing, so you get the first result.

Categories