Suppose an arbitrary number of arrays of arbitrary length. I would like to construct the n-dimensional array of all the combinations from the values in the arrays. Or even better, a list of all the combinations.
However, I would also like the previous "diagonal" element along each combination, except when such an element does not exist, in which case the values which do not exist are set to say -inf.
Take for ex. the following simple 2-D case:
v1=[-2,2]
v2=[-3,3]
From which I would get all the combinations
[[-2,-3],
[-2,3],
[2,-3],
[2,3]]
Or in 2D array / matrix form
-3 3
-2 -2,-3 -2,3
2 2,-3 2,3
Now I would also like a new column with the previous "diagonal" elements (in this case there is only 1 real such case) for each element. By previous "diagonal" element I mean the element at index i-1, j-1, k-1, ..., n-1. On the margins we take all the previous values that are possible.
1 2
-2,-3 -inf,-inf
-2, 3 -inf,-3
2,-3 -2,-inf
2, 3 -2,-3
Edit: here is the code for the 2D case, which is not much use for the general n-case.
import math
v1=[-3,-1,2,4]
v2=[-2,0,2]
tmp=[]
tmp2=[]
for i in range(0,len(v1)):
for j in range(0,len(v2)):
tmp.append([v1[i],v2[j]])
if i==0 and j==0:
tmp2.append([-math.inf,-math.inf])
elif i==0:
tmp2.append([-math.inf,v2[j-1]])
elif j==0:
tmp2.append([v1[i-1],-math.inf])
else:
tmp2.append([v1[i-1],v2[j-1]])
And so
tmp
[[-3, -2],
[-3, 0],
[-3, 2],
[-1, -2],
[-1, 0],
[-1, 2],
[2, -2],
[2, 0],
[2, 2],
[4, -2],
[4, 0],
[4, 2]]
and
tmp2
[[-inf, -inf],
[-inf, -2],
[-inf, 0],
[-3, -inf],
[-3, -2],
[-3, 0],
[-1, -inf],
[-1, -2],
[-1, 0],
[2, -inf],
[2, -2],
[2, 0]]
Take a look at itertools.product().
To get the "diagonals" you could take the product of the vectors indices instead of the vectors themselves. That way you can access the values of each combination aswell as the previous values of the combination.
Example:
import itertools
v1=[-2,2]
v2=[-3,3]
vectors = [v1, v2]
combs = list(itertools.product(*[range(len(v)) for v in vectors]))
print(combs)
[(0, 0), (0, 1), (1, 0), (1, 1)]
print([[vectors[vi][ci] for vi, ci in enumerate(comb)] for comb in combs])
[[-2, -3], [-2, 3], [2, -3], [2, 3]]
print([[(vectors[vi][ci-1] if ci > 0 else np.inf) for vi, ci in enumerate(comb)] for comb in combs])
[[inf, inf], [inf, -3], [-2, inf], [-2, -3]]
Related
This question has three related parts. Consider the numpy array sample, P, having 4 columns.
import numpy as np
P = np.array([-4, 5, 2, -3],
[-5, 6, 0, -5],
[-6, 5, -2, 5],
[1, -2, 1, -2],
[2, -4, -6, 8],
[-4, 9, -4, 2],
[0, -8, -8, 1]])
I'm hoping to learn how to build three new arrays:
a) P1: This is P where the first element of a row has a match in the last 3 elements.
b) P2: This is P where the first 2 elements of a row have a match in the last 2 elements.
c) P3: This is P where the first 3 elements of a row have a match in the last element.
The outcomes, for the small sample array, would be:
P1 = [[-5, 6, 0, -5],
[1, -2, 1, -2],
[-4, 9, -4, 2]]
P2 = [[-5, 6, 0, -5],
[-6, 5, -2, 5],
[1, -2, 1, -2],
[-4, 9, -4, 2],
[0, -8, -8, 1]]
P3 = [-5, 6, 0, -5],
[-6, 5, -2, 5],
[1, -2, 1, -2]]
P1 and P3 are constructed the same way:
P1mask = (P[:, 0:1] == P[:, 1:]).any(axis=1)
P3mask = (P[:, -1:] == P[:, :-1]).any(axis=1)
P1 = P[P1mask, :]
P3 = P[P3mask, :]
The only really interesting thing here is that I'm indexing the columns as slices 0:1 and -1: instead of just 0 and -1 to preserve shape and enable broadcasting.
P2 can be constructed in a similar manner, although the solution is not very general:
P2mask = (P[:, 0:1] == P[:, 2:]).any(axis=1) | (P[:, 1:2] == P[:, 2:]).any(axis=1)
P2 = P[P2mask, :]
A more general solution would be to broadcast the two segments together with a new dimension so that the comparison done with | manually above can be automated:
split = 2
P2mask = (P[:, :split, None] == P[:, None, split:]).any(axis=(1, 2))
P2 = P[P2mask, :]
P1 and P3 are just the cases for split = 1 and split = 3, respectively.
You want to select all rows that fulfill a given condition, so you need to iterate over the rows of P, build a boolean array and apply it to the rows of P. In your case, the easiest way I can think of to check if there are shared elements, is to create two sets and check if their intersection is empty or not. This can be done via set.isdisjoint.
Final code:
P1 = P[[not set(row[:1]).isdisjoint(row[1:]) for row in P], :]
Analogous for P2 and P3.
Within each list of lists, I want to keep only the list whose second element is the absolute minimum among all the second elements.
Without success, I've tried to use a list comprehension with filter and min(lst,key=abs).
Here is an example with three list of lists:
input_list = [[[0, -5, 'rising']],
[[0, -5, 'boost'], [0, -2, 'rise'], [0, -1, 'increase']],
[[1, -2, 'decrease'], [0, -3, 'lower']]]
For instance, the second list of lists is composed of three lists and the absolute minimum among the second element of each of these lists is -1, so out of this list of lists I want to only keep [0, -1, 'increase'].
Here is the desired output :
output_list = [[0, -5, 'rising'],
[0, -1, 'increase'],
[0, -2, 'decrease']]
You could try sorted to get the minimum:
>>> output_list = [sorted(inner_list, key=lambda x:abs(x[1]))[0]
for inner_list in input_list]
>>> output_list
[[0, -5, 'rising'], [0, -1, 'increase'], [1, -2, 'decrease']]
Use list comprehension with min() function and custom key=
input_list = [[[0, -5, 'rising']],
[[0, -5, 'boost'], [0, -2, 'rise'], [0, -1, 'increase']],
[[1, -2, 'decrease'], [0, -3, 'lower']]]
out = [min(l, key=lambda k: abs(k[1])) for l in input_list]
print(out)
Prints:
[[0, -5, 'rising'], [0, -1, 'increase'], [1, -2, 'decrease']]
Let's say that I have the following array of numbers:
array([[-3 , 3],
[ 2, -1],
[-4, -4],
[-4, -4],
[ 0, 3],
[-3, -2],
[-4, -2]])
I would then like to compute the norm of the distance between each pair of consecutive numbers in the columns, i.e.
array([[norm(2--3), norm(-1-3)],
[norm(-4-2), norm(-4--1)],
[norm(-4--4), norm(-4--4)],
[norm(0--4), norm(3--4)],
[norm(-3-0), norm(-2-3)],
[norm(-4--3)-3, norm(-2--2)])
I would then like to take the mean of each column.
Is there a quick and efficient way of doing this in Python? I've been trying but have had no luck so far.
Thank you for your help!
This will do the job:
np.mean(np.absolute(a[1:]-a[:-1]),0)
This returns
array([ 3.16666667, 3.16666667])
Explanation:
First of all, np.absolute(a[1:]-a[:-1]) returns
array([[5, 4],
[6, 3],
[0, 0],
[4, 7],
[3, 5],
[1, 0]])
which is the array of the absolute values of the differences (I assume that by norm of a number you mean absolute value). Then applying np.mean with axis=0 returns the average value of every column.
I have a data file with 2 columns, x ranging from -5 to 4 and f(x). I need to add a third column with |f(x)| the absolute value of f(x). Then I need to export the 3 columns as a new data file.
Currently my code looks like this:
from numpy import *
data = genfromtxt("task1.dat")
c = []
ab = abs(data[:,1])
ablist = ab.tolist()
datalist = data.tolist()
c.append(ablist)
c.append (datalist)
A = asarray (c)
savetxt("task1b.dat", A)
It gives me the following error message for line "A = asarray(c)":
ValueError : setting an array element with a sequence.
Does someone know a quick and efficient way to add this column and export the data file?
You are getting a list within a list in c.
Anyway, I think this is much clearer:
import numpy as np
data = np.genfromtxt("task1.dat")
data_new = np.hstack((data, np.abs(data[:,-1]).reshape((-1,1))))
np.savetxt("task_out.dat", data_new)
c is a list and when you execute
c.append(ablist)
c.append (datalist)
it appends 2 lists of different shapes to the list c. It will probably end up looking like this
c == [ [ [....],[....]], [....]]
which is not possible to be parsed by numpy.asarray due to that shape difference
(I am saying probably because I am assuming there is a 2d matrix in genfromtxt("task1.dat"))
what you can do to concatenate the columns is
from numpy import *
data = genfromtxt("task1.dat")
ab = abs(data[:,1])
c = concatenate((data,ab.reshape(-1,1),axis=1)
savetxt("task1b.dat", c)
data is a 2d array like:
In [54]: data=np.arange(-5,5).reshape(5,2)
In [55]: data
Out[55]:
array([[-5, -4],
[-3, -2],
[-1, 0],
[ 1, 2],
[ 3, 4]])
In [56]: ab=abs(data[:,1])
There are various ways to concatenate 2 arrays. In this case, data is 2d, and ab is 1d, so you have to take some steps to ensure they are both 2d. np.column_stack does that for us.
In [58]: np.column_stack((data,ab))
Out[58]:
array([[-5, -4, 4],
[-3, -2, 2],
[-1, 0, 0],
[ 1, 2, 2],
[ 3, 4, 4]])
With a little change in indexing we could make ab a column array from that start, and simply concatenate on the 2nd axis:
ab=abs(data[:,[1]])
np.concatenate((data,ab),axis=1)
==================
The same numbers with your tolist produce a c like
In [72]: [ab.tolist()]+[data.tolist()]
Out[72]: [[4, 2, 0, 2, 4], [[-5, -4], [-3, -2], [-1, 0], [1, 2], [3, 4]]]
That is not good input for array.
To go the list route you need to do an iteration over a zip:
In [86]: list(zip(data,ab))
Out[86]:
[(array([-5, -4]), 4),
(array([-3, -2]), 2),
(array([-1, 0]), 0),
(array([1, 2]), 2),
(array([3, 4]), 4)]
In [87]: c=[]
In [88]: for i,j in zip(data,ab):
c.append(i.tolist()+[j])
....:
In [89]: c
Out[89]: [[-5, -4, 4], [-3, -2, 2], [-1, 0, 0], [1, 2, 2], [3, 4, 4]]
In [90]: np.array(c)
Out[90]:
array([[-5, -4, 4],
[-3, -2, 2],
[-1, 0, 0],
[ 1, 2, 2],
[ 3, 4, 4]])
Obviously this will be slower than the array concatenate, but studying this might help you understand both arrays and lists.
I have written a python code to duplicate the previous input if any entry is less than zero or nan except the first row in a matrix. But it is not working as I need what is the possible error I am doing and is there any other efficient way to do this without using multiple for loops. The input matrix values may be differ in some case and may contain float values.
import numpy as np
from math import isnan
data = [[0, -1, 2],
[7,8.1,-3],
[-8,5, -1],
['N',7,-1]]
m, n = np.shape(data)
for i in range (1,m):
for j in range (n):
if data[i][j] < 0 or isnan:
data[i][j] = data[i-1][j]
print data
The expected output is
[[0,-1,2],
[7,8.1,2],
[7,5,2],
[7,7,2]]
But, I am getting
[[0, -1, 2],
[0, -1, 2],
[0, -1, 2],
[0, -1, 2]]
You're saying if data[i][j] < 0 or isnan:. isnan is a function, and will always make the if statement True. You would want isnan(data[i][j]). But in this case, it looks like what you want to check is if not isinstance(data[i][j], (int, float)).
import numpy as np
data = [
[0, -1, 2],
[7, 8, -3],
[-8, 5, -1],
['N', 7, -1]
]
m, n = np.shape(data)
for i in range(1, m):
for j in range(n):
if data[i][j] < 0 or not isinstance(data[i][j], (int, float)):
data[i][j] = data[i-1][j]
for row in data:
print row
Output:
[0, -1, 2]
[7, 8, 2]
[7, 5, 2]
[7, 7, 2]