Originally I had something like this:
a = 1 # Some randomly generated positive integer
b = -1 # Some randomly generated negative integer
c = 0 # Constant 0
i = 0 # Randomly picked from (0, 1, 2)
d = [a, b, c][i]
I would like to vectorise this so that many samples can be generated
So I have three arrays of length N, an index array of length N, and would like to use that index array to pick one of the three arrays
a = np.array([1, 2, 3, 4])
b = np.array([-1, -2, -3, -4])
c = np.array([0, 0, 0, 0])
i = np.array([2, 1, 2, 0])
d = np.array([a, b, c])[i] # Doesn't work
# Would like the result:
d = np.array([0, -2, 0, 4])
d = a * (i == 0) + b * (i == 1) + c * (i == 2) works, but surely there is a way that looks more like the unvectorised code
Make a 2-d array from the three arrays then use Integer indexing
>>> e = np.vstack([a,b,c])
>>> i = np.array([2, 1, 2, 0])
>>> e[(i,np.arange(i.shape[0]))]
array([ 0, -2, 0, 4])
>>>
Notice that your answer is on the diagonal of
np.array([a, b, c])[i]
so you can go:
np.array([a, b, c])[i].diagonal()
Related
Say I have two arrays:
a = np.asarray([0,1,2])
b = np.asarray([3,7,10])
Is there a fast way to create:
c = np.asarray([0,0,0,1,1,1,1,2,2,2])
# index 3 7 10
This can be done using a for loop but I wonder if there is a fast internal numpy function that achieves the same thing.
You can use diff to get the successive differences, r_ to add the first b value and repeat to duplicate the values:
a = np.asarray([0, 1, 2])
b = np.asarray([3, 7, 10])
c = np.repeat(a, np.r_[b[0], np.diff(b)])
Output: array([0, 0, 0, 1, 1, 1, 1, 2, 2, 2])
Basically, I have three arrays that I multiply with values from 0 to 2, expanding the number of rows to the number of products (the values to be multiplied are the same for each array). From there, I want to calculate the product of every combination of rows from all three arrays. So I have three arrays
A = np.array([1, 2, 3])
B = np.array([1, 2, 3])
C = np.array([1, 2, 3])
and I'm trying to reduce the operation given below
search_range = np.linspace(0, 2, 11)
results = np.array([[0, 0, 0]])
for i in search_range:
for j in search_range:
for k in search_range:
sm = i*A + j*B + k*C
results = np.append(results, [sm], axis=0)
What I tried doing:
A = np.array([[1, 2, 3]])
B = np.array([[1, 2, 3]])
C = np.array([[1, 2, 3]])
n = 11
scale = np.linspace(0, 2, n).reshape(-1, 1)
A = np.repeat(A, n, axis=0) * scale
B = np.repeat(B, n, axis=0) * scale
C = np.repeat(C, n, axis=0) * scale
results = np.array([[0, 0, 0]])
for i in range(n):
A_i = A[i]
for j in range(n):
B_j = B[j]
C_k = C
sm = A_i + B_j + C_k
results = np.append(results, sm, axis=0)
which only removes the last for loop. How do I reduce the other for loops?
You can get the same result like this:
search_range = np.linspace(0, 2, 11)
search_range = np.array(np.meshgrid(search_range, search_range, search_range))
search_range = search_range.T.reshape(-1, 3)
sm = search_range[:, 0, None]*A + search_range[:, 1, None]*B + search_range[:, 2, None]*C
results = np.concatenate(([[0, 0, 0]], sm))
Instead of using three nested loops to get every combination of elements in the "search_range" array, I used the meshgrid function to convert "search_range" to a 2D array of every possible combination and then instead of i, j and k you can use the 3 items in the arrays in the "search_range".
And finally, as suggested by #Mercury you can use indexing for the new "search_range" array to generate the result. For example search_range[:, 1, None] is an array in shape of (1331, 1), containing singleton arrays of every element at index of 0 in arrays in the "search_range". That concatenate is only there because you wanted the results array to have default value of [[0, 0, 0]], so I appended sm to it; Otherwise, the sm array contains the answer.
I have two numpy arrays of unequal length. I would like to compare the two arrays for mismatch at index.
for example in these numpy arrays. The number of mismatches are
import numpy as np
a = np.array([0, 1, 0, 1, 1])
b = np.array([1, 1, 0, 0, 1, 1])
expected output : 3 ( three mismatches at index 0,3 and 5)
As per my comment above.
Assuming we don't know which of the arrays is the longer one:
def foo(a, b):
# Get equal length arrays
c = a[: min([a.shape[0], b.shape[0]])]
d = b[: min([a.shape[0], b.shape[0]])]
# now compare equal arrays
mis = np.equal(c, d)
# Add difference in shapes of arrays as mismatches
mis = np.concatenate((mis, np.full(abs(a.shape[0] - b.shape[0]), False)))
return np.where(~mis)[0].shape[0]
>>> a = np.array([0, 1, 0, 1, 1])
b = np.array([1, 1, 0, 0, 1, 1])
>>> x = foo(a, b)
>>> x
Out: 3
EDIT: Oops... forgot to add a bitwise not to the return statement.
this is what I got . built on #pavel and #hpaulj answers.
def comp(a, b):
# Get equal length arrays
c = a[: min([a.shape[0], b.shape[0]])]
d = b[: min([a.shape[0], b.shape[0]])]
# now compare equal arrays
mis = np.sum(c != d)
mis = mis + abs(a.shape[0]-b.shape[0])
return mis
a = np.array([0, 1, 0, 1, 1, 0, 0, 1])
b = np.array([1, 1, 1, 0, 1, 1])
x = comp(a,b)
x
Imagine a matrix A having one column with a lot of inequality/equality operators (≥, = ≤) and a vector b, where the number of rows in A is equal the number of elements in b. Then one row, in my setting would be computed by, e.g
dot(A[0, 1:], x) ≥ b[0]
where x is some vector, column A[,0] represents all operators and we'd know that for row 0 we were suppose to calculate using ≥ operator (e.i. A[0,0] == "≥" is true). Now, is there a way for dynamically calculate all rows in following so far imaginary way
dot(A[, 1:], x) A[, 0] b
My hope was for a dynamic evaluation of each row where we evaluate which operator is used for each row.
Example, let
A = [
[">=", -2, 1, 1],
[">=", 0, 1, 0],
["==", 0, 1, 1]
]
b = [0, 1, 1]
and x be some given vector, e.g. x = [1,1,0] we wish to compute as following
A[,1:] x A[,0] b
dot([-2, 1, 1], [1, 1, 0]) >= 0
dot([0, 1, 0], [1, 1, 0]) >= 1
dot([0, 1, 1], [1, 1, 0]) == 1
The output would be [False, True, True]
If I understand correctly, this is a way to do that operation:
import numpy as np
# Input data
a = [
[">=", -2, 1, 1],
[">=", 0, 1, 0],
["==", 0, 1, 1]
]
b = np.array([0, 1, 1])
x = np.array([1, 1, 0])
# Split in comparison and data
a0 = np.array([lst[0] for lst in a])
a1 = np.array([lst[1:] for lst in a])
# Compute dot product
c = a1 # x
# Compute comparisons
leq = c <= b
eq = c == b
geq = c >= b
# Find comparison index for each row
cmps = np.array(["<=", "==", ">="]) # This array is lex sorted
cmp_idx = np.searchsorted(cmps, a0)
# Select the right result for each row
result = np.choose(cmp_idx, [leq, eq, geq])
# Convert to numeric type if preferred
result = result.astype(np.int32)
print(result)
# [0 1 1]
I have a fixed amount of int arrays of the form:
[a,b,c,d,e]
for example:
[2,2,1,1,2]
where a and b can be ints from 0 to 2, c and d can be 0 or 1, and e can be ints from 0 to 2.
Therefore there are: 3 * 3 * 2 * 2 * 3: 108 possible arrays of this form.
I would like to assign to each of those arrays a unique integer code from 0 to 107.
I am stuck, i thought of adding each numbers in the array, but two arrays such as:
[0,0,0,0,1] and [1,0,0,0,0]
would both add to 1.
Any suggestion?
Thank you.
You could use np.ravel_multi_index:
>>> np.ravel_multi_index([1, 2, 0, 1, 2], (3, 3, 2, 2, 3))
65
Validation:
>>> {np.ravel_multi_index(j, (3, 3, 2, 2, 3)) for j in itertools.product(*map(range, (3,3,2,2,3)))} == set(range(np.prod((3, 3, 2, 2, 3))))
True
Going back the other way:
>>> np.unravel_index(65, dims=(3, 3, 2, 2, 3))
(1, 2, 0, 1, 2)
Just another way, similar to Horner's method for polynomials:
>>> array = [1, 2, 0, 1, 2]
>>> ranges = (3, 3, 2, 2, 3)
>>> reduce(lambda i, (a, r): i * r + a, zip(array, ranges), 0)
65
Unrolled that's ((((0 * 3 + 1) * 3 + 2) * 2 + 0) * 2 + 1) * 3 + 2 = 65.
This is a little like converting digits from a varying-size number base to a standard integer. In base-10, you could have five digits, each from 0 to 9, and then you would convert them to a single integer via i = a*10000 + b*1000 + c*100 + d*10 + e*1.
Equivalently, for the decimal conversion, you could write i = np.dot([a, b, c, d, e], bases), where bases = [10*10*10*10, 10*10*10, 10*10, 10, 1].
You can do the same thing with your bases, except that your positions introduce multipliers of [3, 3, 2, 2, 3] instead of [10, 10, 10, 10, 10]. So you could set bases = [3*2*2*3, 2*2*3, 2*3, 3, 1] (=[36, 12, 6, 3, 1]) and then use i = np.dot([a, b, c, d, e], bases). Note that this will always give answers in the range of 0 to 107 if a, b, c, d, and e fall in the ranges you specified.
To convert i back into a list of digits, you could use something like this:
digits = []
remainder = i
for base in bases:
digit, remainder = divmod(remainder, base)
digits.append(digit)
On the other hand, to keep your life simple, you are probably better off using Paul Panzer's answer, which pretty much does the same thing. (I never thought of an n-digit number as the coordinates of a cell in an n-dimensional grid before, but it turns out they're mathematically equivalent. And np.ravel is an easy way to assign a serial number to each cell.)
This data is small enough that you may simply enumerate them:
>>> L = [[a,b,c,d,e] for a in range(3) for b in range(3) for c in range(2) for d in range(2) for e in range(3)]
>>> L[0]
[0, 0, 0, 0, 0]
>>> L[107]
[2, 2, 1, 1, 2]
If you need to go the other way (from the array to the integer) make a lookup dict for it so that you will get O(1) instead of O(n):
>>> lookup = {tuple(x): i for i, x in enumerate(L)}
>>> lookup[1,1,1,1,1]
58
getting dot-product of your vectors as following:
In [210]: a1
Out[210]: array([2, 2, 1, 1, 2])
In [211]: a2
Out[211]: array([1, 0, 1, 1, 0])
In [212]: a1.dot(np.power(10, np.arange(5,0,-1)))
Out[212]: 221120
In [213]: a2.dot(np.power(10, np.arange(5,0,-1)))
Out[213]: 101100
should produce 108 unique numbers - use their indices...
If the array lenght is not very huge, you can calculate out the weight first, then use simple math formula to get the ID.
The code will be like:
#Test Case
test1 = [2, 2, 1, 1, 2]
test2 = [0, 2, 1, 1, 2]
test3 = [0, 0, 0, 0, 2]
def getUniqueID(target):
#calculate out the weights first;
#When Index=0; Weight[0]=1;
#When Index>0; Weight[Index] = Weight[Index-1]*(The count of Possible Values for Previous Index);
weight = [1, 3, 9, 18, 36]
return target[0]*weight[0] + target[1]*weight[1] + target[2]*weight[2] + target[3]*weight[3] + target[4]*weight[4]
print 'Test Case 1:', getUniqueID(test1)
print 'Test Case 2:', getUniqueID(test2)
print 'Test Case 3:', getUniqueID(test3)
#Output
#Test Case 1: 107
#Test Case 2: 105
#Test Case 3: 72
#[Finished in 0.335s]