Update values in numpy array without tedious for loops

Update values in numpy array without tedious for loops - python

I have a few arrays with data like this:
a = np.random.rand(3,3)
b = np.random.rand(3,3)
Using for loops I construct larger matrix
L = np.zeros((9,9))
for i in range(9):
for j in range(9):
L[i,j] = f(a,b,i,j) # values of L depends on values of a and b
Later in my program I will change a and b and I want my L array to change too. So the logic of my program looks like this (in pseudo code)
Create a
Create b
while True:
Create L using a and b
Do the stuff
Change a
Change b
In my program the size of L is large (10^6 x 10^6 and larger).
Constructing this L matrix again and again is tedious and slow process.
Instead of doing for loops again and again I would like just to update values of L matrix according to changed values of a and b. The structure of L is the same each time, the only difference is values of cells. Something like this:
a[0,0] = 2
b[0,0] = 2
L[3,5] = 2*a[0,0]*b[0,0]
L[3,5]
# >>> 8
a[0,0] = 3
b[0,0] = 1
# do some magic here
L[3,5]
>>> 6

Can something like this solve your problem ?
>>> a = 10
>>> b = 20
>>> def func():
# fetch the values of a and b
... return a+b
...
>>> lis = [func]
>>> lis[0]
<function func at 0x109f70730>
>>> lis[0]()
30
>>> a = 20
>>> lis[0]()
40
Basically every time you fetch the value by calling a function
that computes the latest value.

Related

How can I calculate scikit-learn rbf_kernel() with very large array?

While using the rbf_kernel() function the array is too large and there is a memory issue, so I have to separate the data and calculate it.
from sklearn.metrics.pairwise import rbf_kernel
result = rbf_kernel([[1,1],[2,2],[3,3]], gamma=60) # A data:[1,1] , B data:[2,2], C data:[3,3]
And result looks like
A B C
A 1 2 1
B 1 1 1
C 1 1 2
However, if I insert larger data, there is a memory issue.
result = rbf_kernel([[1,1],[2,2],[3,3],[4,4],[5,5],.... ], gamma=60)
How can I extract the result without putting data all at once?

Try using:
l = [[1,1],[2,2],[3,3],[4,4],[5,5], ...]
newl = []
for i in range(0, len(l), 10):
newl.append(rbf_kernel(l[i:i + 10]))

Numpy function operating on two ndarrays

Given two ndarrays a = np.asarray([[0,1,2],[3,4,5]]) and b = np.asarray([[6,7,8],[9,10,11]])I want to write a function that iterates over a and b, such that
[0,1,2] and [6,7,8] are considered
[3,4,5] and [9,10,11] are considered
An example would be a functionn that takes
[0,1,2] and [6,7,8] as an input and outputs 0*6+1*7+2*8 = 23
[3,4,5] and [9,10,11] as an input and outputs 3*9+4*10+5*11 = 122
-> (23,122)
Is there any way to do this efficiently in numpy?
My idea was to zip both arrays, however, this is not efficient.
Edit: I am looking for a way to apply a customizable function myfunc(x,y). In the previous example myfunc(x,y) corresponded to the multipication.

c = a * b
sum1 = c[0].sum()
sum2 = c[1].sum()
if you want the algorithmic way ( custom function )
a = np.asarray([[0,1,2],[3,4,5]])
b = np.asarray([[6,7,8],[9,10,11]])
for i in range(a.shape[0]) :
s = 0
for j in range(a.shape[1]) :
s = s + a[i][j]*b[i][j]
print(s)

import numpy as np
a = np.asarray([[0,1,2],[3,4,5]])
b = np.asarray([[6,7,8],[9,10,11]])
c = a*b
print(sum(c[0]),sum(c1))
ans->23,122

Don't need to use zip both array, you need to understand numpy package help you work well with matrix. So you need the basic knowledge about matrix, i recommend you learn from this link http://cs231n.github.io/python-numpy-tutorial/ , from cs231n of Stanford university.
This is a function can solve your problem:
import numpy as np
def interates(matrix_a, matrix_b):
product = matrix_a*matrix_b
return (np.sum(product,1))
The value product contain a new matrix with same shape of matrix_a and matrix_b, each element in there is the result of matrix_a[i][j] * matrix_b[i][j]with i and j run from 0 to matrix_a.shape[0]and matrix_a.shape[1].
Now check with your example
a = np.asarray([[0,1,2],[3,4,5]])
b = np.asarray([[6,7,8],[9,10,11]])
result = interates(a,b)
Print result
>> print(result)
>> [23 122]
If you want a tuple
>> result = tuple(result)
>> print(result)
>> (23, 122)

Numpy where conditional statement along axis 0

I have a 1D vector Zc containing n elements that are 2D arrays. I want to find the index of each 2D array that equals np.ones(Zc[i].shape).
a = np.zeros((5,5))
b = np.ones((5,5))*4
c = np.ones((5,5))
d = np.ones((5,5))*2
Zc = np.stack((a,b,c,d))
for i in range(len(Zc)):
a = np.ones(Zc[i].shape)
b = Zc[i]
if np.array_equal(a,b):
print(i)
else:
pass
Which returns 2. The code above works and returns the correct answer, but I want to know if there a vectorized way to achieve the same result?

Going off of hpaulj's comment:
>>> allones = (Zc == np.array(np.ones(Zc[i].shape))).all(axis=(1,2))
>>> np.where(allones)[0][0]
2

fastest way to iterate a numpy array and update each element

This might be weird to you people, but I happen to have this weird goal to achieve, code goes as follows.
# A is a numpy array, dtype=int32,
# and each element is actually an ID(int), the ID range might be wide,
# but the actually existing values are quite fewer than the dense range,
A = array([[379621, 552965, 192509],
[509849, 252786, 710979],
[379621, 718598, 591201],
[509849, 35700, 951719]])
# and I need to map these sparse ID to dense ones,
# my idea is to have a dict, mapping actual_sparse_ID -> dense_ID
M = {}
# so I iterate this numpy array, and check if this sparse ID has a dense one or not
for i in np.nditer(A, op_flags=['readwrite']):
if i not in M:
M[i] = len(M) # sparse ID got a dense one
i[...] = M[i] # replace sparse one with the dense ID
My goal could be achieved with np.unique(A, return_inverse=True), and the return_inverse result is what I want.
However, the numpy array I have is too huge to fully load into memory, so I cannot run np.unique over the whole data, and this is why I came up with this dict-mapping idea...
Is this the right way to go? Any possible improvement?

I will make an attempt to provide an alternative way of doing this by using numpy.unique() on sub-arrays. This solution is not fully tested. I also did not do any side-by-side performance evaluation since your solution is not fully working for me.
Let's say we have an array c that we split into two smaller arrays. Let's create some test data, for example:
>>> a = np.array([[1,1,2,3,4],[1,2,6,6,2],[8,0,1,1,4]])
>>> b = np.array([[11,2,-1,12,6],[12,2,6,11,2],[7,0,3,1,3]])
>>> c = np.vstack([a, b])
>>> print(c)
[[ 1 1 2 3 4]
[ 1 2 6 6 2]
[ 8 0 1 1 4]
[11 2 -1 12 6]
[12 2 6 11 2]
[ 7 0 3 1 3]]
Here we assume that c is the large array and a and b are sub-arrays. Of course, one could build c first and then extract sub-arrays.
Next step is to run numpy.unique() on the two sub-arrays:
>>> ua, ia = np.unique(a, return_inverse=True)
>>> ub, ib = np.unique(b, return_inverse=True)
>>> uc, ic = np.unique(c, return_inverse=True) # this is for future reference
Now, here is an algorithm for combining the results from subarrays:
def merge_unique(ua, ia, ub, ib):
# make copies *if* changing inputs is undesirable:
ua = ua.copy()
ia = ia.copy()
ub = ub.copy()
ib = ib.copy()
# find differences between unique values in the two arrays:
diffab = np.setdiff1d(ua, ub, assume_unique=True)
diffba = np.setdiff1d(ub, ua, assume_unique=True)
# find indices in ua, ub where to insert "other" unique values:
ssa = np.searchsorted(ua, diffba)
ssb = np.searchsorted(ub, diffab)
# throw away values that are too large:
ssa = ssa[np.where(ssa < len(ua))]
ssb = ssb[np.where(ssb < len(ub))]
# increment indices past previously computed "insert" positions:
for v in ssa[::-1]:
ia[ia >= v] += 1
for v in ssb[::-1]:
ib[ib >= v] += 1
# combine results:
uc = np.union1d(ua, ub) # or use ssa, ssb, diffba, diffab to update ua, ub
ic = np.concatenate([ia, ib])
return uc, ic
Now, let's run this function on the results of numpy.unique() from sub-arrays and then compare merged indices and unique values with the reference results uc and ic:
>>> uc2, ic2 = merge_unique(ua, ia, ub, ib)
>>> np.all(uc2 == uc)
True
>>> np.all(ic2 == ic)
True
Splitting into more than two sub-arrays can be handled with little additional work - simply keep accumulating "unique" values and indices, like this:
uacc, iacc = np.unique(subarr1, return_inverse=True)
ui, ii = np.unique(subarr2, return_inverse=True)
uacc, iacc = merge_unique(uacc, iacc, ui, ii)
ui, ii = np.unique(subarr3, return_inverse=True)
uacc, iacc = merge_unique(uacc, iacc, ui, ii)
ui, ii = np.unique(subarr4, return_inverse=True)
uacc, iacc = merge_unique(uacc, iacc, ui, ii)
................................ (etc.)

Cumulative integration of elements of numpy arrays

I would like to to the following type of integration:
Say I have 2 arrays
a = np.array[1,2,3,4]
b = np.array[2,4,6,8]
I know how to integrate these using something like:
c = scipy.integrate.simps(b, a)
where c = 15 for above data set.
What I would like to do is multiply the first elements of each array and add to new array called d, i.e. a[0]*b[0] then integrate the first 2 elements the arrays then the first 3 elements, etc. So eventually for this data set, I would get
d = [2 3 8 15]
I have tried a few things but no luck; I am pretty new to writing code.

If I have understood correctly what you need you could do the following:
import numpy as np
from scipy import integrate
a = np.array([2,4,6,8])
b = np.array([1,2,3,4])
d = np.empty_like(b)
d[0] = a[0] * b[0]
for i in range(2, len(a) + 1):
d[i-1] = integrate.simps(b[0:i], a[0:i])
print(d)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Update values in numpy array without tedious for loops - python

Related

How can I calculate scikit-learn rbf_kernel() with very large array?

Numpy function operating on two ndarrays

Numpy where conditional statement along axis 0

fastest way to iterate a numpy array and update each element

Cumulative integration of elements of numpy arrays

Categories

Resources