Related
i'm starting to learn GEKKO. Now, I am solving a knapsak problem to learn, but this time I get the error "int 'object is not subscriptable". can you look at this code? what is the source of the problem How should I define the 1.10 matrices?
from gekko import GEKKO
import numpy as np
m = GEKKO(remote=False)
x = m.Var((10),lb=0,ub=1,integer=True)
#x = m.Array(m.Var,(1,10),lb=0,ub=1,integer=True)
v=np.array([2, 2, 7, 8, 2, 1, 7, 9, 4, 10])
w=np.array([2, 2, 2, 2, 2, 1, 6, 7, 3, 3])
capacity=16
for j in range(10):
m.Maximize(v[j]*x[j])
for i in range(10):
m.Equation(m.sum(x[i]*w[i])<=capacity)
m.options.solver = 1
m.solve()
#print('Objective Function: ' + str(m.options.objfcnval))
print(x)
My second question is that there is a function called "showproblem ()" in MATLAB. Does GEKKO have this function?
thanks for help.
new question that according to answer.
can i write here this style(that doesnt work, if i can do it, please write working style)(i want to write this style, because i think this style is easier to understand.),
for i in range(10):
xw = x[i]*w[i]
m.Equation(m.sum(xw)<=capacity)
instead of this.
xw = [x[i]*w[i] for i in range(10)]
m.Equation(m.sum(xw)<=capacity)
Here is a modified version that solves the mixed integer problem in gekko.
from gekko import GEKKO
import numpy as np
m = GEKKO(remote=False)
x = m.Array(m.Var,10,lb=0,ub=1,integer=True)
v=np.array([2, 2, 7, 8, 2, 1, 7, 9, 4, 10])
w=np.array([2, 2, 2, 2, 2, 1, 6, 7, 3, 3])
capacity=16
for j in range(10):
m.Maximize(v[j]*x[j])
xw = [x[i]*w[i] for i in range(10)]
m.Equation(m.sum(xw)<=capacity)
m.options.solver = 1
m.solve()
print('Objective Function: ' + str(-m.options.objfcnval))
print(x)
Your problem formulation was close. You just needed to define a list xw that you use to form the capacity constraint.
If you want to use a loop instead of a list comprehension then I recommend the following instead of xw = [x[i]*w[i] for i in range(10)].
xw = []
for i in range(10):
xw.append(x[i]*w[i])
I try to compute mode on all cells of the same zone (same value) on a numpy array. I give you an example of code below. In this example sequential approach works fine but multiprocessed approach does nothing. I do not find my mistake.
Does someone see my error ?
I would like to parallelize the computation because my real array is a 10k * 10k array with 1M zones.
import numpy as np
import scipy.stats as ss
import multiprocessing as mp
def zone_mode(i, a, b, output):
to_extract = np.where(a == i)
val = b[to_extract]
output[to_extract] = ss.mode(val)[0][0]
return output
def zone_mode0(i, a, b):
to_extract = np.where(a == i)
val = b[to_extract]
output = ss.mode(val)[0][0]
return output
np.random.seed(1)
zone = np.array([[1, 1, 1, 2, 3],
[1, 1, 2, 2, 3],
[4, 2, 2, 3, 3],
[4, 4, 5, 5, 3],
[4, 6, 6, 5, 5],
[6, 6, 6, 5, 5]])
values = np.random.randint(8, size=zone.shape)
output = np.zeros_like(zone).astype(np.float)
for i in np.unique(zone):
output = zone_mode(i, zone, values, output)
# for multiprocessing
zone0 = zone - 1
pool = mp.Pool(mp.cpu_count() - 1)
results = [pool.apply(zone_mode0, args=(u, zone0, values)) for u in np.unique(zone0)]
pool.close()
output = results[zone0]
For positve integers in the arrays - zone and values, we can use np.bincount. The basic idea is that we will consider zone and values as row and cols on a 2D grid. So, can map those to their linear index equivalent numbers. Those would be used as bins for binned summation with np.bincount. Their argmax IDs would be the mode numbers. They are mapped back to zone-grid with indexing into zone.
Hence, the solution would be -
m = zone.max()+1
n = values.max()+1
ids = zone*n + values
c = np.bincount(ids.ravel(),minlength=m*n).reshape(-1,n).argmax(1)
out = c[zone]
For sparsey data (well spread integers in the input arrays), we can look into sparse-matrix to get the argmax IDs c. Hence, with SciPy's sparse-matrix -
from scipy.sparse import coo_matrix
data = np.ones(zone.size,dtype=int)
r,c = zone.ravel(),values.ravel()
c = coo_matrix((data,(r,c))).argmax(1).A1
For slight perf. boost, specify the shape -
c = coo_matrix((data,(r,c)),shape=(m,n)).argmax(1).A1
Solving for generic values
We will make use of pandas.factorize, like so -
import pandas as pd
ids,unq = pd.factorize(values.flat)
v = ids.reshape(values.shape)
# .. same steps as earlier with bincount, using v in place of values
out = unq[c[zone]]
Note that for tie-cases, it would pick random element off values. If you want to pick the first one, use pd.factorize(values.flat, sort=True).
I have a question why is map_block function run twice? When I run an example below:
import dask.array as da
import numpy as np
def derivative(x):
print(x.shape)
return x - np.roll(x, 1)
x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
d = da.from_array(x, chunks = 5)
y = d.map_blocks(derivative)
res = y.compute()
I obtain this output:
(1L,)
(5L,)
(4L,)
Since my chunks are ((5, 4),), I assume that derivative function has to be somehow run once before is really executed on these chunks, am I right?
I have python v2.7 and dask on v0.13.0.
If you do not supply a dtype to the map-blocks call then it will try running your function on a tiny sample dataset (hence the singleton shape). You can avoid this by passing a dtype explicitly if you know it.
y = d.map_blocks(derivative, dtype=d.dtype)
I am looking for a vectorized method to apply a function returning a 2-dimensional array to each row of a 2-dimensional array and produce a 3-dimensional array.
More specifically, I have a function that takes a vector of length p and returns a 2-dimensional array (m by n). The following is a stylized version of my function:
import numpy as np
def test_func(x, m, n):
# this function is just an example and does not do anything useful.
# but, the dimensions of input and output is what I want to convey.
np.random.seed(x.sum())
return np.random.randint(5, size=(m, n))
I have a t by p 2-dimensional input data:
t = 5
p = 6
input_data = np.arange(t*p).reshape(t, p)
input_data
Out[403]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29]])
I want to apply test_func to each row of the input_data. Since test_func returns a matrix, I expect to create a 3-dimensional (t by m by n) array. I can produce my desired result with the following code:
output_data = np.array([test_func(x, m=3, n=2) for x in input_data])
output_data
Out[405]:
array([[[0, 4],
[0, 4],
[3, 3],
[1, 0]],
[[1, 0],
[1, 0],
[4, 1],
[2, 4]],
[[3, 3],
[3, 0],
[1, 4],
[0, 2]],
[[2, 4],
[2, 1],
[3, 2],
[3, 1]],
[[3, 4],
[4, 3],
[0, 3],
[3, 0]]])
However, this code does not seem to be the most optimal code. It has an explicit for which reduces the speed and it uses an intermediary list which unnecessarily allocates extra memory. So, I like to find a vectorized solution. My best guess was the following code, but it does not work.
output = np.apply_along_axis(test_func, m=3, n=2, axis=1, arr=input_data)
Traceback (most recent call last):
File "<ipython-input-406-5bef44da348f>", line 1, in <module>
output = np.apply_along_axis(test_func, m=3, n=2, axis=1, arr=input_data)
File "C:\Anaconda\lib\site-packages\numpy\lib\shape_base.py", line 117, in apply_along_axis
outarr[tuple(i.tolist())] = res
ValueError: could not broadcast input array from shape (3,2) into shape (3)
Would you please suggest an efficient way to this problem.
UPDATE
Below is the actual function that I want to apply. It performs Multidimensional Classical Scaling. The objective of the question was not to optimize the internal workings of the function, but to find a generalize method for vectorizing the function apply. But, in the spirit of full disclosure I put the actual function here. Note that this function only works if p == m*(m-1)/2
def mds_classical_scaling(v, m, n):
# create a symmetric distance matrix from the elements in vector v
D = np.zeros((m, m))
D[np.triu_indices(4, k=1)] = v
D = (D + D.T)
# Transform the symmetric matrix
A = -0.5 * (D**2)
# Create centering matrix
H = np.eye(m) - np.ones((m, m))/m
# Doubly center A and store in B
B = H*A*H
# B should be positive definite otherwise the function
# would not work.
mu, V = eig(B)
#index of largest eigen values
ndx = (-mu).argsort()
# calculate the point configuration from largest eigen values
# and corresponding eigen vectors
Mu1 = diag(mu[ndx][:n])
V1 = V[:, ndx[:n]]
X = V1*sqrt(Mu1)
return X
Any performance boost I get from vectorization is negligible comparing to the actual function. The main reason was learning:)
ali_m's comment is spot-on: for serious speed gains, you should be more specific about what the function does.
That being said, if you still want to use np.apply_along_axis to get a (possibly) small speed-boost, then consider (after rereading that function's docstring) that you can easily
wrap your function to produce 1D arrays,
use np.apply_along_axis with that wrapper and
reshape the resulting array:
def test_func_wrapper(*args, **kwargs):
return test_func(*args, **kwargs).ravel()
output = np.apply_along_axis(test_func_wrapper, m=3, n=2, axis=1, arr=input_data)
np.allclose(output.reshape(5,3, -1), output_data)
# output: True
Note that this is a generic way to speed up such loops. You'll probably get better performance if you use functionality more specific to the actual problem.
I'm pretty new to Python, so I'm doing a project in it. Part of it includes a diffusion across a map. I'm implementing it by going through and making the current tile equal to .2 * the sum of its neighbors n,w,s,e. If I was doing this in C, I'd just do a double for loop that loops through an array doing arr[i*width + j] = arr of j+1, j-1, i+i, i-1 the neighbors) and have several different arrays that I'd do the same thing for (different qualities of the map I'd be changing). However, I'm not sure if this is really the fastest way in Python. Some people I have asked suggest stuff like numPy, but the width probably won't be more than ~200 (so 40-50k elements max) and I wasn't sure if the overhead is worth it. I don't really know any builtin functions to do what I want. Any advice?
edit: This will be very dense i.e. every spot is going to have a non-trivial calculation
This is quite simple to arrange with NumPy. The function np.roll returns a copy of the array, "rolled" in a specified direction.
For example, given the array x,
x=np.arange(9).reshape(3,3)
# array([[0, 1, 2],
# [3, 4, 5],
# [6, 7, 8]])
you can roll the columns to the right with
np.roll(x,shift=1,axis=1)
# array([[2, 0, 1],
# [5, 3, 4],
# [8, 6, 7]])
Using np.roll, boundaries are wrapped like on a torus. If you do not want wrapped boundaries, you could pad the array with an edge of zeros, and reset the edge to zero before every iteration.
import numpy as np
def diffusion(arr):
while True:
arr+=0.2*np.roll(arr,shift=1,axis=1) # right
arr+=0.2*np.roll(arr,shift=-1,axis=1) # left
arr+=0.2*np.roll(arr,shift=1,axis=0) # down
arr+=0.2*np.roll(arr,shift=-1,axis=0) # up
yield arr
N=5
initial=np.random.random((N,N))
for state in diffusion(initial):
print(state)
raw_input()
Use convolution.
from numpy import *
from scipy.signal import convolve2d
mapArr=array(map)
kernel=array([[0 , 0.2, 0],
[0.2, 0, 0.2],
[0 , 0.2, 0]])
diffused=convolve2d(mapArr,kernel,boundary='wrap')
Is this for the ants challenge? If so, in the ants context, convolve2d worked ~20 times faster than the loop, in my implementation.
This modification to unutbu's code maintains constant the global sum of the array while diffuses the values of it:
import numpy as np
def diffuse(arr, d):
contrib = (arr * d)
w = contrib / 8.0
r = arr - contrib
N = np.roll(w, shift=-1, axis=0)
S = np.roll(w, shift=1, axis=0)
E = np.roll(w, shift=1, axis=1)
W = np.roll(w, shift=-1, axis=1)
NW = np.roll(N, shift=-1, axis=1)
NE = np.roll(N, shift=1, axis=1)
SW = np.roll(S, shift=-1, axis=1)
SE = np.roll(S, shift=1, axis=1)
diffused = r + N + S + E + W + NW + NE + SW + SE
return diffused