I try to run this Neural Network script (for a regression model)
There are two classes defined above. One is Standardizer class and other is Neural Net class. The Standardizer class normalizes all the values and the NeuralNet class builds the neural network that learns the data through feed forward and back propagation.
This function takes the the number of inputs, hidden units, and outputs as the three parameters.
The set_hunit function is used to either update or initiate the weights.It takes the weight as the parameter.
The Pack function packs the multiple weights of each layer into one vector. The unpack function does vice versa.
Forward pass in neural network propagates as shown below:
ππ=β(ππβ
π)=ππβ
π
Activation function is used to make the network non linear. We may use tanh or RBG or etc.
In the backward pass the function takes the the z values, Target values and the error as input. Based on the delta value, the weights and the bias are updated accoringly. This method returns the weight vector packed together of that particualr layer. Below are the functions that are excecuted during backward pass.
ππβπ+πΌβ1π1πΎππβ€((πβπ)πβ€β(1βπ2))βπ+πΌπ1π1πΎππβ€(πβπ)
The train function takes the feautures and the target as the input. The gradientf unpacks the weights,proceeds with the forward pass by calling forward function. Now error is calculated using results of forward pass. Now back propagation is proceeded by calling backward function with parameters as error, Z, T(Target), _lambda.
The optimtarget function tries to reduce the error by using the object function and updates the weights accordingly.
The use method is applied to the test data after training the model. Testing data is passed as parameter and it stadardizes the data. Then forward is applied on the data which returns the predictions
This shows module not found error, but I have installed grad module with pip installation
#Importing required libraries
import pandas as pd
import numpy as np
import seaborn as sns
import grad
import matplotlib.pyplot as plt
# Reading data using pandas library
vehicle_data=pd.read_csv('processed_Data.csv')
# Overall idea about distribution of data
vehicle_data.hist(bins=40, figsize=(20,15))
plt.show()
# Count plot of Ellectric Range
sns.countplot(x='Electric Range',data=vehicle_data)
# Joint plot between Latitude on x axis and Longitude on y axis
sns.jointplot(x=vehicle_data.BaseMSRP.values,y=vehicle_data.LegislativeDistrict.values,height=10)
plt.xlabel("Base MSRP",fontsize=10)
plt.ylabel("Lengislative District",fontsize=10)
# function to drop the rows that has null or missing values
vehicle_data=vehicle_data.dropna()
# Data is already clean and has no missing values
vehicle_data.shape
#Dropping unwanted columns
vehicle_data=vehicle_data.drop(['VIN (1-10)','County', 'City', 'State', 'ZIP Code', 'DOL Vehicle ID'],axis=1)
vehicle_data.shape
# Seperating target variable
t=pd.DataFrame(vehicle_data.iloc[:,8])
vehicle_data=vehicle_data.drop(['Electric Range'],axis=1)
t
vehicle_data.head()
#NeuralNet class for regression
# standardization class
class Standardizer:
""" class version of standardization """
def __init__(self, X, explore=False):
self._mu = np.mean(X,8)
self._sigma = np.std(X,8)
if explore:
print ("mean: ", self._mu)
print ("sigma: ", self._sigma)
print ("min: ", np.min(X,8))
print ("max: ", np.max(X,8))
def set_sigma(self, s):
self._sigma[:] = s
def standardize(self,X):
return (X - self._mu) / self._sigma
def unstandardize(self,X):
return (X * self._sigma) + self._mu
def add_ones(w):
return np.hstack((np.ones((w.shape[8], 1)), w))
from grad import scg, steepest
from copy import copy
class NeuralNet:
def __init__(self, nunits):
self._nLayers=len(nunits)-1
self.rho = [1] * self._nLayers
self._W = []
wdims = []
lenweights = 0
for i in range(self._nLayers):
nwr = nunits[i] + 1
nwc = nunits[i+1]
wdims.append((nwr, nwc))
lenweights = lenweights + nwr * nwc
self._weights = np.random.uniform(-0.1,0.1, lenweights)
start = 0 # fixed index error 20110107
for i in range(self._nLayers):
end = start + wdims[i][0] * wdims[i][1]
self._W.append(self._weights[start:end])
self._W[i].resize(wdims[i])
start = end
self.stdX = None
self.stdT = None
self.stdTarget = True
def add_ones(self, w):
return np.hstack((np.ones((w.shape[8], 1)), w))
def get_nlayers(self):
return self._nLayers
def set_hunit(self, w):
for i in range(self._nLayers-1):
if w[i].shape != self._W[i].shape:
print("set_hunit: shapes do not match!")
break
else:
self._W[i][:] = w[i][:]
def pack(self, w):
return np.hstack(map(np.ravel, w))
def unpack(self, weights):
self._weights[:] = weights[:] # unpack
def cp_weight(self):
return copy(self._weights)
def RBF(self, X, m=None,s=None):
if m is None: m = np.mean(X)
if s is None: s = 2 #np.std(X)
r = 1. / (np.sqrt(2*np.pi)* s)
return r * np.exp(-(X - m) ** 2 / (2 * s ** 2))
def forward(self,X):
t = X
Z = []
for i in range(self._nLayers):
Z.append(t)
if i == self._nLayers - 1:
t = np.dot(self.add_ones(t), self._W[i])
else:
t = np.tanh(np.dot(self.add_ones(t), self._W[i]))
#t = self.RBF(np.dot(np.hstack((np.ones((t.shape[0],1)),t)),self._W[i]))
return (t, Z)
def backward(self, error, Z, T, lmb=0):
delta = error
N = T.size
dws = []
for i in range(self._nLayers - 1, -1, -1):
rh = float(self.rho[i]) / N
if i==0:
lmbterm = 0
else:
lmbterm = lmb * np.vstack((np.zeros((1, self._W[i].shape[1])),
self._W[i][1:,]))
dws.insert(0,(-rh * np.dot(self.add_ones(Z[i]).T, delta) + lmbterm))
if i != 0:
delta = np.dot(delta, self._W[i][1:, :].T) * (1 - Z[i]**2)
return self.pack(dws)
def _errorf(self, T, Y):
return T - Y
def _objectf(self, T, Y, wpenalty):
return 0.5 * np.mean(np.square(T - Y)) + wpenalty
def train(self, X, T, **params):
verbose = params.pop('verbose', False)
# training parameters
_lambda = params.pop('Lambda', 0.)
#parameters for scg
niter = params.pop('niter', 1000)
wprecision = params.pop('wprecision', 1e-10)
fprecision = params.pop('fprecision', 1e-10)
wtracep = params.pop('wtracep', False)
ftracep = params.pop('ftracep', False)
# optimization
optim = params.pop('optim', 'scg')
if self.stdX == None:
explore = params.pop('explore', False)
self.stdX = Standardizer(X, explore)
Xs = self.stdX.standardize(X)
if self.stdT == None and self.stdTarget:
self.stdT = Standardizer(T)
T = self.stdT.standardize(T)
def gradientf(weights):
self.unpack(weights)
Y,Z = self.forward(Xs)
error = self._errorf(T, Y)
return self.backward(error, Z, T, _lambda)
def optimtargetf(weights):
""" optimization target function : MSE
"""
self.unpack(weights)
#self._weights[:] = weights[:] # unpack
Y,_ = self.forward(Xs)
Wnb=np.array([])
for i in range(self._nLayers):
if len(Wnb)==0: Wnb=self._W[i][1:,].reshape(self._W[i].size-self._W[i][0,].size,1)
else: Wnb = np.vstack((Wnb,self._W[i][1:,].reshape(self._W[i].size-self._W[i][0,].size,1)))
wpenalty = _lambda * np.dot(Wnb.flat ,Wnb.flat)
return self._objectf(T, Y, wpenalty)
if optim == 'scg':
result = scg(self.cp_weight(), gradientf, optimtargetf,
wPrecision=wprecision, fPrecision=fprecision,
nIterations=niter,
wtracep=wtracep, ftracep=ftracep,
verbose=False)
self.unpack(result['w'][:])
self.f = result['f']
elif optim == 'steepest':
result = steepest(self.cp_weight(), gradientf, optimtargetf,
nIterations=niter,
xPrecision=wprecision, fPrecision=fprecision,
xtracep=wtracep, ftracep=ftracep )
self.unpack(result['w'][:])
if ftracep:
self.ftrace = result['ftrace']
if 'reason' in result.keys() and verbose:
print(result['reason'])
return result
def use(self, X, retZ=False):
if self.stdX:
Xs = self.stdX.standardize(X)
else:
Xs = X
Y, Z = self.forward(Xs)
if self.stdT is not None:
Y = self.stdT.unstandardize(Y)
if retZ:
return Y, Z
return Y
Try to open command prompt and type pip install grad or if you using jupyter notebook, make a new code shell and type !pip install grad before you importing it
Hope that solves your problem
Related
I want to formulate the objective function (minimization problem): sum[sum[Ri*{PiΒ² + (Qi - Qcj*Xij)Β²}for j in range(Nc)] for i in range(N) ] with P and Q are the constants, Qc is a list of proposed solution and X is our decision variable (binary variable), R=[0.2,0.4,0.5], P=[2,4,5], Q=[1,3,4], Qc=[0,1,3,4,5], N= 3=len(P), Nc= 5.
I'm trying to get the vector X which minimizes the objective function.
You can find her my attempt:
class Problem(ElementwiseProblem):
def __init__(self,L,n_max,Q,P,T,R):
super().__init__(n_var=len(L), n_obj=1, n_ieq_constr=1)
self.L = L
self.n_max = n_max
self.Q = Q
self.P = P
self.T = T
self.R = R
def _evaluate(self, x, out, *args, **kwargs):
out["F"] =(( np.sum(self.P))**2+(np.sum(self.Q -self.L[x]))**2)*np.sum(self.R)
out["G"] = (np.sum(self.Q -self.L[x]))
# create the actual problem to be solved
np.random.seed(1)
P=[2,3,4,5,6]
Q=[6,11,13,14,15]
R=[0.2,0.3,0.4,0.5,0.6]
L = np.array([12,13,14,15,16,17,18,19,2,3,4,5,6,7,8,9,10,11])
n_max = 5
problem = Problem(L, n_max,Q,P,T,R)
I am completing a codeusing the bayesian classifier using the gaussian distribution.
In some part of the code as you can see I need to define a dictionary called meanDict for calculating the mean of parameters X1 and X2 and then use the same calculateMeanDict function to calulate the var in the calculate var function.
However, after getting to the var = np.mean((X-meanDict[y])^2) part of the code, I receive a TypeError: unhashable type: 'numpy.ndarray' error.
Can anybody please help me on this?
# !!! Must Not Change the Imports !!!
from pathlib import Path
from typing import Dict
import numpy as np
import pandas as pd
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from TicToc import timing, Timer
def loadDataset(filepath='Data/PokemonUnivariate.csv', labelCol='type1') -> (np.ndarray, np.ndarray, LabelEncoder):
"""
!!! Must Not Change the Content !!!
This function is used to load dataset
Data will be scaled from 0 to 1
:param filepath: csv file path
:param labelCol: column name
:return: X: ndarray, y: ndarray, labelEncoder
"""
data = pd.read_csv(filepath)
y = data[labelCol].values
labelEncoder = LabelEncoder()
y = labelEncoder.fit_transform(y)
X = data.drop(columns=[labelCol]).values
X = np.nan_to_num(X)
scaler = MinMaxScaler()
X = scaler.fit_transform(X)
return X, y, labelEncoder
class LinearClassifier(object):
def __init__(self):
"""!!! Must Not Change the Content !!!"""
self.X = None
self.y = None
self.w = 0
self.b = 0
def name(self) -> str:
"""!!! Must Not Change the Content !!!"""
return self.__class__.__name__
def sigmoid(self, X, w=None, b=None):
"""
!!! Must Not Change the Content !!!
Sigmoid function
:param X: X
:param w: weight
:param b: bias
:return: sigmoid(wx + b)
"""
w, b = self.w if w is None else w, self.b if b is None else b
return 1 / (1 + np.exp(-(w * X + b)))
def fit(self, X, y):
"""!!! Must Not Change the Content !!!"""
raise NotImplementedError
def predictSample(self, x) -> int:
"""
ToDo: Implement This Function
Predict a label of given x
If sigmoid(x) is greater than 0.5, then the label is 1; otherwise the label is 0
:param x: a sample
:return: a label
"""
if sigmoid(X, w, b) > 0.5:
label = 1
else:
label = 0
return label
def predict(self, X):
"""
!!! Must Not Change the Content !!!
Predict the labels of given X
If sigmoid(x) is greater than 0.5, then the label is 1; otherwise the label is 0
"""
return np.apply_along_axis(self.predictSample, axis=1, arr=X)
class UnivariateGaussianDiscriminantAnalysisClassifier(LinearClassifier):
"""
Univariate Gaussian Classifier
Univariate & Binary Class
"""
#staticmethod
def lnPrior(X1, X2):
"""
!!! Must Not Change the Content !!!
Calculate the ln P(C1) and ln P(C2)
:param X1: X with C1 label
:param X2: X with C2 label
:return (ln P(C1), ln P(C2))
"""
(nX1, _), (nX2, _) = X1.shape, X2.shape
return np.log(nX1 / (nX1 + nX2)), np.log(nX2 / (nX1 + nX2))
#staticmethod
def calculateMeanDict(X1, X2) -> Dict[int, float]:
"""
ToDo: Implement This Function
This function should return a diction,
which 0: mean X1, and 1: mean X2
:param X1: ndarray
:param X2: ndarray
:return: mean dict
"""
meanDict = dict(mean_X1= np.mean(X1), mean_X2= np.mean(X2))
return meanDict
#staticmethod
def calculateVar(meanDict, X, y) -> float:
"""
ToDo: Implement This Function
This function calculates the variance of X
var = mean (xi-meanDict[yi])^2
:param meanDict: 0: mean X1, and 1: mean X2
:param X: X
:param y: y
:return: var
"""
var = meanDict['mean_X1']
#var = np.mean((X-meanDict[y])^2)
return var
#staticmethod
def calculateW(meanDict, var) -> float:
"""
ToDo: Implement This Function
Calculate w. w=(mean2-mean1)/var
:param meanDict: 0: mean X1, and 1: mean X2
:param var: variance
:return: w
"""
w = (meanDict['mean_X1'] - meanDict['mean_X2'])/var^2
return w
#staticmethod
def calculateB(meanDict, var, lnPrior1, lnPrior2) -> float:
"""
ToDo: Implement This Function
calculate b. b=(mean1^2-mean2^2)/(2*var) + ln(P(C2)) - ln(P(C1))
:param meanDict: 0: mean X1, and 1: mean X2
:param var: variance
:param lnPrior1: ln(P(C1))
:param lnPrior2: ln(P(C2))
:return: b
"""
b = ((meanDict['mean X1'])^2 -(meanDict['mean X2'])^2/(2*var^2)) + lnPrior2 - lnPrior1
def fit(self, X, y):
"""
!!! Must Not Change the Content !!!
Train the Univariate Gaussian Classifier
:param X: shape (N Samples, 1)
:param y: shape (1, N Samples)
"""
self.X, self.y = X, y
nX, _ = X.shape
X1 = np.array([x1 for x1, y1 in zip(X, y) if y1 == 0])
X2 = np.array([x2 for x2, y2 in zip(X, y) if y2 == 1])
lnPrior1, lnPrior2 = self.lnPrior(X1, X2)
meanDict = self.calculateMeanDict(X1, X2)
var = self.calculateVar(meanDict, X, y)
self.w = self.calculateW(meanDict, var)
self.b = self.calculateB(meanDict, var, lnPrior1, lnPrior2)
return self
#timing
def main():
"""
!!! Must Not Change the Content !!!
"""
randomState = 0
resultFolder = Path('Data/')
with Timer('Data Loaded'):
X, y, _ = loadDataset()
XTrain, XTest, yTrain, yTest = \
train_test_split(X, y, test_size=0.2, random_state=randomState)
print(f'Training Set Length: {XTrain.shape[0]}\n'
f'Testing Set Length: {XTest.shape[0]}')
classifiers = [UnivariateGaussianDiscriminantAnalysisClassifier()]
for classifier in classifiers:
with Timer(f'{classifier.name()} Trained'):
classifier.fit(XTrain, yTrain)
with Timer(f'{classifier.name()} Tested'):
yPredicted = classifier.predict(XTest)
with Timer(f'{classifier.name()} Results Saved'):
resultsCsv = pd.DataFrame()
resultsCsv['yPredicted'] = yPredicted
resultsCsv['yTrue'] = yTest
resultsCsvPath = resultFolder / f'{classifier.name()}Results.csv'
resultsCsv.to_csv(resultsCsvPath, index=False)
resultsStr = f'{classification_report(yTest, yPredicted, digits=5)}\n' \
f'{classifier.name()}: w={classifier.w}; b={classifier.b}'
resultsTxtPath = resultFolder / f'{classifier.name()}Results.txt'
with open(resultsTxtPath, 'w') as resultsTxtFile:
resultsTxtFile.write(resultsStr)
print(resultsStr)
if __name__ == '__main__':
main(timerPrefix='Total Time Costs: ', timerBeep=False)
I implemented the algortihm of Moschopoulos that gives the density and c.d.f of a sum of gamma random variables. A C++ implementation exists in the dcoga R package, but i need mine to handle arbitrary precision numbers through the mpmath library.
The major problem with the following code is the runtime: some parameters of the class (the _delta slot) need to be updated and re-computed 'on-the-fly' when needed, and it takes a lot of time. I ran a cProfile on a simple exemple so you can see where is the problem quickly, but i dont know enough to make it faster. See for yourself by running the follwoing :
import mpmath as mp
import numpy as np
import scipy.stats.distributions as sc_dist
def gamma_density(x,a,b):
return mp.power(x,a-1) * mp.exp(-x/b) / mp.power(b,a) / mp.gamma(a)
def gamma_cdf(x,a,b):
return mp.gammainc(a,0,x/b,regularized=True)
class GammaConvolution:
def __init__(self,alpha,beta):
#alpha and beta must be provided as numpy array of mpmath.mpf objects
n = len(alpha)
if not len(beta) == n:
raise ValueError('you should provide as much alphas and betas')
if n == 1:
raise ValueError('you should provide at least 2 gammas.')
if not (type(alpha[0]) == mp.mpf and type(beta[0]) == mp.mpf):
raise ValueError('you should provide alpha and beta in mp.mpf format.')
alpha = np.array(alpha)
beta = np.array(beta)
# sanity check :
check = alpha>0
if not np.all(check):
alpha = alpha[check]
beta = beta[check]
print('Some alphas were negatives. We discarded them. {} are remaining'.format(len(alpha)))
self.signs = np.array([-1 if b < 0 else 1 for b in beta])
self.alpha = alpha
self.beta = 1/beta * self.signs
self.n = self.alpha.shape[0]
# Moshopoulos parameters :
self._beta1 = np.min(self.beta)
self._c = np.prod([mp.power(self._beta1/self.beta[i], self.alpha[i]) for i in range(self.n)])
self._rho = np.sum(self.alpha)
self._delta = [mp.mpf('1')]
self._lgam_mod = [np.sum([self.alpha[i] * (1 - (self._beta1 / self.beta[i])) for i in range(self.n)])] # this correpsont o get_lgam(k=1)"
self._to_power = [1 - self._beta1/self.beta[i] for i in range(self.n)]
def _get_delta(self,k):
if(len(self._delta)<=k):
# Then we create it :
n = len(self._delta)
self._lgam_mod.extend([np.sum([self.alpha[i] * mp.power(self._to_power[i], j + 1) for i in range(self.n)])for j in range(n,k+1)])
self._delta.extend([np.sum([self._lgam_mod[i] * self._delta[j - 1 - i] for i in range(j)])/j for j in range(n, k+1)])
return self._delta[k]
def coga(self, x, type='pdf'):
if x < 0:
return 0
k = 0
out = 0
if type=='pdf':
func = gamma_density
if type=='cdf':
func = gamma_cdf
while True:
step = self._get_delta(k) * func(x, self._rho + k, self._beta1)
if mp.isinf(step) or mp.isnan(step):
print('inf or nan happened, the algorithm did not converge')
break
out += step
if mp.almosteq(step, 0):
break
k += 1
out *= self._c
return out
def pdf(self,x):
return self.coga(x, 'pdf')
def cdf(self,x):
return self.coga(x, 'cdf')
if __name__ == "__main__":
mp.mp.dps = 20
# some particular exemples values that 'approximates' a lognormal.
alpha = np.array([mp.mpf(28.51334751960197301778147509487793953959733053134799171090326516406790428180220147416519532643017308),
mp.mpf(11.22775884868121894986129015315963173419663023710308189240288960305130268927466536233373438048018254),
mp.mpf(6.031218085515218207945488717293490366342446718306869797877975835997607997369075273734516467130527887),
mp.mpf(3.566976340452999300401949508136750700482567798832918933344685923750679570986611068640936818600783319),
mp.mpf(2.11321149019108276673514744052403419069919543373601000373799419309581402519772983291547041971629247),
mp.mpf(1.13846760415283260768713745745968197587694610126298554688258480795156541979045502458925706173497129),
mp.mpf(0.4517330810577715647869261976064157403882011767087065171431053299996540080549312203533542184738086012),
mp.mpf(0.07749235677493576352946436194914173772169589371740264101530548860132559560092370430079007024964728383),
mp.mpf(0.002501284133093294545540492059111705453529784044424054717786717782803430937621102255478670439562804153),
mp.mpf(0.000006144939533164067887819376779035687994761732668244591993428755735056093784306786937652351425833352728)])
beta = np.array([mp.mpf(391.6072818187915081052155152400534191999174250784251976117131780922742055385769343508047998043722828),
mp.mpf(77.21898445771279675063405017644417196454232681648725486524482168571564310062495549360709158314560647),
mp.mpf(31.76440960907061013049029007869346161467203121003911885547576503605957915202107379956314233702891246),
mp.mpf(17.44293394293412500742344752991577287098138329678954573112349659319428017788092621396034528552305827),
mp.mpf(11.23444737858955404891602233256282644042451542694693191750203254839756772074776087387688524524329672),
mp.mpf(8.050341288822160015292746577166226701992193848793662515696372301333563919247892899484012106343185691),
mp.mpf(6.255867387720061672816524328464895459610937299629691008594802004659480037331191722532939009895028284),
mp.mpf(5.146993307537222489735861088512006440481952536952402160831342266591666243011939270854579656683779017),
mp.mpf(4.285958039399903253267350243950743396496148339434605882255896571795493305652009075308916145669650666),
mp.mpf(3.455673251219567018227405844933725014914508519853860295284610094904914286228770061300477809395436033)])
dist = GammaConvolution(alpha, beta)
print(sc_dist.lognorm(s=0.25).cdf(1))
import cProfile
pr = cProfile.Profile()
pr.enable()
print(dist.cdf(1))
print(sc_dist.lognorm(s=0.25).cdf(1))
pr.disable()
# after your program ends
import pstats
pstats.Stats(pr).strip_dirs().sort_stats('cumtime').print_stats(20)
Can you help me making it faster ? The problem is clearly if the _get_delta function.
I'm implementing deep deterministic policy gradient (DDPG) to solve my problem by following this tutorial (https://www.youtube.com/watch?v=GJJc1t0rtSU) and using this python source code (https://github.com/philtabor/Youtube-Code-Repository/blob/master/ReinforcementLearning/PolicyGradient/DDPG/pendulum/tensorflow/ddpg_orig_tf.py).
The number of actions is 3 (alpha, beta, gamma) and the number of dimension of state is 2. I want to get the value of 3 actions in [0,1] so I changed the output layer (third layer) of class Actor from "tanh" function to "sigmoid" function in file "ddpg_orig_tf.py". However, when I tried this algorithm to solve my problem, it only obtained value 0 or 1 {0,1} in 3 actions, not changed over time in interval [0,1]. I think problem is not activation function, I tried changed to tanh and it also obtained only {-1,1}.
Here is my changed code in "ddpg_orig_tf.py": I changed "tanh" to "sigmoid" in output layer
def build_network(self):
with tf.variable_scope(self.name):
self.input = tf.placeholder(tf.float32,
shape=[None, *self.input_dims],
name='inputs')
self.action_gradient = tf.placeholder(tf.float32,
shape=[None, self.n_actions],
name='gradients')
f1 = 1. / np.sqrt(self.fc1_dims)
dense1 = tf.layers.dense(self.input, units=self.fc1_dims,
kernel_initializer=random_uniform(-f1, f1),
bias_initializer=random_uniform(-f1, f1))
batch1 = tf.layers.batch_normalization(dense1)
layer1_activation = tf.nn.relu(batch1)
f2 = 1. / np.sqrt(self.fc2_dims)
dense2 = tf.layers.dense(layer1_activation, units=self.fc2_dims,
kernel_initializer=random_uniform(-f2, f2),
bias_initializer=random_uniform(-f2, f2))
batch2 = tf.layers.batch_normalization(dense2)
layer2_activation = tf.nn.relu(batch2)
f3 = 0.003
mu = tf.layers.dense(layer2_activation, units=self.n_actions,
activation='sigmoid',
kernel_initializer= random_uniform(-f3, f3),
bias_initializer=random_uniform(-f3, f3))
self.mu = tf.multiply(mu, self.action_bound)
Here is my Environment:
import gym
from gym import spaces
from gym.utils import seeding
import numpy as np
from os import path
class P_NOMAEnv():
def __init__(self, distance1, distance2, power, B=15000, N0=10**-20, path_loss=2, g=1):
self.B = B #bandwidth
self.N0 = N0
self.path_loss = path_loss
self.g = g
self.alpha_low = 0.
self.alpha_high = 1.
self.beta_low = 0.
self.beta_high = 1.
self.gamma_low = 0.
self.gamma_high = 1.
self.distance1 = np.random.randint(30,500)
self.distance2 = 2*distance1
self.power = power
self.max_iteration = 1000
self.high = np.array([self.B, self.power])
self.action_space = spaces.Box(low=0., high=1., shape=(3,), dtype=np.float32)
self.observation_space = spaces.Box(low=np.array([0.1, 0.0001]), high=np.array([self.B, self.power]), dtype=np.float32)
self.seed()
def seed(self, seed=None):
self.np_random, seed = seeding.np_random(seed)
return [seed]
def cal_SINR_near(self, alpha, beta, gamma, g, distance1, path_loss, power, B, N0):
h_near = g*(distance1**-path_loss)
channel_noise = B*N0 # 1 subchannel
non_overlap = (np.square(np.absolute(h_near))*power*0.5*(1-beta))/channel_noise
overlap = (np.square(np.absolute(h_near))*power*gamma*(alpha+beta)*0.5)/(channel_noise + (np.square(np.absolute(h_near))*power*(1-gamma)*(alpha+beta)*0.5))
SINR_near = non_overlap + overlap
return SINR_near
def cal_SINR_far(self, alpha, beta, gamma, g, distance2, path_loss, power, B, N0):
h_far = g*(distance2**-path_loss)
channel_noise = B*N0 # 1 subchannel
non_overlap = (np.square(np.absolute(h_far))*power*0.5*(1-alpha))/channel_noise
overlap = (np.square(np.absolute(h_far))*power*(1-gamma)*(alpha+beta)*0.5)/(channel_noise
+ (np.square(np.absolute(h_far))*power*gamma*(alpha+beta)*0.5))
SINR_far = non_overlap + overlap
return SINR_far
def cal_sum_rate(self, SINR_near, SINR_far, B, alpha, beta):
R_near = (1+alpha)*0.5*B*np.log2(1+SINR_near)
R_far = (1+beta)*0.5*B*np.log2(1+SINR_far)
sum_rate = R_near + R_far # reward
return sum_rate
def normalize(self, x):
normalized = (x+1.2)/2.4
return normalized
def step(self, action):
self.steps_taken += 1
B,P = self.state
new_alpha = np.clip(action, self.alpha_low, self.alpha_high)[0]
new_beta = np.clip(action, self.beta_low, self.beta_high)[1]
new_gamma = np.clip(action, self.gamma_low, self.gamma_high)[2]
SINR_near = self.cal_SINR_near(new_alpha, new_beta, new_gamma, self.g, self.distance1, self.path_loss, self.power, self.B, self.N0)
SINR_far = self.cal_SINR_far(new_alpha, new_beta, new_gamma, self.g, self.distance2, self.path_loss, self.power, self.B, self.N0)
reward = self.cal_sum_rate(SINR_near, SINR_far, self.B, new_alpha, new_beta)
done = self.steps_taken >= self.max_iteration
B_new=(1-new_beta)*0.5*self.B + (new_alpha+new_beta)*0.5*self.B
P_new=(1-new_beta)*0.5*self.power + (new_alpha+new_beta)*0.5*new_gamma*self.power
self.state = np.array([B_new, P_new])
return self._get_obs(action), reward, done, {}, new_alpha, new_beta, new_gamma
def _get_obs(self, action):
new_alpha = np.clip(action, self.alpha_low, self.alpha_high)[0]
new_beta = np.clip(action, self.beta_low, self.beta_high)[1]
new_gamma = np.clip(action, self.gamma_low, self.gamma_high)[2]
B_new=(1-new_beta)*0.5*self.B + (new_alpha+new_beta)*0.5*self.B
P_new=(1-new_beta)*0.5*self.power + (new_alpha+new_beta)*0.5*new_gamma*self.power
return np.array([B_new, P_new])
def reset(self):
self.steps_taken = 0
a = np.random.random_sample((3,))
self.state = self._get_obs(a)
return self._get_obs(a)
Here is my main file:
import os
import gym
import numpy as np
from ddpg import Agent
from Environment import P_NOMAEnv
from utils import plotLearning
if __name__ == '__main__':
a = np.random.randint(30,250)
env = P_NOMAEnv(distance1=100, distance2=200, power=2, B=15000, N0=10**-20, path_loss=2, g=1)
agent = Agent(alpha=0.00005, beta=0.0005, input_dims=[2], tau=0.001,
env=env, batch_size=64, layer1_size=400, layer2_size=300,
n_actions=3)
np.random.seed(0)
score_history = []
score_history2 = []
for i in range(4000):
obs = env.reset()
done = False
score = 0
maxi = np.zeros(4,)
while not done:
act = agent.choose_action(obs)
new_state, reward, done, info, alpha, beta, gamma = env.step(act)
agent.remember(obs, act, reward, new_state, int(done))
agent.learn()
score += reward
obs = new_state
if reward>maxi[0]: maxi = [reward, alpha, beta, gamma]
#env.render()
score_history.append(maxi[0])
score_history2.append(score/1000)
print('episode ', i+1, ', reward ', np.around(score/1000, decimals=4), ', max reward: ', np.around(maxi[0], decimals=4), ', with alpha: ', np.around(alpha, decimals=4), ', beta: ', np.around(beta, decimals=4), ', gamma: ', np.around(gamma, decimals=4), 'trailing 100 episodes avg ', np.mean(score_history2[-100:]))
I tried to print out noise, mu(output of actor):
def choose_action(self, state):
state = state[np.newaxis, :]
mu = self.actor.predict(state) # returns list of list
noise = self.noise()
mu_prime = mu + noise
print("Noise: ", noise, "mu: ", mu, "mu_prime: ", mu_prime[0])
return mu_prime[0]
When I run main, the results showed as follows:
Noise: [-0.26362168 -0.01389367 -0.39754398] mu: [[1. 0. 0.]] mu_prime: [ 0.73637832 -0.01389367 -0.39754398]
Noise: [-0.29287953 -0.03729832 -0.39651476] mu: [[1. 0. 0.]] mu_prime: [ 0.70712047 -0.03729832 -0.39651476]
.........
As you can see, mu always get the value 0 or 1 {0,1}, not in interval [0,1]. I tried more than 1000 episodes but it did not change over time.
The issues is probably in my environment but I do not know how to fix it. If you have any idea about that, please help me to solve it, I appreciate that.
You can't use sigmoid for multi-label output. You'll need to use softmax
For learning purposes, I want to build my own LSTM model in Tensorflow. The problem is, how to train is in a way that the states at a certain timestep get initialized using the states from the previous timestep. Is there a mechanism for this in Tensorflow?
class Lstm:
def __init__(self, x, steps):
self.initial = tf.placeholder(tf.float32, [None, size])
self.state = self.initial
for _ in range(steps):
x = self.layer_lstm(x, 100)
x = self.layer_softmax(x, 10)
self.prediction = x
def step_lstm(self, x, size):
stream = self.layer(x, size)
input_ = self.layer(x, size)
forget = self.layer(x, size, bias=1)
output = self.layer(x, size)
self.state = stream * input_ + self.state * forget
x = self.state * output
return x
def layer_softmax(self, x, size):
x = self.layer(x, size)
x = tf.nn.softmax(x)
return x
def layer(self, x, size, bias=0.1):
in_size = int(x.get_shape()[1])
weight = tf.Variable(tf.truncated_normal([in_size, size], stddev=0.1))
bias = tf.Variable(tf.constant(bias, shape=[size]))
x = tf.matmul(x, weight) + bias
return x
#danijar - you may want to look at the 'Variables' section of this page for a simple example of how to maintain state across calls to a subgraph.