ANN model for multiclass classification - python

I don't know what the problem is and why I'm getting this error:
ValueError: in user code:
ValueError: Shapes (None, 1) and (None, 6) are incompatible
Can anyone please help me with this code?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation,Dropout
from sklearn.preprocessing import MinMaxScaler
%matplotlib inline
df = pd.read_csv('test.csv')
dft = pd.read_csv('train.csv')
X_train = df.drop('label',axis=1).values
y_train = df['label'].values
X_test = dft.drop('label',axis=1).values
y_test = dft['label'].values
scaler = MinMaxScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
model = Sequential()
model.add(Dense(units=30, activation='relu'))
model.add(Dense(units=15, activation='relu'))
model.add(Dense(6, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=X_train, y=y_train, epochs=15, batch_size=10, validation_data=(X_test, y_test))

The issue is that the length of the second dimension of your target arrays (y_train and y_test) is equal to 1, while the model is expecting 6, given that the number of neurons of the output layer is set equal to 6. To resolve this issue you need to one-hot encode the target (you can use scikit-learn OneHotEncoder). If your target does actually have 6 classes then your model will work.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import make_classification
from sklearn.preprocessing import OneHotEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
tf.random.set_seed(0)
# generate the data
X, y = make_classification(n_classes=6, n_samples=1000, n_features=10, n_informative=10, n_redundant=0, random_state=42)
print(y.shape)
# (1000, )
# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
# one-hot encode the target
enc = OneHotEncoder(sparse=False, handle_unknown='ignore')
enc.fit(y_train.reshape(-1, 1))
y_train = enc.transform(y_train.reshape(-1, 1))
y_test = enc.transform(y_test.reshape(-1, 1))
print(y_train.shape, y_test.shape)
# (750, 6) (250, 6)
# scale the features
scaler = MinMaxScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
# define the model
model = Sequential()
model.add(Dense(units=30, activation='relu'))
model.add(Dense(units=15, activation='relu'))
model.add(Dense(6, activation='softmax'))
# fit the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=X_train, y=y_train, epochs=3, batch_size=10, validation_data=(X_test, y_test))
# Epoch 1/3
# 75/75 [==============================] - 1s 2ms/step - loss: 1.7872 - accuracy: 0.2427 - val_loss: 1.7719 - val_accuracy: 0.2600
# Epoch 2/3
# 75/75 [==============================] - 0s 781us/step - loss: 1.7660 - accuracy: 0.2547 - val_loss: 1.7549 - val_accuracy: 0.2720
# Epoch 3/3
# 75/75 [==============================] - 0s 768us/step - loss: 1.7528 - accuracy: 0.2587 - val_loss: 1.7408 - val_accuracy: 0.3280

Related

fashion_mnist Data ML Accuracy Score is only 0.1

i am pretty new to ML and trying to do an typical fashion_mnist Classification. The Problem is that the accuracy Score after I run the code is only 0.1 and the loss is below 0. So i guess the ML is not learning but I dont know what the Problem is?
Thx
from tensorflow.keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train = x_train.astype('float32')
print(type(x_train))
x_train =x_train.reshape(60000,784)
x_train = x_train / 255.0
x_test =x_test.reshape(60000,784)
x_test= x_test/ 255.0
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
model= Sequential()
model.add(Dense(100, activation="sigmoid", input_shape=(784,)))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer='sgd', loss="binary_crossentropy", metrics=["accuracy"])
model.fit(
x_train,
y_train,
epochs=10,
batch_size=1000)
Output:
Multiple issues with your code -
You have some error in the reshape x_test = x_test.reshape(10000,784) as it has 10000 images only.
You are using a sigmoid activation in the first dense layer, which is not a good practice. Instead, use relu.
Your output Dense has only 1 node. You are working with a dataset that has 10 unique classes. The output has to be Dense(10). Please understand that even though the y_train has classes 0-10, a neural network can't predict integer values with a softmax or sigmoid activation. Instead what you are trying to do is predict the probability values for EACH of the 10 classes.
You are using the incorrect activation in the final layer for multi-class classification. Use softmax.
You are using the incorrect loss function. For multi-class classification use categorical_crossentropy. Since your output is a 10-dimensional probability distribution, but your y_train is a single value for each class label, you can use sparse_categorical_crossentropy instead which is the same thing but handles label encoded y.
Try using a better optimizer to avoid getting stuck in local minima, such as adam.
It's preferred to use CNNs for image data since a simple Dense layer will not be able to capture the spatial features that make up the image. Since the images are small (28,28) and this is a toy example, it's ok the way it is.
Please refer to this table for checking out what to use. You have to ensure you know what problem you are solving in the first place though.
In your case, you want to do a multi-class single label classification but you are instead doing a multi-class multi-label classification by using the incorrect loss and output layer activation.
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
#Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
#Normalize
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
#Reshape
x_train = x_train.reshape(60000,784)
x_train = x_train / 255.0
x_test = x_test.reshape(10000,784)
x_test = x_test / 255.0
print('Data shapes->',[i.shape for i in [x_train, y_train, x_test, y_test]])
#Contruct computation graph
model = Sequential()
model.add(Dense(100, activation="relu", input_shape=(784,)))
model.add(Dense(10, activation="softmax"))
#Compile with loss as cross_entropy and optimizer as adam
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])
#Fit model
model.fit(x_train, y_train, epochs=10, batch_size=1000)
Data shapes-> [(60000, 784), (60000,), (10000, 784), (10000,)]
Epoch 1/10
60/60 [==============================] - 0s 5ms/step - loss: 0.8832 - accuracy: 0.7118
Epoch 2/10
60/60 [==============================] - 0s 6ms/step - loss: 0.5125 - accuracy: 0.8281
Epoch 3/10
60/60 [==============================] - 0s 6ms/step - loss: 0.4585 - accuracy: 0.8425
Epoch 4/10
60/60 [==============================] - 0s 6ms/step - loss: 0.4238 - accuracy: 0.8547
Epoch 5/10
60/60 [==============================] - 0s 7ms/step - loss: 0.4038 - accuracy: 0.8608
Epoch 6/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3886 - accuracy: 0.8656
Epoch 7/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3788 - accuracy: 0.8689
Epoch 8/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3669 - accuracy: 0.8725
Epoch 9/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3560 - accuracy: 0.8753
Epoch 10/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3451 - accuracy: 0.8794
I am also adding a code for your reference with Convolutional layers, using categorical_crossentropy and functional API instead of Sequential. Please read the comments inline the code for more clarity. This should help you get an idea of some good practices when working with Keras.
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import layers, Model, utils
#Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
#Normalize
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
#Reshape
x_train = x_train.reshape(60000,28,28,1)
x_train = x_train / 255.0
x_test = x_test.reshape(10000,28,28,1)
x_test = x_test / 255.0
#Set y to onehot instead of label encoded
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)
#print([i.shape for i in [x_train, y_train, x_test, y_test]])
#Contruct computation graph
inp = layers.Input((28,28,1))
x = layers.Conv2D(32, (3,3), activation='relu', padding='same')(inp)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(32, (3,3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Flatten()(x)
out = Dense(10, activation='softmax')(x)
#Define model
model = Model(inp, out)
#Compile with loss as cross_entropy and optimizer as adam
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["accuracy"])
#Fit model
model.fit(x_train, y_train, epochs=10, batch_size=1000)
utils.plot_model(model, show_layer_names=False, show_shapes=True)

Avoiding overfitting in LSTM

I have written code using Keras and TensorFlow to recognize a pattern in a cyclic dataset. The thing which I worried about was overfitting and how to avoid from being overfitted. Now, from the loss value and accuracy, it seems I have become overfitted. The code is below:
#importing libraries
import numpy as np
import pandas as pd
import os
import math
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from pandas import read_csv
from matplotlib import pyplot
import tensorflow.keras.backend as K
from keras.layers import Dense, Activation
from keras.layers.recurrent import LSTM
from keras import backend as K
from keras.models import Sequential
from sklearn.metrics import mean_squared_error
from keras.layers import InputLayer
# Reading dataset
df = pd.read_excel("concate35w270.xlsx")
df = df.astype('float32')
df.head()
scaler = MinMaxScaler(feature_range= (0,1))
df = scaler.fit_transform(df)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 0)
#Looking last time step
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY
# Reshaping dataset
# reshape into X=t and Y=t+1
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# Network Architecture
# create and fit the LSTM network
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(1, input_shape=(1, look_back)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation = tf.nn.relu))
#model.add(tf.keras.layers.Dense(128, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(1, activation = 'sigmoid'))
def coeff_determination(y_true, y_pred):
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
model.compile(optimizer='sgd',
loss='mse',
metrics = [coeff_determination])
model.fit(trainX, trainY, epochs = 30)
After starting to fit model using the training data set, I saw this massages from the machine:
Epoch 1/30
702543/702543 [==============================] - 64s 91us/sample - loss: 0.0376 - coeff_determination: 0.4673
Epoch 2/30
702543/702543 [==============================] - 61s 86us/sample - loss: 0.0015 - coeff_determination: 0.9791
Epoch 3/30
702543/702543 [==============================] - 60s 86us/sample - loss: 0.0014 - coeff_determination: 0.9802
Epoch 4/30
702543/702543 [==============================] - 64s 91us/sample - loss: 0.0013 - coeff_determination: 0.9812
Epoch 5/30
702543/702543 [==============================] - 68s 97us/sample - loss: 0.0013 - coeff_determination: 0.9820
Epoch 6/30
702543/702543 [==============================] - 67s 96us/sample - loss: 0.0012 - coeff_determination: 0.9827
Epoch 7/30
702543/702543 [==============================] - 67s 95us/sample - loss: 0.0012 - coeff_determination: 0.9834
I guess I should define a penalty to avoid from overfitting, How can I do that?
All help will be appreciated.
Your test data meant to be for monitoring the model's overfitting on train data, so you have to insert validation_data parameter in your .fit method like this:
model.fit(trainX, trainY, validation_data=(testX, testY), epochs=30)
Detailed information you can get in my answer here.

Low accuracy after training a CNN

I try to train a CNN model that classifies the handwritten digit using Keras, but I am getting low accuracy in the training (lower than 10%) and a big error.
I tried a simple neural network without concolutions and it didn't work as well.
This is my code.
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
#Explore data
print(y_train[12])
print(np.shape(x_train))
print(np.shape(x_test))
#we have 60000 imae for the training and 10000 for testing
# Scaling data
x_train = x_train/255
y_train = y_train/255
#reshape the data
x_train = x_train.reshape(60000,28,28,1)
x_test = x_test.reshape(10000,28,28,1)
y_train = y_train.reshape(60000,1)
y_test = y_test.reshape(10000,1)
#Create a model
model = keras.Sequential([
keras.layers.Conv2D(64,(3,3),(1,1),padding = "same",input_shape=(28,28,1)),
keras.layers.MaxPooling2D(pool_size = (2,2),padding = "valid"),
keras.layers.Conv2D(32,(3,3),(1,1),padding = "same"),
keras.layers.MaxPooling2D(pool_size = (2,2),padding = "valid"),
keras.layers.Flatten(),
keras.layers.Dense(128,activation = "relu"),
keras.layers.Dense(10,activation = "softmax")])
model.compile(optimizer = "adam",
loss = "sparse_categorical_crossentropy",
metrics = ['accuracy'])
model.fit(x_train,y_train,epochs=10)
test_loss,test_acc = model.evaluate(x_test,y_test)
print("\ntest accuracy:",test_acc)
Could anyone advice me on how to improve my model?
Your problem is here:
x_train = x_train/255
y_train = y_train/255 # makes no sense
You should have rescaled x_test, not y_train.
x_train = x_train/255
x_test = x_test/255
That was probably just a typo from your part. Change these lines and you'll have 95%+ accuracy.
You model have some scaling problem and try to use tf 2.0
x_train /= 255
x_test /= 255
you don't need to scale all data of test
as you have done :
x_train = x_train/255
y_train = y_train/255
Afterwards, we can transform the labels into a one-hot encoding
from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
which helps in the :
loss='categorical_crossentropy',
The Sequential API allows us to stack layers on top of each other. The only downside is that we cannot have multiple outputs or inputs when using these models. Nevertheless, we can create a Sequential object and use the add() function to add layers to our model.
Try to use more API that make your model more smooth and accurate as using add function is present on Tf 2.0
As we can give Conv2D 4 time to make smooth :
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu',
input_shape=x_train.shape[1:]))
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
in the code you can use dropout :
seq_model.add(Dropout(rate=0.25))
Full model :
%tensorflow_version 2.x
from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
seq_model = Sequential()
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu',
input_shape=x_train.shape[1:]))
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
seq_model.add(MaxPool2D(pool_size=(2, 2)))
seq_model.add(Dropout(rate=0.25))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
seq_model.add(MaxPool2D(pool_size=(2, 2)))
seq_model.add(Dropout(rate=0.25))
seq_model.add(Flatten())
seq_model.add(Dense(256, activation='relu'))
seq_model.add(Dropout(rate=0.5))
seq_model.add(Dense(10, activation='softmax'))
seq_model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
epochsz = 3 # number of epch
batch_sizez = 32 # the batch size ,can be 64 , 128 so other
seq_model.fit(x_train,y_train, batch_size=batch_sizez, epochs=epochsz)
Result :
Train on 60000 samples
Epoch 1/3
60000/60000 [==============================] - 186s 3ms/sample - loss: 0.1379 - accuracy: 0.9588
Epoch 2/3
60000/60000 [==============================] - 187s 3ms/sample - loss: 0.0677 - accuracy: 0.9804
Epoch 3/3
60000/60000 [==============================] - 187s 3ms/sample - loss: 0.0540 - accuracy: 0.9840

sparse matrix length is ambiguous

I'm very new to machine learning so this question might sound stupid.
i'm following a tutorial on Text Classification but I'm facing an error that I don't have any idea about how to solve.
This is the code I have (it is basically what it is found in the tutorial)
import pandas as pd
filepath_dict = {'yelp': 'data/yelp_labelled.txt',
'amazon': 'data/amazon_cells_labelled.txt',
'imdb': 'data/imdb_labelled.txt'}
df_list = []
for source, filepath in filepath_dict.items():
df = pd.read_csv(filepath, names=['sentence', 'label'], sep='\t')
df['source'] = source
df_list.append(df)
df = pd.concat(df_list)
print(df.iloc[0:4])
from sklearn.feature_extraction.text import CountVectorizer
df_yelp = df[df['source'] == 'yelp']
sentences = df_yelp['sentence'].values
y = df_yelp['label'].values
from sklearn.model_selection import train_test_split
sentences_train, sentences_test, y_train, y_test = train_test_split(sentences, y, test_size=0.25, random_state=1000)
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
vectorizer.fit(sentences_train)
X_train = vectorizer.transform(sentences_train)
X_test = vectorizer.transform(sentences_test)
from keras.models import Sequential
from keras import layers
input_dim = X_train.shape[1]
model = Sequential()
model.add(layers.Dense(10, input_dim=input_dim, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
history = model.fit(X_train, y_train,
nb_epoch=100,
verbose=False,
validation_data=(X_test, y_test),
batch_size=10)
When I reach the last line, I get an error
"TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0]"
I guess I'll have to perform some kind of transformation on the data I'm using, or that I should try to load those data in a different way. I tried to search on Stackoverflow already but - being new to all this - I couldn't find anything helpful.
How do I make this work? Ideally I'd like to get not only the solution but also a brief explaination about why the error happened and what the solution does in order to solve it.
thanks!
The reason you're facing this difficulty is that your X_train and X_test are of type <class scipy.sparse.csr.csr_matrix> whereas your model expects it to be a numpy array.
Try casting them to dense and you're fine to go:
X_train = X_train.todense()
X_test = X_test.todense()
Not sure, why are you getting error for this script.
The following script is working fine; even with sparse matrix. May be give a try in your machine.
sentences = ['i want to test this','let us try this',
'would this work','how about this',
'even this','this should not work']
y= [0,0,0,0,0,1]
from sklearn.model_selection import train_test_split
sentences_train, sentences_test, y_train, y_test = train_test_split(sentences, y, test_size=0.25, random_state=1000)
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
vectorizer.fit(sentences_train)
X_train = vectorizer.transform(sentences_train)
X_test = vectorizer.transform(sentences_test)
from keras.models import Sequential
from keras import layers
input_dim = X_train.shape[1]
model = Sequential()
model.add(layers.Dense(10, input_dim=input_dim, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
model.fit(X_train, y_train,
epochs=2,
verbose=True,
validation_data=(X_test, y_test),
batch_size=2)
#
Layer (type) Output Shape Param #
=================================================================
dense_5 (Dense) (None, 10) 110
_________________________________________________________________
dense_6 (Dense) (None, 1) 11
=================================================================
Total params: 121
Trainable params: 121
Non-trainable params: 0
_________________________________________________________________
Train on 4 samples, validate on 2 samples
Epoch 1/2
4/4 [==============================] - 1s 169ms/step - loss: 0.7570 - acc: 0.2500 - val_loss: 0.6358 - val_acc: 1.0000
Epoch 2/2
4/4 [==============================] - 0s 3ms/step - loss: 0.7509 - acc: 0.2500 - val_loss: 0.6328 - val_acc: 1.0000

Linear regression getting NaN for loss

Cant understand why keras linear regression model is not working. Using Boston Housing data.Get Loss as nan
path='/Users/admin/Desktop/airfoil_self_noise.csv'
df=pd.read_csv(path,sep='\t',header=None)
y=df[5] #TARGET
df2=df.iloc[:,:-1]
X_train, X_test, y_train, y_test = train_test_split(df2, y, test_size=0.2)
p = Sequential()
p.add(Dense(units=20, activation='relu', input_dim=5))
p.add(Dense(units=20, activation='relu'))
p.add(Dense(units=1))
p.compile(loss='mean_squared_error',
optimizer='sgd')
p.fit(X_train, y_train, epochs=10, batch_size=32)
this yeilds:
Epoch 1/10
1202/1202 [==============================] - 0s 172us/step - loss: nan
Epoch 2/10
1202/1202 [==============================] - 0s 37us/step - loss: nan
Epoch 3/10
1202/1202 [==============================] - 0s 38us/step - loss: nan
Epoch 4/10
1202/1202 [==============================] - 0s 36us/step - loss: nan
Epoch 5/10
1202/1202 [==============================] - 0s 36us/step - loss: nan
Epoch 6/10
1202/1202 [==============================] - 0s 40us/step - loss: nan
Just to get you started, building on the top of NaN loss when training regression network
import pandas as pd
import keras
from keras.layers import Dense, Input
from keras import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
#Grabbing these 2 lines from your example
path='/Users/admin/Desktop/airfoil_self_noise.csv'
df = pd.read_csv("airfoil_self_noise.csv", sep = '\t', header = None)
y = df[5]
df2 = df.iloc[:, :-1]
#preprocessing. Vectorization and Scaling
X_train, X_test, y_train, y_test = train_test_split(df2.values, y.values, test_size = 0.2)
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)
p = Sequential()
p.add(Dense(units = 20, activation ='relu', input_dim = 5))
p.add(Dense(units = 20, activation ='relu'))
p.add(Dense(units = 1))
p.compile(loss = 'mean_squared_error', optimizer = 'adam')
print(p.fit(X_train, y_train, epochs = 100, batch_size = 64))

Categories