Work In Progress…¶
Neural Network (Regression)¶
Predicting Medical Insurance Costs
Data Source: Medical Insurance
Project Goal:
We would like to predict the individual medical costs (charges) given the rest of the columns/features. Since charges represent continuous values (in dollars), we’re performing a regression task.
Content:
Import Libraries¶
Import python libraries and loading the dataset
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
# from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from scikeras.wrappers import KerasClassifier, KerasRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error, make_scorer, r2_score
import matplotlib.pyplot as plt
df = pd.read_csv('insurance.csv') #load the dataset
print(df.shape)
df.head(3)
Data Cleaning¶
# inspect categorical features
df.region.unique()
# clean categorical features
df.region = df.region.replace('0', 'no', regex=True)
df.region.unique()
Define X and y¶
X = df.iloc[:,0:6]
y = df.iloc[:,-1]
One-Hot Encoding For Categorical Variables¶
X = pd.get_dummies(X)
X.head(2)
Split data¶
Note:
Train,Test, Validation splits comes differently in terms of Neural Networks. Usually using traditional ML algorithm we do the process is to split a given data set into 70% train data set and 30% test data set (ideally). In the training phase, we fit the model on the training data. And now to evaluate the model (i.e., to check how well the model is able to predict on unseen data), we run the model against the test data and get the predicted results. Since we already know what the expected results are, we compare the predicted and the real results to get the accuracy of the model. If the accuracy is not up to the desired level, we repeat the above process (train, test, compare) until the desired accuracy is achieved.
In Neural Networks approach, we do spliting our data set in train_test_plit. And In training/fitting phase we do spliting again. We split out training and validation_set. Then finally we will test our model using the testing set(unseen data) and compare the predicted result to the real result.
x_train, x_test, y_train, y_test = train_test_split(X, y,
test_size = 0.1, # 10%
random_state = 42)
print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)
Standardize¶
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
Designing Model¶
# Creating a keras sequential object
model_regr = Sequential()
# fix random seed for reproducibility
seed = 7
tf.random.set_seed(seed)
DEFINE MODEL¶
############## INPUT LAYER ##########################################
model_regr.add(Dense(units = X.shape[1] , activation = 'relu'))
############## HIDDEN LAYER 1 ##########################################
# `Note:`
# How do we choose the number of hidden layers and the number of units per layer? That is a tough question and there
# is no good answer. The rule of thumb is to start with one hidden layer and add as many units as we have features in the
# dataset. However, this might not always work. We need to try things out and observe our learning curve.
# there are a numbers of activation functions such as softmax, sigmoid,
# but ReLU (relu) (Rectified Linear Unit) is very effective in many applications and we’ll use it here.
model_regr.add(Dense(128, activation = 'relu'))
# Adding dropout
model_regr.add(layers.Dropout(0.1))
############## OUTPUT LAYER ##########################################
model_regr.add(Dense(1, activation = 'linear'))
OPTIMIZER¶
# We have a lot of optimizers such as SGD (Stochastic Gradient Descent optimizer), Adam, RMSprop, and others.
# right now adam is the best one as its solved previous optmizers issues.
opt = Adam(learning_rate = 0.01) # by default adam learning rate is 0.0.1
COMPILE MODEL¶
# loss/cost
# MSE, MAE, Huber loss
model_regr.compile(loss='mse', metrics=['mae'], optimizer=opt)
TRAINING PHASE/FIT THE MODEL¶
Add early stoping when theres no improvement.
# reference https://keras.io/api/callbacks/early_stopping/
stop = EarlyStopping(monitor='val_loss', # validation_split 20%
mode='min',
patience=30,
verbose=1)
Here we define a validation_set to 20%. Spliting our training set in 80:20 ratio
h = model_regr.fit(x_train, y_train,
validation_split=0.2, # fraction of the training data to be used in validation
epochs=100,
batch_size=1,
verbose=1,
callbacks=[stop])
Model Summary¶
# view summary
model_regr.summary()
Visualization¶
h.history.keys()
#plotting
fig, axs = plt.subplots(1,2,
figsize=(15, 6),
gridspec_kw={'hspace': 0.5, 'wspace': 0.2})
(ax1, ax2) = axs
# MSE
ax1.plot(h.history['loss'], label='Train')
ax1.plot(h.history['val_loss'], label='Validation')
ax1.set_title('learning rate=' + str(0.01))
ax1.legend(loc="upper right")
ax1.set_xlabel("# of epochs")
ax1.set_ylabel("loss (MSE)")
#MAE
ax2.plot(h.history['mae'], label='Train')
ax2.plot(h.history['val_mae'], label='Validation')
ax2.set_title('learning rate=' + str(0.01))
ax2.legend(loc="upper right")
ax2.set_xlabel("# of epochs")
ax2.set_ylabel("MAE")
Evaluation¶
val_mse, val_mae = model_regr.evaluate(x_test, y_test, verbose = 1)
y_predict = model_regr.predict(x_test)
r2_score(y_test, y_predict)
Predicted vs. Actual Charges¶
# show/hide code
a = y_test.values.reshape(-1,1).flatten()
b = y_predict.flatten()
diff = (b - a)
sim_data={"Actual Charges":a, 'Predicted Charges':b, 'Difference':np.round(diff,2)}
sim_data=pd.DataFrame(sim_data)
# Showing first 5 rows
sim_data.head(5)
Visualization¶
# visualization of actual vs. predicted charges
plt.figure(figsize=(8, 6))
plt.scatter(y_test, y_predict, alpha=0.4, color = 'red')
plt.title("Actual Vs. Predicted Charges")
plt.xlabel("Actual Charges")
plt.ylabel("Predicted Charges")
GridSearchCV¶
Finding the optimal hypeparameters value.
Function For Designing Model¶
Function that creates and returns your Keras sequential model (To use in skires wrappers)
def design_model(features):
# ann model instance
model_regr = Sequential()
#### INPUT LAYER>>>>
#adding the input layer
model_regr.add(Dense(units = X.shape[1] , activation = 'relu'))
#### HIDDEN LAYER1>>>>
# there are a numbers of activation functions such as softmax, sigmoid,
# but ReLU (relu) (Rectified Linear Unit) is very effective in many applications and we’ll use it here.
model_regr.add(Dense(128, activation = 'relu'))
#### OUTPUT LAYER>>>>
model_regr.add(Dense(1, activation = 'linear'))
#### Optimizer
# WE have a lot of optimizers such as SGD (Stochastic Gradient Descent optimizer), Adam, RMSprop, and others.
# right now adam is the best one as its solved previous optmizers issues.
opt = Adam(learning_rate = 0.01)
# loss/cost
# MSE, MAE, Huber loss
model_regr.compile(loss='mse', metrics=['mae'], optimizer=opt)
return model_regr
Invoke Our Fucntion And Pass The x_train Argument Then Save It In a Variable.¶
model_regr2 = design_model(x_train)
Training Phase/Fit The Model¶
model_regr2.fit(x_train, y_train,
validation_split=0.2,
verbose=1)
To use KerasRegressor, we must define a function that creates and returns your Keras sequential model,(Above Function) then pass this function to the model argument when constructing the KerasClassifier class.
model = KerasRegressor(model = model_regr2)
Setting Up Hyperparameters¶
This is computational extensive, we will use small value here.
List of hyperparameters:
- the learning rate
- number of batches
- number of epochs
- number of units per hidden layer
- activation functions.
param_grid = dict(
epochs = [32,64],
batch_size = [1,10])
grid = GridSearchCV(estimator=model,
param_grid=param_grid,
n_jobs=-1, # use all processor cores of our machine (faster!!)
scoring = 'r2',
return_train_score = True,
cv=3)
grid_result = grid.fit(x_train, y_train)
grid_result.best_score_ , grid_result.best_params_