top of page

Steps to Code Your First NeuralNet Classifier with Keras.

  • Nov 1, 2021
  • 8 min read

Keras is a free open source Python framework for constructing and analyzing deep learning models.  It is powerful and simple to use.

It covers Theano and TensorFlow, two fast numerical computing frameworks, and enables you to create and train neural network models with only a few lines of code.

In this tutorial, you'll learn how to use Keras to build your first deep learning neural network model in Python programming language.






The following are the stages you'll go through in this tutorial:


  • Loading Data.

  • Defining the Keras Model

  • Compile the Keras Model.

  • Fit the Keras Model.

  • Evaluate the Keras Model

  • Make predictions using the classifier


There are a few prerequisites for this Keras tutorial:

  1. Python 2 or 3 is installed and setup on your computer.

  2. You've installed and configured SciPy (including NumPy).

  3. You've installed and configured Keras as well as a backend (Theano or TensorFlow).


First, create an empty python file and give it a relevant name: something like "myFirstClassifier.py"


Loading Data


The first step in this tutorial is to specify the functions and classes that will be used.

To load our dataset, we'll use the NumPy library, and to create our model, we'll utilize two Keras classes.


The necessary imports are given below.

#first neural network with keras tutorial
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense

Our dataset is now ready to be loaded.


A dataset of diabetic patients will be used in this Keras tutorial. This dataset comes from the UCI Machine Learning repository and is classified as a typical machine learning dataset. It examines Pima Indians' medical records to see whether they developed diabetes during the previous five years.


As a result, it's an issue of binary classification (onset of diabetes as 1 or not as 0). Each patient's input variables are all numerical in nature. This makes it excellent for our first neural network in Keras, since it can be used immediately with neural networks that accept numerical input and output values.


You can access this dataset below.



Download the dataset and place it in the same folder as your python file.


Using the NumPy function loadtxt(), we can now load the file as a matrix of integers.

Eight input variables and one output variable (the last column) are used. We'll learn a model that maps rows of input variables (X) to an output variable (y), which we often characterize as y = f. (X).


The following is a summary of the variables:

Input variables (X):

  1. Number of times pregnant

  2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test

  3. Diastolic blood pressure (mm Hg)

  4. Triceps skin fold thickness (mm)

  5. 2-Hour serum insulin (mu U/ml)

  6. Body mass index (weight in kg/(height in m)^2)

  7. Diabetes pedigree function

  8. Age (years)

Output Variables (y):

  1. Class variable (0 or 1)

We can divide the columns of data into input and output variables after the CSV file has been imported into memory.

The data will be kept as a two-dimensional array with rows as the first dimension and columns as the second dimension, e.g. [rows, columns].

By using the normal NumPy slice operator or ":" to choose subsets of columns, we can divide the array into two arrays. The slice 0:8 can be used to choose the first eight columns from index 0 to index 7. The output column (the 9th variable) can then be selected using index 8.



# load the dataset
dataset = loadtxt('diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]

Our neural network model is now ready to be defined.


Note that the dataset comprises nine columns, and the range 0:8 will pick columns from 0 through 7, but not index 8. If this is all new to you, check out this article for further information on array slicing and ranges:


Defining the Keras Model


In Keras, a model is defined as a series of layers.


We start with a Sequential model and gradually add layers until we are satisfied with our network design.

The first step is to guarantee that the input layer has the appropriate amount of input features. This is provided by setting the input dim option to 8 for the 8 input variables when establishing the first layer.


How do we determine the number and type of layers?

This is a difficult question to answer. We may utilize heuristics, and the ideal network topology is often discovered by trial and error testing.  In general, you'll need a network big enough to encapsulate the problem.

In this example, we'll utilize a three-layer fully connected network topology.


The Dense class is used to define fully linked layers. The number of neurons or nodes in the layer may be specified as the first parameter, and the activation function can be specified as the second parameter.

On the first two layers, we'll use the rectified linear unit (ReLU) activation function, and in the output layer, we'll utilize the Sigmoid function.


Sigmoid and Tanh activation functions are formerly the favored activation functions for all layers. These days, the ReLU activation function is used to improve performance. With a default threshold of 0.5, we utilize a sigmoid on the output layer to guarantee that our network output is between 0 and 1 and simple to transfer to either a probability of class 1 or a hard classification of either class.


In summary:

  • The input dim=8 option specifies that the model expects rows of data with eight variables.

  • The relu activation function is used in the first hidden layer, which comprises 12 nodes.

  • The relu activation function is used in the second hidden layer, which includes 8 nodes.

  • The sigmoid activation function is used in the output layer, which contains just one node.


# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

The line of code that creates the first Dense layer does two tasks: it defines the input or visible layer as well as the first hidden layer.


Compile the Keras Model.


The model is compiled using efficient numerical libraries (the so-called backend) such as Theano or TensorFlow. The backend automatically selects the optimum method to represent the network for training and prediction on your hardware, whether it's a CPU, GPU, or distributed architecture.


We must supply some more parameters necessary for training the network while compiling. Remember that when we train a network, we're looking for the optimum collection of weights to translate our dataset's inputs to outputs.


The optimizer is used to search through different weights for the network and any optional metrics we would like to collect and report during training. We must specify the loss function to use to evaluate a set of weights, and the optimizer is used to search through different weights for the network and any optional metrics we would like to collect and report during training.


We'll utilize cross entropy as the loss argument in this situation. This loss is known as "binary crossentropy" in Keras and is used to solve binary classification issues.


The optimizer will be "adam," an effective stochastic gradient descent technique. Because it automatically adjusts itself and produces excellent results in a broad variety of situations, this is a popular variation of gradient descent.


Finally, since this is a classification issue, we'll collect and report classification accuracy, which will be determined by the metrics input.



# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])


Fit the Keras Model


Now it's time to put the model to the test on some real-world data.


By using the fit() method on the model, we can train or fit our model on our loaded data.

Training is conducted in epochs, with each epoch divided into batches.


  • Epoch: A single loop across the whole training dataset.

  • Batch: One or more samples examined by the model during an epoch before weights are changed.


One epoch is made up of one or more batches, depending on the batch size selected, and the model may be used for many epochs. 


The training process will go over the dataset for a specified number of iterations called epochs, which we must provide using the epochs option. The batch size, which is specified via the batch size option, is the number of dataset rows that are evaluated before the model weights are modified inside each epoch.

We'll run for a short number of epochs (150) and a batch size of 10 for this task.


These combinations can be determined by trial and error. You want to train your model enough such that it learns a decent (or good enough) mapping of input data rows to output classification. The model will always have some inaccuracy, but for a particular model setup, the amount of error will eventually level off. Model convergence is the term for this.



# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)

This is where all the computation happens. It can occur on your CPU or in a GPU if you have specified in the code.


Evaluate the Keras Model


We've trained our neural network on the complete dataset and can now assess its performance on the same dataset.


This will only tell us how well we predicted the dataset (for example, train accuracy), but not how well the algorithm would perform on fresh data. We did it this way for simplicity, but you could split your data into train and test datasets for model training and assessment.


You can use the evaluate() method on your model to evaluate your model on your training dataset, giving it the same input and output that you used to train it.


This will make a prediction for each input and output combination and gather scores, including the average loss and whatever metrics you've set up, including accuracy.

The evaluate() method will return a pair of values in a list. The first will represent the model's loss on the dataset, while the second will be the model's accuracy on the dataset. We're just concerned in reporting the accuracy, thus the loss value will be ignored.


# evaluate the keras model
accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))

If python is enabled in your path directory then you can run this program from the command prompt.


python myFirstClassifier.py


When you run this command, you should see a message with the loss and accuracy for each of the 150 epochs, followed by the final assessment of the trained model on the training dataset.


768/768 [==============================] - 0s 63us/step - loss: 0.4817 - acc: 0.7708 Epoch 147/150 768/768 [==============================] - 0s 63us/step - loss: 0.4764 - acc: 0.7747 Epoch 148/150 768/768 [==============================] - 0s 63us/step - loss: 0.4737 - acc: 0.7682 Epoch 149/150 768/768 [==============================] - 0s 64us/step - loss: 0.4730 - acc: 0.7747 Epoch 150/150 768/768 [==============================] - 0s 63us/step - loss: 0.4754 - acc: 0.7799 768/768 [==============================] - 0s 38us/step Accuracy: 76.56


On my CPU-powered workstation, it takes around 10 seconds to complete.

We would want the loss to be zero and the accuracy to be 100%. Except for the simplest machine learning situations, this is not achievable. Instead, there will always be a flaw in your model. For a given dataset, the aim is to choose a model configuration and training configuration that achieves the lowest loss and best accuracy. One of the most applicable ways to improve on this is to increase the number of epochs. Note that the more the epochs the much time and more computational power is required.

Make Predictions


After training our model, how do we use it to predict and classify data.


We can modify the previous example and use it to make predictions on the training dataset, as if it were a fresh dataset we had never seen before.


Making predictions is as simple as using the model's predict() method. On the output layer, we're using a sigmoid activation function, thus the predictions will be probabilities between 0 and 1. By rounding them, we can simply transform them into a crisp binary forecast for this classification assignment.


Consider the following example:


# make class predictions with the model
predictions = (model.predict(X) > 0.5).astype(int)
for i in range(5):
    print('%s => predicted %d(expected %d)' % (X[i].tolist(), predictions[i], y[i]))

For the first 5 instances in the dataset, this code makes predictions for each case in the dataset, then outputs the input data, predicted class, and anticipated class.


Predictions are produced for all cases in the dataset once the model is fit, and the input rows and predicted class value for the first 5 examples are reported and compared to the expected class value.


We can observe that the majority of the rows are predicted accurately. In reality, based on the model's expected performance in the preceding section, we would expect around 76.9% of the rows to be accurately predicted.


[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => predicted 0 (expected 1)

[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => predicted 0 (expected 0)

[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => predicted 1 (expected 1)

[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => predicted 0 (expected 0)

[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => predicted 1 (expected 1)


Bottom Line

In this tutorial, you have learned a major step into your Machine Learning adventure. Hold on, it does not end there. There are many more more things to know before you step up the ladder into becoming an ML expert. Do you need assistance today? Our experts are there to give you the resources you need in the form of training, undertaking you projects and for consultancy.

Comments


bottom of page