35. Neural Networks with Keras#

import numpy as np          # advanced math library
import matplotlib.pyplot as plt   
import random            # for generating random numbers

from keras.datasets import mnist  # MNIST dataset is included in Keras  
from keras.models import Sequential # Model type to be used

from keras.layers.core import Dense, Dropout, Activation # Types of layers to be used in our model
from keras.utils import np_utils             # NumPy related tools

from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D, GlobalAveragePooling2D, Flatten
from keras.layers import BatchNormalization
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [1], in <cell line: 5>()
      2 import matplotlib.pyplot as plt   
      3 import random            # for generating random numbers
----> 5 from keras.datasets import mnist  # MNIST dataset is included in Keras  
      6 from keras.models import Sequential # Model type to be used
      8 from keras.layers.core import Dense, Dropout, Activation # Types of layers to be used in our model

File /opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/keras/__init__.py:21, in <module>
     15 """Implementation of the Keras API, the high-level API of TensorFlow.
     16 
     17 Detailed documentation and user guides are available at
     18 [keras.io](https://keras.io).
     19 """
     20 # pylint: disable=unused-import
---> 21 from tensorflow.python import tf2
     22 from keras import distribute
     24 from keras import models

ModuleNotFoundError: No module named 'tensorflow'

First, we’ll use keras to build similar networks to the ones we saw with sklearn it will be more complex, and we’re using a different version of the digits data (larger images) but this will still be just learning the predictions, not the representation for now. On Friday, we’ll learn how to add new layers that transform the data and learn the representation at the same time.

35.1. Preparing data for deep learning#

The MNIST data is split between 60,000 28 x 28 pixel training images and 10,000 28 x 28 pixel images. It’s realated to the digits data that we have seen before.

We will load the data and look at a random sample.

(X_train, y_train), (X_test, y_test) = mnist.load_data()


plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger

for i in range(9):
  plt.subplot(3,3,i+1)
  num = random.randint(0, len(X_train))
  plt.imshow(X_train[num], cmap='gray', interpolation='none')
  plt.title("Class {}".format(y_train[num]))

plt.tight_layout()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 (X_train, y_train), (X_test, y_test) = mnist.load_data()
      4 plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger
      6 for i in range(9):

NameError: name 'mnist' is not defined

Next, we need to transform the data to be compatible by reshaping 28 x 28 matrices into vectors, then converting to floats and normalizing.

28*28
784

Now that we know the length, we can transform.

X_train = X_train.reshape(60000, 784) # 60,000 training samples
X_test = X_test.reshape(10000, 784)  #  10,000  test samples

X_train = X_train.astype('float32')  # change integers to 32-bit floating point numbers
X_test = X_test.astype('float32')

X_train /= 255            # normalize each value for each pixel for the entire vector for each input
X_test /= 255

print("Training matrix shape", X_train.shape)
print("Testing matrix shape", X_test.shape)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 X_train = X_train.reshape(60000, 784) # 60,000 training samples
      2 X_test = X_test.reshape(10000, 784)  #  10,000  test samples
      4 X_train = X_train.astype('float32')  # change integers to 32-bit floating point numbers

NameError: name 'X_train' is not defined

We will use one-hot encoding for the target variable, since our neural net’s output layer is acutaly the probability distribution over the 10 digits, not a value from 0 to 9.

nb_classes = 10 # number of unique digits

Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [5], in <cell line: 3>()
      1 nb_classes = 10 # number of unique digits
----> 3 Y_train = np_utils.to_categorical(y_train, nb_classes)
      4 Y_test = np_utils.to_categorical(y_test, nb_classes)

NameError: name 'np_utils' is not defined
y_train[0]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 y_train[0]

NameError: name 'y_train' is not defined
Y_train[0]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 Y_train[0]

NameError: name 'Y_train' is not defined

35.2. Building a neural network in Keras#

We will start with a network similar to what we have seen in sklearn. It will have a number of layers and, when predicting pass data from one layer to the next sequentially.

model = Sequential()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 model = Sequential()

NameError: name 'Sequential' is not defined

Next wee add layers. In keras what we saw as one neuron (connections + actiation) before is treated as two separate layers, where the size of the layer is the number of neurons.

model.add(Dense(512, input_shape=(784,)))
model.add(Activation('relu'))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [9], in <cell line: 1>()
----> 1 model.add(Dense(512, input_shape=(784,)))
      2 model.add(Activation('relu'))

NameError: name 'model' is not defined
model.add(Dense(512))
model.add(Activation('relu'))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [10], in <cell line: 1>()
----> 1 model.add(Dense(512))
      2 model.add(Activation('relu'))

NameError: name 'model' is not defined
model.add(Dense(10))
model.add(Activation('softmax'))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [11], in <cell line: 1>()
----> 1 model.add(Dense(10))
      2 model.add(Activation('softmax'))

NameError: name 'model' is not defined
model.summary()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 model.summary()

NameError: name 'model' is not defined

Keras also has parameters about the optimization, like we saw before, but we set those with the compile method.

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

NameError: name 'model' is not defined
model.fit(X_train, Y_train,
     batch_size=128, epochs=5,
     verbose=1)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [14], in <cell line: 1>()
----> 1 model.fit(X_train, Y_train,
      2      batch_size=128, epochs=5,
      3      verbose=1)

NameError: name 'model' is not defined
score = model.evaluate(X_test, Y_test)
score
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [15], in <cell line: 1>()
----> 1 score = model.evaluate(X_test, Y_test)
      2 score

NameError: name 'model' is not defined

35.3. What kind of mistakes did it make?#

predicted_prob = model.predict(X_test)
predicted_classes = np.argmax(predicted_prob,axis=1)


# Check which items we got right / wrong
correct_indices = np.nonzero(predicted_classes == y_test)[0]

incorrect_indices = np.nonzero(predicted_classes != y_test)[0]


plt.figure()
for i, correct in enumerate(correct_indices[:9]):
  plt.subplot(3,3,i+1)
  plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none')
  plt.title("Predicted {}, Class {}".format(predicted_classes[correct], y_test[correct]))

plt.tight_layout()

plt.figure()
for i, incorrect in enumerate(incorrect_indices[:9]):
  plt.subplot(3,3,i+1)
  plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none')
  plt.title("Predicted {}, Class {}".format(predicted_classes[incorrect], y_test[incorrect]))

plt.tight_layout()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [16], in <cell line: 1>()
----> 1 predicted_prob = model.predict(X_test)
      2 predicted_classes = np.argmax(predicted_prob,axis=1)
      5 # Check which items we got right / wrong

NameError: name 'model' is not defined

35.4. Dropout for less overfitting#

Dropout makes some of the neurons zero.

model_dropout = Sequential()

model_dropout.add(Dense(512, input_shape=(784,)))
model_dropout.add(Activation('relu'))
model_dropout.add(Dropout(0.2))

model_dropout.add(Dense(512))
model_dropout.add(Activation('relu'))
model_dropout.add(Dropout(0.2))

model_dropout.add(Dense(10))
model_dropout.add(Activation('softmax'))


model_dropout.summary()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 model_dropout = Sequential()
      3 model_dropout.add(Dense(512, input_shape=(784,)))
      4 model_dropout.add(Activation('relu'))

NameError: name 'Sequential' is not defined

Again, we set the optimization paramaters and then we can fit.

model_dropout.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model_dropout.fit(X_train, Y_train,
     batch_size=128, epochs=5,
     verbose=1)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [18], in <cell line: 1>()
----> 1 model_dropout.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
      2 model_dropout.fit(X_train, Y_train,
      3      batch_size=128, epochs=5,
      4      verbose=1)

NameError: name 'model_dropout' is not defined
score = model_dropout.evaluate(X_test, Y_test)
score
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [19], in <cell line: 1>()
----> 1 score = model_dropout.evaluate(X_test, Y_test)
      2 score

NameError: name 'model_dropout' is not defined

Now we see the gap between the train and test performance is smaller and the test performance is higher.

35.5. Questions#

35.5.1. ‟What is the difference between the different activation methods?”#