Neural Networks with Keras
Contents
35. Neural Networks with Keras#
import numpy as np # advanced math library
import matplotlib.pyplot as plt
import random # for generating random numbers
from keras.datasets import mnist # MNIST dataset is included in Keras
from keras.models import Sequential # Model type to be used
from keras.layers.core import Dense, Dropout, Activation # Types of layers to be used in our model
from keras.utils import np_utils # NumPy related tools
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D, GlobalAveragePooling2D, Flatten
from keras.layers import BatchNormalization
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Input In [1], in <cell line: 5>()
2 import matplotlib.pyplot as plt
3 import random # for generating random numbers
----> 5 from keras.datasets import mnist # MNIST dataset is included in Keras
6 from keras.models import Sequential # Model type to be used
8 from keras.layers.core import Dense, Dropout, Activation # Types of layers to be used in our model
File /opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/keras/__init__.py:21, in <module>
15 """Implementation of the Keras API, the high-level API of TensorFlow.
16
17 Detailed documentation and user guides are available at
18 [keras.io](https://keras.io).
19 """
20 # pylint: disable=unused-import
---> 21 from tensorflow.python import tf2
22 from keras import distribute
24 from keras import models
ModuleNotFoundError: No module named 'tensorflow'
First, we’ll use keras
to build similar networks to the ones we saw with sklearn
it will be more complex, and we’re using a different version of the digits data
(larger images) but this will still be just learning the predictions, not the
representation for now. On Friday, we’ll learn how to add new layers that
transform the data and learn the representation at the same time.
35.1. Preparing data for deep learning#
The MNIST data is split between 60,000 28 x 28 pixel training images and 10,000 28 x 28 pixel images. It’s realated to the digits data that we have seen before.
We will load the data and look at a random sample.
(X_train, y_train), (X_test, y_test) = mnist.load_data()
plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger
for i in range(9):
plt.subplot(3,3,i+1)
num = random.randint(0, len(X_train))
plt.imshow(X_train[num], cmap='gray', interpolation='none')
plt.title("Class {}".format(y_train[num]))
plt.tight_layout()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 (X_train, y_train), (X_test, y_test) = mnist.load_data()
4 plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger
6 for i in range(9):
NameError: name 'mnist' is not defined
Next, we need to transform the data to be compatible by reshaping 28 x 28 matrices into vectors, then converting to floats and normalizing.
28*28
784
Now that we know the length, we can transform.
X_train = X_train.reshape(60000, 784) # 60,000 training samples
X_test = X_test.reshape(10000, 784) # 10,000 test samples
X_train = X_train.astype('float32') # change integers to 32-bit floating point numbers
X_test = X_test.astype('float32')
X_train /= 255 # normalize each value for each pixel for the entire vector for each input
X_test /= 255
print("Training matrix shape", X_train.shape)
print("Testing matrix shape", X_test.shape)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 X_train = X_train.reshape(60000, 784) # 60,000 training samples
2 X_test = X_test.reshape(10000, 784) # 10,000 test samples
4 X_train = X_train.astype('float32') # change integers to 32-bit floating point numbers
NameError: name 'X_train' is not defined
We will use one-hot encoding for the target variable, since our neural net’s output layer is acutaly the probability distribution over the 10 digits, not a value from 0 to 9.
nb_classes = 10 # number of unique digits
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [5], in <cell line: 3>()
1 nb_classes = 10 # number of unique digits
----> 3 Y_train = np_utils.to_categorical(y_train, nb_classes)
4 Y_test = np_utils.to_categorical(y_test, nb_classes)
NameError: name 'np_utils' is not defined
y_train[0]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 y_train[0]
NameError: name 'y_train' is not defined
Y_train[0]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 Y_train[0]
NameError: name 'Y_train' is not defined
35.2. Building a neural network in Keras#
We will start with a network similar to what we have seen in sklearn
. It will have a number of layers and, when predicting pass data from one layer to the next sequentially.
model = Sequential()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 model = Sequential()
NameError: name 'Sequential' is not defined
Next wee add layers. In keras
what we saw as one neuron (connections + actiation) before is treated as two separate layers, where the size of the layer is the number of neurons.
model.add(Dense(512, input_shape=(784,)))
model.add(Activation('relu'))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [9], in <cell line: 1>()
----> 1 model.add(Dense(512, input_shape=(784,)))
2 model.add(Activation('relu'))
NameError: name 'model' is not defined
model.add(Dense(512))
model.add(Activation('relu'))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [10], in <cell line: 1>()
----> 1 model.add(Dense(512))
2 model.add(Activation('relu'))
NameError: name 'model' is not defined
model.add(Dense(10))
model.add(Activation('softmax'))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [11], in <cell line: 1>()
----> 1 model.add(Dense(10))
2 model.add(Activation('softmax'))
NameError: name 'model' is not defined
model.summary()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 model.summary()
NameError: name 'model' is not defined
Keras also has parameters about the optimization, like we saw before, but we set those with the compile
method.
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
NameError: name 'model' is not defined
model.fit(X_train, Y_train,
batch_size=128, epochs=5,
verbose=1)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [14], in <cell line: 1>()
----> 1 model.fit(X_train, Y_train,
2 batch_size=128, epochs=5,
3 verbose=1)
NameError: name 'model' is not defined
score = model.evaluate(X_test, Y_test)
score
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [15], in <cell line: 1>()
----> 1 score = model.evaluate(X_test, Y_test)
2 score
NameError: name 'model' is not defined
35.3. What kind of mistakes did it make?#
predicted_prob = model.predict(X_test)
predicted_classes = np.argmax(predicted_prob,axis=1)
# Check which items we got right / wrong
correct_indices = np.nonzero(predicted_classes == y_test)[0]
incorrect_indices = np.nonzero(predicted_classes != y_test)[0]
plt.figure()
for i, correct in enumerate(correct_indices[:9]):
plt.subplot(3,3,i+1)
plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none')
plt.title("Predicted {}, Class {}".format(predicted_classes[correct], y_test[correct]))
plt.tight_layout()
plt.figure()
for i, incorrect in enumerate(incorrect_indices[:9]):
plt.subplot(3,3,i+1)
plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none')
plt.title("Predicted {}, Class {}".format(predicted_classes[incorrect], y_test[incorrect]))
plt.tight_layout()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [16], in <cell line: 1>()
----> 1 predicted_prob = model.predict(X_test)
2 predicted_classes = np.argmax(predicted_prob,axis=1)
5 # Check which items we got right / wrong
NameError: name 'model' is not defined
35.4. Dropout for less overfitting#
Dropout makes some of the neurons zero.
model_dropout = Sequential()
model_dropout.add(Dense(512, input_shape=(784,)))
model_dropout.add(Activation('relu'))
model_dropout.add(Dropout(0.2))
model_dropout.add(Dense(512))
model_dropout.add(Activation('relu'))
model_dropout.add(Dropout(0.2))
model_dropout.add(Dense(10))
model_dropout.add(Activation('softmax'))
model_dropout.summary()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 model_dropout = Sequential()
3 model_dropout.add(Dense(512, input_shape=(784,)))
4 model_dropout.add(Activation('relu'))
NameError: name 'Sequential' is not defined
Again, we set the optimization paramaters and then we can fit.
model_dropout.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model_dropout.fit(X_train, Y_train,
batch_size=128, epochs=5,
verbose=1)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [18], in <cell line: 1>()
----> 1 model_dropout.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
2 model_dropout.fit(X_train, Y_train,
3 batch_size=128, epochs=5,
4 verbose=1)
NameError: name 'model_dropout' is not defined
score = model_dropout.evaluate(X_test, Y_test)
score
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [19], in <cell line: 1>()
----> 1 score = model_dropout.evaluate(X_test, Y_test)
2 score
NameError: name 'model_dropout' is not defined
Now we see the gap between the train and test performance is smaller and the test performance is higher.