Materials for Applied Data Science profile course INFOMDA2 *Battling the curse of dimensionality*.

- 1 Introduction
- 2 Take-home exercises: deep feed-forward neural network
- 3 Lab exercises: convolutional neural network

In this practical, we will create a feed-forward neural network as well as a convolutional neural network to analyze the famous MNIST dataset.

```
library(tidyverse)
library(keras)
```

In this section, we will develop a deep feed-forward neural network for MNIST.

**1. Load the built-in MNIST dataset by running the following code.
Then, describe the structure and contents of the mnist object.**

```
mnist <- dataset_mnist()
```

Plotting is very important when working with image data. We have defined a convenient plotting function for you.

**2. Use the plot_img() function below to plot the first training
image. The img parameter has to be a matrix with dimensions
(28, 28).** NB: indexing in 3-dimensional arrays works the same as
indexing in matrices, but you need an extra comma

`x[,,]`

.```
plot_img <- function(img, col = gray.colors(255, start = 1, end = 0), ...) {
image(t(img), asp = 1, ylim = c(1.1, -0.1), col = col, bty = "n", axes = FALSE, ...)
}
```

It is usually a good idea to normalize your features to have a manageable, standard range before entering them in neural networks.

**3. As a preprocessing step, ensure the brightness values of the images
in the training and test set are in the range (0, 1)**

The simplest model is a multinomial logistic regression model, where we have no hidden layers and 10 outputs (0-1). That model is shown below.

**4. Display a summary of the multinomial model using the summary()
function. Describe why this model has 7850 parameters.**

```
multinom <-
keras_model_sequential(input_shape = c(28, 28)) %>% # initialize a sequential model
layer_flatten() %>% # flatten 28*28 matrix into single vector
layer_dense(10, activation = "softmax") # softmax outcome == logistic regression for each of 10 outputs
multinom$compile(
loss = "sparse_categorical_crossentropy", # loss function for multinomial outcome
optimizer = "adam", # we use this optimizer because it works well
metrics = list("accuracy") # we want to know training accuracy in the end
)
```

**5. Train the model for 5 epochs using the code below. What accuracy do
we obtain in the validation set?** (NB: the multinom object is changed
“in-place”, which means you don’t have to assign it to another variable)

```
multinom %>% fit(x = mnist$train$x, y = mnist$train$y, epochs = 5, validation_split = 0.2, verbose = 1)
```

**6. Train the model for another 5 epochs. What accuracy do we obtain in
the validation set?**

**7. Create and compile a feed-forward neural network with the following
properties. Ensure that the model has 50890 parameters.**

- sequential model
- flatten layer
- dense layer with 64 hidden units and “relu” activation function
- dense output layer with 10 units and softmax activation function

You may reuse code from the multinomial model

**7. Train the model for 10 epochs. What do you see in terms of
validation accuracy, also compared to the multinomial model?**

**8. Create predictions for the test data using the two trained models
(using the function below). Create a confusion matrix and compute test
accuracy for these two models. Write down any observations you have.**

```
class_predict <- function(model, x_train) predict(model, x = x_train) %>% apply(1, which.max) - 1
```

**9. OPTIONAL: if you have time, create and estimate (10 epochs) a deep
feed-forward model with the following properties. Compare this model to
the previous models on the test data.**

- sequential model
- flatten layer
- dense layer with 128 hidden units and “relu” activation function
- dense layer with 64 hidden units and “relu” activation function
- dense output layer with 10 units and softmax activation function

Convolution layers in Keras need a specific form of data input. For each
example, they need a `(width, height, channels)`

array (tensor). For a
colour image with 28*28 dimension, that shape is usually `(28, 28, 3)`

,
where the channels indicate red, green, and blue. MNIST has no colour
info, but we still need the channel dimension to enter the data into a
convolution layer with shape `(28, 28, 1)`

. The training dataset
`x_train`

should thus have shape `(60000, 28, 28, 1)`

.

**10. add a “channel” dimension to the training and test data using the
following code. Plot an image using the first channel of the 314th
training example (this is a 9).**

```
# add channel dimension to input (required for convolution layers)
dim(mnist$train$x) <- c(dim(mnist$train$x), 1)
dim(mnist$test$x) <- c(dim(mnist$test$x), 1)
```

**11. Create and compile a convolutional neural network using the
following code. Describe the different layers in your own words.**

```
cnn <-
keras_model_sequential(input_shape = c(28, 28, 1)) %>%
layer_conv_2d(filters = 6, kernel_size = c(5, 5)) %>%
layer_max_pooling_2d(pool_size = c(4, 4)) %>%
layer_flatten() %>%
layer_dense(units = 32, activation = "relu") %>%
layer_dense(10, activation = "softmax")
cnn %>%
compile(
loss = "sparse_categorical_crossentropy",
optimizer = "adam",
metrics = c("accuracy")
)
```

**12. Fit this model on the training data (10 epochs) and compare it to
the previous models.**

**13. Create another CNN which has better validation performance within
10 epochs. Compare your validation accuracy to that of your peers.**

Here are some things you could do:

- Reduce the convolution filter size & the pooling size and add a second convolutional & pooling layer with double the number of filters
- Add a dropout layer after the flatten layer
- Look up on the internet what works well and implement it!