Building Resnet 50 from scratch with Keras

Resnets are one of the most popular convolutional networks available in deep learning literature. All major libraries (e.g. Keras) have fully baked implementations of Resnets available for engineers to use on daily basis. There are a number of online tutorials available which illuminate the basic principles behind the resnets. Here are a couple of them:

There is also one useful tutorial about building the key modules in popular networks like VGG, Inception and ResNet.

However, there is a lack of articles walking through the nitty gritties of a complete ResNet implementation. There are several details which need to be properly addressed in building a complete ResNet. In this article, we will focus on building ResNet 50 from scratch. Our presentation in this tutorial is a simplified version of the code available in the Keras Applications GITHUB repository.

One key goal of this tutorial is to give you hands on experience of building large complex CNNs with the help of Keras Functional API.

For basic introduction to Resnets, we suggest looking at the articles mentioned above or the original paper

Following is table-1 from the paper which describes various ResNet architectures. Resnet Architectures

Please spend some time looking at the column for the architecture of 50 layer ResNet.

Without further ado, let’s get into implementing a Resnet 50 network with Keras.

We start by importing relevant modules from Keras.

[ ]:
from tensorflow.keras import layers, backend, models, utils

In the sequel, we will need to create various batch normalization layers. All of them will have same epsilon value which is a small float added to the variance to avoid dividing by zero.

[ ]:
# epsilon for Batch Normalization layers
BN_EPS= 1.001e-5

There are five stages of the ResNet which have been labeled as conv1, conv2, conv3, conv4 and conv5 in the paper (first column of the table above). These are followed by a global average pool and a simple fully connected 1000 way classification layer.

Input to the ResNet 50 network is typically a batch of images with size ( 224, 224, 3).

  • Stage conv1 output is (56, 56, 64). i.e. the size has reduced 4 times to (56, 56) and the number of channels has increased to 64.

  • Stage conv2 output is (56, 56, 256).

  • Stage conv3 output is (28, 28, 512).

  • Stage conv4 output is (14, 14, 1024)

  • Stage conv5 output is (7, 7, 2048)

  • conv2 has 3 residual blocks, conv3 has 4 residual blocks, conv4 has 6 and conv5 has 3.

The Residual Blocks

Let’s start by defining functions for building the residual blocks in the ResNet50 network. We will slowly increase the complexity of residual blocks to cover all the needs of ResNet 50.

Every residual block essentially consists of three convolutional layers along the residual path and an identity connection from input to output. There are some details which will come up later. Let’s look at the residual blocks in conv2 stage.

Output of the conv1 stage is a tensor of size (None, 56, 56, 64). Its implementation will be discussed later.

The first convolutional layer in this residual block is a 1x1 layer with 64 filters. Second one is a 3x3 layer with 64 filters. The last is again a 1x1 layer with 4 times number of filters at 256 filters.

For now, let’s just build the convolutional layers for the residual path.

[ ]:
def conv_131(input, filters):
    # number of channels in output tensor
    num_output_channels = 4 * filters
    # The 1x1 first convolution layer
    net = layers.Conv2D(filters, 1)(input)
    net = layers.BatchNormalization(epsilon=BN_EPS,)(net)
    net = layers.Activation('relu')(net)
    # The 3x3 second convolution layer
    net = layers.Conv2D(filters, 3, padding='same')(net)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    net = layers.Activation('relu')(net)
    # The 1x1 third convolution layer
    net = layers.Conv2D(num_output_channels, 1)(net)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    net = layers.Activation('relu')(net)
    return net

The 1x1 layers don’t require a padding parameter as 1x1 convolution is nothing but an inner product over the channels in the input tensor and it doesn’t lead to any changes in the image size (unless a different stride is specified). It can definitely change the number of channels from input to output.

Each conv layer is followed by batch normalization and then a relu activation.

Building a model

We will write a simple function which can build a Keras model from a network building function. A network building function, like conv_131 above, takes a layer as input, adds some more layers on top of it.

The model building function considers the shape of the input tensor to the network and uses it to create an input layer. It then feeds the input layer to the network building function and builds the whole network. Finally, the input and output are combined to form a Keras model.

This function will be quite handy in displaying the architecture of any network in rest of the tutorial.

[ ]:
def build_model(input_shape, net_fn):
    img_input =  layers.Input(shape=input_shape)
    net = net_fn(img_input)
    inputs = img_input
    outputs = net
    model = models.Model(inputs, outputs)
    return model

Let’s use this function to build a small model consisting of a single block of the three convolutional layers in conv2 stage of ResNet 50. As discussed earlier, the input shape is (56, 56, 64). We then print the model architecture summary.

[ ]:
model = build_model((56, 56, 64), lambda input: conv_131(input, 64))
model.summary()
Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 56, 56, 64)]      0
_________________________________________________________________
conv2d (Conv2D)              (None, 56, 56, 64)        4160
_________________________________________________________________
batch_normalization (BatchNo (None, 56, 56, 64)        256
_________________________________________________________________
activation (Activation)      (None, 56, 56, 64)        0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 56, 56, 64)        36928
_________________________________________________________________
batch_normalization_1 (Batch (None, 56, 56, 64)        256
_________________________________________________________________
activation_1 (Activation)    (None, 56, 56, 64)        0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 56, 56, 256)       16640
_________________________________________________________________
batch_normalization_2 (Batch (None, 56, 56, 256)       1024
_________________________________________________________________
activation_2 (Activation)    (None, 56, 56, 256)       0
=================================================================
Total params: 59,264
Trainable params: 58,496
Non-trainable params: 768
_________________________________________________________________

The identity path

It’s time to add the identity path in our 3 CNN layer block. However, there is a catch. The input to first residual block is (56, 56, 64). But the output of the third layer in this block is (56, 56, 256). The number of channels changes from 64 to 256. Hence, the input cannot be added to the output of the residual path directly. A solution is to add a 1x1 conv layer in the identity path (whenever the number of input channels is not equal to the number of output channels).

The function residual_131_v1 below incorporates this. Note the following in this:

  • We check if the number of input and output channels is different.

  • If yes, then a 1x1 conv layer is added with batch normalization but no relu activation.

  • The output of the last (1x1) conv layer is bacth normalized.

  • Then it is added to the output of the shortcut identity path.

  • Finally, the sum undergoes a common relu activation.

[ ]:
def residual_131_v1(input, filters):
    # shape of input tensor
    input_shape = input.shape
    # number of channels in input tensor
    num_input_channels = input_shape[3]
    # number of channels in output tensor
    num_output_channels = 4 * filters
    # if input and output channels are same then we can feed
    # the input directly as identity shortcut
    # otherwise, we need to add a convolutional layer in identity path
    conv_in_identity_path = num_output_channels != num_input_channels
    if conv_in_identity_path is True:
        # add a conv layer to increase the number of channels
        shortcut = layers.Conv2D(num_output_channels, 1)(input)
        # batch normalize (activation will come later)
        shortcut = layers.BatchNormalization(epsilon=BN_EPS)(shortcut)
    else:
        shortcut = input
    # The 1x1 first convolution layer
    net = layers.Conv2D(filters, 1)(input)
    net = layers.BatchNormalization(epsilon=BN_EPS,)(net)
    net = layers.Activation('relu')(net)
    # The 3x3 second convolution layer
    net = layers.Conv2D(filters, 3, padding='same')(net)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    net = layers.Activation('relu')(net)
    # The 1x1 third convolution layer
    net = layers.Conv2D(num_output_channels, 1)(net)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    # Add identity shortcut to residual output before activation
    net = layers.Add()([shortcut, net])
    net = layers.Activation('relu')(net)
    return net

Let’s build a model with a residual block and print its summary.

[ ]:
model = build_model((56, 56, 64), lambda input: residual_131_v1(input, 64))
model.summary()
Model: "functional_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_2 (InputLayer)            [(None, 56, 56, 64)] 0
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 56, 56, 64)   4160        input_2[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 56, 56, 64)   256         conv2d_4[0][0]
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 56, 56, 64)   0           batch_normalization_4[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 56, 56, 64)   36928       activation_3[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 56, 56, 64)   256         conv2d_5[0][0]
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 56, 56, 64)   0           batch_normalization_5[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 56, 56, 256)  16640       input_2[0][0]
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 56, 56, 256)  16640       activation_4[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 56, 56, 256)  1024        conv2d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 56, 56, 256)  1024        conv2d_6[0][0]
__________________________________________________________________________________________________
add (Add)                       (None, 56, 56, 256)  0           batch_normalization_3[0][0]
                                                                 batch_normalization_6[0][0]
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 56, 56, 256)  0           add[0][0]
==================================================================================================
Total params: 76,928
Trainable params: 75,648
Non-trainable params: 1,280
__________________________________________________________________________________________________

The conv1 stage

There are still one problem with the residual block. But before that, let’s complete the implementation of the conv1 stage. See the function below.

  • Input is a (224, 224, 3) image.

  • The first layer is a large 7x7 convolutional layer that downsamples the resolution to (112, 112) and increases the number of channels to 64.

  • We achieve this in two steps as follows.

  • Add a padding of 3 pixels on all sides increasing resolution to (230, 230)

  • Perform a 7x7 valid convolution with stride 2 with 64 filters and achieve an output of (112, 112, 64).

  • Next, the output goes through batch normalization and relu activation.

  • Finally, we add a padding of 1 pixels and then do a max pooling of 3x3 to achieve an output tensor of size (56,56,64).

[ ]:
def conv1(img_input):
    # pad in advance for valid convolution (output is 230x230)
    net = layers.ZeroPadding2D(padding=(3, 3))(img_input)
    # perform the big 7x7 convolution with 2x2 stride to (112x112x64)
    net = layers.Conv2D(64, (7, 7),
                      strides=(2, 2),
                      padding='valid',
                      kernel_initializer='he_normal')(net)
    # batch normalization before activation (output is 112x112x64)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    # relu activation (output is 112x112x64)
    net = layers.Activation('relu')(net)
    # pad again for max pooling (output is 114x114x64)
    net = layers.ZeroPadding2D(padding=(1, 1))(net)
    # 3x3 max pooling with 2x2 stride (output is 56x56x64)
    net = layers.MaxPooling2D((3, 3), strides=(2, 2))(net)
    return net

Let’s combine the conv1 stage with our first residual block to see if everything is working; build a partial residual network and print it.

[ ]:
def partial_resnet_v1(img_input):
  net = conv1(img_input)
  net = residual_131_v1(net, 64)
  return net

model =  build_model((224, 224, 3), partial_resnet_v1)
model.summary()
Model: "functional_5"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_3 (InputLayer)            [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
zero_padding2d (ZeroPadding2D)  (None, 230, 230, 3)  0           input_3[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 112, 112, 64) 9472        zero_padding2d[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 112, 112, 64) 256         conv2d_7[0][0]
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 112, 112, 64) 0           batch_normalization_7[0][0]
__________________________________________________________________________________________________
zero_padding2d_1 (ZeroPadding2D (None, 114, 114, 64) 0           activation_6[0][0]
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 56, 56, 64)   0           zero_padding2d_1[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 56, 56, 64)   4160        max_pooling2d[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 56, 56, 64)   256         conv2d_9[0][0]
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 56, 56, 64)   0           batch_normalization_9[0][0]
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 56, 56, 64)   36928       activation_7[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 56, 56, 64)   256         conv2d_10[0][0]
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 56, 56, 64)   0           batch_normalization_10[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 56, 56, 256)  16640       max_pooling2d[0][0]
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 56, 56, 256)  16640       activation_8[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 56, 56, 256)  1024        conv2d_8[0][0]
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 56, 56, 256)  1024        conv2d_11[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 56, 56, 256)  0           batch_normalization_8[0][0]
                                                                 batch_normalization_11[0][0]
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 56, 56, 256)  0           add_1[0][0]
==================================================================================================
Total params: 86,656
Trainable params: 85,248
Non-trainable params: 1,408
__________________________________________________________________________________________________

The stack of all residual blocks of conv2 stage

Everything is going good so far. Recall from the network table that the conv2 stage for ResNet 50 has 3 residual blocks. Let’s write a simple function to build all of them and combine them.

This is also known as the stack of residual blocks.

[ ]:
def residual_stack_v1(input, filters, blocks):
    net = input
    for i in range(blocks):
        net = residual_131_v1(net, filters)
    return net

If you have been paying attention, you may notice something interesting.

  • Output of first residual block is (56, 56, 256).

  • Output of second residual block is also (56, 56, 256).

  • Thus, for the second (and also the third) residual block, the number of channels in both input and output are same.

  • Hence, the identity connection doesn’t require any 1x1 convolutional layer.

  • In fact, the first 1x1 conv layer in the second residual block actually reduces the number of channels from 256 back to 64 and the third layer increases it again to 256.

Let’s now build a partial resnet with both conv1 and conv2 stages complete.

[ ]:
def partial_resnet_v2(img_input):
  net = conv1(img_input)
  net = residual_stack_v1(net, 64, 3)
  return net

model =  build_model((224, 224, 3), partial_resnet_v2)
model.summary()
Model: "functional_7"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_4 (InputLayer)            [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
zero_padding2d_2 (ZeroPadding2D (None, 230, 230, 3)  0           input_4[0][0]
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 112, 112, 64) 9472        zero_padding2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 112, 112, 64) 256         conv2d_12[0][0]
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 112, 112, 64) 0           batch_normalization_12[0][0]
__________________________________________________________________________________________________
zero_padding2d_3 (ZeroPadding2D (None, 114, 114, 64) 0           activation_10[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 56, 56, 64)   0           zero_padding2d_3[0][0]
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 56, 56, 64)   4160        max_pooling2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 56, 56, 64)   256         conv2d_14[0][0]
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 56, 56, 64)   0           batch_normalization_14[0][0]
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 56, 56, 64)   36928       activation_11[0][0]
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 56, 56, 64)   256         conv2d_15[0][0]
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 56, 56, 64)   0           batch_normalization_15[0][0]
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 56, 56, 256)  16640       max_pooling2d_1[0][0]
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 56, 56, 256)  16640       activation_12[0][0]
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 56, 56, 256)  1024        conv2d_13[0][0]
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 56, 56, 256)  1024        conv2d_16[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (None, 56, 56, 256)  0           batch_normalization_13[0][0]
                                                                 batch_normalization_16[0][0]
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 56, 56, 256)  0           add_2[0][0]
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 56, 56, 64)   16448       activation_13[0][0]
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 56, 56, 64)   256         conv2d_17[0][0]
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 56, 56, 64)   0           batch_normalization_17[0][0]
__________________________________________________________________________________________________
conv2d_18 (Conv2D)              (None, 56, 56, 64)   36928       activation_14[0][0]
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 56, 56, 64)   256         conv2d_18[0][0]
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 56, 56, 64)   0           batch_normalization_18[0][0]
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 56, 56, 256)  16640       activation_15[0][0]
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 56, 56, 256)  1024        conv2d_19[0][0]
__________________________________________________________________________________________________
add_3 (Add)                     (None, 56, 56, 256)  0           activation_13[0][0]
                                                                 batch_normalization_19[0][0]
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 56, 56, 256)  0           add_3[0][0]
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 56, 56, 64)   16448       activation_16[0][0]
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 56, 56, 64)   256         conv2d_20[0][0]
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 56, 56, 64)   0           batch_normalization_20[0][0]
__________________________________________________________________________________________________
conv2d_21 (Conv2D)              (None, 56, 56, 64)   36928       activation_17[0][0]
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 56, 56, 64)   256         conv2d_21[0][0]
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 56, 56, 64)   0           batch_normalization_21[0][0]
__________________________________________________________________________________________________
conv2d_22 (Conv2D)              (None, 56, 56, 256)  16640       activation_18[0][0]
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 56, 56, 256)  1024        conv2d_22[0][0]
__________________________________________________________________________________________________
add_4 (Add)                     (None, 56, 56, 256)  0           activation_16[0][0]
                                                                 batch_normalization_22[0][0]
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 56, 56, 256)  0           add_4[0][0]
==================================================================================================
Total params: 229,760
Trainable params: 226,816
Non-trainable params: 2,944
__________________________________________________________________________________________________

Size reduction from one stage to next stage

Look back at the architecture table. conv2 stage has output size of (56, 56) but the conv3 stage works on a size of (28, 28). We need to perform a downsampling here. This is the job of the first convolution layer in the first residual block of a particular stage of the network (conv3, conv4, conv5).

A little modification in the residual 131 block generation function achieves this. See the code below:

  • A new parameter stride1 has been introduced. This only applies to the first 1x1 convolution layer.

  • If stride1=2, then the first 1x1 conv layer reduces the input size by a factor of 4 (2 in width, 2 in height).

  • The other two conv layers remain as it is.

  • This change also applies to the identity path. As the output size is halved, the 1x1 conv layer in the identity path also needs a stride of 2.

[ ]:
def residual_131_v2(input, filters, stride1=1):
    # shape of input tensor
    input_shape = input.shape
    # number of channels in input tensor
    num_input_channels = input_shape[3]
    # number of channels in output tensor
    num_output_channels = 4 * filters
    # if input and output channels are same then we can feed
    # the input directly as identity shortcut
    # otherwise, we need to add a convolutional layer in identity path
    conv_in_identity_path = num_output_channels != num_input_channels
    if conv_in_identity_path is True:
        # add a conv layer to increase the number of channels
        shortcut = layers.Conv2D(num_output_channels, 1, strides=stride1)(input)
        # batch normalize (activation will come later)
        shortcut = layers.BatchNormalization(epsilon=BN_EPS)(shortcut)
    else:
        shortcut = input
    # The 1x1 first convolution layer
    net = layers.Conv2D(filters, 1, strides=stride1)(input)
    net = layers.BatchNormalization(epsilon=BN_EPS,)(net)
    net = layers.Activation('relu')(net)
    # The 3x3 second convolution layer
    net = layers.Conv2D(filters, 3, padding='same')(net)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    net = layers.Activation('relu')(net)
    # The 1x1 third convolution layer
    net = layers.Conv2D(num_output_channels, 1)(net)
    net = layers.BatchNormalization(epsilon=BN_EPS)(net)
    # Add identity shortcut to residual output before activation
    net = layers.Add()([shortcut, net])
    net = layers.Activation('relu')(net)
    return net

we need to modify our stack building function. The first block in the stack will have a stride of 2 (for conv3, conv4, conv5) layers and stride of 1 in conv2 layer. All other blocks in the stack will have a stride of 1.

[ ]:
def residual_stack_v2(input, filters, blocks, stride1=2):
    net = input
    net = residual_131_v2(net, filters, stride1=stride1)
    for i in range(blocks-1):
        net = residual_131_v1(net, filters)
    return net

The complete ResNet 50 CNN

We are now ready to build all the 5 stages of the ResNet 50 CNN. See the code below. The only thing missing is the classification layer on top of the CNN.

[ ]:
def cnn_resnet50(img_input):
  net = conv1(img_input)
  net = residual_stack_v2(net, 64, 3, stride1=1)
  net = residual_stack_v2(net, 128, 4)
  net = residual_stack_v2(net, 256, 6)
  net = residual_stack_v2(net, 512, 3)
  return net

model =  build_model((224, 224, 3), cnn_resnet50)
model.summary()
Model: "functional_9"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_5 (InputLayer)            [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
zero_padding2d_4 (ZeroPadding2D (None, 230, 230, 3)  0           input_5[0][0]
__________________________________________________________________________________________________
conv2d_23 (Conv2D)              (None, 112, 112, 64) 9472        zero_padding2d_4[0][0]
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 112, 112, 64) 256         conv2d_23[0][0]
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 112, 112, 64) 0           batch_normalization_23[0][0]
__________________________________________________________________________________________________
zero_padding2d_5 (ZeroPadding2D (None, 114, 114, 64) 0           activation_20[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 56, 56, 64)   0           zero_padding2d_5[0][0]
__________________________________________________________________________________________________
conv2d_25 (Conv2D)              (None, 56, 56, 64)   4160        max_pooling2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 56, 56, 64)   256         conv2d_25[0][0]
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 56, 56, 64)   0           batch_normalization_25[0][0]
__________________________________________________________________________________________________
conv2d_26 (Conv2D)              (None, 56, 56, 64)   36928       activation_21[0][0]
__________________________________________________________________________________________________
batch_normalization_26 (BatchNo (None, 56, 56, 64)   256         conv2d_26[0][0]
__________________________________________________________________________________________________
activation_22 (Activation)      (None, 56, 56, 64)   0           batch_normalization_26[0][0]
__________________________________________________________________________________________________
conv2d_24 (Conv2D)              (None, 56, 56, 256)  16640       max_pooling2d_2[0][0]
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 56, 56, 256)  16640       activation_22[0][0]
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 56, 56, 256)  1024        conv2d_24[0][0]
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, 56, 56, 256)  1024        conv2d_27[0][0]
__________________________________________________________________________________________________
add_5 (Add)                     (None, 56, 56, 256)  0           batch_normalization_24[0][0]
                                                                 batch_normalization_27[0][0]
__________________________________________________________________________________________________
activation_23 (Activation)      (None, 56, 56, 256)  0           add_5[0][0]
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 56, 56, 64)   16448       activation_23[0][0]
__________________________________________________________________________________________________
batch_normalization_28 (BatchNo (None, 56, 56, 64)   256         conv2d_28[0][0]
__________________________________________________________________________________________________
activation_24 (Activation)      (None, 56, 56, 64)   0           batch_normalization_28[0][0]
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, 56, 56, 64)   36928       activation_24[0][0]
__________________________________________________________________________________________________
batch_normalization_29 (BatchNo (None, 56, 56, 64)   256         conv2d_29[0][0]
__________________________________________________________________________________________________
activation_25 (Activation)      (None, 56, 56, 64)   0           batch_normalization_29[0][0]
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, 56, 56, 256)  16640       activation_25[0][0]
__________________________________________________________________________________________________
batch_normalization_30 (BatchNo (None, 56, 56, 256)  1024        conv2d_30[0][0]
__________________________________________________________________________________________________
add_6 (Add)                     (None, 56, 56, 256)  0           activation_23[0][0]
                                                                 batch_normalization_30[0][0]
__________________________________________________________________________________________________
activation_26 (Activation)      (None, 56, 56, 256)  0           add_6[0][0]
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, 56, 56, 64)   16448       activation_26[0][0]
__________________________________________________________________________________________________
batch_normalization_31 (BatchNo (None, 56, 56, 64)   256         conv2d_31[0][0]
__________________________________________________________________________________________________
activation_27 (Activation)      (None, 56, 56, 64)   0           batch_normalization_31[0][0]
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, 56, 56, 64)   36928       activation_27[0][0]
__________________________________________________________________________________________________
batch_normalization_32 (BatchNo (None, 56, 56, 64)   256         conv2d_32[0][0]
__________________________________________________________________________________________________
activation_28 (Activation)      (None, 56, 56, 64)   0           batch_normalization_32[0][0]
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, 56, 56, 256)  16640       activation_28[0][0]
__________________________________________________________________________________________________
batch_normalization_33 (BatchNo (None, 56, 56, 256)  1024        conv2d_33[0][0]
__________________________________________________________________________________________________
add_7 (Add)                     (None, 56, 56, 256)  0           activation_26[0][0]
                                                                 batch_normalization_33[0][0]
__________________________________________________________________________________________________
activation_29 (Activation)      (None, 56, 56, 256)  0           add_7[0][0]
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, 28, 28, 128)  32896       activation_29[0][0]
__________________________________________________________________________________________________
batch_normalization_35 (BatchNo (None, 28, 28, 128)  512         conv2d_35[0][0]
__________________________________________________________________________________________________
activation_30 (Activation)      (None, 28, 28, 128)  0           batch_normalization_35[0][0]
__________________________________________________________________________________________________
conv2d_36 (Conv2D)              (None, 28, 28, 128)  147584      activation_30[0][0]
__________________________________________________________________________________________________
batch_normalization_36 (BatchNo (None, 28, 28, 128)  512         conv2d_36[0][0]
__________________________________________________________________________________________________
activation_31 (Activation)      (None, 28, 28, 128)  0           batch_normalization_36[0][0]
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, 28, 28, 512)  131584      activation_29[0][0]
__________________________________________________________________________________________________
conv2d_37 (Conv2D)              (None, 28, 28, 512)  66048       activation_31[0][0]
__________________________________________________________________________________________________
batch_normalization_34 (BatchNo (None, 28, 28, 512)  2048        conv2d_34[0][0]
__________________________________________________________________________________________________
batch_normalization_37 (BatchNo (None, 28, 28, 512)  2048        conv2d_37[0][0]
__________________________________________________________________________________________________
add_8 (Add)                     (None, 28, 28, 512)  0           batch_normalization_34[0][0]
                                                                 batch_normalization_37[0][0]
__________________________________________________________________________________________________
activation_32 (Activation)      (None, 28, 28, 512)  0           add_8[0][0]
__________________________________________________________________________________________________
conv2d_38 (Conv2D)              (None, 28, 28, 128)  65664       activation_32[0][0]
__________________________________________________________________________________________________
batch_normalization_38 (BatchNo (None, 28, 28, 128)  512         conv2d_38[0][0]
__________________________________________________________________________________________________
activation_33 (Activation)      (None, 28, 28, 128)  0           batch_normalization_38[0][0]
__________________________________________________________________________________________________
conv2d_39 (Conv2D)              (None, 28, 28, 128)  147584      activation_33[0][0]
__________________________________________________________________________________________________
batch_normalization_39 (BatchNo (None, 28, 28, 128)  512         conv2d_39[0][0]
__________________________________________________________________________________________________
activation_34 (Activation)      (None, 28, 28, 128)  0           batch_normalization_39[0][0]
__________________________________________________________________________________________________
conv2d_40 (Conv2D)              (None, 28, 28, 512)  66048       activation_34[0][0]
__________________________________________________________________________________________________
batch_normalization_40 (BatchNo (None, 28, 28, 512)  2048        conv2d_40[0][0]
__________________________________________________________________________________________________
add_9 (Add)                     (None, 28, 28, 512)  0           activation_32[0][0]
                                                                 batch_normalization_40[0][0]
__________________________________________________________________________________________________
activation_35 (Activation)      (None, 28, 28, 512)  0           add_9[0][0]
__________________________________________________________________________________________________
conv2d_41 (Conv2D)              (None, 28, 28, 128)  65664       activation_35[0][0]
__________________________________________________________________________________________________
batch_normalization_41 (BatchNo (None, 28, 28, 128)  512         conv2d_41[0][0]
__________________________________________________________________________________________________
activation_36 (Activation)      (None, 28, 28, 128)  0           batch_normalization_41[0][0]
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, 28, 28, 128)  147584      activation_36[0][0]
__________________________________________________________________________________________________
batch_normalization_42 (BatchNo (None, 28, 28, 128)  512         conv2d_42[0][0]
__________________________________________________________________________________________________
activation_37 (Activation)      (None, 28, 28, 128)  0           batch_normalization_42[0][0]
__________________________________________________________________________________________________
conv2d_43 (Conv2D)              (None, 28, 28, 512)  66048       activation_37[0][0]
__________________________________________________________________________________________________
batch_normalization_43 (BatchNo (None, 28, 28, 512)  2048        conv2d_43[0][0]
__________________________________________________________________________________________________
add_10 (Add)                    (None, 28, 28, 512)  0           activation_35[0][0]
                                                                 batch_normalization_43[0][0]
__________________________________________________________________________________________________
activation_38 (Activation)      (None, 28, 28, 512)  0           add_10[0][0]
__________________________________________________________________________________________________
conv2d_44 (Conv2D)              (None, 28, 28, 128)  65664       activation_38[0][0]
__________________________________________________________________________________________________
batch_normalization_44 (BatchNo (None, 28, 28, 128)  512         conv2d_44[0][0]
__________________________________________________________________________________________________
activation_39 (Activation)      (None, 28, 28, 128)  0           batch_normalization_44[0][0]
__________________________________________________________________________________________________
conv2d_45 (Conv2D)              (None, 28, 28, 128)  147584      activation_39[0][0]
__________________________________________________________________________________________________
batch_normalization_45 (BatchNo (None, 28, 28, 128)  512         conv2d_45[0][0]
__________________________________________________________________________________________________
activation_40 (Activation)      (None, 28, 28, 128)  0           batch_normalization_45[0][0]
__________________________________________________________________________________________________
conv2d_46 (Conv2D)              (None, 28, 28, 512)  66048       activation_40[0][0]
__________________________________________________________________________________________________
batch_normalization_46 (BatchNo (None, 28, 28, 512)  2048        conv2d_46[0][0]
__________________________________________________________________________________________________
add_11 (Add)                    (None, 28, 28, 512)  0           activation_38[0][0]
                                                                 batch_normalization_46[0][0]
__________________________________________________________________________________________________
activation_41 (Activation)      (None, 28, 28, 512)  0           add_11[0][0]
__________________________________________________________________________________________________
conv2d_48 (Conv2D)              (None, 14, 14, 256)  131328      activation_41[0][0]
__________________________________________________________________________________________________
batch_normalization_48 (BatchNo (None, 14, 14, 256)  1024        conv2d_48[0][0]
__________________________________________________________________________________________________
activation_42 (Activation)      (None, 14, 14, 256)  0           batch_normalization_48[0][0]
__________________________________________________________________________________________________
conv2d_49 (Conv2D)              (None, 14, 14, 256)  590080      activation_42[0][0]
__________________________________________________________________________________________________
batch_normalization_49 (BatchNo (None, 14, 14, 256)  1024        conv2d_49[0][0]
__________________________________________________________________________________________________
activation_43 (Activation)      (None, 14, 14, 256)  0           batch_normalization_49[0][0]
__________________________________________________________________________________________________
conv2d_47 (Conv2D)              (None, 14, 14, 1024) 525312      activation_41[0][0]
__________________________________________________________________________________________________
conv2d_50 (Conv2D)              (None, 14, 14, 1024) 263168      activation_43[0][0]
__________________________________________________________________________________________________
batch_normalization_47 (BatchNo (None, 14, 14, 1024) 4096        conv2d_47[0][0]
__________________________________________________________________________________________________
batch_normalization_50 (BatchNo (None, 14, 14, 1024) 4096        conv2d_50[0][0]
__________________________________________________________________________________________________
add_12 (Add)                    (None, 14, 14, 1024) 0           batch_normalization_47[0][0]
                                                                 batch_normalization_50[0][0]
__________________________________________________________________________________________________
activation_44 (Activation)      (None, 14, 14, 1024) 0           add_12[0][0]
__________________________________________________________________________________________________
conv2d_51 (Conv2D)              (None, 14, 14, 256)  262400      activation_44[0][0]
__________________________________________________________________________________________________
batch_normalization_51 (BatchNo (None, 14, 14, 256)  1024        conv2d_51[0][0]
__________________________________________________________________________________________________
activation_45 (Activation)      (None, 14, 14, 256)  0           batch_normalization_51[0][0]
__________________________________________________________________________________________________
conv2d_52 (Conv2D)              (None, 14, 14, 256)  590080      activation_45[0][0]
__________________________________________________________________________________________________
batch_normalization_52 (BatchNo (None, 14, 14, 256)  1024        conv2d_52[0][0]
__________________________________________________________________________________________________
activation_46 (Activation)      (None, 14, 14, 256)  0           batch_normalization_52[0][0]
__________________________________________________________________________________________________
conv2d_53 (Conv2D)              (None, 14, 14, 1024) 263168      activation_46[0][0]
__________________________________________________________________________________________________
batch_normalization_53 (BatchNo (None, 14, 14, 1024) 4096        conv2d_53[0][0]
__________________________________________________________________________________________________
add_13 (Add)                    (None, 14, 14, 1024) 0           activation_44[0][0]
                                                                 batch_normalization_53[0][0]
__________________________________________________________________________________________________
activation_47 (Activation)      (None, 14, 14, 1024) 0           add_13[0][0]
__________________________________________________________________________________________________
conv2d_54 (Conv2D)              (None, 14, 14, 256)  262400      activation_47[0][0]
__________________________________________________________________________________________________
batch_normalization_54 (BatchNo (None, 14, 14, 256)  1024        conv2d_54[0][0]
__________________________________________________________________________________________________
activation_48 (Activation)      (None, 14, 14, 256)  0           batch_normalization_54[0][0]
__________________________________________________________________________________________________
conv2d_55 (Conv2D)              (None, 14, 14, 256)  590080      activation_48[0][0]
__________________________________________________________________________________________________
batch_normalization_55 (BatchNo (None, 14, 14, 256)  1024        conv2d_55[0][0]
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 14, 14, 256)  0           batch_normalization_55[0][0]
__________________________________________________________________________________________________
conv2d_56 (Conv2D)              (None, 14, 14, 1024) 263168      activation_49[0][0]
__________________________________________________________________________________________________
batch_normalization_56 (BatchNo (None, 14, 14, 1024) 4096        conv2d_56[0][0]
__________________________________________________________________________________________________
add_14 (Add)                    (None, 14, 14, 1024) 0           activation_47[0][0]
                                                                 batch_normalization_56[0][0]
__________________________________________________________________________________________________
activation_50 (Activation)      (None, 14, 14, 1024) 0           add_14[0][0]
__________________________________________________________________________________________________
conv2d_57 (Conv2D)              (None, 14, 14, 256)  262400      activation_50[0][0]
__________________________________________________________________________________________________
batch_normalization_57 (BatchNo (None, 14, 14, 256)  1024        conv2d_57[0][0]
__________________________________________________________________________________________________
activation_51 (Activation)      (None, 14, 14, 256)  0           batch_normalization_57[0][0]
__________________________________________________________________________________________________
conv2d_58 (Conv2D)              (None, 14, 14, 256)  590080      activation_51[0][0]
__________________________________________________________________________________________________
batch_normalization_58 (BatchNo (None, 14, 14, 256)  1024        conv2d_58[0][0]
__________________________________________________________________________________________________
activation_52 (Activation)      (None, 14, 14, 256)  0           batch_normalization_58[0][0]
__________________________________________________________________________________________________
conv2d_59 (Conv2D)              (None, 14, 14, 1024) 263168      activation_52[0][0]
__________________________________________________________________________________________________
batch_normalization_59 (BatchNo (None, 14, 14, 1024) 4096        conv2d_59[0][0]
__________________________________________________________________________________________________
add_15 (Add)                    (None, 14, 14, 1024) 0           activation_50[0][0]
                                                                 batch_normalization_59[0][0]
__________________________________________________________________________________________________
activation_53 (Activation)      (None, 14, 14, 1024) 0           add_15[0][0]
__________________________________________________________________________________________________
conv2d_60 (Conv2D)              (None, 14, 14, 256)  262400      activation_53[0][0]
__________________________________________________________________________________________________
batch_normalization_60 (BatchNo (None, 14, 14, 256)  1024        conv2d_60[0][0]
__________________________________________________________________________________________________
activation_54 (Activation)      (None, 14, 14, 256)  0           batch_normalization_60[0][0]
__________________________________________________________________________________________________
conv2d_61 (Conv2D)              (None, 14, 14, 256)  590080      activation_54[0][0]
__________________________________________________________________________________________________
batch_normalization_61 (BatchNo (None, 14, 14, 256)  1024        conv2d_61[0][0]
__________________________________________________________________________________________________
activation_55 (Activation)      (None, 14, 14, 256)  0           batch_normalization_61[0][0]
__________________________________________________________________________________________________
conv2d_62 (Conv2D)              (None, 14, 14, 1024) 263168      activation_55[0][0]
__________________________________________________________________________________________________
batch_normalization_62 (BatchNo (None, 14, 14, 1024) 4096        conv2d_62[0][0]
__________________________________________________________________________________________________
add_16 (Add)                    (None, 14, 14, 1024) 0           activation_53[0][0]
                                                                 batch_normalization_62[0][0]
__________________________________________________________________________________________________
activation_56 (Activation)      (None, 14, 14, 1024) 0           add_16[0][0]
__________________________________________________________________________________________________
conv2d_63 (Conv2D)              (None, 14, 14, 256)  262400      activation_56[0][0]
__________________________________________________________________________________________________
batch_normalization_63 (BatchNo (None, 14, 14, 256)  1024        conv2d_63[0][0]
__________________________________________________________________________________________________
activation_57 (Activation)      (None, 14, 14, 256)  0           batch_normalization_63[0][0]
__________________________________________________________________________________________________
conv2d_64 (Conv2D)              (None, 14, 14, 256)  590080      activation_57[0][0]
__________________________________________________________________________________________________
batch_normalization_64 (BatchNo (None, 14, 14, 256)  1024        conv2d_64[0][0]
__________________________________________________________________________________________________
activation_58 (Activation)      (None, 14, 14, 256)  0           batch_normalization_64[0][0]
__________________________________________________________________________________________________
conv2d_65 (Conv2D)              (None, 14, 14, 1024) 263168      activation_58[0][0]
__________________________________________________________________________________________________
batch_normalization_65 (BatchNo (None, 14, 14, 1024) 4096        conv2d_65[0][0]
__________________________________________________________________________________________________
add_17 (Add)                    (None, 14, 14, 1024) 0           activation_56[0][0]
                                                                 batch_normalization_65[0][0]
__________________________________________________________________________________________________
activation_59 (Activation)      (None, 14, 14, 1024) 0           add_17[0][0]
__________________________________________________________________________________________________
conv2d_67 (Conv2D)              (None, 7, 7, 512)    524800      activation_59[0][0]
__________________________________________________________________________________________________
batch_normalization_67 (BatchNo (None, 7, 7, 512)    2048        conv2d_67[0][0]
__________________________________________________________________________________________________
activation_60 (Activation)      (None, 7, 7, 512)    0           batch_normalization_67[0][0]
__________________________________________________________________________________________________
conv2d_68 (Conv2D)              (None, 7, 7, 512)    2359808     activation_60[0][0]
__________________________________________________________________________________________________
batch_normalization_68 (BatchNo (None, 7, 7, 512)    2048        conv2d_68[0][0]
__________________________________________________________________________________________________
activation_61 (Activation)      (None, 7, 7, 512)    0           batch_normalization_68[0][0]
__________________________________________________________________________________________________
conv2d_66 (Conv2D)              (None, 7, 7, 2048)   2099200     activation_59[0][0]
__________________________________________________________________________________________________
conv2d_69 (Conv2D)              (None, 7, 7, 2048)   1050624     activation_61[0][0]
__________________________________________________________________________________________________
batch_normalization_66 (BatchNo (None, 7, 7, 2048)   8192        conv2d_66[0][0]
__________________________________________________________________________________________________
batch_normalization_69 (BatchNo (None, 7, 7, 2048)   8192        conv2d_69[0][0]
__________________________________________________________________________________________________
add_18 (Add)                    (None, 7, 7, 2048)   0           batch_normalization_66[0][0]
                                                                 batch_normalization_69[0][0]
__________________________________________________________________________________________________
activation_62 (Activation)      (None, 7, 7, 2048)   0           add_18[0][0]
__________________________________________________________________________________________________
conv2d_70 (Conv2D)              (None, 7, 7, 512)    1049088     activation_62[0][0]
__________________________________________________________________________________________________
batch_normalization_70 (BatchNo (None, 7, 7, 512)    2048        conv2d_70[0][0]
__________________________________________________________________________________________________
activation_63 (Activation)      (None, 7, 7, 512)    0           batch_normalization_70[0][0]
__________________________________________________________________________________________________
conv2d_71 (Conv2D)              (None, 7, 7, 512)    2359808     activation_63[0][0]
__________________________________________________________________________________________________
batch_normalization_71 (BatchNo (None, 7, 7, 512)    2048        conv2d_71[0][0]
__________________________________________________________________________________________________
activation_64 (Activation)      (None, 7, 7, 512)    0           batch_normalization_71[0][0]
__________________________________________________________________________________________________
conv2d_72 (Conv2D)              (None, 7, 7, 2048)   1050624     activation_64[0][0]
__________________________________________________________________________________________________
batch_normalization_72 (BatchNo (None, 7, 7, 2048)   8192        conv2d_72[0][0]
__________________________________________________________________________________________________
add_19 (Add)                    (None, 7, 7, 2048)   0           activation_62[0][0]
                                                                 batch_normalization_72[0][0]
__________________________________________________________________________________________________
activation_65 (Activation)      (None, 7, 7, 2048)   0           add_19[0][0]
__________________________________________________________________________________________________
conv2d_73 (Conv2D)              (None, 7, 7, 512)    1049088     activation_65[0][0]
__________________________________________________________________________________________________
batch_normalization_73 (BatchNo (None, 7, 7, 512)    2048        conv2d_73[0][0]
__________________________________________________________________________________________________
activation_66 (Activation)      (None, 7, 7, 512)    0           batch_normalization_73[0][0]
__________________________________________________________________________________________________
conv2d_74 (Conv2D)              (None, 7, 7, 512)    2359808     activation_66[0][0]
__________________________________________________________________________________________________
batch_normalization_74 (BatchNo (None, 7, 7, 512)    2048        conv2d_74[0][0]
__________________________________________________________________________________________________
activation_67 (Activation)      (None, 7, 7, 512)    0           batch_normalization_74[0][0]
__________________________________________________________________________________________________
conv2d_75 (Conv2D)              (None, 7, 7, 2048)   1050624     activation_67[0][0]
__________________________________________________________________________________________________
batch_normalization_75 (BatchNo (None, 7, 7, 2048)   8192        conv2d_75[0][0]
__________________________________________________________________________________________________
add_20 (Add)                    (None, 7, 7, 2048)   0           activation_65[0][0]
                                                                 batch_normalization_75[0][0]
__________________________________________________________________________________________________
activation_68 (Activation)      (None, 7, 7, 2048)   0           add_20[0][0]
==================================================================================================
Total params: 23,587,712
Trainable params: 23,534,592
Non-trainable params: 53,120
__________________________________________________________________________________________________

The classification layer.

It’s important to keep the classification layer separate (called the top in Keras docs). The CNN part can work on images of larger sizes too. A (224,224,3) input image size is necessary only if the network is being used for classification purposes. For transfer learning, people often load the CNN (without the top classification layer) with pre-trained weights and train a different classification layer on top of it.

Following is the top classification layer for Res Nets. If the pretrained weights for ImageNet classification are to be loaded, then the number of classes will be 1000.

  • The output of the CNN is of size (7,7,2048).

  • There is a simple global average pooling reducing the size to a vector of 2048 length.

  • This is followed by a dense layer with 1000 outputs. (2048 * 1000 + 1000 weights).

  • The output of dense layer goes through a softmax activation to convert it to classification probablities.

[ ]:
def top_resnet(net, classes=1000):
    # add the top classification network
    net = layers.GlobalAveragePooling2D()(net)
    net = layers.Dense(classes, activation='softmax')(net)
    return net

We now combine the CNN with the classifier to build the overall ResNet 50 classifier. Build a model around it.

Voila! It’s done. We have a complete ResNet 50 architecture built for us.

[ ]:
def resnet_classifier(img_input):
  net = cnn_resnet50 (img_input)
  net = top_resnet(net)
  return net

model =  build_model((224, 224, 3), resnet_classifier)
model.summary()
Model: "functional_11"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_6 (InputLayer)            [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
zero_padding2d_6 (ZeroPadding2D (None, 230, 230, 3)  0           input_6[0][0]
__________________________________________________________________________________________________
conv2d_76 (Conv2D)              (None, 112, 112, 64) 9472        zero_padding2d_6[0][0]
__________________________________________________________________________________________________
batch_normalization_76 (BatchNo (None, 112, 112, 64) 256         conv2d_76[0][0]
__________________________________________________________________________________________________
activation_69 (Activation)      (None, 112, 112, 64) 0           batch_normalization_76[0][0]
__________________________________________________________________________________________________
zero_padding2d_7 (ZeroPadding2D (None, 114, 114, 64) 0           activation_69[0][0]
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 56, 56, 64)   0           zero_padding2d_7[0][0]
__________________________________________________________________________________________________
conv2d_78 (Conv2D)              (None, 56, 56, 64)   4160        max_pooling2d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_78 (BatchNo (None, 56, 56, 64)   256         conv2d_78[0][0]
__________________________________________________________________________________________________
activation_70 (Activation)      (None, 56, 56, 64)   0           batch_normalization_78[0][0]
__________________________________________________________________________________________________
conv2d_79 (Conv2D)              (None, 56, 56, 64)   36928       activation_70[0][0]
__________________________________________________________________________________________________
batch_normalization_79 (BatchNo (None, 56, 56, 64)   256         conv2d_79[0][0]
__________________________________________________________________________________________________
activation_71 (Activation)      (None, 56, 56, 64)   0           batch_normalization_79[0][0]
__________________________________________________________________________________________________
conv2d_77 (Conv2D)              (None, 56, 56, 256)  16640       max_pooling2d_3[0][0]
__________________________________________________________________________________________________
conv2d_80 (Conv2D)              (None, 56, 56, 256)  16640       activation_71[0][0]
__________________________________________________________________________________________________
batch_normalization_77 (BatchNo (None, 56, 56, 256)  1024        conv2d_77[0][0]
__________________________________________________________________________________________________
batch_normalization_80 (BatchNo (None, 56, 56, 256)  1024        conv2d_80[0][0]
__________________________________________________________________________________________________
add_21 (Add)                    (None, 56, 56, 256)  0           batch_normalization_77[0][0]
                                                                 batch_normalization_80[0][0]
__________________________________________________________________________________________________
activation_72 (Activation)      (None, 56, 56, 256)  0           add_21[0][0]
__________________________________________________________________________________________________
conv2d_81 (Conv2D)              (None, 56, 56, 64)   16448       activation_72[0][0]
__________________________________________________________________________________________________
batch_normalization_81 (BatchNo (None, 56, 56, 64)   256         conv2d_81[0][0]
__________________________________________________________________________________________________
activation_73 (Activation)      (None, 56, 56, 64)   0           batch_normalization_81[0][0]
__________________________________________________________________________________________________
conv2d_82 (Conv2D)              (None, 56, 56, 64)   36928       activation_73[0][0]
__________________________________________________________________________________________________
batch_normalization_82 (BatchNo (None, 56, 56, 64)   256         conv2d_82[0][0]
__________________________________________________________________________________________________
activation_74 (Activation)      (None, 56, 56, 64)   0           batch_normalization_82[0][0]
__________________________________________________________________________________________________
conv2d_83 (Conv2D)              (None, 56, 56, 256)  16640       activation_74[0][0]
__________________________________________________________________________________________________
batch_normalization_83 (BatchNo (None, 56, 56, 256)  1024        conv2d_83[0][0]
__________________________________________________________________________________________________
add_22 (Add)                    (None, 56, 56, 256)  0           activation_72[0][0]
                                                                 batch_normalization_83[0][0]
__________________________________________________________________________________________________
activation_75 (Activation)      (None, 56, 56, 256)  0           add_22[0][0]
__________________________________________________________________________________________________
conv2d_84 (Conv2D)              (None, 56, 56, 64)   16448       activation_75[0][0]
__________________________________________________________________________________________________
batch_normalization_84 (BatchNo (None, 56, 56, 64)   256         conv2d_84[0][0]
__________________________________________________________________________________________________
activation_76 (Activation)      (None, 56, 56, 64)   0           batch_normalization_84[0][0]
__________________________________________________________________________________________________
conv2d_85 (Conv2D)              (None, 56, 56, 64)   36928       activation_76[0][0]
__________________________________________________________________________________________________
batch_normalization_85 (BatchNo (None, 56, 56, 64)   256         conv2d_85[0][0]
__________________________________________________________________________________________________
activation_77 (Activation)      (None, 56, 56, 64)   0           batch_normalization_85[0][0]
__________________________________________________________________________________________________
conv2d_86 (Conv2D)              (None, 56, 56, 256)  16640       activation_77[0][0]
__________________________________________________________________________________________________
batch_normalization_86 (BatchNo (None, 56, 56, 256)  1024        conv2d_86[0][0]
__________________________________________________________________________________________________
add_23 (Add)                    (None, 56, 56, 256)  0           activation_75[0][0]
                                                                 batch_normalization_86[0][0]
__________________________________________________________________________________________________
activation_78 (Activation)      (None, 56, 56, 256)  0           add_23[0][0]
__________________________________________________________________________________________________
conv2d_88 (Conv2D)              (None, 28, 28, 128)  32896       activation_78[0][0]
__________________________________________________________________________________________________
batch_normalization_88 (BatchNo (None, 28, 28, 128)  512         conv2d_88[0][0]
__________________________________________________________________________________________________
activation_79 (Activation)      (None, 28, 28, 128)  0           batch_normalization_88[0][0]
__________________________________________________________________________________________________
conv2d_89 (Conv2D)              (None, 28, 28, 128)  147584      activation_79[0][0]
__________________________________________________________________________________________________
batch_normalization_89 (BatchNo (None, 28, 28, 128)  512         conv2d_89[0][0]
__________________________________________________________________________________________________
activation_80 (Activation)      (None, 28, 28, 128)  0           batch_normalization_89[0][0]
__________________________________________________________________________________________________
conv2d_87 (Conv2D)              (None, 28, 28, 512)  131584      activation_78[0][0]
__________________________________________________________________________________________________
conv2d_90 (Conv2D)              (None, 28, 28, 512)  66048       activation_80[0][0]
__________________________________________________________________________________________________
batch_normalization_87 (BatchNo (None, 28, 28, 512)  2048        conv2d_87[0][0]
__________________________________________________________________________________________________
batch_normalization_90 (BatchNo (None, 28, 28, 512)  2048        conv2d_90[0][0]
__________________________________________________________________________________________________
add_24 (Add)                    (None, 28, 28, 512)  0           batch_normalization_87[0][0]
                                                                 batch_normalization_90[0][0]
__________________________________________________________________________________________________
activation_81 (Activation)      (None, 28, 28, 512)  0           add_24[0][0]
__________________________________________________________________________________________________
conv2d_91 (Conv2D)              (None, 28, 28, 128)  65664       activation_81[0][0]
__________________________________________________________________________________________________
batch_normalization_91 (BatchNo (None, 28, 28, 128)  512         conv2d_91[0][0]
__________________________________________________________________________________________________
activation_82 (Activation)      (None, 28, 28, 128)  0           batch_normalization_91[0][0]
__________________________________________________________________________________________________
conv2d_92 (Conv2D)              (None, 28, 28, 128)  147584      activation_82[0][0]
__________________________________________________________________________________________________
batch_normalization_92 (BatchNo (None, 28, 28, 128)  512         conv2d_92[0][0]
__________________________________________________________________________________________________
activation_83 (Activation)      (None, 28, 28, 128)  0           batch_normalization_92[0][0]
__________________________________________________________________________________________________
conv2d_93 (Conv2D)              (None, 28, 28, 512)  66048       activation_83[0][0]
__________________________________________________________________________________________________
batch_normalization_93 (BatchNo (None, 28, 28, 512)  2048        conv2d_93[0][0]
__________________________________________________________________________________________________
add_25 (Add)                    (None, 28, 28, 512)  0           activation_81[0][0]
                                                                 batch_normalization_93[0][0]
__________________________________________________________________________________________________
activation_84 (Activation)      (None, 28, 28, 512)  0           add_25[0][0]
__________________________________________________________________________________________________
conv2d_94 (Conv2D)              (None, 28, 28, 128)  65664       activation_84[0][0]
__________________________________________________________________________________________________
batch_normalization_94 (BatchNo (None, 28, 28, 128)  512         conv2d_94[0][0]
__________________________________________________________________________________________________
activation_85 (Activation)      (None, 28, 28, 128)  0           batch_normalization_94[0][0]
__________________________________________________________________________________________________
conv2d_95 (Conv2D)              (None, 28, 28, 128)  147584      activation_85[0][0]
__________________________________________________________________________________________________
batch_normalization_95 (BatchNo (None, 28, 28, 128)  512         conv2d_95[0][0]
__________________________________________________________________________________________________
activation_86 (Activation)      (None, 28, 28, 128)  0           batch_normalization_95[0][0]
__________________________________________________________________________________________________
conv2d_96 (Conv2D)              (None, 28, 28, 512)  66048       activation_86[0][0]
__________________________________________________________________________________________________
batch_normalization_96 (BatchNo (None, 28, 28, 512)  2048        conv2d_96[0][0]
__________________________________________________________________________________________________
add_26 (Add)                    (None, 28, 28, 512)  0           activation_84[0][0]
                                                                 batch_normalization_96[0][0]
__________________________________________________________________________________________________
activation_87 (Activation)      (None, 28, 28, 512)  0           add_26[0][0]
__________________________________________________________________________________________________
conv2d_97 (Conv2D)              (None, 28, 28, 128)  65664       activation_87[0][0]
__________________________________________________________________________________________________
batch_normalization_97 (BatchNo (None, 28, 28, 128)  512         conv2d_97[0][0]
__________________________________________________________________________________________________
activation_88 (Activation)      (None, 28, 28, 128)  0           batch_normalization_97[0][0]
__________________________________________________________________________________________________
conv2d_98 (Conv2D)              (None, 28, 28, 128)  147584      activation_88[0][0]
__________________________________________________________________________________________________
batch_normalization_98 (BatchNo (None, 28, 28, 128)  512         conv2d_98[0][0]
__________________________________________________________________________________________________
activation_89 (Activation)      (None, 28, 28, 128)  0           batch_normalization_98[0][0]
__________________________________________________________________________________________________
conv2d_99 (Conv2D)              (None, 28, 28, 512)  66048       activation_89[0][0]
__________________________________________________________________________________________________
batch_normalization_99 (BatchNo (None, 28, 28, 512)  2048        conv2d_99[0][0]
__________________________________________________________________________________________________
add_27 (Add)                    (None, 28, 28, 512)  0           activation_87[0][0]
                                                                 batch_normalization_99[0][0]
__________________________________________________________________________________________________
activation_90 (Activation)      (None, 28, 28, 512)  0           add_27[0][0]
__________________________________________________________________________________________________
conv2d_101 (Conv2D)             (None, 14, 14, 256)  131328      activation_90[0][0]
__________________________________________________________________________________________________
batch_normalization_101 (BatchN (None, 14, 14, 256)  1024        conv2d_101[0][0]
__________________________________________________________________________________________________
activation_91 (Activation)      (None, 14, 14, 256)  0           batch_normalization_101[0][0]
__________________________________________________________________________________________________
conv2d_102 (Conv2D)             (None, 14, 14, 256)  590080      activation_91[0][0]
__________________________________________________________________________________________________
batch_normalization_102 (BatchN (None, 14, 14, 256)  1024        conv2d_102[0][0]
__________________________________________________________________________________________________
activation_92 (Activation)      (None, 14, 14, 256)  0           batch_normalization_102[0][0]
__________________________________________________________________________________________________
conv2d_100 (Conv2D)             (None, 14, 14, 1024) 525312      activation_90[0][0]
__________________________________________________________________________________________________
conv2d_103 (Conv2D)             (None, 14, 14, 1024) 263168      activation_92[0][0]
__________________________________________________________________________________________________
batch_normalization_100 (BatchN (None, 14, 14, 1024) 4096        conv2d_100[0][0]
__________________________________________________________________________________________________
batch_normalization_103 (BatchN (None, 14, 14, 1024) 4096        conv2d_103[0][0]
__________________________________________________________________________________________________
add_28 (Add)                    (None, 14, 14, 1024) 0           batch_normalization_100[0][0]
                                                                 batch_normalization_103[0][0]
__________________________________________________________________________________________________
activation_93 (Activation)      (None, 14, 14, 1024) 0           add_28[0][0]
__________________________________________________________________________________________________
conv2d_104 (Conv2D)             (None, 14, 14, 256)  262400      activation_93[0][0]
__________________________________________________________________________________________________
batch_normalization_104 (BatchN (None, 14, 14, 256)  1024        conv2d_104[0][0]
__________________________________________________________________________________________________
activation_94 (Activation)      (None, 14, 14, 256)  0           batch_normalization_104[0][0]
__________________________________________________________________________________________________
conv2d_105 (Conv2D)             (None, 14, 14, 256)  590080      activation_94[0][0]
__________________________________________________________________________________________________
batch_normalization_105 (BatchN (None, 14, 14, 256)  1024        conv2d_105[0][0]
__________________________________________________________________________________________________
activation_95 (Activation)      (None, 14, 14, 256)  0           batch_normalization_105[0][0]
__________________________________________________________________________________________________
conv2d_106 (Conv2D)             (None, 14, 14, 1024) 263168      activation_95[0][0]
__________________________________________________________________________________________________
batch_normalization_106 (BatchN (None, 14, 14, 1024) 4096        conv2d_106[0][0]
__________________________________________________________________________________________________
add_29 (Add)                    (None, 14, 14, 1024) 0           activation_93[0][0]
                                                                 batch_normalization_106[0][0]
__________________________________________________________________________________________________
activation_96 (Activation)      (None, 14, 14, 1024) 0           add_29[0][0]
__________________________________________________________________________________________________
conv2d_107 (Conv2D)             (None, 14, 14, 256)  262400      activation_96[0][0]
__________________________________________________________________________________________________
batch_normalization_107 (BatchN (None, 14, 14, 256)  1024        conv2d_107[0][0]
__________________________________________________________________________________________________
activation_97 (Activation)      (None, 14, 14, 256)  0           batch_normalization_107[0][0]
__________________________________________________________________________________________________
conv2d_108 (Conv2D)             (None, 14, 14, 256)  590080      activation_97[0][0]
__________________________________________________________________________________________________
batch_normalization_108 (BatchN (None, 14, 14, 256)  1024        conv2d_108[0][0]
__________________________________________________________________________________________________
activation_98 (Activation)      (None, 14, 14, 256)  0           batch_normalization_108[0][0]
__________________________________________________________________________________________________
conv2d_109 (Conv2D)             (None, 14, 14, 1024) 263168      activation_98[0][0]
__________________________________________________________________________________________________
batch_normalization_109 (BatchN (None, 14, 14, 1024) 4096        conv2d_109[0][0]
__________________________________________________________________________________________________
add_30 (Add)                    (None, 14, 14, 1024) 0           activation_96[0][0]
                                                                 batch_normalization_109[0][0]
__________________________________________________________________________________________________
activation_99 (Activation)      (None, 14, 14, 1024) 0           add_30[0][0]
__________________________________________________________________________________________________
conv2d_110 (Conv2D)             (None, 14, 14, 256)  262400      activation_99[0][0]
__________________________________________________________________________________________________
batch_normalization_110 (BatchN (None, 14, 14, 256)  1024        conv2d_110[0][0]
__________________________________________________________________________________________________
activation_100 (Activation)     (None, 14, 14, 256)  0           batch_normalization_110[0][0]
__________________________________________________________________________________________________
conv2d_111 (Conv2D)             (None, 14, 14, 256)  590080      activation_100[0][0]
__________________________________________________________________________________________________
batch_normalization_111 (BatchN (None, 14, 14, 256)  1024        conv2d_111[0][0]
__________________________________________________________________________________________________
activation_101 (Activation)     (None, 14, 14, 256)  0           batch_normalization_111[0][0]
__________________________________________________________________________________________________
conv2d_112 (Conv2D)             (None, 14, 14, 1024) 263168      activation_101[0][0]
__________________________________________________________________________________________________
batch_normalization_112 (BatchN (None, 14, 14, 1024) 4096        conv2d_112[0][0]
__________________________________________________________________________________________________
add_31 (Add)                    (None, 14, 14, 1024) 0           activation_99[0][0]
                                                                 batch_normalization_112[0][0]
__________________________________________________________________________________________________
activation_102 (Activation)     (None, 14, 14, 1024) 0           add_31[0][0]
__________________________________________________________________________________________________
conv2d_113 (Conv2D)             (None, 14, 14, 256)  262400      activation_102[0][0]
__________________________________________________________________________________________________
batch_normalization_113 (BatchN (None, 14, 14, 256)  1024        conv2d_113[0][0]
__________________________________________________________________________________________________
activation_103 (Activation)     (None, 14, 14, 256)  0           batch_normalization_113[0][0]
__________________________________________________________________________________________________
conv2d_114 (Conv2D)             (None, 14, 14, 256)  590080      activation_103[0][0]
__________________________________________________________________________________________________
batch_normalization_114 (BatchN (None, 14, 14, 256)  1024        conv2d_114[0][0]
__________________________________________________________________________________________________
activation_104 (Activation)     (None, 14, 14, 256)  0           batch_normalization_114[0][0]
__________________________________________________________________________________________________
conv2d_115 (Conv2D)             (None, 14, 14, 1024) 263168      activation_104[0][0]
__________________________________________________________________________________________________
batch_normalization_115 (BatchN (None, 14, 14, 1024) 4096        conv2d_115[0][0]
__________________________________________________________________________________________________
add_32 (Add)                    (None, 14, 14, 1024) 0           activation_102[0][0]
                                                                 batch_normalization_115[0][0]
__________________________________________________________________________________________________
activation_105 (Activation)     (None, 14, 14, 1024) 0           add_32[0][0]
__________________________________________________________________________________________________
conv2d_116 (Conv2D)             (None, 14, 14, 256)  262400      activation_105[0][0]
__________________________________________________________________________________________________
batch_normalization_116 (BatchN (None, 14, 14, 256)  1024        conv2d_116[0][0]
__________________________________________________________________________________________________
activation_106 (Activation)     (None, 14, 14, 256)  0           batch_normalization_116[0][0]
__________________________________________________________________________________________________
conv2d_117 (Conv2D)             (None, 14, 14, 256)  590080      activation_106[0][0]
__________________________________________________________________________________________________
batch_normalization_117 (BatchN (None, 14, 14, 256)  1024        conv2d_117[0][0]
__________________________________________________________________________________________________
activation_107 (Activation)     (None, 14, 14, 256)  0           batch_normalization_117[0][0]
__________________________________________________________________________________________________
conv2d_118 (Conv2D)             (None, 14, 14, 1024) 263168      activation_107[0][0]
__________________________________________________________________________________________________
batch_normalization_118 (BatchN (None, 14, 14, 1024) 4096        conv2d_118[0][0]
__________________________________________________________________________________________________
add_33 (Add)                    (None, 14, 14, 1024) 0           activation_105[0][0]
                                                                 batch_normalization_118[0][0]
__________________________________________________________________________________________________
activation_108 (Activation)     (None, 14, 14, 1024) 0           add_33[0][0]
__________________________________________________________________________________________________
conv2d_120 (Conv2D)             (None, 7, 7, 512)    524800      activation_108[0][0]
__________________________________________________________________________________________________
batch_normalization_120 (BatchN (None, 7, 7, 512)    2048        conv2d_120[0][0]
__________________________________________________________________________________________________
activation_109 (Activation)     (None, 7, 7, 512)    0           batch_normalization_120[0][0]
__________________________________________________________________________________________________
conv2d_121 (Conv2D)             (None, 7, 7, 512)    2359808     activation_109[0][0]
__________________________________________________________________________________________________
batch_normalization_121 (BatchN (None, 7, 7, 512)    2048        conv2d_121[0][0]
__________________________________________________________________________________________________
activation_110 (Activation)     (None, 7, 7, 512)    0           batch_normalization_121[0][0]
__________________________________________________________________________________________________
conv2d_119 (Conv2D)             (None, 7, 7, 2048)   2099200     activation_108[0][0]
__________________________________________________________________________________________________
conv2d_122 (Conv2D)             (None, 7, 7, 2048)   1050624     activation_110[0][0]
__________________________________________________________________________________________________
batch_normalization_119 (BatchN (None, 7, 7, 2048)   8192        conv2d_119[0][0]
__________________________________________________________________________________________________
batch_normalization_122 (BatchN (None, 7, 7, 2048)   8192        conv2d_122[0][0]
__________________________________________________________________________________________________
add_34 (Add)                    (None, 7, 7, 2048)   0           batch_normalization_119[0][0]
                                                                 batch_normalization_122[0][0]
__________________________________________________________________________________________________
activation_111 (Activation)     (None, 7, 7, 2048)   0           add_34[0][0]
__________________________________________________________________________________________________
conv2d_123 (Conv2D)             (None, 7, 7, 512)    1049088     activation_111[0][0]
__________________________________________________________________________________________________
batch_normalization_123 (BatchN (None, 7, 7, 512)    2048        conv2d_123[0][0]
__________________________________________________________________________________________________
activation_112 (Activation)     (None, 7, 7, 512)    0           batch_normalization_123[0][0]
__________________________________________________________________________________________________
conv2d_124 (Conv2D)             (None, 7, 7, 512)    2359808     activation_112[0][0]
__________________________________________________________________________________________________
batch_normalization_124 (BatchN (None, 7, 7, 512)    2048        conv2d_124[0][0]
__________________________________________________________________________________________________
activation_113 (Activation)     (None, 7, 7, 512)    0           batch_normalization_124[0][0]
__________________________________________________________________________________________________
conv2d_125 (Conv2D)             (None, 7, 7, 2048)   1050624     activation_113[0][0]
__________________________________________________________________________________________________
batch_normalization_125 (BatchN (None, 7, 7, 2048)   8192        conv2d_125[0][0]
__________________________________________________________________________________________________
add_35 (Add)                    (None, 7, 7, 2048)   0           activation_111[0][0]
                                                                 batch_normalization_125[0][0]
__________________________________________________________________________________________________
activation_114 (Activation)     (None, 7, 7, 2048)   0           add_35[0][0]
__________________________________________________________________________________________________
conv2d_126 (Conv2D)             (None, 7, 7, 512)    1049088     activation_114[0][0]
__________________________________________________________________________________________________
batch_normalization_126 (BatchN (None, 7, 7, 512)    2048        conv2d_126[0][0]
__________________________________________________________________________________________________
activation_115 (Activation)     (None, 7, 7, 512)    0           batch_normalization_126[0][0]
__________________________________________________________________________________________________
conv2d_127 (Conv2D)             (None, 7, 7, 512)    2359808     activation_115[0][0]
__________________________________________________________________________________________________
batch_normalization_127 (BatchN (None, 7, 7, 512)    2048        conv2d_127[0][0]
__________________________________________________________________________________________________
activation_116 (Activation)     (None, 7, 7, 512)    0           batch_normalization_127[0][0]
__________________________________________________________________________________________________
conv2d_128 (Conv2D)             (None, 7, 7, 2048)   1050624     activation_116[0][0]
__________________________________________________________________________________________________
batch_normalization_128 (BatchN (None, 7, 7, 2048)   8192        conv2d_128[0][0]
__________________________________________________________________________________________________
add_36 (Add)                    (None, 7, 7, 2048)   0           activation_114[0][0]
                                                                 batch_normalization_128[0][0]
__________________________________________________________________________________________________
activation_117 (Activation)     (None, 7, 7, 2048)   0           add_36[0][0]
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 2048)         0           activation_117[0][0]
__________________________________________________________________________________________________
dense (Dense)                   (None, 1000)         2049000     global_average_pooling2d[0][0]
==================================================================================================
Total params: 25,636,712
Trainable params: 25,583,592
Non-trainable params: 53,120
__________________________________________________________________________________________________

Where to go from here

There are some additional details involved in a robust implementation which have been skipped in this tutorial.

  • Every layer should be named appropriately.

  • Some image processing systmes may follow channels first approach where the image dimensions are (3,224,224).

  • Every image needs to go through a basic preprocessing. This involves, convering the image to BGR format (from RGB) if necessary, then zero center each color channel with respect to the ImageNet dataset.

  • Loading the pretrained weights using the model.load_weights function.

  • Add a global pooling if the top classification network is ignored.

For a more complete implementation addressing these aspects, please look at the source code in Keras Applications repository mentioned above.

We hope that we have been able to give you a first hand experience of building large and complex convolutional networks easily with the help of Keras.

Enjoy!