Convolutional Neural Networks

2018/03/15

Categories: neuralNetworks deepLearning cnn Tags: computerVision cnn

Building Blocks of CNN

Edge Detection

Vertical Edge Detection

vertical edge detection

Vertical and Horizontal Edge Detection

horizontal and vertical edge detection

Convolution Operation


Padding


Valid and Same Convolutions


Strided Convolution


Convolution Over Volume

rgb

multiple filters

One Layer of a Convolutional Network

one layer of cnn

Simple Convolutional Network Example

Types of layer in a convolutional network:

simple cnn

Pooling Layer


Neural Network Example (Inspired by LeNet-5)

lenet5

Why Convolutions?


CNN Step By Step - Python Code

Building a CNN Step by Step

CNN Application-Sign Recognition


Case Studies

Classic Networks

LeNet-5

lenet5

AlexNet

$$ (11\times 11\times 3+1)\times 96+(5\times 5\times 96+1)\times 256 + (3\times 3\times 256+1)\times 384 \\
+(3\times3\times 384+1)\times 384+ (3\times3\times 384 +1)\times 256 + (9216+1)\times 4096 \\
+(4096+1)\times 4096 + (4096+1)\times 10 =58,322,314 $$

alex

VGG-16

vgg

Residual Networks (ResNets)

Residual Block

$$ a^{[\ell+2]}=g(z^{[\ell +2]}+a^{[\ell]}) $$

residual block

resnet

resnet


Signs Recognition with ResNet - Python Code

jupyter notebook


One-By-One Convolution

one by one

one by one


Inception Network

inception

inception

inception

Inception Module

inception

inception network


Practical Advices for Using ConvNets

  1. use architectures of networks published in the literature
  2. use open source implementations if possible
  3. use pretrained models and fine-tune on your dataset

Using Open-Source Implementation

Transfer Learning

transfer


Data Augmentation


Implementing Distortion During Training

cpu


Tips for Doing Well on Benchmarks/Winning Competitions

crop


Detection Algorithms

localization

Classification With Localization

$$ y = \begin{pmatrix} p_c\newline b_x\newline b_y\newline b_h\newline b_w\newline c_1\newline c_2\newline c_3\newline \end{pmatrix},\ p_c=\begin{cases} 1,& \exists\text{ object}\newline 0,&\text{otherwise} \end{cases},\ c_i=\begin{cases} 1,& \exists\text{ object }i\newline 0,&\text{otherwise} \end{cases} $$

localization

$$ \mathcal{L}(\hat{y},y)=\begin{cases} \Vert \hat{y}-y\Vert_2^2, &\text{ if }y_1=1 \newline (\hat{y}_1-y_1)^2, &\text{ if }y_1=0 \end{cases} $$


Landmark Detection

landmark

Object Detection

sliding window

Convolutional Implementation of Sliding Windows

fc

sliding window

sliding window


Bounding Box Prediction

yolo

bounding boxes


Intersection Over Union (IOU)

$$ \text{“correct” if }IOU = \frac{\text{size of intersection area}}{\text{size of union area}} \geq 0.5 $$

iou


Non-max Suppression

nonmax

nonmax


Anchor Boxes

anchor

anchor


YOLO Algorithm

yolo training

yolo prediction

yolo output


Region Proposals: R-CNN

rcnn


Face Verification vs Face Recognition


One Shot Learning


Siamese Network

$$ d(x^{(1)}, x^{(2)})=\Vert f(x^{(1)})-f(x^{(2)})\Vert_2^2 $$

siamese


Triplet Loss

$$ \mathcal{L}(A,P,N)=max(\Vert f(A)-f(P)\Vert^2- \Vert f(A)-f(N)\Vert^2+\alpha, 0) $$

$$ J = \sum_{i=1}^m\mathcal{L}(A^{(i)}, P^{(i)}, N^{(i)}) $$


Face Verification and Binary Classification

face

face


Neural Style Transfer

style

What are Deep ConvNets Learning?

visual

visual

Visualizing and understanding convolutional networks


Neural Style Transfer Cost Function

$$ J(G)=\alpha J_{content}(C,G) +\beta J_{style}(S,G) $$


Content Cost Function

$$ J_{content}(C,G):=\frac{1}{2}\Vert a^{[\ell](C)}-a^{[\ell] (G)}\Vert^2 $$


Style Cost Function

$$ a^{[\ell]}_{i,j,k}=\text{activation at }(i,j,k).\\
G_{kk’}^{[\ell]}=\sum_{i=1}^{n_H^{[\ell]}}\sum_{j=1}^{n_W^{[\ell]}} a_{ijk}^{[\ell]}a_{ijk’}^{[\ell]},\ \ k,k’=1,2,\ldots,n_C^{[\ell]}.\\
G^{[\ell]}=(G_{kk’}^{[\ell]})\in\mathbb{R}^{n_C^{[\ell]}\times n_C^{[\ell]}}. $$

$$ J_{style}^{[\ell]}(S,G)=\frac{1}{(2n_{H}^{[\ell]}n_W^{[\ell]}n_C^{[\ell]})^2}\sum_{k,k’}\Vert G^{[\ell](S)}-G^{[\ell](G)}\Vert^2_F $$

$$ J_{style}(S,G)=\sum_l \lambda^{[\ell]} J_{style}^{[\ell]}(S,G) $$


Convolutions in 1D and 3D

1d

3d