CS231n-CNN for Visual Recognition-Assignment1


Categories: neuralNetworks deepLearning Tags: knn svm softmax cnn imageClassification

Image Classification



Data-Driven Approach

Image Classification Pipeline

Example Image Classification Dataset: CIFAR-10



Load Data

Xtr_orig, Ytr, Xte_orig, Yte = load_CIFAR10(cifar10_dir) # a magic function we provide
# flatten out all images to be one-dimensional
Xtr = Xtr_orig.reshape(Xtr_orig.shape[0], -1) # Xtr becomes 50000 x 3072
Xte = Xte_orig.reshape(Xte_orig.shape[0], -1) # Xte becomes 10000 x 3072

Train and Evaluate a Classifier

knn = KNearestNeighbor() # create a kNN classifier instance
knn.train(Xtr, ytr) # train the classifier on the training images and labels
yte_predict = knn.predict(Xte) # predict labels on the test images
# accuracy = fraction of examples that were correctly predicted
accuracy = np.mean(yte_predict == yte)
print('accuracy: %f' % accuracy)

Implementation of kNN Classifier with L2 Distance

Vectorization of Distance Computation

$$ A = \begin{pmatrix} a_1\newline \vdots\newline a_N \end{pmatrix} \in \mathbb{R}^{N \times D},\ \ B = \begin{pmatrix} b_1\newline \vdots\newline b_M \end{pmatrix} \in \mathbb{R}^{M \times D} $$

$$ E_{ij} = \Vert b_i- a_j\Vert_2, \ \ E=(E_{ij})\in\mathbb{R}^{M\times N} $$

$$ E_{ij}^2=-2b_i a_j^T +\Vert b_i\Vert_2^2 +\Vert a_j\Vert_2^2 =-2 (BA^T)_{ij} +\Vert b_i\Vert_2^2 +\Vert a_j\Vert_2^2 $$

b = np.sum(B * B, axis = 1, keepdims = True)
a = np.sum(A * A, axis = 1, keepdims = True)
E = np.sqrt(-2 * np.dot(B, A.T) + b + a.T) # numpy broadcasting

import numpy as np
from scipy.stats import mode

class KNearestNeighbor(object):
    """ a kNN classifier with L2 distance """
    def __init__(self):

    def train(self, X, y):
        X: X.shape == (N, D), N examples, each of dim D
        y: y.shape == (N,)
           y[i] is the label of X[i]
        # the nearest neighbor classifier simply remembers all the training data
        self.Xtrain = X
        self.ytrain = y

    def computeDistances(self, X):
        Compute the distances between each test point in X
        and each training point in self.Xtrain
        X: each row is an example we wish to predict label for
           X.shape == (ntest, D)
        dists: dists.shape == (ntest, ntrain)
               dists[i, j] == L2 distance between X[i] and self.Xtrain[j]
        ntest, ntrain = X.shape[0], self.Xtrain.shape[0]

        te = np.sum(X * X, axis = 1, keepdims = True)
        tr = np.sum(self.Xtrain * self.Xtrain, axis = 1, keepdims = True)
        dists = np.sqrt(-2 * np.dot(X, self.Xtrain.T) + te + tr.T)

    def predict(self, X, k = 1):
        Predict labels for test data using this classifier.

        X: each row is an example we wish to predict label for
           X.shape == (ntest, D)
        ypred: ypred.shape == (ntest,)
               ypred[i] is the predicted label for X[i]
        dists = self.computeDistances(X)
        ntest = X.shape[0]
        # ith row: indices of k nearest neighbors of X[i]
        k_idx = dists.argsort(axis = 1)[:, :k]
        # ith row: labels of k nearest neighbors of X[i]
        closest_y = self.ytrain[k_idx]
        # ith row: most common label in closest_y[i]
        y_pred = mode(closest_y, axis = 1).mode

        return np.squeeze(y_pred)

Hyperparameter Tuning

Evaluate on the test set only a single time, at the very end.

Split your training set into training set and a validation set. Use validation set to tune all hyperparameters. At the end run a single time on the test set and report performance.

Pros and Cons of kNN

With $N$examples, how fast are training and prediction?

Approximate Nearest Neighbor (ANN) algorithms (e.g. FLANN)

Python Code

kNN in Practice (Never Used on Images)

Linear Classification

Multiclass SVM


Two-layer Neural Network