CS231n-CNN for Visual Recognition-Assignment1

2018/03/07

Categories: neuralNetworks deepLearning Tags: knn svm softmax cnn imageClassification

Image Classification

Challenges

challenges


Data-Driven Approach


Image Classification Pipeline


Example Image Classification Dataset: CIFAR-10


kNN

challenges

Load Data

Xtr_orig, Ytr, Xte_orig, Yte = load_CIFAR10(cifar10_dir) # a magic function we provide
# flatten out all images to be one-dimensional
Xtr = Xtr_orig.reshape(Xtr_orig.shape[0], -1) # Xtr becomes 50000 x 3072
Xte = Xte_orig.reshape(Xte_orig.shape[0], -1) # Xte becomes 10000 x 3072

Train and Evaluate a Classifier

knn = KNearestNeighbor() # create a kNN classifier instance
knn.train(Xtr, ytr) # train the classifier on the training images and labels
yte_predict = knn.predict(Xte) # predict labels on the test images
# accuracy = fraction of examples that were correctly predicted
accuracy = np.mean(yte_predict == yte)
print('accuracy: %f' % accuracy)

Implementation of kNN Classifier with L2 Distance

Vectorization of Distance Computation

$$ A = \begin{pmatrix} a_1\newline \vdots\newline a_N \end{pmatrix} \in \mathbb{R}^{N \times D},\ \ B = \begin{pmatrix} b_1\newline \vdots\newline b_M \end{pmatrix} \in \mathbb{R}^{M \times D} $$

$$ E_{ij} = \Vert b_i- a_j\Vert_2, \ \ E=(E_{ij})\in\mathbb{R}^{M\times N} $$


$$ E_{ij}^2=-2b_i a_j^T +\Vert b_i\Vert_2^2 +\Vert a_j\Vert_2^2 =-2 (BA^T)_{ij} +\Vert b_i\Vert_2^2 +\Vert a_j\Vert_2^2 $$

b = np.sum(B * B, axis = 1, keepdims = True)
a = np.sum(A * A, axis = 1, keepdims = True)
E = np.sqrt(-2 * np.dot(B, A.T) + b + a.T) # numpy broadcasting

import numpy as np
from scipy.stats import mode

class KNearestNeighbor(object):
    """ a kNN classifier with L2 distance """
    def __init__(self):
        pass

    def train(self, X, y):
        """
        X: X.shape == (N, D), N examples, each of dim D
        y: y.shape == (N,)
           y[i] is the label of X[i]
        """
        # the nearest neighbor classifier simply remembers all the training data
        self.Xtrain = X
        self.ytrain = y

    def computeDistances(self, X):
        """
        Compute the distances between each test point in X
        and each training point in self.Xtrain
        Input:
        X: each row is an example we wish to predict label for
           X.shape == (ntest, D)
        Output:
        dists: dists.shape == (ntest, ntrain)
               dists[i, j] == L2 distance between X[i] and self.Xtrain[j]
        """
        ntest, ntrain = X.shape[0], self.Xtrain.shape[0]

        te = np.sum(X * X, axis = 1, keepdims = True)
        tr = np.sum(self.Xtrain * self.Xtrain, axis = 1, keepdims = True)
        dists = np.sqrt(-2 * np.dot(X, self.Xtrain.T) + te + tr.T)

    def predict(self, X, k = 1):
        """
        Predict labels for test data using this classifier.

        Input:
        X: each row is an example we wish to predict label for
           X.shape == (ntest, D)
        Output:
        ypred: ypred.shape == (ntest,)
               ypred[i] is the predicted label for X[i]
        """
        dists = self.computeDistances(X)
        ntest = X.shape[0]
        # ith row: indices of k nearest neighbors of X[i]
        k_idx = dists.argsort(axis = 1)[:, :k]
        # ith row: labels of k nearest neighbors of X[i]
        closest_y = self.ytrain[k_idx]
        # ith row: most common label in closest_y[i]
        y_pred = mode(closest_y, axis = 1).mode

        return np.squeeze(y_pred)

Hyperparameter Tuning

Evaluate on the test set only a single time, at the very end.

Split your training set into training set and a validation set. Use validation set to tune all hyperparameters. At the end run a single time on the test set and report performance.


Pros and Cons of kNN

With $N$examples, how fast are training and prediction?

Approximate Nearest Neighbor (ANN) algorithms (e.g. FLANN)


Python Code

Jupyter Notebook

.py


kNN in Practice (Never Used on Images)


Linear Classification

Multiclass SVM

Softmax

Two-layer Neural Network