### Introduction

• K Nearest Neighbor(KNN) is a kind of nonparametric method to classify data points. It’s very easy and convenient to implement KNN.

### Details

• Prerequests: N samples/instances:

$$\mathcal{D}^{Train} = {(\textbf{x}_1, y_1), (\textbf{x}_2, y_2), (\textbf{x}_3, y_3), (\textbf{x}_4, y_4), \cdots, (\textbf{x}_n, y_n)}$$

we let
$$nn_k(\textbf{x}) = k\text{th} \text{ nearest neighbor of }\textbf{x}$$

$$knn(\textbf{x}) = \text{top } k\text{ nearest neighbor of \textbf{x}}$$

Then the rule for classifying x into specific label is:
$$y = f(\textbf{x}) = \text{argmax}_{c \in [C]} \sum\limits \mathbb{I}(x_n \in knn(\textbf{x}), y_n = c)$$

### Explanation

• Let K nearest neighbors vote to represent current tuple’s label. It’s the intuition of the thought of K nearest neighbors
• If the votes from K nearest neighbors are the same for top two categories, just choose the label from closer neighbor