- Robust to noisy training data (especially if we use inverse square of weighted distance as the "distance")
- Effective if the training data is large
- Need to determine value of parameter K (number of nearest neighbors)
- Distance based learning is not clear which type of distance to use and which attribute to use to produce the best results. Shall we use all attributes or certain attributes only?
- Computation cost is quite high because we need to compute distance of each query instance to all training samples. Some indexing (e.g. K-D tree) may reduce this computational cost
Preferable reference for this tutorial is
Teknomo, Kardi. K-Nearest Neighbors Tutorial. http:\\people.revoledu.com\kardi\tutorial\KNN\