Kardi Teknomo Kardi Teknomo
Kardi Teknomo Kardi Teknomo Kardi Teknomo
     
Research
Publications
Tutorials
Resume
Service
Resources
Contact

Visit Tutorials below:
Adaptive Learning from Histogram
Adjacency matrix
Analytic Hierarchy Process (AHP)
ArcGIS tutorial
Arithmetic Mean
Bayes Theorem
Bootstrap Sampling
Bray Curtis Distance
Break Even Point
Chebyshev Distance
City Block Distance
Conditional Probability
Continued Fraction
Data Analysis from Questionnaire
Data Revival from Statistics
Decimal to Rational
Decision tree
Difference equations
Digital Root
Discriminant analysis
Divisibility
Eigen Value using Excel
Euclidean Distance
Euler Integration
Euler Number
Excel Iteration
Excel Macro
Excel Tutorial
Factorial Function
Feasibility Study
Financial Analysis
Generalized Inverse
Generalized Mean
Geometric Mean
Ginger Bread Man and Chaos
Graph Theory
Growth Model
Hamming Distance
Harmonic Mean
Hierarchical Clustering
Independent Events
Incident matrix
Jaccard Coefficient
Kernel basis function
Kernel Regression
k-Means clustering
K Nearest Neighbor
LAN Connections Switch
Learning from data
Lehmer Mean
Linear Algebra
Logarithm Rules
Mahalanobis Distance
Market Basket Analysis
Mean Absolute Deviation
Mean and Average
Mean, median, mode
Minkowski Distance
Minkowski Mean
Monte Carlo Simulation
Multi Agent System
Multicriteria decision making
Mutivariate Distance
Newton Raphson
Non-Linear Transformation
Normalization Index
Normalized Rank
Ordinary Differential Equation
Page Rank
Palindrome
PI
Power rules
Prime Factor
Prime Number
Q Learning
Quadratic Function
Queueing Theory
Rank Reversal
Recursive Statistics
Regression Model
Reinforcement Learning
Root of Polynomial
Runge-Kutta
Scenario Analysis
Sierpinski gasket
Sieve of Erastosthenes
Similarity and Distance
Solving System Equation
Standard deviation
String Distance
Summation Tricks
Support Vector Machines
System dynamic
Time Average
Tower of Hanoi
Variance
Vedic Square
Visual Basic (VB) tutorial
What If Analysis

KNN for Smoothing and Prediction

By Kardi Teknomo, PhD.

KNN e-book

<Previous | Next | Contents>

Read it off line on any device. Click here to purchase the complete E-book of this tutorial

Using the same principle, we can extend the K-Nearest Neighbor (KNN) algorithm for smoothing (interpolation) and prediction (forecasting, extrapolation) of quantitative data (e.g. time series). In classification, the dependent variable Y is categorical data. In this section, the dependent variable has quantitative values.

 

Here is step by step on how to compute K-nearest neighbors KNN algorithm for quantitative data:

  1. Determine parameter K = number of nearest neighbors
  2. Calculate the distance between the query-instance and all the training samples
  3. Sort the distance and determine nearest neighbors based on the K-th minimum distance
  4. Gather the values of of the nearest neighbors
  5. Use average of nearest neighbors as the prediction value of the query instance


KNN for Extrapolation, Prediction, Forecasting

Example (KNN for Extrapolation, Prediction, Forecasting)

We have 5 data pair (X,Y) as shown below. The data are quantitative in nature. Suppose the data is sorted as in time series. Then the problem is to estimate the value of Y based on K-Nearest Neighbor (KNN) algorithm at X=6.5

1. Determine parameter K = number of nearest neighbors

Suppose use K = 2


2. Calculate the distance between the query-instance and all the training samples

Coordinate of query instance is 6.5. As we are dealing with one-dimensional distance, we simply take absolute value from the query instance to value of X.

For instance for X=5.1, the distance is | 6.5 – 5.1 | = 1.4, for X = 1.2 the distance is | 6.5 – 1.2 | = 5.3 and so on.


3. Sort the distance and determine nearest neighbors based on the K-th minimum distance

As the data is already sorted, the nearest neighbors are the last K data.


4. Gather the values of of the nearest neighbors

We simply copy the Y values of the last K=2 data. The result is tabulated below.

data and nearest neighbor value


5. Use average of nearest neighbors as the prediction value of the query instance

In this case, we have prediction value of prediction value is 17.5

KNN for time series

 

You can play around with different data and value K if you download the spreadsheet of this example here.

KNN for Interpolation, Smoothing

Example (KNN for Interpolation)

Using the same training data and the same technique, we can also do KNN for smoothing (interpolation between values). Thus, our data is shown as

data

Suppose we know the X data is between 0 and 6 and we would like to compute the value of Y between them.

  1. We define dx=0.1 and set the value of x = 0 to 6 with increment dx
  2. Compute distance between x (as if it is the query instance) and each of the data X

For instance the distance between query instance x = 0.1 and X2 = 1.2 is denoted as d(x, X2) = | 0.1 – 1.2 | = 1.1. Similarly, distance between query instance x = 0.5 and X5 = 5.1 is computed as d(x, X5) = | 0.5 – 5.1 | = 4.6. Table below shows distance for x = 0 to 0.5 for all X data

distance computation

  1. We obtain the nearest neighbors based on the K-th minimum distance and copy the value of Y of the nearest neighbors.
  2. The smoothing estimate is the arithmetic average of the values of the nearest neighbors

Table below shows example of computation of KNN for smoothing for x = 2.5 until 3.5.

K nearest neighbor smoothing

Playing around with the value of K give the graph results is give below. In general, the plot of KNN smoothing has many discontinuities. For K=1, the KNN smoothing line goes passing all the data points, therefore the sum of square error is zero. The plot is the most rough. When K = 5 (all the data point), we get only one horizontal line as the average of all data. Between the two extremes, we can find adjust the value of K as parameter to adjust the smoothing plot. Among K=2, K=3 and K=4 we obtain K=4 have the smallest sum of square error (SSE) .

K nearest neighbor K=1K nearest neighbor K=2K nearest neighbor K=3K nearest neighbor K=4

K nearest neighbor K=5

download the spreadsheet of this example here

I have demonstrated through several examples how we can use simple K-NN algorithm for classification, interpolation and extrapolation.

Click here to purchase the complete E-book of this tutorial

Give your feedback and rate this tutorial

<Contents | Previous | Next >

 

This tutorial is copyrighted.

Preferable reference for this tutorial is

Teknomo, Kardi. K-Nearest Neighbors Tutorial. http:\\people.revoledu.com\kardi\ tutorial\KNN\

 
 
© 2015 Kardi Teknomo. All Rights Reserved.