Share this: Google+

In this simple tutorial, you will learn the basic knowledge to expand your data type into
**
multivariate
**
(different type of measurement scale, such as nominal, ordinal, and quantitative) data and go
**
beyond 2 dimensional data scale up to N dimensions
**
.
Comprehesive example is given at the last part of this tutorial.
You also may
download the MS Excel companion file of this tutorial here

This knowledge about similarity and dissimilarity is necessary for data mining, pattern recognition, machine intelligent, artificial intelligent and multi-agents system fields. However, the application is not only limited to computer science field. Other fields of natural and social science as well as engineering and statistics have been applied this kind of simple knowledge. Tools such as K means clustering , Discriminant analysis , K-Nearest Neighbors , or Decision Tree and Hierarchical clustering rely heavily on the distance matrix explained in this tutorial.

What is distance ?Why do we need to measure similarity? (Applications)

What is the relationship between similarity and dissimilarity?

How do we measure similarity or dissimilarity?

How do we compute dissimilarity or similarity for

**binary variables**?

Simple Matching CoefficientHow do we compute dissimilarity or similarity for

Jaccard's Coefficient

Hamming Distance

**nominal / categorical**variables?

Assign each value of category asHow do we compute dissimilarity or similarity fora binary dummyvariable

Assign each value of category intoseveral binary dummyvariables

**ordinal**variables?

Normalized Rank TransformationHow do we compute dissimilarity or similarity for

Spearman Distance

Footrule Distance

Kendall Distance

Cayley Distance

Hamming Distance for Ordinal Variable

Ulam Distance

**text and string variables**?

How do we compute dissimilarity or similarity for

**quantitative variables**?

Euclidean DistanceHow do we compute dissimilarity between

City block (Manhattan) distance

Chebyshev Distance

Minkowski Distance

Canberra distance

Bray Curtis (Sorensen) distance

Angular separation

Correlation coefficient

**two groups**(

**Mahalanobis distance**)?

How do we

**normalize**the similarity or dissimilarity?

How do we

**aggregate mixed type**of variables?

Comprehensive example: Distance matrix ofResourcesMultivariatedata

Rate and give comment for this tutorial

**
Preferable reference for this tutorial is
**

Teknomo, Kardi (2015) Similarity Measurement. http:\people.revoledu.comkardi tutorialSimilarity