Kardi Teknomo Kardi Teknomo
Kardi Teknomo Kardi Teknomo Kardi Teknomo
     
Research
Publications
Tutorials
Resume
Service
Resources
Contact

Similarity Measurement

By Kardi Teknomo, PhD.

Share this: Google+

In this simple tutorial, you will learn the basic knowledge to expand your data type into multivariate (different type of measurement scale, such as nominal, ordinal, and quantitative) data and go beyond 2 dimensional data scale up to N dimensions. Comprehesive example is given at the last part of this tutorial. You also may download the MS Excel companion file of this tutorial here

This knowledge about similarity and dissimilarity is necessary for data mining, pattern recognition, machine intelligent, artificial intelligent and multi-agents system fields. However, the application is not only limited to computer science field. Other fields of natural and social science as well as engineering and statistics have been applied this kind of simple knowledge. Tools such as K means clustering, Discriminant analysis, K-Nearest Neighbors, or Decision Tree and Hierarchical clustering rely heavily on the distance matrix explained in this tutorial.

What is similarity?

What is distance?
What is the relationship between similarity and dissimilarity?
Why do we need to measure similarity? (Applications)
How do we measure similarity or dissimilarity?
How do we compute dissimilarity or similarity for binary variables?
Simple Matching Coefficient
Jaccard's Coefficient
Hamming Distance
How do we compute dissimilarity or similarity for nominal / categorical variables?
Assign each value of category as a binary dummy variable
Assign each value of category into several binary dummy variables
How do we compute dissimilarity or similarity for ordinal variables?
Normalized Rank Transformation
Spearman Distance
Footrule Distance
Kendall Distance
Cayley Distance
Hamming Distance for Ordinal Variable
Ulam Distance
How do we compute dissimilarity or similarity for text and string variables?
How do we compute dissimilarity or similarity for quantitative variables?
Euclidean Distance
City block (Manhattan) distance
Chebyshev Distance
Minkowski Distance
Canberra distance
Bray Curtis (Sorensen) distance
Angular separation
Correlation coefficient
How do we compute dissimilarity between two groups (Mahalanobis distance)?
How do we normalize the similarity or dissimilarity?
How do we aggregate mixed type of variables?
Comprehensive example: Distance matrix of Multivariate data
Resources

Rate and give comment for this tutorial

Share and save this tutorial
Add to: Del.icio.us  Add to: Digg  Add to: StumbleUpon   Add to: Reddit   Add to: Slashdot   Add to: Technorati   Add to: Netscape   Add to: Newsvine   Add to: Mr. Wong Add to: Webnews Add to: Icio Add to: Folkd Add to: Yigg Add to: Linkarena Add to: Simpy Add to: Furl Add to: Yahoo Add to: Google Add to: Blinklist Add to: Blogmarks Add to: Diigo Add to: Blinkbits Add to: Ma.Gnolia Information

This tutorial is copyrighted.

Preferable reference for this tutorial is

Teknomo, Kardi (2015) Similarity Measurement. http:\people.revoledu.comkardi tutorialSimilarity

 



 
© 2015 Kardi Teknomo. All Rights Reserved.