| |||||||||||||||||
![]() |
![]() |
![]() |
|||||||||||||||
|
Linear Discriminant Analysis (LDA)
PurposeThe purpose of Discriminant Analysis is to classify objects (people, customers, things, etc.) into one of two or more groups based on a set of features that describe the objects (e.g. gender, age, income, weight, preference score, etc. ). In general, we assign an object to one of a number of predetermined groups based on observations made on the object. Note that the groups are known or predetermined and do not have order (i.e. nominal scale). The classification problem gives several objects with a set features measured from those objects. What we are looking for is two things:
(Check the difference of discriminant analysis and cluster analysis) The first purpose is feature selection and the second purpose is classification. In this tutorial we will not cover the first purpose (reader interested in this step wise approach can use statistical software such as SPSS, SAS or statistical package of Matlab. However, we do cover the second purpose to get the rule of classification and predict new object based on the rule.
Linear Discriminant AnalysisFor example, we want to know whether a soap product is good or bad based on several measurements on the product such as weight, volume, people's preferential score, smell, color contrast etc. The object here is soap. The class category or the group (“good” and “bad”) is what we are looking for (it is also called dependent variable). Each measurement on the product is called features that describe the object (it is also called independent variable). Thus, in discriminant analysis, the dependent variable (Y) is the group and the independent variables (X) are the object features that might describe the group. The dependent variable is always category (nominal scale) variable while the independent variables can be any measurement scale (i.e. nominal, ordinal, interval or ratio). If we can assume that the groups are linearly separable, we can use linear discriminant model (LDA). Linearly separable suggests that the groups can be separated by a linear combination of features that describe the objects. If only two features, the separators between objects group will become lines. If the features are three, the separator is a plane and the number of features (i.e. independent variables) is more than 3, the separators become a hyper-plane.
LDA FormulaUsing classification criterion to minimize total error of classification (TEC), we tend to make the proportion of object that it misclassifies as small as possible. TEC is the performance rule in the 'long run' on a random sample of objects. Thus, TEC should be thought as the probability that the rule under consideration will misclassify an object. The classification rule is to a ssign an object to the group with highest conditional probability . This is called Bayes Rule. This rule also minimizes the TEC. If there are We want to know the probability Fortunately, there is a relationship between the two conditional probabilities that well known as Bayes Theorem:
Prior probability In practice, however, to use the Bayes rule directly is unpractical because to obtain Assign object If you notice carefully the second term (
Any standard text books in data mining, pattern recognition or classification can give you more detail derivation of this formula. The meaning of each variable is explained in the next section of numerical example.
Preferable reference for this tutorial is Teknomo, Kardi. Discriminant Analysis Tutorial. http://people.revoledu.com/kardi/ tutorial/LDA/
|
|||||||||||||||
|
||||||||||||||||
© 2006 Kardi Teknomo. All Rights Reserved. Designed by CNV Media |
||||||||||||||||