Fisher Linear Discriminant
Data Visualization
Machine Learning
Reproducibility
Python

Reproduce Fisher linear discriminant figure

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Fisher's Linear Discriminant (FLD) is a method used in statistics and machine learning to find a linear combination of features that best separates two or more classes of objects or events. The key objective is to maximize the ratio of the difference between the means of the classes to the variation within each class. This provides a linear classifier or, more commonly, a dimensionality reduction technique.

Conceptual Background

Linear Discriminant Analysis

Fisher's Linear Discriminant is one component of Linear Discriminant Analysis (LDA), which assumes that different classes generate data based on different Gaussian distributions. LDA is widely used in pattern recognition and machine learning for dimension reduction and classification.

Mathematical Formulation

Given a set of data points belonging to two classes, the goal is to project the data onto a line such that the separation between the two classes is maximized. The separation is quantified in terms of the difference between projected means relative to the variances in each class.

If x1,x2,...,xnx_1, x_2, ..., x_n are feature vectors of the dataset and ww is the weight vector, the projection of these vectors is given by yi=wTxiy_i = w^T x_i. The weight vector ww should be chosen so that it maximizes Fisher’s criterion:

J(w)=(m1m2)2s12+s22J(w) = \frac{(m_1 - m_2)^2}{s_1^2 + s_2^2}

where:

m1m_1 and m2m_2 are the means of the projected data for class 1 and class 2, respectively. • s12s_1^2 and s22s_2^2 are the variances of the projected data for class 1 and class 2, respectively.

Computing the Optimal Weight Vector

To find the weight vector, ww, that maximizes J(w)J(w), we solve:

w=SW1(m1m2)w = S_W^{-1} (m_1 - m_2)

Where the within-class scatter matrix SWS_W is defined as:

SW=n=1N1(xn(1)m1)(xn(1)m1)T+n=1N2(xn(2)m2)(xn(2)m2)TS_W = \sum_{n=1}^{N_1} (x_n^{(1)} - m_1)(x_n^{(1)} - m_1)^T + \sum_{n=1}^{N_2} (x_n^{(2)} - m_2)(x_n^{(2)} - m_2)^T

Example

Consider a simple dataset with two-dimensional features. We have two classes:

• Class 1: Points: (2, 3), (3, 4), (4, 5) • Class 2: Points: (1, 0), (0, -1), (-1, -2)

  1. Mean Calculation: Compute the means m1m_1 and m2m_2. • m1=(3,4)m_1 = (3, 4)m2=(0,1)m_2 = (0, -1)
  2. Scatter Matrices: • Compute SWS_W using data points and means.
  3. Weight Vector: • Compute w=SW1(m1m2)w = S_W^{-1} (m_1 - m_2).

These computations will lead to the optimal direction we seek for projecting data to maximize class separation.

Summary Table

AspectExplanation
MethodFisher's Linear Discriminant
GoalMaximize class separation by linear projection
CriterionJ(w)=(m1m2)2s12+s22J(w) = \frac{(m_1 - m_2)^2}{s_1^2 + s_2^2}
Optimal Weight Vectorw=SW1(m1m2)w = S_W^{-1} (m_1 - m_2)
Within-Class ScatterSW=n=1N1(xn(1)m1)(xn(1)m1)T+n=1N2(xn(2)m2)(xn(2)m2)TS_W = \sum_{n=1}^{N_1} (x_n^{(1)} - m_1)(x_n^{(1)} - m_1)^T + \sum_{n=1}^{N_2} (x_n^{(2)} - m_2)(x_n^{(2)} - m_2)^T
Example ClassesClass 1: (2, 3), (3, 4), (4, 5) Class 2: (1, 0), (0, -1), (-1, -2)

Conclusion

Fisher's Linear Discriminant provides a simplistic yet powerful method for linear separation and dimensionality reduction. By finding the optimal projection direction, FLD enhances the ability to distinguish between classes highlighting significant directions in the feature space. Its methodological strength lies in balancing separation between class means with within-class variance, a principle that extends to many modern machine learning approaches.


Course illustration
Course illustration

All Rights Reserved.