Reproduce Fisher linear discriminant figure

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Fisher's Linear Discriminant (FLD) is a method used in statistics and machine learning to find a linear combination of features that best separates two or more classes of objects or events. The key objective is to maximize the ratio of the difference between the means of the classes to the variation within each class. This provides a linear classifier or, more commonly, a dimensionality reduction technique.

Conceptual Background

Linear Discriminant Analysis

Fisher's Linear Discriminant is one component of Linear Discriminant Analysis (LDA), which assumes that different classes generate data based on different Gaussian distributions. LDA is widely used in pattern recognition and machine learning for dimension reduction and classification.

Mathematical Formulation

Given a set of data points belonging to two classes, the goal is to project the data onto a line such that the separation between the two classes is maximized. The separation is quantified in terms of the difference between projected means relative to the variances in each class.

If $x_1, x_2, ..., x_n$ are feature vectors of the dataset and $w$ is the weight vector, the projection of these vectors is given by $y_i = w^T x_i$ . The weight vector $w$ should be chosen so that it maximizes Fisher’s criterion:

$J(w) = \frac{(m_1 - m_2)^2}{s_1^2 + s_2^2}$

where:

• $m_1$ and $m_2$ are the means of the projected data for class 1 and class 2, respectively. • $s_1^2$ and $s_2^2$ are the variances of the projected data for class 1 and class 2, respectively.

Computing the Optimal Weight Vector

To find the weight vector, $w$ , that maximizes $J(w)$ , we solve:

$w = S_W^{-1} (m_1 - m_2)$

Where the within-class scatter matrix $S_W$ is defined as:

$S_W = \sum_{n=1}^{N_1} (x_n^{(1)} - m_1)(x_n^{(1)} - m_1)^T + \sum_{n=1}^{N_2} (x_n^{(2)} - m_2)(x_n^{(2)} - m_2)^T$

Example

Consider a simple dataset with two-dimensional features. We have two classes:

• Class 1: Points: (2, 3), (3, 4), (4, 5) • Class 2: Points: (1, 0), (0, -1), (-1, -2)

Mean Calculation: Compute the means $m_1$ and $m_2$ . • $m_1 = (3, 4)$ • $m_2 = (0, -1)$
Scatter Matrices: • Compute $S_W$ using data points and means.
Weight Vector: • Compute $w = S_W^{-1} (m_1 - m_2)$ .

These computations will lead to the optimal direction we seek for projecting data to maximize class separation.

Summary Table

Aspect	Explanation
Method	Fisher's Linear Discriminant
Goal	Maximize class separation by linear projection
Criterion	$J(w) = \frac{(m_1 - m_2)^2}{s_1^2 + s_2^2}$
Optimal Weight Vector	$w = S_W^{-1} (m_1 - m_2)$
Within-Class Scatter	$S_W = \sum_{n=1}^{N_1} (x_n^{(1)} - m_1)(x_n^{(1)} - m_1)^T + \sum_{n=1}^{N_2} (x_n^{(2)} - m_2)(x_n^{(2)} - m_2)^T$
Example Classes	Class 1: (2, 3), (3, 4), (4, 5) Class 2: (1, 0), (0, -1), (-1, -2)

Conclusion

Fisher's Linear Discriminant provides a simplistic yet powerful method for linear separation and dimensionality reduction. By finding the optimal projection direction, FLD enhances the ability to distinguish between classes highlighting significant directions in the feature space. Its methodological strength lies in balancing separation between class means with within-class variance, a principle that extends to many modern machine learning approaches.