MySQL
cardinality
database
SQL
data management

What is cardinality in MySQL?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

In MySQL, cardinality usually refers to how many distinct values exist in a column or index prefix. That matters because the optimizer uses cardinality as a rough measure of selectivity when deciding whether an index is useful. High-cardinality data often benefits more from indexing than low-cardinality data.

Cardinality as Uniqueness

If a column contains many different values, it has high cardinality. If it contains only a few repeated values, it has low cardinality.

Examples:

  • 'user_id is usually high cardinality because most rows have a unique value'
  • 'country_code is low cardinality because many rows share the same few values'
  • 'status with values such as active and inactive is very low cardinality'

This distinction matters because an index is helpful only when it narrows the search space enough to justify using it.

Why MySQL Cares About Cardinality

The MySQL optimizer estimates how selective an index is. If the engine believes an indexed lookup still matches a huge portion of the table, it may choose a table scan instead.

That is why indexing a low-cardinality column is often disappointing. An index on a gender or status column may exist, but the optimizer may ignore it because it does not filter enough rows.

By contrast, an index on a highly unique column such as email or order_id is usually very valuable.

Seeing Index Cardinality

You can inspect index statistics with SHOW INDEX:

sql
SHOW INDEX FROM users;

A typical result includes a Cardinality column. That value is an estimate of the number of unique values in the index, not an exact real-time count.

For example, if you have:

sql
1CREATE TABLE users (
2    id INT PRIMARY KEY,
3    email VARCHAR(255),
4    status VARCHAR(20),
5    INDEX idx_email (email),
6    INDEX idx_status (status)
7);

then idx_email will usually have much higher cardinality than idx_status.

Cardinality and Query Performance

Suppose you run these two queries:

sql
SELECT * FROM users WHERE email = '[email protected]';
SELECT * FROM users WHERE status = 'active';

The email query is likely to use idx_email effectively because it narrows the result to one or very few rows. The status query may return half the table, so the optimizer may prefer a scan even if idx_status exists.

That does not mean low-cardinality indexes are always useless. They can still help:

  • in composite indexes
  • when combined with more selective conditions
  • when the table is small enough that cost tradeoffs differ

But on their own, low-cardinality columns are weaker indexing candidates.

Cardinality in Relationships Versus Index Statistics

Database discussions also use the word cardinality to describe relationships such as one-to-one and one-to-many. That meaning is valid, but when people ask about cardinality in the context of MySQL performance, they usually mean distinct-value counts and index selectivity.

So always ask what the conversation is about:

  • relational design cardinality
  • or optimizer/index cardinality

Confusing those two meanings causes a lot of beginner confusion.

Cardinality Is an Estimate

MySQL does not constantly recompute exact distinct counts for every index entry during normal operation. The optimizer works from statistics, which may become stale if the data distribution changes significantly.

That is why execution plans sometimes look surprising. If the statistics no longer reflect reality, the optimizer's index choice may be poor until statistics are refreshed.

Common Pitfalls

  • Assuming cardinality always means table relationship type rather than value uniqueness.
  • Creating standalone indexes on very low-cardinality columns and expecting large performance gains.
  • Reading the Cardinality value from SHOW INDEX as an exact count instead of as an estimate.
  • Ignoring composite indexes where a low-cardinality column can still be useful in combination.
  • Blaming MySQL blindly when stale statistics are influencing the optimizer.

Summary

  • In MySQL performance discussions, cardinality usually means how many distinct values a column or index has.
  • High cardinality generally means higher selectivity and better index usefulness.
  • Low-cardinality columns are often weak standalone indexing candidates.
  • 'SHOW INDEX exposes an estimated Cardinality value for indexes.'
  • Do not confuse relationship cardinality with optimizer statistics cardinality.

Course illustration
Course illustration

All Rights Reserved.