SQL
Redshift
Postgres
Row Count
Conditional Querying

In Redshift/Postgres, how to count rows that meet a condition?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

In PostgreSQL and Amazon Redshift, conditional counting is done with COUNT combined with CASE WHEN, FILTER (WHERE) (Postgres 9.4+), or SUM with a boolean expression cast to integer. These approaches let you count rows meeting different criteria in a single query without multiple WHERE-filtered subqueries. Redshift supports most PostgreSQL conditional counting syntax but does not support FILTER (WHERE).

Basic COUNT with WHERE

sql
1-- Count all orders from 2024
2SELECT COUNT(*)
3FROM orders
4WHERE created_at >= '2024-01-01'
5  AND created_at < '2025-01-01';

This counts all rows matching the WHERE clause. But what if you need multiple conditional counts in one query?

COUNT with CASE WHEN

The most portable approach — works in PostgreSQL, Redshift, MySQL, SQL Server, and all SQL databases:

sql
1SELECT
2    COUNT(*) AS total_orders,
3    COUNT(CASE WHEN status = 'completed' THEN 1 END) AS completed,
4    COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending,
5    COUNT(CASE WHEN status = 'cancelled' THEN 1 END) AS cancelled,
6    COUNT(CASE WHEN amount > 100 THEN 1 END) AS high_value
7FROM orders;

COUNT ignores NULL values, so when the CASE has no ELSE clause, non-matching rows produce NULL and are not counted.

Multiple Conditions

sql
1SELECT
2    department,
3    COUNT(*) AS total_employees,
4    COUNT(CASE WHEN salary > 100000 THEN 1 END) AS high_earners,
5    COUNT(CASE WHEN hire_date >= '2024-01-01' THEN 1 END) AS new_hires,
6    COUNT(CASE WHEN salary > 100000 AND hire_date >= '2024-01-01' THEN 1 END) AS high_earning_new_hires
7FROM employees
8GROUP BY department
9ORDER BY total_employees DESC;

SUM with Boolean Expression

An alternative that leverages the fact that true casts to 1 and false to 0:

sql
1-- PostgreSQL — boolean::int works
2SELECT
3    SUM((status = 'completed')::int) AS completed,
4    SUM((status = 'pending')::int) AS pending,
5    SUM((amount > 100)::int) AS high_value
6FROM orders;
7
8-- Redshift — use CASE instead (no boolean::int cast)
9SELECT
10    SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS completed,
11    SUM(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) AS pending,
12    SUM(CASE WHEN amount > 100 THEN 1 ELSE 0 END) AS high_value
13FROM orders;

The SUM(CASE WHEN ... THEN 1 ELSE 0 END) pattern is equivalent to COUNT(CASE WHEN ... THEN 1 END) and works everywhere.

FILTER (WHERE) — PostgreSQL Only

PostgreSQL 9.4+ supports the FILTER clause for aggregate functions, which is cleaner:

sql
1-- PostgreSQL 9.4+ only — NOT supported in Redshift
2SELECT
3    COUNT(*) AS total,
4    COUNT(*) FILTER (WHERE status = 'completed') AS completed,
5    COUNT(*) FILTER (WHERE status = 'pending') AS pending,
6    COUNT(*) FILTER (WHERE amount > 100) AS high_value,
7    AVG(amount) FILTER (WHERE status = 'completed') AS avg_completed_amount
8FROM orders;

FILTER (WHERE) works with any aggregate function (COUNT, SUM, AVG, MIN, MAX, ARRAY_AGG, etc.).

Conditional Count with GROUP BY

sql
1-- Monthly breakdown with conditional counts
2SELECT
3    DATE_TRUNC('month', created_at) AS month,
4    COUNT(*) AS total,
5    COUNT(CASE WHEN status = 'completed' THEN 1 END) AS completed,
6    COUNT(CASE WHEN status = 'refunded' THEN 1 END) AS refunded,
7    ROUND(
8        100.0 * COUNT(CASE WHEN status = 'completed' THEN 1 END) / COUNT(*),
9        2
10    ) AS completion_rate_pct
11FROM orders
12WHERE created_at >= '2024-01-01'
13GROUP BY DATE_TRUNC('month', created_at)
14ORDER BY month;

Counting Distinct Values Conditionally

sql
1SELECT
2    COUNT(DISTINCT user_id) AS total_users,
3    COUNT(DISTINCT CASE WHEN status = 'completed' THEN user_id END) AS users_with_completed,
4    COUNT(DISTINCT CASE WHEN amount > 100 THEN user_id END) AS users_with_high_value
5FROM orders;

Window Function with Conditional Count

sql
1-- Running conditional count per user
2SELECT
3    user_id,
4    order_id,
5    status,
6    COUNT(CASE WHEN status = 'completed' THEN 1 END)
7        OVER (PARTITION BY user_id ORDER BY created_at) AS running_completed
8FROM orders;

Performance: Redshift-Specific Tips

sql
1-- Redshift: use sort keys and distribution keys for efficient counting
2-- Table definition
3CREATE TABLE orders (
4    order_id INT,
5    user_id INT,
6    status VARCHAR(20),
7    amount DECIMAL(10,2),
8    created_at TIMESTAMP
9)
10DISTSTYLE KEY
11DISTKEY (user_id)
12SORTKEY (created_at, status);
13
14-- This query benefits from the sort key on (created_at, status)
15SELECT
16    DATE_TRUNC('month', created_at) AS month,
17    COUNT(CASE WHEN status = 'completed' THEN 1 END) AS completed
18FROM orders
19WHERE created_at >= '2024-01-01'
20GROUP BY 1;

Common Pitfalls

  • Using FILTER (WHERE) in Redshift: The FILTER clause is a PostgreSQL extension not supported by Redshift. Queries using FILTER fail with a syntax error in Redshift. Use COUNT(CASE WHEN ... THEN 1 END) instead, which works in both systems.
  • Using COUNT(*) with CASE instead of COUNT(expression): COUNT(CASE WHEN cond THEN 1 END) works because COUNT skips NULLs. If you accidentally add ELSE 0, the count includes non-matching rows (since 0 is not NULL). Either omit the ELSE with COUNT, or use SUM with ELSE 0.
  • Dividing counts without casting to float: COUNT(completed) / COUNT(*) performs integer division and always returns 0 or 1. Cast to DECIMAL or multiply by 100.0 first to get a percentage.
  • Not using GROUP BY with conditional counts: Conditional counts without GROUP BY return a single row for the entire table. If you need per-category breakdowns, add GROUP BY category and include the column in the SELECT.
  • Counting NULLs: COUNT(column) excludes NULL values; COUNT(*) includes them. When counting conditionally, CASE WHEN col IS NULL THEN 1 END is needed to count NULL rows, since COUNT(col) silently skips them.

Summary

  • Use COUNT(CASE WHEN condition THEN 1 END) for portable conditional counting (works in Postgres, Redshift, MySQL, SQL Server)
  • Use SUM(CASE WHEN condition THEN 1 ELSE 0 END) as an equivalent alternative
  • Use FILTER (WHERE) in PostgreSQL 9.4+ for cleaner syntax (not supported in Redshift)
  • Combine with GROUP BY for breakdowns by category, time period, or other dimensions
  • Cast to DECIMAL before dividing counts to avoid integer division truncation
  • In Redshift, use sort keys on columns in WHERE and CASE WHEN clauses for performance

Course illustration
Course illustration

All Rights Reserved.