Redshift. Convert comma delimited values into rows
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Despite the mistaken tags, this article’s title is clearly about Amazon Redshift string processing: turning a comma-delimited field into one row per value. Redshift does not give you a magical one-call “split this string into rows” operator in the same way some other SQL systems do, so the usual solution is to combine SPLIT_PART with a generated sequence of positions.
Example Problem
Suppose you have a table like this:
You want the result:
Core Idea: Split by Position
Redshift’s SPLIT_PART(string, delimiter, position) returns one field at a given position. So if you can generate positions 1, 2, 3, ..., you can expand a delimited string into rows.
Count How Many Parts Exist
A simple way to estimate the number of comma-separated pieces is to count commas and add one.
This works for simple well-formed comma-delimited data.
Use a Numbers CTE
Then join each row against a small numbers table or CTE.
This is the standard pattern: generate candidate positions and use SPLIT_PART for each valid one.
Make the Output Cleaner
If your source data may contain spaces such as A1, B2, C3, trim the extracted value.
That keeps downstream joins and filters cleaner.
Use a Permanent Numbers Table for Real Workloads
For production workloads, a permanent numbers table is usually better than hardcoding a tiny CTE. It is more reusable and scales to longer lists.
Populate it once with enough integers for your expected maximum list length, then reuse it anywhere you need position-based expansion logic.
Why Not Keep Comma-Delimited Data?
A comma-separated column is often a sign that the data should really be normalized into a child table. Splitting into rows during query time works, but it is usually a workaround, not the ideal long-term schema.
If you control the data model, storing one value per row is almost always simpler for filtering, joining, and analytics.
Common Pitfalls
A common mistake is assuming Redshift supports the exact same row-splitting functions as PostgreSQL or another SQL engine. Another is forgetting to trim whitespace after splitting. Developers also often underestimate how messy comma-delimited data becomes when values can be empty, malformed, or contain commas themselves. Finally, if the list can be longer than the hardcoded numbers CTE, rows silently disappear from the result.
Summary
- In Redshift, a common way to turn comma-delimited text into rows is
SPLIT_PARTplus generated positions. - '
REGEXP_COUNThelps determine how many parts a row contains.' - A reusable numbers table is better than a tiny hardcoded sequence for real workloads.
- '
TRIMis useful when delimiters are followed by spaces.' - If possible, normalize the data model so you do not need to split delimited strings at query time.

