Get list from pandas dataframe column or row?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Converting pandas DataFrame values to Python lists is common when passing data into APIs, plotting libraries, or legacy code. The correct method depends on whether you need a single column, one row, or a two dimensional list. Getting this right means preserving order, data type intent, missing value behavior, and avoiding accidental conversion of more data than the consumer actually needs.
Extract a Single Column
For one column, Series.tolist is the most direct option.
This keeps row order and is typically enough for API payload assembly.
Extract a Row as a List
Use positional indexing with iloc when you need row values.
If you need labeled values instead of ordered list, use to_dict from the selected row series.
Choose list or dictionary based on downstream consumer expectations.
Extract Multiple Rows or Columns
For matrix style output, values.tolist or to_numpy().tolist both work.
If type consistency matters, use to_numpy with explicit dtype when possible.
This avoids mixed type surprises that can happen with loosely typed object arrays.
If you are extracting from a row first and then converting, remember that pandas may upcast mixed row values to a broader common dtype. That is another reason to select only the columns you truly need before list conversion.
Handle Missing Values Before Conversion
Missing values often become nan in lists, which may break JSON encoding or strict typed consumers.
Define missing value policy early:
- Fill with defaults.
- Drop missing rows.
- Preserve as
Nonefor nullable APIs.
Different pipelines need different policies, so do not hardcode silently.
Performance Considerations
For very large dataframes, converting everything to native Python lists can be slow and memory heavy. Prefer vectorized operations and keep data in pandas or NumPy until list conversion is truly necessary.
Good practice:
- Filter rows first.
- Select only required columns.
- Convert to list at the boundary where consumer requires it.
This minimizes Python object creation overhead.
Preserve Ordering and Index Intent
By default, list conversion preserves current dataframe order. If order matters to business logic, sort explicitly before conversion.
Do not rely on incidental order from previous operations unless guaranteed.
If the index itself matters, convert it separately rather than assuming it is part of the row list:
Practical Patterns for APIs
If an API expects a list of records, use to_dict(orient="records") instead of nested list conversion.
This is often clearer and less error prone when keys matter in payload schemas.
Prepare Lists for Model or Numerical Inputs
Machine learning and numerical routines often require strict shape and dtype guarantees. When converting from DataFrame, select columns in explicit order and convert to NumPy before list conversion.
For row oriented model APIs, convert selected rows in one step so schema alignment is obvious:
Common Pitfalls
- Converting entire dataframes to lists when only one column is needed.
- Ignoring missing value behavior and leaking
naninto outputs. - Losing semantic meaning by using row lists where keyed dictionaries are required.
- Assuming column order implicitly without selecting columns explicitly.
- Converting too early and giving up pandas performance advantages.
Summary
- Use
Series.tolistfor single column extraction. - Use
ilocfor row extraction and choose list or dict intentionally. - Use subset selection before matrix conversions for control and efficiency.
- Handle missing values before conversion to match consumer expectations.
- Convert to Python lists at the latest possible stage in the pipeline.

