How to install packages in Airflow docker-compose?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Apache Airflow deployed via Docker Compose uses the official apache/airflow image, which includes core dependencies but not every Python package your DAGs may need. There are several ways to install additional packages: using a requirements.txt file with the _PIP_ADDITIONAL_REQUIREMENTS environment variable, building a custom Docker image, or mounting a requirements file. The best approach depends on whether you need quick iteration during development or reproducible builds for production.
Method 1: _PIP_ADDITIONAL_REQUIREMENTS (Quick Development)
_PIP_ADDITIONAL_REQUIREMENTS is an environment variable recognized by the official Airflow Docker image's entrypoint script. It runs pip install on every container startup. This is convenient for development but slow for production because packages are reinstalled on every restart.
Method 2: Custom Dockerfile (Production)
Build a custom image with docker compose build. Packages are installed at build time, so container startups are fast and builds are reproducible. This is the recommended approach for production.
Method 3: Mounting a requirements.txt
This approach mounts the requirements file and installs packages via a custom entrypoint. It is more maintainable than inline _PIP_ADDITIONAL_REQUIREMENTS but still reinstalls on every startup. Use it when you want a file-based approach without building a custom image.
Installing Airflow Provider Packages
Airflow provider packages (for AWS, GCP, Slack, etc.) add operators, hooks, and sensors. They are versioned separately from core Airflow and should be pinned to specific versions.
Installing System-Level Dependencies
Always switch to USER root for apt-get commands and back to USER airflow for pip install. The Airflow image runs as the airflow user by default for security.
Verifying Installed Packages
Common Pitfalls
- Using
_PIP_ADDITIONAL_REQUIREMENTSin production: This environment variable runspip installon every container startup, adding minutes to startup time and making builds non-reproducible (package versions may change between installs). Build a custom Docker image with a pinnedrequirements.txtfor production. - Forgetting to install packages in all Airflow services: The webserver, scheduler, and worker all need the same packages. Using the
x-airflow-commonYAML anchor ensures all services share the same image and environment. If one service is missing the package, tasks fail on that specific component. - Installing packages as root instead of the airflow user: Running
pip installas root installs packages to a different Python path than the airflow user sees. Always useUSER airflowbeforepip installin the Dockerfile, and only useUSER rootfor system-levelapt-getcommands. - Version conflicts with Airflow's pinned dependencies: Airflow pins specific versions of packages like SQLAlchemy, Flask, and Jinja2. Installing incompatible versions breaks Airflow. Use
pip install --constraintor check Airflow's constraints file athttps://raw.githubusercontent.com/apache/airflow/constraints-2.8.1/constraints-3.11.txt. - Not cleaning apt cache in Dockerfile: Leaving
apt-getcache in the Docker image unnecessarily inflates image size. Always add&& apt-get clean && rm -rf /var/lib/apt/lists/*to the sameRUNlayer asapt-get install.
Summary
- For development: use
_PIP_ADDITIONAL_REQUIREMENTSenvironment variable for quick package installation - For production: build a custom Docker image with a pinned
requirements.txt - Install system dependencies as
root, Python packages asairflowuser - Use
x-airflow-commonYAML anchors to ensure all services get the same packages - Pin package versions and respect Airflow's dependency constraints
- Verify installation with
docker compose exec ... pip listor Python import checks

