Python
dataclasses
class inheritance
Python 3.7
object-oriented programming

Class inheritance in Python 3.7 dataclasses

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Python 3.7 introduced the dataclasses module, which automatically generates boilerplate methods like __init__, __repr__, and __eq__ for classes that primarily hold data. Dataclasses work with inheritance, but there is one critical ordering rule that trips up nearly every developer the first time they encounter it. Understanding this rule and its solutions will save you from a confusing TypeError at class definition time.

Dataclass Basics

Before looking at inheritance, let us review what a basic dataclass looks like and why it exists. The @dataclass decorator inspects the class body for field annotations and generates an __init__ method whose parameters match those fields in order:

python
1from dataclasses import dataclass
2
3@dataclass
4class Animal:
5    name: str
6    species: str
7    age: int

This single decorator replaces about a dozen lines of handwritten __init__, __repr__, and __eq__ code. You create instances just like any other class:

python
dog = Animal(name="Rex", species="dog", age=5)
print(dog)  # Animal(name='Rex', species='dog', age=5)

Fields can have default values:

python
1@dataclass
2class Animal:
3    name: str
4    species: str
5    age: int = 0  # default value

Just like function parameters, fields with defaults must come after fields without defaults. This rule matters a great deal when inheritance enters the picture.

Basic Dataclass Inheritance

A child dataclass can inherit from a parent dataclass. The generated __init__ combines the parent's fields followed by the child's fields:

python
1from dataclasses import dataclass
2
3@dataclass
4class Vehicle:
5    make: str
6    model: str
7
8@dataclass
9class Car(Vehicle):
10    num_doors: int
11    is_electric: bool
python
car = Car(make="Tesla", model="Model 3", num_doors=4, is_electric=True)
print(car)  # Car(make='Tesla', model='Model 3', num_doors=4, is_electric=True)

The __init__ signature becomes (self, make, model, num_doors, is_electric). Parent fields appear first, child fields appear second. This works perfectly as long as you follow the default-value ordering rule.

The Field Ordering Problem

Here is where most developers hit a wall. Suppose the parent class has a field with a default value, and the child class adds a field without one:

python
1@dataclass
2class Vehicle:
3    make: str
4    model: str
5    year: int = 2024  # has a default
6
7@dataclass
8class Car(Vehicle):
9    num_doors: int  # no default

This raises a TypeError at class definition time:

 
TypeError: non-default argument 'num_doors' follows default argument

The reason is that the combined __init__ would have the signature (self, make, model, year=2024, num_doors). In Python, you cannot have a parameter without a default after a parameter with a default. The dataclass decorator enforces this rule when it generates __init__.

Solutions to the Ordering Problem

Solution 1 -- Give All Child Fields Defaults

The most straightforward fix is to ensure every child field has a default value:

python
1from dataclasses import dataclass, field
2
3@dataclass
4class Vehicle:
5    make: str
6    model: str
7    year: int = 2024
8
9@dataclass
10class Car(Vehicle):
11    num_doors: int = 4
12    is_electric: bool = False

This works because the combined signature (self, make, model, year=2024, num_doors=4, is_electric=False) is valid. The downside is that you might want num_doors to be required, which this approach does not enforce.

Solution 2 -- Remove Defaults from Parent Fields

If you control the parent class, you can remove the defaults so all fields are required:

python
1@dataclass
2class Vehicle:
3    make: str
4    model: str
5    year: int  # no default
6
7@dataclass
8class Car(Vehicle):
9    num_doors: int
10    is_electric: bool

This is clean but forces every caller to provide year explicitly.

Solution 3 -- Use field() with default_factory

For mutable defaults (like lists or dicts), always use field(default_factory=...):

python
1from dataclasses import dataclass, field
2
3@dataclass
4class Vehicle:
5    make: str
6    model: str
7    features: list = field(default_factory=list)
8
9@dataclass
10class Car(Vehicle):
11    trim_levels: list = field(default_factory=list)

Using a bare [] as a default would raise a ValueError because mutable defaults are shared across instances. The field(default_factory=list) creates a new list for each instance.

Solution 4 -- Reorder with a Non-Dataclass Base (Python 3.10+)

In Python 3.10 and later, you can use kw_only=True to sidestep the ordering issue entirely:

python
1@dataclass(kw_only=True)
2class Vehicle:
3    make: str
4    model: str
5    year: int = 2024
6
7@dataclass(kw_only=True)
8class Car(Vehicle):
9    num_doors: int  # no default, but kw_only makes it valid
10    is_electric: bool = False

With kw_only=True, all fields become keyword-only arguments, so ordering no longer matters. This is the cleanest solution if you do not need to support Python 3.7-3.9.

Using post_init

The __post_init__ method runs immediately after the generated __init__ completes. It is the right place for validation or computed fields:

python
1from dataclasses import dataclass, field
2
3@dataclass
4class Vehicle:
5    make: str
6    model: str
7    year: int = 2024
8
9    def __post_init__(self):
10        if self.year < 1886:
11            raise ValueError("Automobiles were not invented before 1886")
12
13@dataclass
14class Car(Vehicle):
15    num_doors: int = 4
16    description: str = field(init=False)
17
18    def __post_init__(self):
19        super().__post_init__()  # call parent validation
20        self.description = f"{self.year} {self.make} {self.model}"
python
car = Car(make="Toyota", model="Camry", year=2023, num_doors=4)
print(car.description)  # 2023 Toyota Camry

Notice the explicit super().__post_init__() call. The dataclass decorator does not chain __post_init__ calls automatically, so you must call the parent's version yourself if the parent defines one.

Calling super().init

Because the @dataclass decorator generates __init__, you generally should not write your own __init__. If you do need custom initialization logic, put it in __post_init__ instead. If for some reason you override __init__ in a child, you need to call super().__init__() manually and pass all parent fields, which defeats much of the purpose of using dataclasses.

Common Pitfalls

  • Adding a child field without a default when the parent has fields with defaults. This is the single most common dataclass inheritance mistake.
  • Using mutable objects (lists, dicts, sets) as default values instead of using field(default_factory=...).
  • Forgetting to call super().__post_init__() in the child class when the parent defines its own __post_init__.
  • Assuming field(init=False) fields are automatically computed. You still need to set them in __post_init__.
  • Overriding __init__ directly instead of using __post_init__, which bypasses the benefits of the dataclass decorator.

Summary

  • Dataclass inheritance combines parent fields (first) with child fields (second) in the generated __init__.
  • The TypeError about non-default arguments following default arguments is caused by a child adding required fields after a parent defines optional ones.
  • Fix the ordering issue by giving child fields defaults, removing parent defaults, using field(default_factory=...), or using kw_only=True in Python 3.10+.
  • Use __post_init__ for validation and computed fields, and remember to call super().__post_init__() explicitly.
  • Prefer field(default_factory=list) over bare [] for mutable default values to avoid shared-state bugs.

Course illustration
Course illustration

All Rights Reserved.