file manipulation
tail command
programming
file reading
command line tools

Get last n lines of a file, similar to tail

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Getting the last n lines of a file is the programmatic version of the Unix tail command. The best implementation depends on file size. For small and medium files, reading line by line into a bounded buffer is simple and reliable. For very large files, reading backward from the end can be more efficient because it avoids scanning the whole file.

Simple and Practical: Use a Bounded Buffer

In Python, the easiest good solution is collections.deque with maxlen.

python
1from collections import deque
2
3def tail(path, n):
4    with open(path, "r", encoding="utf-8") as f:
5        return list(deque(f, maxlen=n))
6
7for line in tail("app.log", 3):
8    print(line, end="")

Why this works well:

  • memory stays bounded at n lines
  • the code is short and readable
  • it handles ordinary text files cleanly

For many real applications, this is already good enough.

Why Not Read the Entire File

A naive solution like:

python
with open("app.log") as f:
    lines = f.readlines()
print(lines[-3:])

works, but it loads the whole file into memory. That is fine for small files and a poor choice for large logs.

The deque solution still reads the whole file sequentially, but it does not retain all lines in memory.

Backward Reading for Huge Files

If the file is very large and you want something closer to how tail behaves internally, read from the end in chunks.

python
1def tail_from_end(path, n, chunk_size=1024):
2    with open(path, "rb") as f:
3        f.seek(0, 2)
4        file_end = f.tell()
5        data = b""
6        lines = []
7
8        while file_end > 0 and len(lines) <= n:
9            read_size = min(chunk_size, file_end)
10            file_end -= read_size
11            f.seek(file_end)
12            data = f.read(read_size) + data
13            lines = data.splitlines()
14
15        return [line.decode("utf-8") for line in lines[-n:]]

This avoids scanning from the start when only the tail is needed, though it is more complex and needs care with encodings and incomplete chunk boundaries.

Use the Real tail Command When Appropriate

If you are writing a script for a Unix-like environment and portability is not a concern, calling the real tail command may be simpler than reimplementing it.

python
1import subprocess
2
3result = subprocess.run(
4    ["tail", "-n", "5", "app.log"],
5    capture_output=True,
6    text=True,
7    check=True
8)
9
10print(result.stdout)

This is often perfectly reasonable in operations tooling, but it is less portable than a pure language-level implementation.

Follow Mode Is a Different Problem

Many people associate tail with tail -f, which keeps watching the file for new lines. That is not the same as "get the last n lines once."

For one-time tail behavior:

  • read the last n lines and stop

For follow mode:

  • keep the file open
  • seek to the end
  • poll or watch for new content

Those are related tasks, but they need different control flow.

Common Pitfalls

The most common mistake is using readlines() on very large files and unexpectedly consuming a lot of memory.

Another issue is forgetting about encodings when reading backward in binary mode. Chunk-based reverse reads are efficient, but text decoding becomes more delicate.

Some developers also assume all files end with a trailing newline. That is common, but not guaranteed, so your implementation should tolerate the final line being unterminated.

Finally, if you are on Windows or in a restricted environment, relying on the external tail command may not be portable.

Summary

  • For many use cases, deque(f, maxlen=n) is the simplest good tail implementation.
  • Avoid loading the entire file into memory unless the file is known to be small.
  • For huge files, reading backward from the end can be more efficient.
  • Calling the real tail command is fine when platform dependence is acceptable.
  • 'tail -f style follow behavior is a different feature from just reading the last n lines once.'

Course illustration
Course illustration

All Rights Reserved.