Getting file names without extensions
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Getting a file name without its extension sounds trivial until edge cases show up. Hidden files, multiple extensions such as archive.tar.gz, and full paths all change what "without extension" is supposed to mean, so the best solution is usually a path-aware library rather than raw string slicing.
What Needs to Be Removed
A typical filename has two logical parts:
- the base name
- the extension after the last dot
For report.pdf, the base name is report. For archive.tar.gz, many libraries treat only .gz as the extension, leaving archive.tar as the stem. Whether that is correct depends on your use case.
Python Examples
os.path.splitext removes only the last extension:
Output:
pathlib gives the same basic behavior with a nicer API:
If you need to strip every suffix:
JavaScript Example
If you only need the last extension removed:
Using index > 0 avoids turning .env into an empty string. That is an important edge case many simple snippets miss.
Bash Example
Shell parameter expansion is concise:
This removes the shortest suffix matching .*. It is useful in scripts, but it is still string-based, so it does not magically understand filesystem semantics.
Paths Versus Names
Another common source of bugs is mixing the full path with the filename. If the input is /tmp/report.pdf, you may want:
- just the stem:
report - the path without extension:
/tmp/report
Libraries often let you choose clearly, which is safer than manually splitting on / and . yourself.
That difference matters in batch-processing scripts. A filename-only operation is fine for display logic, but if you intend to write output next to the source file, dropping the directory portion accidentally can put the new file in the wrong place.
That is one reason path objects tend to age better than ad hoc string code.
They also make later refactoring easier, because once you need parent directories, suffix lists, or normalized joins, the path object is already carrying the right semantics instead of forcing you to rebuild them from string fragments.
Hidden Files and Multi-Part Extensions
Files such as .gitignore and .env are not ordinary "name plus extension" cases. Many tools treat the leading dot as part of the name, not as an extension separator. Multi-part extensions create a second ambiguity: does .tar.gz count as one logical extension or two? The correct answer depends on the application, so make the rule explicit in code.
Common Pitfalls
- Removing text after the first dot instead of the last dot.
- Breaking hidden files such as
.env. - Ignoring full paths and operating on directory names by accident.
- Assuming
archive.tar.gzshould always becomearchive. - Using raw string operations when a path library already solves the problem.
Summary
- Use a path-aware library when possible.
- Most standard helpers remove only the last extension.
- Hidden files and multi-part extensions require explicit policy decisions.
- Keep path handling separate from filename handling.
- Small edge cases make this problem more subtle than it first appears.

