Identifying verb tenses in python
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Identifying verb tenses in text is a common NLP task used in grammar checkers, language learning apps, and text analysis pipelines. Python's NLP libraries — NLTK, spaCy, and pattern — can detect verb forms through part-of-speech (POS) tagging. POS tags encode tense information, and mapping those tags to human-readable tense labels gives you a practical verb tense detector.
Penn Treebank Verb Tags
Most English POS taggers use the Penn Treebank tagset, which assigns specific tags to verb forms:
| Tag | Meaning | Example |
| VB | Base form | "walk", "be" |
| VBD | Past tense | "walked", "was" |
| VBG | Gerund / present participle | "walking", "being" |
| VBN | Past participle | "walked", "been" |
| VBP | Present tense, non-3rd person | "walk", "are" |
| VBZ | Present tense, 3rd person singular | "walks", "is" |
Using NLTK
NLTK provides POS tagging through nltk.pos_tag() after tokenization.
Mapping Tags to Tense Labels
Using spaCy
spaCy provides richer morphological analysis, including tense as a separate feature.
spaCy's morphological features distinguish tense from aspect, allowing you to detect progressive ("is walking"), perfect ("has walked"), and simple forms.
Detecting Compound Tenses
Simple POS tagging identifies individual verb forms but not compound tenses like "has been walking" (present perfect progressive). To detect these, analyze auxiliary chains.
Using the pattern Library
The pattern library can conjugate and lemmatize verbs, making it useful for tense detection by comparing forms.
Common Pitfalls
- Relying on POS tags alone for compound tenses — "has been walking" tags each word separately, so you need to analyze auxiliary verb chains to determine the full tense.
- Not downloading required NLTK data —
pos_tagfails silently or crashes withoutaveraged_perceptron_taggerandpunkt. - Confusing past tense (VBD) with past participle (VBN) — "walked" can be either, depending on context. POS taggers use surrounding context to disambiguate, but errors occur.
- Treating gerunds (VBG) as always present tense — "Walking is good exercise" uses "walking" as a noun (gerund), not a present-tense verb.
- Assuming spaCy and NLTK always agree — different models produce different tags for ambiguous cases. Pick one and be consistent.
Summary
- POS tags (VBD, VBZ, VBG, VBN, VBP, VB) encode verb tense information in the Penn Treebank tagset.
- Use NLTK's
pos_tagfor quick tense detection or spaCy for richer morphological analysis. - Map POS tags to human-readable tense labels with a lookup dictionary.
- Detect compound tenses by analyzing auxiliary verb chains in the dependency tree.
- Always validate tense detection against your specific domain text — POS taggers have error rates on informal or domain-specific language.

