regular-expression
email

How can I validate an email address using a regular expression?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

You can validate the basic shape of an email address with a regular expression, but regex alone is not full email validation. It can catch obvious formatting mistakes such as missing @ or missing domain parts. It cannot prove that the address exists, accepts mail, or even follows every obscure edge case allowed by the relevant specifications.

A Practical Regex for Common Cases

For most application forms, a pragmatic pattern is enough:

regex
^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$

This allows:

  • a typical local part before @
  • a domain containing dots and hyphens
  • a top-level domain with at least two letters

It intentionally favors practical readability over perfect RFC coverage.

Example in Python

python
1import re
2
3EMAIL_RE = re.compile(r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$")
4
5def is_valid_email(value: str) -> bool:
6    return bool(EMAIL_RE.fullmatch(value))
7
8print(is_valid_email("[email protected]"))     # True
9print(is_valid_email("bad-address"))         # False
10print(is_valid_email("missing@domain"))      # False

Use fullmatch rather than match so the whole string must satisfy the pattern.

Example in JavaScript

javascript
1const emailRe = /^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$/;
2
3function isValidEmail(value) {
4  return emailRe.test(value);
5}
6
7console.log(isValidEmail("[email protected]"));
8console.log(isValidEmail("bad-address"));

This is enough for many frontend validation flows where the goal is fast user feedback, not protocol-perfect parsing.

Why Full RFC Validation Is a Trap

The full email syntax rules are much broader and stranger than most applications actually want to support. There are valid addresses that surprise people, including quoted local parts and unusual domain arrangements. A fully RFC-compliant regex is hard to read, hard to maintain, and often not aligned with what your mail provider or business rules actually accept.

In practice, most applications should prefer:

  1. a simple syntax check
  2. server-side normalization and validation
  3. confirmation through a verification email

That is more robust than trying to encode the entire email specification in one regex.

Regex Can Only Validate Shape

Even a good regex cannot answer important operational questions:

  • does the domain exist
  • can the domain receive mail
  • is the mailbox real
  • is the address disposable or blocked by policy

If your business process depends on contactability, the real validation step is sending a confirmation email or using a trusted validation service.

Tighten or Loosen the Pattern Based on Product Rules

The right regex depends on your product.

If you want to ban plus-addressing or certain domains, that is not “email validity” in the general sense. That is an application-specific policy and should be treated explicitly.

For example, a stricter corporate-only rule might be:

regex
^[A-Za-z0-9._%+-]+@example\.com$

That is not a better general email validator. It is simply a different business rule.

Add Length Checks Separately

Length constraints are better handled outside the regex.

python
1def is_reasonable_email(value: str) -> bool:
2    if len(value) > 254:
3        return False
4    return bool(EMAIL_RE.fullmatch(value))

Keeping length logic separate makes the regex easier to read and easier to modify later.

Normalize Input Before Validation

Users often paste addresses with leading or trailing spaces. Trim the value before applying the regex.

python
def is_valid_email(value: str) -> bool:
    value = value.strip()
    return bool(EMAIL_RE.fullmatch(value))

If your system treats domain names case-insensitively, normalization may also include lowercasing the domain portion before storage or comparison.

Server-Side Validation Still Matters

Frontend regex checks are useful for immediate feedback, but they are not a security or integrity boundary. Always validate again on the server because client-side code can be bypassed or altered.

This is especially important when the address becomes a login identifier or a destination for account recovery messages.

Common Pitfalls

The biggest mistake is claiming that one regex can fully validate every legitimate email address and every delivery scenario. Another is using an unreadable “perfect” pattern that nobody on the team can maintain. Developers also often forget to trim whitespace before testing, which causes needless rejection of otherwise valid user input. Finally, client-side regex alone is not enough if the application actually depends on a reachable mailbox.

Summary

  • A regex can validate the common shape of an email address, not the full reality of deliverability.
  • A practical pattern is usually better than a massive RFC-style regex.
  • Use fullmatch or equivalent whole-string validation.
  • Normalize and length-check input in addition to regex testing.
  • If the email must be real, the final validation step is verification, not regex.

Course illustration
Course illustration

All Rights Reserved.