Email
Technology
Internet communication
Email address format
Online safety

What characters are allowed in an email address?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Email syntax is defined by internet standards, but real-world provider support is often narrower than the full standard. That is why an address may be technically valid yet rejected by a form or service. The practical answer requires separating specification rules from compatibility rules.

Basic Structure

An email address has two main parts:

  • local part before @
  • domain part after @

General form: local-part@domain

Both parts have their own character and placement rules.

Local Part Rules

In the common unquoted form, local part can include letters, digits, and selected symbols.

Commonly allowed symbols include:

  • !
  • #
  • $
  • %
  • &
  • '
  • *
  • +
  • -
  • /
  • =
  • ?
  • ^
  • _
  • `
  • |
  • ~
  • .

Dot rules in unquoted local part:

  • dot cannot be first
  • dot cannot be last
  • dots cannot be consecutive

Example valid addresses:

Example invalid unquoted local parts:

Quoted local parts can include additional characters, but many systems reject them in practice.

Domain Part Rules

The domain part follows DNS-oriented constraints.

Standard practical rules:

  • labels use letters, digits, and hyphen
  • label cannot start or end with hyphen
  • labels are separated by dots
  • top-level label is usually alphabetic in user-facing validation

Examples:

  • example.com
  • mail.example.org
  • sub-domain.example.net

Internationalized domain names are possible and are represented in DNS using Punycode.

Internationalized Email Addresses

Modern standards allow Unicode in email addresses when server infrastructure supports SMTPUTF8.

Examples can include non-Latin characters in local and domain parts. Compatibility, however, varies across providers, gateways, and legacy systems.

If your product sends critical transactional email, test internationalized addresses end to end with your mail provider before claiming full support.

Practical Validation Strategy

Do not rely on one massive regex as your only validator. A robust strategy has multiple stages:

  1. lightweight syntax screening
  2. normalization
  3. domain validation
  4. confirmation email for ownership

For Python applications, a standards-aware library is often safer than custom regex.

bash
python -m pip install email-validator
python
1from email_validator import EmailNotValidError, validate_email
2
3
4def validate_user_email(value: str) -> str:
5    try:
6        result = validate_email(value, check_deliverability=False)
7        return result.normalized
8    except EmailNotValidError as exc:
9        raise ValueError(f"invalid email: {exc}") from exc
10
11
12samples = [
13    "[email protected]",
14    "[email protected]",
15    "用户@例子.广告",
16]
17
18for item in samples:
19    try:
20        print(item, "=>", validate_user_email(item))
21    except ValueError as err:
22        print(item, "=>", err)

This code is runnable and gives cleaner results than hand-maintained regex for many cases.

Why Services Still Reject Some Valid Addresses

Provider-specific policies may reject technically valid addresses to reduce abuse, simplify account handling, or avoid legacy integration issues. Common restrictions include:

  • limiting local part to alphanumeric plus dot and underscore
  • blocking quoted local parts
  • disallowing Unicode addresses
  • requiring top-level domain allowlist checks

When building signup forms, decide whether your product prioritizes strict standard compliance or broad provider compatibility.

Security and UX Considerations

Email validation should balance security and user experience.

Recommended practices:

  • trim surrounding whitespace before validation
  • normalize case where appropriate for comparison
  • avoid over-restrictive client-side rules that block valid users
  • require verification email before account activation

Validation alone does not prove mailbox ownership. Verification does.

Common Pitfalls

A common pitfall is using an overly strict regex that rejects valid addresses with plus signs or internationalized domains. Another is using an overly permissive regex that accepts malformed dot placement or missing domain labels. Teams also confuse syntax validity with deliverability and ownership. Ignoring provider compatibility constraints can break onboarding in production. Finally, rejecting Unicode addresses without policy rationale can create unnecessary global user friction.

Summary

  • Email format rules are broader in standards than in many real-world providers.
  • Local part and domain part have different character and placement constraints.
  • Internationalized email is possible but depends on SMTPUTF8 support.
  • Prefer library-based validation plus verification email over complex custom regex.
  • Design validation rules around your product compatibility and security requirements.

Course illustration
Course illustration

All Rights Reserved.