URL parsing
web development
JavaScript
get protocol
extract hostname

Get protocol host name from URL

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

The safest way to get the protocol and host name from a URL is to parse it with a real URL parser instead of splitting the string manually. In modern JavaScript, the built-in URL class gives you the protocol, hostname, host, port, and other pieces with the correct edge-case handling.

Use The URL Class In JavaScript

Given a full URL string, create a URL object and read its properties:

javascript
1const value = "https://www.example.com:8080/path?q=1#top";
2const parsed = new URL(value);
3
4console.log(parsed.protocol); // "https:"
5console.log(parsed.hostname); // "www.example.com"
6console.log(parsed.host);     // "www.example.com:8080"
7console.log(parsed.port);     // "8080"

There are two easy details to remember:

  • 'protocol includes the trailing colon in JavaScript.'
  • 'hostname excludes the port, while host includes it.'

If you only want the scheme and the host name without the port, protocol plus hostname is the pair you want.

Avoid Manual String Splitting

A lot of buggy code starts with something like url.split("/"). That can seem fine for a happy-path input, but it breaks quickly for:

  • URLs with authentication info
  • IPv6 host addresses
  • custom ports
  • query strings and fragments
  • relative URLs

Using the platform parser is both shorter and more correct than reimplementing URL syntax rules by hand.

Relative URLs Need A Base

If the input is relative, the URL constructor needs a base URL.

javascript
1const relative = new URL("/docs/setup", "https://docs.example.com");
2
3console.log(relative.protocol); // "https:"
4console.log(relative.hostname); // "docs.example.com"
5console.log(relative.pathname); // "/docs/setup"

Without a base, new URL("/docs/setup") throws because the string is not an absolute URL.

That is an important distinction in browser and Node.js code that accepts both absolute and relative paths.

Browser Example

If you want the current page's values in the browser, window.location already exposes the same pieces:

javascript
console.log(window.location.protocol);
console.log(window.location.hostname);
console.log(window.location.host);

That is convenient when you are writing client-side code that needs to compare the current origin with another URL.

Equivalent Example In Python

If the task is language-agnostic, Python offers the same idea through urllib.parse.

python
1from urllib.parse import urlparse
2
3parsed = urlparse("https://www.example.com:8080/path?q=1")
4
5print(parsed.scheme)    # https
6print(parsed.hostname)  # www.example.com
7print(parsed.netloc)    # www.example.com:8080

The property names are different from JavaScript, but the principle is the same: parse first, then read the structured fields.

Protocol, Host, Hostname, And Origin

These terms are often confused, so it helps to separate them clearly:

  • Protocol: the scheme, such as https:
  • Hostname: the domain or IP address without the port
  • Host: the hostname plus port when present
  • Origin: protocol, hostname, and port together

In JavaScript:

javascript
1const parsed = new URL("https://api.example.com:8443/users");
2
3console.log(parsed.protocol); // "https:"
4console.log(parsed.hostname); // "api.example.com"
5console.log(parsed.host);     // "api.example.com:8443"
6console.log(parsed.origin);   // "https://api.example.com:8443"

Choosing the wrong property is a common source of subtle bugs in redirects, allowlists, and routing logic.

Common Pitfalls

The most common mistake is expecting protocol to be "https" in JavaScript. The actual value is "https:" with the colon.

Another mistake is using host when you meant hostname. If the URL has a port, host includes it.

Developers also run into failures when they try to parse a relative URL without providing a base. The constructor needs enough context to resolve the location.

Finally, manual string operations often fail on edge cases such as IPv6 addresses, usernames, or encoded characters. Use the built-in parser whenever possible.

Summary

  • In JavaScript, use new URL(value) and read protocol and hostname.
  • 'protocol includes the trailing colon, and host includes the port.'
  • Relative URLs require a base URL when parsing.
  • 'window.location already provides the current page values in the browser.'
  • Avoid manual string splitting because URL syntax has many edge cases.

Course illustration
Course illustration

All Rights Reserved.