Distributed Systems
Lamport Clock
Vector Clock
System Locking
Computing Networks

distributed systems, lamport and vector clock and locking

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Distributed systems cannot rely on one global physical clock, so event ordering and mutual exclusion require special mechanisms. Lamport clocks, vector clocks, and distributed locks solve related but different coordination problems. Understanding their boundaries is essential for building correct replicated workflows.

Why Wall-Clock Time Is Not Enough

Machine clocks drift and network delays reorder messages. Two events with close timestamps may still have opposite causal relationships. To reason about causality, systems use logical clocks based on message flow rather than physical time.

Coordination questions usually split into:

  • did event A happen-before event B
  • are events concurrent
  • who may enter critical section safely

Lamport and vector clocks answer ordering questions. Locks answer critical-section ownership.

Lamport Clock Basics

Each process keeps an integer counter. Local event increments counter. Sent messages carry current counter. Receiver sets local counter to max local and received, then increments.

python
1from dataclasses import dataclass
2
3@dataclass
4class LamportClock:
5    value: int = 0
6
7    def tick(self):
8        self.value += 1
9        return self.value
10
11    def send(self):
12        return self.tick()
13
14    def receive(self, remote_ts: int):
15        self.value = max(self.value, remote_ts) + 1
16        return self.value
17
18p1 = LamportClock()
19p2 = LamportClock()
20msg = p1.send()      # 1
21print(p2.receive(msg))

Lamport guarantees causal order implication: if A happened-before B, then timestamp A is smaller. Reverse implication is not guaranteed.

Vector Clock Basics

Vector clocks keep one counter per process. They can distinguish causal relation from true concurrency.

python
1from dataclasses import dataclass
2
3@dataclass
4class VectorClock:
5    pid: int
6    clock: list
7
8    def tick(self):
9        self.clock[self.pid] += 1
10        return self.clock.copy()
11
12    def send(self):
13        return self.tick()
14
15    def receive(self, remote):
16        self.clock = [max(a, b) for a, b in zip(self.clock, remote)]
17        self.clock[self.pid] += 1
18        return self.clock.copy()

Comparison rule:

  • A happened-before B if every component of A is less than or equal to B and at least one is strictly less
  • otherwise, if neither dominates, events are concurrent

This is useful for conflict detection in eventually consistent systems.

Distributed Locking Is a Different Layer

Logical clocks do not enforce mutual exclusion. If two nodes must not update shared state simultaneously, use distributed lock service with leases and ownership tokens.

Simplified lock pattern:

python
1import time
2import uuid
3
4class LeaseLock:
5    def __init__(self):
6        self.owner = None
7        self.expire = 0
8
9    def acquire(self, token, ttl):
10        now = time.time()
11        if self.owner is None or now >= self.expire:
12            self.owner = token
13            self.expire = now + ttl
14            return True
15        return False
16
17    def release(self, token):
18        if self.owner == token:
19            self.owner = None
20            self.expire = 0
21            return True
22        return False

Real deployments use systems like etcd, ZooKeeper, or Redis-backed primitives with stronger guarantees.

Combining Clocks and Locks Correctly

A practical architecture often uses both:

  • clocks for ordering metadata and conflict analysis
  • locks for short critical mutation windows

Example pattern:

  • attach vector timestamp to writes
  • acquire lock before committing state transition
  • release lock quickly
  • persist timestamp in logs for debugging race conditions

Do not treat lock timestamp as replacement for causal metadata.

Failure and Partition Considerations

Network partitions can break assumptions. Lease-based locks reduce permanent deadlocks but can still permit split-brain behavior if lock backend guarantees are weak. Clock metadata remains useful for post-failure reconciliation even when lock ownership was ambiguous.

Design reviews should define expected behavior under partition and delayed message replay, not only normal-path ordering.

Common Pitfalls

  • Using physical timestamps alone for cross-node causality decisions.
  • Assuming Lamport timestamps can detect concurrent events.
  • Holding distributed locks for long operations and increasing contention.
  • Releasing lock without ownership token validation.
  • Treating locks as replacement for conflict metadata in replicated systems.

Summary

  • Lamport clocks provide lightweight happened-before ordering hints.
  • Vector clocks detect causality and concurrency with richer metadata.
  • Distributed locks control critical-section access, not causal analysis.
  • Combining clocks with short lease-based locks improves correctness.
  • Partition-aware design and recovery strategy are essential in production.

Course illustration
Course illustration

All Rights Reserved.