CodingLad
distributed systems

Chord: A Scalable Peer-to-Peer Lookup Protocol

Chord: A Scalable Peer-to-Peer Lookup Protocol
0 views
12 min read
#distributed systems

Chord: A Scalable Peer-to-Peer Lookup Protocol

Chord is a Distributed Hash Table (DHT) protocol that organizes nodes into a circular identifier ring of size 2n2^n, where each node and each key is assigned an n-bit identifier. This article provides a comprehensive guide to understanding how Chord works, from basic concepts to advanced operations like node joins and lookups.

Overview

Distributed Hash Tables (DHTs) are peer-to-peer (P2P) algorithms that store key-value pairs across a network of nodes without any central coordinator. Each node in the network can efficiently store and retrieve data by distributing keys across the participating nodes.

Distributed Hash Table Architecture

Peer-to-peer (P2P) systems often involve thousands or millions of nodes, and we want to distribute key–value pairs without any central coordinator. But this leads to a core question:

How do we efficiently find the node responsible for storing a particular key?

A Distributed Hash Table (DHT) like Chord solves this by hashing keys into an identifier space and distributing those keys across nodes. Chord guarantees that any node can locate the owner of any key in O(log N) messagesβ€”even if nodes frequently join and leave.

Chord Network

A Chord network is a distributed hash table (DHT) that organizes nodes in a circular ID ring so they can store and find key–value data efficiently and without any central server.

Chord Network

Key Concepts:

πŸ”’ 1. Identifier Space

All node IDs and key IDs lie in the range:

0≀ID<2n0 \leq ID < 2^n

The numbers wrap around to form a ring.

Example from the figure above:

  • Identifier Space = 4 bits
  • Maximum possible nodes = 24=162^4 = 16 (IDs from 0 to 15)
  • Active nodes in the diagram: 7 (nodes 0, 1, 3, 6, 7, 11, 14)

πŸ” 2. Key Placement Rule (Successor Rule)

To store a key-value pair:

  • A key is stored at the first node whose ID is β‰₯ keyID when walking clockwise around the ring.
  • This node is called the successor of the key.
  • If no node has a greater ID (i.e., wrap-around case), the successor is the first node in the ring.

Examples from the figure above:

  • Key 1 stored at node 1 (as 1 is the first active node β‰₯ 1)
  • Keys 2, 3 stored at node 3 (as 3 is the first active node β‰₯ 2 and β‰₯ 3)
  • Keys 4, 5, 6 stored at node 6 (as 6 is the first active node β‰₯ 4, 5, 6)
  • Key 7 stored at node 7 (as 7 is the first active node β‰₯ 7)
  • Keys 8, 9, 10, 11 stored at node 11 (as 11 is the first active node β‰₯ 8, 9, 10, 11)
  • Keys 12, 13, 14 stored at node 14 (as 14 is the first active node β‰₯ 12, 13, 14)
  • Key 15 (wrap-around case): No node β‰₯ 15, so wraps to node 0

🧭 3. Successor & Predecessor

Each node maintains:

  • a successor (next clockwise node)
  • a predecessor (previous counter-clockwise node)

These alone ensure correctness but make lookups slow (O(N)).

Example from the diagram above:

Successor:

  • Successor of 0 β†’ 1
  • Successor of 1 β†’ 3
  • Successor of 3 β†’ 6
  • Successor of 6 β†’ 7
  • Successor of 7 β†’ 11
  • Successor of 11 β†’ 14
  • Successor of 14 β†’ wraps β†’ 0

Predecessor:

The predecessor of a node is the previous active node counter-clockwise.

  • Predecessor of 1 β†’ 0
  • Predecessor of 3 β†’ 1
  • Predecessor of 6 β†’ 3
  • Predecessor of 7 β†’ 6
  • Predecessor of 11 β†’ 7
  • Predecessor of 14 β†’ 11
  • Predecessor of 0 β†’ 14

⚑ 4. Finger Table (Speeding Up Lookups)

To accelerate lookups, each node maintains a table of shortcuts:

For a node N, finger i points to:

N+2i(mod2n)N + 2^i \pmod{2^n}, for 0≀i<n0 \leq i < n

This reduces lookup time to:

O(log N)

We will learn how finger tables work clearly with a detailed example below.

πŸ”„ 5. Node Join

When a node joins:

  1. It contacts an existing node.
  2. Finds its correct position (successor and predecessor).
  3. Takes responsibility for a part of the ID range previously held by its successor.
  4. Keys within that range are migrated to it.
  5. The finger tables eventually stabilize.

We will learn how node joins work clearly with a detailed example below.

πŸ‘‹ 6. Node Departure

When a node leaves:

  1. It transfers its keys to its successor.
  2. Its predecessor points to its successor.
  3. Stabilization fixes the ring structure over time.

πŸ”§ 7. Stabilization

Since finger tables can become outdated:

  • Nodes periodically run stabilization protocols.
  • Fix successor/predecessor pointers.
  • Update finger table entries.

The slides mention that lookup() always undershoots, meaning lookups may initially find an earlier node, then require successor corrections.


Example

Chord Network Example

PART 1 β€” Identifier Space and Where Keys Are Stored

Identifier Space:

Let:

  • n = 6 bits
  • 26=642^6 = 64 possible identifiers
  • Node IDs range from 0 to 63

Example active node IDs:

N1, N8, N14, N21, N32, N38, N42, N48, N51, N56

These 10 nodes form the Chord ring.

Example Keys:

KeyHashed ID
key110
key254
key324
key438
key530
key658
key715
key81

Multiple keys can map to the same ID:

Since:

ID=H(key)β€Šmodβ€Š2nID = H(key) \bmod 2^n,

different strings may hash to the same identifier.

Example:

H("user123") mod 64 = 30
H("session-token-abc") mod 64 = 30

Both keys β†’ ID 30 β†’ stored at N32 in our ring.

Rule for Storing Keys:

A key is stored at the first node whose ID is β‰₯ keyID when moving clockwise on the ring.

If nothing is β‰₯ keyID, we wrap around to N1.

Let's assign each key:

  • key1 β†’ 10 β†’ stored at N14
  • key2 β†’ 54 β†’ stored at N56
  • key3 β†’ 24 β†’ stored at N32
  • key4 β†’ 38 β†’ stored at N38
  • key5 β†’ 30 β†’ stored at N32
  • key6 β†’ 58 β†’ wrap β†’ N1
  • key7 β†’ 15 β†’ stored at N21
  • key8 β†’ 1 β†’ stored at N1

Final Table:

KeyHashed IDStored at
key110N14
key254N56
key324N32
key438N38
key530N32
key658N1
key715N21
key81N1

PART 2 β€” Constructing the Finger Table (Example: Node N8)

A node with n = 6 bits has 6 finger entries, indexed 0–5.

Finger i points to:

the successor of (N+2i)β€Šmodβ€Š2n(N + 2^i) \bmod 2^n

Let's compute for N8.

Entry i = 0:

Start = 8+20=98 + 2^0 = 9

First node β‰₯ 9 = N14

β†’ FT[0] = N14

Entry i = 1:

Start = 8+21=108 + 2^1 = 10

First node β‰₯ 10 = N14

β†’ FT[1] = N14

Entry i = 2:

Start = 8+22=128 + 2^2 = 12

First node β‰₯ 12 = N14

β†’ FT[2] = N14

Entry i = 3:

Start = 8+23=168 + 2^3 = 16

First node β‰₯ 16 = N21

β†’ FT[3] = N21

Entry i = 4:

Start = 8+24=248 + 2^4 = 24

First node β‰₯ 24 = N32

β†’ FT[4] = N32

Entry i = 5:

Start = 8+25=408 + 2^5 = 40

First node β‰₯ 40 = N42

β†’ FT[5] = N42

Final Finger Table of N8:

EntryStartSuccessorFT[i]
0914N14
11014N14
21214N14
31621N21
42432N32
54042N42

Why Finger Tables Matter:

Finger entries make exponentially increasing jumps:

1 step β†’ 2 β†’ 4 β†’ 8 β†’ 16 β†’ 32 …

🟦 finger[0] = N14

Meaning:

β€œIf you need to move 1 step forward, the next responsible node is N14.”

This is essentially your successor.

🟩 finger[1] = N14

Meaning:

β€œIf you need to move 2 steps forward, the closest node at or after 10 is N14.”

It’s still the same because there are no nodes between 8 and 14.

🟧 finger[2] = N14

Meaning:

β€œIf you need to move 4 steps forward, go to N14.”

Again no nodes between 8 and 14.

🟨 finger[3] = N21

Meaning:

β€œFor jumps of 8 positions, the responsible node is N21.”

This is the first significantly bigger shortcut.

πŸŸ₯ finger[4] = N32

Meaning:

β€œFor jumps of 16 positions, the correct next hop is N32.”

This is a mid-range shortcut.

πŸŸͺ finger[5] = N42

Meaning:

β€œFor jumps of 32 positions, go to N42.”

This is a large jump.

Thus a lookup takes O(log N) messages instead of walking around the whole ring.


PART 3 β€” Lookup Example

Find successor(24) starting from N8:

We know beforehand that successor(24) = N32, but let's see how Chord finds it.

Step 1 β€” Check the simple case:

Is 24 in the interval (8, 14] ?

Values in that interval: 9–14.

24 is not in that range.

So we must use the finger table.

Step 2 β€” closest_preceding_finger(24):

From N8's finger entries:

  • N14 (14)
  • N14 (14)
  • N14 (14)
  • N21 (21) ← best candidate (<24<24, largest)
  • N32 (32)
  • N42 (42)

Largest ID <24< 24 is 21, so:

β†’ closest_preceding_finger = N21

N8 forwards the lookup to N21.

Step 3 β€” N21 Processes the Lookup:

Check if 24 ∈ (21, successor(N21)]

Is 24 in the interval (21, 32] ?

successor(N21) = N32

Interval = 22–32

24 ∈ this range β†’ match!

Thus:

successor(24) = N32

Lookup is finished.

Final Lookup Path

N8  β†’  N21  β†’  N32

Instead of the longer hop-by-hop path:

N8 β†’ N14 β†’ N21 β†’ N32

This is why Chord achieves O(log N) performance.


PART 4 β€” Node Join Example (N26 Joins the Chord Ring)

When a new node enters a Chord DHT, the system must update pointers and migrate keys so that the ring remains consistent. Chord's mechanisms ensure that everything happens efficiently and automatically, even if the joining node only knows one other random node.

PART 4.1 β€” Goal:

A new node N26 joins the system.

Correct position in the ring:

… N21 β†’ N26 β†’ N32 …

But we will analyze two cases:

  1. Simple version: N26 already knows N21
  2. Realistic version: N26 knows only N14 (not its actual predecessor)

The second version demonstrates the full Chord lookup mechanism.


PART 4 β€” SIMPLE VERSION (N26 knows N21)

Step 1 β€” N26 contacts N21:

Chord requires a joining node to know at least one node.

Assume N26 knows N21.

N26 β†’ JOIN β†’ N21

Step 2 β€” N21 determines N26's successor:

N21 runs:

successor(26)

Check if 26 is in the interval:

(21, 32]

Since 26 falls in that range:

βœ” successor(26) = N32

Step 3 β€” Initial pointer setup:

N26 now sets:

  • N26.successor = N32
  • N26.predecessor = N21

Temporary chain:

N21 β†’ N26 β†’ N32

But N21 still thinks its successor is N32.

This will be fixed by stabilization.

Step 4 β€” N32 transfers keys to N26:

N26 is responsible for keys in:

(predecessor(N26), N26] = (21, 26]

Keys in this range:

  • K24 β†’ 24

So N32 transfers K24 to N26.

After migration:

  • N26 stores: K24
  • N32 stores: K30, K38, … (anything β‰₯27)

Step 5 β€” Stabilization begins:

Nodes periodically run stabilize().

N21 runs stabilization and identifies N26 as its true successor.

Step 6 β€” Stabilization fixes pointers:

N21 performs:

x = successor.predecessor
x = N32.predecessor = N26

Since:

26 ∈ (21, 32]

N21 updates:

N21.successor = N26

The ring is updated.


Final State (after stabilization):

Successor pointers:

  • N21 β†’ N26
  • N26 β†’ N32
  • N32 β†’ N38

Predecessor pointers:

  • N26 ← N21
  • N32 ← N26

Keys:

  • N26 stores: K24
  • N32 stores: K30, …

N26 is now fully integrated.


PART 4 β€” REALISTIC VERSION: N26 Knows Only N14

This version demonstrates the full power of Chord's logarithmic lookup.

Setup:

Identifier space: 0–63 (6-bit)

Active nodes:

N1, N8, N14, N21, N32, N38, N42, N48, N51, N56

Correct position of N26:

… N21 β†’ N26 β†’ N32 …

But N26 only knows N14.


PART A β€” Step 1: N26 asks N14 to locate successor(26):

N26 sends:

find_successor(26)

to N14.

N14 now performs a Chord lookup for ID = 26.


PART B β€” Step 2: N14 starts the lookup:

First, N14 checks the simple case:

Is 26 in:

(14, successor(N14)=21] ?

14 < 26 ≀ 21 ?

β†’ NO

So N14 cannot determine the successor directly.

It must use its finger table.

A typical finger table for N14 contains entries like:

FT[0] = N21
FT[1] = N21
FT[*] eventually points to N32, N38, …

Now N14 finds the:

closest_preceding_finger(26)

Candidates < 26:

  • N21 (21)

This is the closest ID to 26 without passing it.

Thus N14 forwards the lookup to:

β†’ N21

PART C β€” Step 3: N21 completes the lookup:

N21 receives the lookup request for ID 26.

N21 checks:

Is 26 in (21, successor(N21)=32] ?

21 < 26 ≀ 32 ?

β†’ YES βœ”

Therefore:

successor(26) = N32

N21 responds to N14, which forwards the answer to N26.


PART D β€” Step 4: N26 joins using N32:

N26 now knows:

  • Its successor: N32
  • Its predecessor: Initially unknown (will be set via stabilization)

N26 sets:

N26.successor = N32

N26 notifies N32 of its presence.


PART E β€” Step 5: Key migration:

N26 is responsible for keys in the range:

(predecessor(N26), N26] = (21, 26]

Since N32 was previously responsible for keys β‰₯ 22 (up to 32), N32 transfers:

  • K24 (ID = 24) β†’ to N26

After migration:

  • N26 stores: K24
  • N32 stores: K30, K38, … (keys with ID β‰₯ 27)

PART F β€” Step 6: Stabilization:

N32's stabilization:

N32 periodically runs stabilize():

  1. N32 asks its successor (N38) for its predecessor.
  2. If that predecessor is between N32 and N38, N32 updates its successor.
  3. N32 notifies its successor (N38) about N32's existence.

But more importantly, N26 also runs stabilization:

N26's stabilization:

  1. N26 asks N32: "Who is your predecessor?"
  2. N32 responds: Initially, N32's predecessor might be N21 (before N26 joined).
  3. N26 checks: Is N21 between N26 and N32? No, because 21 < 26.
  4. However, N26 also notifies N32: "I am N26, and I think you are my successor."
  5. N32 receives this notification and checks: Is N26 between N32's predecessor and N32?
    • If N32's predecessor was N21, then: 21 < 26 < 32 β†’ YES
    • N32 updates: N32.predecessor = N26

N21's stabilization:

  1. N21 asks its successor (initially N32) for its predecessor.
  2. N32 responds: "My predecessor is N26."
  3. N21 checks: Is N26 between N21 and N32?
    • 21 < 26 < 32 β†’ YES
  4. N21 updates: N21.successor = N26
  5. N21 notifies N26: "I am N21, and I think you are my successor."
  6. N26 checks: Is N21 between N26's predecessor and N26?
    • If N26's predecessor is not set yet, or if N21 is closer, N26 updates: N26.predecessor = N21

Final State (Realistic Version):

After stabilization completes:

Successor pointers:

  • N21 β†’ N26
  • N26 β†’ N32
  • N32 β†’ N38

Predecessor pointers:

  • N21 ← N8 (or previous node)
  • N26 ← N21
  • N32 ← N26

Keys:

  • N26 stores: K24
  • N32 stores: K30, K38, …

Finger tables:

All nodes eventually update their finger tables to include shortcuts to N26 where appropriate.


Summary

  • βœ… Chord distributes keys using consistent hashing.
  • βœ… Each key is stored at the first node >= keyID.
  • βœ… Finger tables let nodes jump exponentially around the ring.
  • βœ… Lookups become fast: O(log N).
  • βœ… Even with node churn, Chord remains scalable and fault-tolerant.

Chord is a powerful protocol that enables efficient distributed storage and retrieval in peer-to-peer systems, making it ideal for applications like distributed file systems, content delivery networks, and decentralized applications.