Binary Search Tree Deep Dive
Understanding BST operations through interactive visualizations
BST Foundation
What is a Binary Search Tree?
A Binary Search Tree (BST) is a hierarchical data structure where:
- Each node has at most two children
- Left child < Parent < Right child
- All operations average O(log n) time
- In-order traversal gives sorted sequence
BST Search Operation
How Search Works
Starting from root, compare target with current node:
- If equal: Found!
- If target < current: Go left
- If target > current: Go right
- If null reached: Not found
BST Insert Operation
Building a Tree Step by Step
Watch how values [8,3,10,1,6,14,4,7,13] build our BST:
BST Insert Sequence
8 → (3,10) → (1,6) → (14) → (4,7) → (13)
Each value finds its correct position maintaining BST property
Answer: Between 4 and 6 as right child of 4
BST Delete Operation
Three Deletion Cases
- Leaf node: Simply remove
- One child: Replace with child
- Two children: Replace with inorder successor
BST Traversals
Three Traversal Methods
| Traversal | Order | Use Case |
|---|---|---|
| Inorder | Left → Root → Right | Sorted sequence |
| Preorder | Root → Left → Right | Tree copy |
| Postorder | Left → Right → Root | Tree deletion |
DFS on Graphs
DFS on Complex Graph Structure
DFS works on graphs similar to tree traversal:
- Stack-based execution shows deep exploration
- Creates a spanning tree structure
- Back edges connect to already visited vertices
- Similar to pre-order traversal pattern
Answer: Pre-order traversal (visit node before children)
Interactive BST & DFS Practice
Starting from the BST constructed by inserting the keys [8, 3, 10, 1, 6, 14, 4, 7, 13] in that order:
- Delete the keys in the order 3, 14, 8.
- For each deletion, identify whether it is a leaf, one-child, or two-child case.
- Draw the resulting tree after each deletion and give the final in-order traversal.
Idea: Apply the three BST deletion cases and carefully update the tree after each operation.
-
Step 1 – Original tree.
This satisfies the BST property: all left keys < parent < all right keys.
-
8
-
3
- 1
-
6
- 4
- 7
-
10
-
14
- 13
-
14
-
3
-
8
-
Step 2 – Delete 3 (two-child case).
- Node 3 has two children (1 and 6) ⟶ use its in-order successor.
- In-order successor = minimum of right subtree of 3 ⟶ node 4.
- Copy key 4 into node 3, then delete the leaf node 4 under node 6.
-
8
-
4
- 1
-
6
- 7
-
10
- 13
-
4
-
Step 3 – Delete 14 (one-child case).
- Node 14 has a single left child 13.
- Replace 14 by its child 13.
-
8
-
4
- 1
-
6
- 7
-
10
- 13
-
4
-
Step 4 – Delete 8 (two-child case at the root).
- Root 8 has two children (4 and 10).
- In-order successor is the minimum in the right subtree ⟶ node 10.
- Copy 10 into the root, then delete the original node 10 (one-child node with child 13).
-
10
-
4
- 1
-
6
- 7
- 13
-
4
-
Step 5 – Final in-order traversal.
In-order (Left → Root → Right) on the final tree gives:
[1, 4, 6, 7, 10, 13]
Using the original BST from Question 1 (before any deletions), trace the search paths for keys 7 and 2.
- List the nodes visited for each search in order.
- Count the number of key comparisons.
- Explain why unsuccessful searches in a BST still take O(h) time (h = tree height).
Idea: Follow the rule “go left if the key is smaller, right if it is larger”.
-
Step 1 – Search for 7 (successful).
Path for key 7:
- Compare with 8 ⟶ 7 < 8 ⟶ go left to 3.
- Compare with 3 ⟶ 7 > 3 ⟶ go right to 6.
- Compare with 6 ⟶ 7 > 6 ⟶ go right to 7.
- Compare with 7 ⟶ match found.
Visited nodes: 8 → 3 → 6 → 7 ⇒ 4 key comparisons.
-
Step 2 – Search for 2 (unsuccessful).
Path for key 2:
- Compare with 8 ⟶ 2 < 8 ⟶ go left to 3.
- Compare with 3 ⟶ 2 < 3 ⟶ go left to 1.
- Compare with 1 ⟶ 2 > 1 ⟶ go right to
null. - Reaching
nullmeans 2 is not in the tree.
We did 3 key comparisons (with 8, 3, 1) plus the final null check.
-
Step 3 – Relating cost to height.
- Each comparison takes us one level down the tree.
- The longest possible path from the root has length equal to the tree height h.
- Therefore, both successful and unsuccessful BST searches run in O(h) time.
- If the tree is balanced, h = O(log n); if it is skewed, h = O(n).
Consider inserting the keys [1, 2, 3, 4, 5, 6, 7] into:
- (a) An initially empty BST by inserting them in that order.
- (b) A perfectly balanced BST built from the same keys (for example, by inserting the median first, then recursively).
-
Step 1 – Case (a): Sorted insertion → skewed tree.
Insert 1, 2, 3, 4, 5, 6, 7 into an empty BST in that order:
-
1
-
2
-
3
-
4
-
5
-
6
- 7
-
6
-
5
-
4
-
3
-
2
- Each new key is larger than all previous ones ⟶ always inserted as the right child of the bottom node.
- The tree degenerates into a linked list of length 7.
- Height hskewed = 6 (root at level 0, leaf 7 at level 6).
-
1
-
Step 2 – Case (b): Perfectly balanced BST.
Take 4 as root (the median), then 2 and 6 as its children, etc. One possible perfectly balanced BST is:
-
4
-
2
- 1
- 3
-
6
- 5
- 7
-
2
- All levels except possibly the last are completely full.
- Height hbalanced = 2 (levels 0, 1, 2).
-
4
-
Step 3 – Impact on search complexity.
- Worst-case search in a BST is proportional to the height h.
- Skewed tree: h = 6 ⇒ O(7) = O(n) steps in the worst case.
- Balanced tree: h = 2 ⇒ O(log₂ 7) steps ≈ O(log n).
- This is why self-balancing trees (AVL, Red-Black) are so valuable: they prevent the O(n) worst-case behavior.
Fast Data Retrieval with O(1) Average Time
An introduction to hash tables and their importance in computer science
Why Hash Tables Matter
Hash tables allow very fast retrieval of data regardless of dataset size.
- Problem solved: Linear search takes O(n) time
- Solution: Array access is O(1)
- Access time is independent of array size
Hash Function Fundamentals
Calculating the Index
For numeric keys:
For alphanumeric keys:
Hash Function Example: Cartoon Characters
Array size = 11
| Character | Calculation | Index |
|---|---|---|
| FRED | F(70)+r(114)+e(101)+d(100) = 385 mod 11 | 0 |
| TWEETY | T(84)+w(119)+e(101)+e(101)+t(116)+y(121) = 642 mod 11 | 4 |
| BUGS | B(66)+u(117)+g(103)+s(115) = 401 mod 11 | 5 |
| SCOOBY | S(83)+c(99)+o(111)+o(111)+b(98)+y(121) = 623 mod 11 | 7 |
Collisions and Resolution
Collision: When two different keys hash to the same index
Linear Probing
If calculated position is occupied, find the next available slot:
- Check index 6: occupied (Daffy)
- Check index 7: occupied (Scooby)
- Check index 8: empty! ✓ Place Porky here
Alternative Collision Strategies
- Quadratic Probing: Use quadratic distances (1², 2², 3²) to reduce clustering
- Plus-3 Rehash: Check every 3rd position
- Double Hashing: Apply second hash function to determine step size
- Chaining: Each position has linked list for colliding items
Load Factor and Performance
Load Factor = Number of Items / Array Size
Best practice: Keep load factor ≤ 0.7 (70% occupancy)
Dynamic Resizing
When load factor exceeds threshold:
- Create new array (typically 2× size)
- Rehash all items into new array
- Replace old array with new array
Performance Analysis
| Scenario | Time Complexity | Explanation |
|---|---|---|
| Best Case | O(1) | No collisions, direct access |
| Average Case | O(1) | Good hash function, load factor < 0.7 |
| Worst Case | O(n) | Many collisions or poor hash function |
Hash Function Design
Good Hash Function Characteristics:
- Deterministic (same input → same output)
- Fast to compute
- Distributes keys uniformly
- Uses all available address space
- Minimizes collisions
Perfect Hashing
Perfect hash function: Produces unique index for each key
- Requirements: All keys must be known in advance
- Use cases: Compiler keywords, fixed configuration mappings
Real-World Applications
Examples in programming languages:
- Java: HashMap, HashSet
- Python: dict, set
- JavaScript: Object, Map, Set
- C++: unordered_map, unordered_set
Interactive Hash Table Practice
Using a hash table of size 11 and linear probing, insert the keys in this order: FRED, TWEETY, BUGS, SCOOBY, DAFFY, PORKY.
- Show how the table evolves as you insert the keys.
- Count the collisions per key.
- Compute the final load factor.
-
Step 1 – Start with an empty table (size = 11).
All slots are empty; load factor = 0/11 = 0.0—1—2—3—4—5—6—7—8—9—10—
-
Step 2 – Insert FRED, TWEETY, BUGS, SCOOBY, DAFFY.
- FRED → 0: slot 0 empty ⟶ place FRED at 0 (0 collisions).
- TWEETY → 4: slot 4 empty ⟶ place TWEETY at 4 (0 collisions).
- BUGS → 5: slot 5 empty ⟶ place BUGS at 5 (0 collisions).
- SCOOBY → 7: slot 7 empty ⟶ place SCOOBY at 7 (0 collisions).
- DAFFY → 6: slot 6 empty ⟶ place DAFFY at 6 (0 collisions).
So far, 5 items, load factor = 5/11 ≈ 0.45.0FRED1—2—3—4TWEETY5BUGS6DAFFY7SCOOBY8—9—10— -
Step 3 – Insert PORKY with linear probing.
- PORKY → 5: slot 5 contains BUGS ⟶ collision #1.
- Probe index 6: contains DAFFY ⟶ collision #2.
- Probe index 7: contains SCOOBY ⟶ collision #3.
- Probe index 8: empty ⟶ place PORKY at 8.
0FRED1—2—3—4TWEETY5BUGS6DAFFY7SCOOBY8PORKY9—10—- PORKY incurred 3 collisions before finding slot 8.
- Final number of items = 6 ⇒ load factor = 6/11 ≈ 0.545.
Using the final hash table from Question 1, trace the probe sequence for:
- A successful search for SCOOBY.
- An unsuccessful search for some key X that hashes initially to index 5 (same cluster).
-
Step 1 – Successful search for SCOOBY.
From Question 1, the relevant slots are:
5BUGS6DAFFY7SCOOBY8PORKYhash(SCOOBY) = 7⟶ probe slot 7 first.- Slot 7 contains SCOOBY ⟶ found after 1 probe.
-
Step 2 – Unsuccessful search for X with hash(X) = 5.
Linear probing sequence starting at index 5:
- Probe 5: BUGS (≠ X) ⟶ continue.
- Probe 6: DAFFY (≠ X) ⟶ continue.
- Probe 7: SCOOBY (≠ X) ⟶ continue.
- Probe 8: PORKY (≠ X) ⟶ continue.
- Probe 9: empty ⟶ stop; X not in table.
The unsuccessful search touches 5 slots (indices 5, 6, 7, 8, 9).
-
Step 3 – Takeaway about cost.
- Successful search time is proportional to the length of the probe sequence up to the matching key.
- Unsuccessful search must scan through an entire cluster until it hits an empty slot.
- As load factor grows and clusters form, the average probe length (and thus cost) increases.
Suppose you implement a hash table with:
- Initial capacity = 8.
- Load factor threshold = 0.7 (resize when items / capacity > 0.7).
- Resizing strategy: double capacity (8 → 16 → 32 → ...), rehash all items.
-
Step 1 – Track load factor as we insert.
- Capacity = 8, threshold = 0.7 ⟶ max allowed items before resize ≈ 0.7 × 8 = 5.6.
- So after inserting the 6th item, load factor becomes 6/8 = 0.75 > 0.7 ⟶ trigger resize.
Events so far:
- Insert items 1–6: 6 hash computations (one per insertion).
- After item 6, resize to capacity 16 and rehash the 6 existing items (6 more computations).
-
Step 2 – Continue inserting up to 12 items.
- Now capacity = 16.
- Threshold = 0.7 ⇒ max items before next resize ≈ 0.7 × 16 = 11.2.
- We currently have 6 items; we will insert items 7–12 ⟶ 6 more insertions.
During these insertions:
- For each of the 6 new items, we compute its hash once ⟶ 6 more hash computations.
- After inserting the 12th item, load factor = 12/16 = 0.75 > 0.7, so another resize would be triggered next, but we stop at 12 items per the question.
-
Step 3 – Total hash computations and amortized cost.
- 6 hashes for initial inserts (items 1–6).
- 6 hashes to rehash those 6 items when growing from 8 → 16.
- 6 hashes for inserts 7–12.
Total = 6 + 6 + 6 = 18 hash executions for 12 insert operations.
- Average (amortized) hash computations per insert = 18 / 12 = 1.5.
- This shows that even with occasional expensive resizes, the amortized cost per insertion remains O(1).
Red-Black Trees: Balanced BST with O(log n) Guarantee
CSE310 Data Structures and Algorithms: Advanced Tree Balancing
The Problem: Unbalanced BSTs
| Aspect | Unbalanced BST (Worst Case) | Red-Black Tree (Guaranteed) |
|---|---|---|
| Height | O(n) | O(log n) |
| Insert | O(n) | O(log n) with ≤2 rotations |
| Delete | O(n) | O(log n) with ≤3 rotations |
Red-Black Tree Definition
A Red-Black Tree is a Binary Search Tree with an additional color attribute (RED or BLACK) on each node that enforces specific structural properties.
RB Tree Properties
Property 1: Color Attribute
Each node is either RED or BLACK
Property 2: Root is Black
The root node is always BLACK
Property 3: Null Leaves are Black
All NIL (null) leaves are BLACK
Property 4: Red Node Constraint
If a node is RED, then both its children must be BLACK (no two consecutive red nodes)
Property 5: Black Height Property
All paths from any node to its descendant leaves contain the same number of BLACK nodes
Why These Properties Ensure Balance
- Property 4 prevents long chains - no chain of 3 nodes possible without violating constraints
- Property 5 ensures the tree doesn't become skewed
- Result: Height ≤ 2·lg(n+1)
Since all paths have same # of black nodes, and red nodes can't be consecutive, the tree remains approximately balanced
Rotations: The Balancing Mechanism
Rotation is a local restructuring operation that maintains BST property while changing tree structure.
Left Rotation on x:
- Right child y moves up
- y's left subtree becomes x's right subtree
- Maintains BST property: x < y
Right Rotation on x:
- Left child y moves up
- y's right subtree becomes x's left subtree
- Mirror image of left rotation
Insertion in Red-Black Trees
Strategy: Always insert as RED because this is less likely to violate Property 5 (black height)
The Problem:
New red node might have red parent (violates Property 4)
The Solution:
Walk up tree fixing violations - three main cases based on uncle's color and node position
Case 1: Uncle is RED (Recoloring Case)
Parent P and uncle U are both RED, grandparent G is BLACK, and new node N is RED under P.
Before fix-up
-
G
-
P
- N
- U
-
P
After recolor
-
G
-
P
- N
- U
-
P
- Recolor parent P and uncle U from RED → BLACK.
- Recolor grandparent G from BLACK → RED.
- Now treat G as the newly inserted node and move the “fix-up” pointer up to G.
- If G is the root, recolor it BLACK at the end to satisfy Property 2.
Case 2: Uncle BLACK, Triangle Configuration
Uncle U is BLACK, and node N forms a “triangle” with parent P and grandparent G. Example: N is the right child of a left child (left-right) or left child of a right child (right-left).
Before fix-up (left-right example)
-
G
-
P
- N
- U
-
P
N is right child of P, P is left child of G (triangle)
After first rotation (convert to Case 3)
-
G
-
N
- P
- U
-
N
Rotate left around P ⟶ N becomes child of G
- Because N and P form a triangle with G, perform a rotation at P to “straighten” the path.
- For the left-right case:
- Rotate left around P so that N moves up and P becomes N’s left child.
- After this rotation, the configuration becomes a “line” (N is now directly under G).
- Relabel N as the new “inserted” node and fall through to Case 3 on the new configuration.
Case 3: Uncle BLACK, Line Configuration
Uncle U is BLACK, and node N, parent P, and grandparent G form a straight line (e.g., left-left or right-right).
Before fix-up (left-left example)
-
G
-
P
- N
- U
-
P
After rotation & recolor
-
P
- N
-
G
- U
- Perform a rotation at G to bring P up (right rotation in left-left case, left rotation in right-right case).
- After the rotation:
- Color P BLACK.
- Color G RED.
- This fixes the “red-red” violation and preserves the black-height property.
- The path from the root down remains short, maintaining O(log n) height.
Insertion Complexity:
- Each iteration either stops (Cases 2–3) or moves z up 2 levels (Case 1)
- Maximum iterations: O(log n) because tree height is O(log n)
- Total time: O(log n) iterations × O(1) per iteration = O(log n)
Deletion in Red-Black Trees
Why it's complex: Deleting a BLACK node violates black-height property
The Core Problem:
When a BLACK node is deleted, one path loses a black node. This violates Property 5 (black-height constraint)
The Solution:
Mark replacement node as "double-black" - conceptually carrying an extra black. Propagate this mark up until it can be eliminated through recoloring and rotations.
Case 1: Sibling is RED
Node x is double-black, its sibling S is RED, and parent P is BLACK. We rotate to transform into a case where the sibling is BLACK.
Before fix-up (x is left child)
-
P
- x
- S
After rotation & recolor
-
S
-
P
- x
-
P
- Recolor S to BLACK and P to RED.
- If x is a left child, perform a left rotation on P (mirror: right rotation if x is a right child).
- After rotation, x still has “double-black”, but its new sibling is BLACK.
- Now you are in one of Cases 2–4 with a BLACK sibling and can continue the fix-up.
Case 2: Sibling BLACK, Both Sibling's Children BLACK
Sibling S is BLACK and both of S's children are BLACK (or NIL). We push the extra black up.
Before
-
P
- x
-
S
- A
- B
After recolor (x still conceptually double-black)
-
P
- x
-
S
- A
- B
- Recolor sibling S from BLACK → RED.
- “Remove” one black from both x and S by moving the double-black problem up to the parent P.
- If P was RED, recoloring it to BLACK ends the fix-up.
- If P was BLACK, it now becomes double-black and you continue the process at P.
Case 3: Sibling BLACK, Near Child RED, Far Child BLACK
Sibling S is BLACK, the child of S closer to x (the “near” child) is RED, and the “far” child is BLACK. We rotate and recolor to transform into Case 4.
Before (x is left child, near child = SL)
-
P
- x
-
S
- SL
- SR
After rotation around S
-
P
- x
-
SL
- S
- SR
- Recolor SL to BLACK and S to RED.
- Rotate right around S (if x is left child; mirror with left rotation if x is right child).
- Now the new sibling of x has a RED far child and BLACK near child ⟶ configuration of Case 4.
- Proceed to Case 4 using this new configuration.
Case 4: Sibling BLACK, Far Child RED
Sibling S is BLACK and its far child (farther from x) is RED. This is the final case that lets us remove the double-black and restore all properties with a single rotation.
Before (x is left child, far child = SR)
-
P
- x
-
S
- SL
- SR
After rotation & recolor
-
S
-
P
- x
- SL
- SR
-
P
- Set S’s color to P’s color (copy parent’s color down into S).
- Recolor P and SR to BLACK.
- Perform a left rotation at P (if x is left child; mirror with right rotation if x is right child).
- This removes the double-black on x and restores all RB properties; the fix-up terminates.
Performance Summary
| Operation | Time Complexity | Details |
|---|---|---|
| Search | O(log n) | Height is O(log n), standard BST search |
| Insertion | O(log n) | BST insert O(log n) + fixup O(log n), ≤2 rotations |
| Deletion | O(log n) | BST delete O(log n) + fixup O(log n), ≤3 rotations |
| Space | O(n) | One color bit per node + parent pointer |
Real-World Applications
Java Collections:
- java.util.TreeMap - sorted key-value storage
- java.util.TreeSet - sorted set with O(log n) operations
C++ STL:
- std::map, std::set, std::multimap, std::multiset
System Software:
- CFS Scheduler: Linux uses RB trees for process scheduling
- Database Indexing: B+ trees and RB tree variants for in-memory indexing
RB Trees vs AVL Trees
| Aspect | Red-Black Tree | AVL Tree |
|---|---|---|
| Balance | h ≤ 2·lg(n+1) | h ≤ 1.44·lg(n) |
| Insert Rotations | ≤2 | O(log n) |
| Delete Rotations | ≤3 | O(log n) |
| Best For | Frequent updates | Search-heavy workloads |
References
- CLRS: "Introduction to Algorithms" Chapter 13
- Visualizations: VisuAlgo and Programiz have excellent step-by-step animations
- Practice: Implement insertion and deletion fixup to master these concepts
Interactive Red-Black Tree Practice
Starting from an empty Red-Black tree, insert the keys [41, 38, 31, 12, 19, 8] using the standard RB insertion algorithm (new node is RED, then fix-up).
- For each insertion, state which of the three insertion cases (or their mirror) applies.
- Identify when rotations occur.
- Show the final tree with correct colors.
-
Step 1 – Insert 41.
- Tree was empty; 41 becomes the root.
- Root must be BLACK (Property 2).
- 41
-
Step 2 – Insert 38.
- 38 < 41 ⟶ insert as left child of 41, color RED.
- Parent (41) is BLACK, so no properties are violated.
-
41
- 38
-
Step 3 – Insert 31 (first rotation).
- 31 < 41, 31 < 38 ⟶ left child of 38, colored RED.
- Violation: 31 (RED) with parent 38 (RED) ⟶ Property 4 break.
- Uncle of 31 is NIL (BLACK) and the configuration is a straight line (31 is left child of left child).
- This is Case 3 (line configuration): right-rotate around 41 and recolor.
After fix-up:
- 38 becomes the new root (BLACK).
- 31 and 41 become its RED children.
-
38
- 31
- 41
-
Step 4 – Insert 12 (recoloring case).
- 12 < 38, 12 < 31 ⟶ left child of 31, color RED.
- Parent 31 is RED and uncle 41 is RED ⟶ Case 1 (uncle RED).
- Recolor parent and uncle to BLACK, grandparent (38) to RED.
- Because 38 is the root, we recolor it back to BLACK at the end.
Resulting tree:
-
38
-
31
- 12
- 41
-
31
-
Step 5 – Insert 19 (rotation + recolor).
- 19 < 38, 19 < 31, 19 > 12 ⟶ right child of 12 (RED).
- Parent 12 is RED, uncle (right child of 31) is NIL (BLACK).
- Configuration is a triangle (left-right) w.r.t. grandparent 31 ⟶ Case 2 then Case 3.
- First, rotate left around 12, then rotate right around 31 with recoloring.
After fix-up, the left subtree of 38 becomes:
- 19 (BLACK) as left child of 38.
- 12 (RED) as left child of 19.
- 31 (RED) as right child of 19.
-
Step 6 – Insert 8 (final recoloring).
- 8 < 38, 8 < 19, 8 < 12 ⟶ left child of 12, color RED.
- Parent 12 is RED and uncle 31 is RED ⟶ Case 1 again.
- Recolor 12 and 31 to BLACK, recolor 19 to RED.
- Parent of 19 is 38 (BLACK), so the loop terminates and 38 stays BLACK.
Final Red-Black tree:
-
38
-
19
-
12
- 8
- 31
-
12
- 41
-
19
All RB properties hold: root is BLACK, no two REDs in a row, and every path from a node to NIL has the same number of BLACK nodes.
Using the final Red-Black tree from Question 1, compute the black height
bh(x) for each internal node x, where bh(x) is the number of BLACK
nodes on any simple path from x down to a NIL leaf (excluding x itself).
- Verify that all root-to-leaf paths have the same black height.
- Explain how this implies the tree is approximately balanced.
-
Step 1 – Redraw the final tree with colors.
We treat every missing child pointer as a NIL leaf (BLACK).
-
38
-
19
-
12
- 8
- 31
-
12
- 41
-
19
-
38
-
Step 2 – Compute black height bottom-up.
- All NIL children are BLACK with bh(NIL) = 0 by definition.
- Node 8 is RED, its children are NIL ⟶ the number of BLACK nodes below it is 0, so bh(8) = 0.
- Node 12 is BLACK with left child 8 and right child NIL: both subpaths from 12 to leaves contain exactly one BLACK node (NIL) after 12, so bh(12) = 1.
- Node 31 is BLACK with two NIL children ⟶ bh(31) = 1.
- Node 41 is BLACK with NIL children ⟶ bh(41) = 1.
- Node 19 is RED with children 12 and 31 ⟶ both subtrees below 19 have black height 1, so bh(19) = 1.
-
Step 3 – Black height of the root and balance implication.
- From the root 38 (BLACK), any path to a NIL leaf passes through some number of BLACK nodes below it (here 1) plus the NIL leaf.
- Thus bh(38) = 2 in the conventional RB definition.
- Every root-to-leaf path has the same number of BLACK nodes (2), so Property 5 holds.
- Because reds cannot be consecutive, the longest path from the root can have at most twice as many nodes as the shortest path.
- Therefore the tree height h is at most 2 × bh(root), which is O(log n), giving logarithmic search, insert, and delete time.
Let a Red-Black tree have n internal nodes and height h.
Show that h ≤ 2·log₂(n + 1).
- Use the fact that the black height of the root is at least h/2.
- Relate black height to a lower bound on the number of nodes.
-
Step 1 – At least half the nodes on any root-to-leaf path are BLACK.
- No two RED nodes can be adjacent (Property 4).
- Along any path, RED nodes must be separated by at least one BLACK node.
- Therefore, the number of RED nodes on a path is at most the number of BLACK nodes.
- So the length of the path h is at most 2 × (number of BLACK nodes on that path).
Let b be the black height of the root (number of BLACK nodes on any root→leaf path, excluding the root in the formal definition); then h ≤ 2b.
-
Step 2 – Minimum number of nodes for a given black height.
- Consider a RB tree where every node on every path is BLACK (no RED nodes at all).
- Such a tree with black height b is a perfectly balanced BLACK-only tree.
- This tree has at least 2b − 1 internal nodes (full binary tree of height b − 1 with only BLACK nodes).
- Any RB tree with black height b has at least that many nodes:
n ≥ 2b − 1.
-
Step 3 – Combine inequalities to bound h in terms of n.
- From Step 2:
n ≥ 2^b − 1 ⇒ n + 1 ≥ 2^b. - Taking log₂ on both sides:
log₂(n + 1) ≥ b. - From Step 1:
h ≤ 2b. - Therefore:
h ≤ 2 · b ≤ 2 · log₂(n + 1).
This proves that the height of a Red-Black tree is always O(log n), which is the key to its worst-case guarantees.
- From Step 2: