Cracking the Coding Interview (4th Ed.) — Under the Hood¶
Source: Cracking the Coding Interview, 4th Edition — Gayle Laakmann McDowell, CareerCup LLC (310 pp.)
Focus: Internal memory mechanics, algorithmic invariants, and computational complexity models behind the 150 interview problems — not the interview prep advice, but the deep data structure and algorithm internals the problems are testing.
1. Bit Manipulation: Integer Memory Layout and Arithmetic¶
1.1 Integer Representation in Memory¶
Every integer in a 32-bit system occupies exactly 4 bytes = 32 bits in two's complement representation. Bit position k has value 2^k (LSB = bit 0).
block-beta
columns 1
block:twos["32-bit Two's Complement Layout"]:1
B31["bit 31 (sign)\n-2^31 = -2,147,483,648"]
B30["bit 30\n2^30"]
DOTS["..."]
B1["bit 1\n2^1 = 2"]
B0["bit 0\n2^0 = 1"]
end
Key operations that CTCI problems exploit:
| Operation | C/Java | Effect |
|---|---|---|
| Test bit k | (n >> k) & 1 |
Extract single bit |
| Set bit k | n \| (1 << k) |
Force bit to 1 |
| Clear bit k | n & ~(1 << k) |
Force bit to 0 |
| Toggle bit k | n ^ (1 << k) |
Flip bit |
| Count set bits | n & (n-1) removes lowest set bit |
O(popcount) iterations |
1.2 Bit Vector: 256-char Set in One Integer¶
The classic "unique characters" problem (CTCI 1.1) exploits that 26 lowercase letters fit in 26 bits of one int:
flowchart LR
A["char c = 'e'\n(ASCII 101)"] --> B["val = c - 'a' = 4"]
B --> C["mask = 1 << 4 = 0b...010000"]
C --> D["checker & mask > 0 ?\n→ duplicate found"]
D -->|NO| E["checker |= mask\n(mark as seen)"]
Memory saving: boolean[256] = 256 bytes; int checker = 4 bytes. 64× compression.
1.3 XOR Properties for Interview Problems¶
XOR (^) is commutative, associative, and x ^ x = 0, x ^ 0 = x. These allow:
- Finding the unique element in an array where all others appear twice: result = XOR of all elements
- Swap without temp: a ^= b; b ^= a; a ^= b
sequenceDiagram
participant A as a = 5 (0101)
participant B as b = 3 (0011)
A->>A: a ^= b → a = 0110 (6)
B->>B: b ^= a → b = 0101 (5) [original a]
A->>A: a ^= b → a = 0011 (3) [original b]
Note over A,B: No temp variable, O(1) space
Note over A,B: DANGER: fails when a and b point to same location
2. Hash Tables: Collision Chain Memory Model¶
2.1 Java HashMap Internal Buckets¶
block-byzantine
columns 1
block:HM["HashMap<K,V> internals"]:1
A["Entry[] table (array of bucket heads)"]
B["Entry{hash, key, value, next} — linked list nodes"]
C["loadFactor = 0.75 (default)"]
D["threshold = table.length × loadFactor"]
end
Put operation mechanics:
hash = key.hashCode() ^ (hash >>> 16) // spread high bits
index = hash & (table.length - 1) // modulo via bitmask (power-of-2 tables)
sequenceDiagram
participant K as key "hello"
participant H as hash = 99162322
participant T as table[index]
participant E as Entry chain
K->>H: key.hashCode() = 99162322
H->>H: hash ^= hash >>> 16 → spread bits
H->>T: index = hash & (n-1)
T->>E: scan chain: key.equals(existing)?
alt match found
E-->>T: update value
else no match
E-->>T: prepend new Entry at head
end
2.2 Amortized Resize Cost¶
When size > threshold, HashMap doubles table.length and rehashes all entries:
flowchart TD
A["put(k,v): size > threshold?"] -->|YES| B["newTable = new Entry[2×n]"]
B --> C["for each old entry:\nhash & (2n-1) → new bucket"]
C --> D["Old table eligible for GC"]
D --> E["Put proceeds in new table"]
A -->|NO| F["Direct bucket insert"]
Amortized O(1) per insert because: total work across N inserts = O(N + N/2 + N/4 + ...) = O(2N) = O(N).
3. Linked List: Pointer Mechanics and Two-Pointer Patterns¶
3.1 Node Memory Layout¶
class Node {
int data; // 4 bytes
Node next; // 8 bytes (64-bit reference)
// JVM object header: 16 bytes
// Total: 28 bytes → padded to 32 bytes
}
A linked list of N nodes uses 32N bytes on 64-bit JVM — vs int[N] = 24+4N bytes. Arrays are 7-8× more memory-efficient for integers.
3.2 Floyd's Cycle Detection — Two Pointers¶
flowchart TD
A["slow = head\nfast = head"] --> B["Loop:"]
B --> C["slow = slow.next\nfast = fast.next.next"]
C --> D["slow == fast?\n(inside cycle)"]
D -->|YES| E["Reset slow = head\nkeep fast at meeting point"]
D -->|NO| B
E --> F["Loop:\nslow = slow.next\nfast = fast.next"]
F --> G["slow == fast?\n→ cycle start found"]
Mathematical invariant: If cycle has length c and head-to-cycle-entry distance is d, when slow enters cycle (after d steps), fast is at position 2d mod c in cycle. They meet after additional (c - d mod c) mod c steps. Re-meeting from head converges at cycle entry.
3.3 Find K-th from Last — Space-Optimized Two Pointers¶
sequenceDiagram
participant P1 as pointer1
participant P2 as pointer2
Note over P1,P2: Both start at head
loop k times
P1->>P1: advance p1 one step
end
Note over P1,P2: Gap = k nodes
loop until p1 == null
P1->>P1: advance p1
P2->>P2: advance p2
end
Note over P2: p2 is at k-th from last
O(N) time, O(1) space — versus storing all nodes in an array which is O(N) space.
4. Stacks and Queues: Memory and Cache Behavior¶
4.1 Stack via Linked List vs Array¶
block-byzantine
columns 2
block:LL["Linked-List Stack"]:1
A["Each push: heap-allocate Node (32B)"]
B["Unlimited size, no resize"]
C["Cache-UNFRIENDLY: nodes scattered"]
D["GC pressure: O(N) Node objects"]
end
block:Arr["Array Stack (ArrayDeque)"]:1
E["Pre-allocated array, doubly-indexed"]
F["Resize doubles array: O(N) amortized"]
G["Cache-FRIENDLY: contiguous memory"]
H["One GC root for entire stack"]
end
4.2 Queue via Two Stacks — Amortized Analysis¶
sequenceDiagram
participant In as Stack IN
participant Out as Stack OUT
Note over In,Out: enqueue(x): always push to IN
loop many enqueues
In->>In: push(x) — O(1) each
end
Note over In,Out: dequeue(): if OUT empty, transfer all from IN
In->>Out: pop each from IN, push to OUT (O(N) one-time)
Out->>Out: pop() — O(1) thereafter
Note over Out: Each element transfers at most ONCE → O(1) amortized dequeue
5. Trees: DFS Recursion Stack, BFS Queue, and Memory¶
5.1 BST Memory per Node in Interview Code¶
class TreeNode {
int val;
TreeNode left;
TreeNode right;
// JVM: 16B header + 4B int + 2×8B refs + 4B pad = 40 bytes per node
}
// N-node BST: 40N bytes (vs O(N log N) for sorted array equivalent)
5.2 DFS — Call Stack Depth Analysis¶
stateDiagram-v2
[*] --> Root : inorder(root)
Root --> LeftSubtree : recurse left
LeftSubtree --> LeafNode : recurse until null
LeafNode --> BacktrackLeft : return
BacktrackLeft --> ProcessNode : visit current
ProcessNode --> RightSubtree : recurse right
RightSubtree --> [*] : all done
note right of LeafNode
JVM stack depth = tree height h
Balanced BST: h = O(log N)
Unbalanced (sorted input): h = O(N)
Risk: StackOverflowError for h > ~8000
end note
5.3 BFS — Level-Order Queue Memory¶
sequenceDiagram
participant Q as Queue<TreeNode>
participant Level as Current Level
Q->>Q: enqueue(root)
loop while Q not empty
Note over Q: Level boundary: save current queue size
loop size times
Q->>Q: node = dequeue()
Level->>Level: process node
Q->>Q: enqueue non-null children
end
Note over Level: All nodes at this depth processed
end
Note over Q: Max queue size = widest level (up to N/2 nodes)
Memory peak: at the widest level of a complete binary tree, queue holds N/2 nodes → O(N) space. DFS uses O(h) stack space = O(log N) for balanced tree.
5.4 LCA (Lowest Common Ancestor) — Path Comparison¶
flowchart TD
A["Find path from root to node p:\n[root, a, b, p]"] --> C["Find path from root to node q:\n[root, a, c, q]"]
C --> D["Compare paths element by element"]
D --> E["Last common element = LCA"]
E --> F["Example: paths diverge at 'a'\n→ LCA = a"]
Space: O(log N) per path for balanced tree. Alternative: track parent pointers and use a HashSet of ancestors — O(h) space, O(h) time.
6. Dynamic Programming: Memoization vs Tabulation Memory Models¶
6.1 Fibonacci — Exponential vs Linear Memory¶
Naive recursion (call tree):
flowchart TD
F5["fib(5)"] --> F4["fib(4)"]
F5 --> F3a["fib(3)"]
F4 --> F3b["fib(3)"]
F4 --> F2a["fib(2)"]
F3a --> F2b["fib(2)"]
F3a --> F1a["fib(1)"]
F3b --> F2c["fib(2)"]
F3b --> F1b["fib(1)"]
Time: O(2^N), Space: O(N) call stack — but 2^N redundant computations.
Top-down memoization (HashMap cache):
sequenceDiagram
participant Call as fib(5)
participant Cache as HashMap<int,int>
Call->>Cache: cache.get(5)? MISS
Call->>Call: compute fib(4) + fib(3)
Note over Cache: fib(3) result cached after first computation
Call->>Cache: cache.get(3)? HIT → return immediately
Call->>Cache: cache.put(5, result)
Bottom-up tabulation (array):
dp[0] = 0, dp[1] = 1
for i = 2..N: dp[i] = dp[i-1] + dp[i-2]
// Further optimized: only keep last 2 values → O(1) space
6.2 Coin Change DP — Table Filling Pattern¶
For CTCI-style "number of ways to make change" problems, the DP table ways[i][j] = number of ways to make j cents using first i coin types:
flowchart TD
A["Initialize: ways[0][0] = 1\nways[0][j≠0] = 0"] --> B["For each coin type c_i:"]
B --> C["For each amount j:"]
C --> D["ways[i][j] += ways[i][j - c_i]\n(if j >= c_i)"]
D --> E["Only keeps row i-1 → O(amount) space\nvs O(coins × amount) naive"]
Space optimization: Since ways[i][j] only depends on ways[i][j - coin] (same row!), a single 1D array suffices.
6.3 String DP: Edit Distance — 2D Table Layout¶
block-byzantine
columns 1
block:ED["Edit Distance dp[i][j] = edit dist(s1[0..i], s2[0..j])"]:1
R0["dp[0][j] = j (insert j chars)"]
R1["dp[i][0] = i (delete i chars)"]
R2["dp[i][j] = dp[i-1][j-1] if s1[i]==s2[j]"]
R3["dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) else"]
end
Cache behavior: Row-major iteration through dp[i][j] → dp[i+1][j] accesses stride 1 in C/Java arrays (cache-friendly). Only two rows needed at a time → O(min(m,n)) space.
7. Recursion: Call Stack Frame Layout and Base Case Analysis¶
7.1 Permutation Generation — Stack Frame Count¶
For permutations of N characters, the recursion tree has:
- Depth N (one character fixed per level)
- Each level k has N × (N-1) × ... × (N-k+1) active calls
- Total frames created: N! × N / N = N × N! / N = O(N × N!)... but not all simultaneously
stateDiagram-v2
[*] --> Level0 : permute("abc")
Level0 --> Level1a : fix 'a', permute("bc")
Level0 --> Level1b : fix 'b', permute("ac")
Level0 --> Level1c : fix 'c', permute("ab")
Level1a --> Leaf1 : fix 'b', ["a","bc"] → add "abc"
Level1a --> Leaf2 : fix 'c', ["a","cb"] → add "acb"
note right of Level0
Max stack depth = N
Space = O(N) for call stack
+ O(N × N!) for storing results
end note
7.2 Tower of Hanoi — Exponential Work, Linear Space¶
sequenceDiagram
participant A as Peg A (source)
participant B as Peg B (buffer)
participant C as Peg C (target)
Note over A,C: hanoi(n, A, C, B):
A->>B: Move top n-1 disks from A to B (using C)
A->>C: Move disk n from A to C
B->>C: Move top n-1 disks from B to C (using A)
Note over A,C: Total moves = 2^n - 1
Note over A,C: Call stack depth = n frames
Note over A,C: Space = O(n), Time = O(2^n)
Why 2^n - 1: Let T(n) = moves for n disks. T(1)=1, T(n) = 2T(n-1) + 1. Solving: T(n) = 2^n - 1.
8. Sorting and Searching: Internal Mechanics for Interview Contexts¶
8.1 Quicksort Partition Pointer Dance¶
sequenceDiagram
participant L as left pointer
participant R as right pointer
participant Pivot as pivot (last element)
Note over L,R: L starts at leftmost, R at rightmost-1
loop until L >= R
L->>L: advance right while a[L] < pivot
R->>R: advance left while a[R] > pivot
Note over L,R: a[L] >= pivot AND a[R] <= pivot?
L->>R: swap(a[L], a[R])
end
Note over L,Pivot: swap(a[L], pivot)
Note over L: Pivot now in final sorted position
Note over L: Elements left of L < pivot, right > pivot
Worst case (sorted input, last-element pivot): pivot is always min/max → O(N²). Fix: median-of-three pivot selection.
8.2 Binary Search — Integer Overflow Bug¶
Classic binary search has a famous integer overflow bug:
// BUG: mid = (lo + hi) / 2 — overflows if lo+hi > Integer.MAX_VALUE
// FIX: mid = lo + (hi - lo) / 2
flowchart LR
A["lo = 2^30\nhi = 2^30 + 1"] --> B["lo + hi = 2^31 + 1\n→ INTEGER OVERFLOW → negative!"]
A --> C["hi - lo = 1\nlo + (hi-lo)/2 = 2^30\n→ CORRECT"]
8.3 Merge Sorted Arrays In-Place — From Back to Front¶
Merging array A (with m elements + buffer) and array B (with n elements) — starting from the back avoids extra space:
sequenceDiagram
participant A as Array A[0..m-1 + n buffer]
participant B as Array B[0..n-1]
participant PA as pointer a = m-1
participant PB as pointer b = n-1
participant PW as write = m+n-1
loop until one array exhausted
Note over PA,PB: Compare A[a] and B[b]
alt A[a] >= B[b]
PA->>A: A[write] = A[a], a--, write--
else
PB->>A: A[write] = B[b], b--, write--
end
end
Note over A: Copy remaining B elements to front of A
Writing from back to front means we never overwrite unread elements — O(1) extra space.
9. System Design: Memory Constraints and Data Partitioning¶
9.1 Bit Array for Duplicate Detection (CTCI Ch.12)¶
Problem: Given 4GB file with 32-bit integers, find any duplicate using only 1GB RAM.
flowchart TD
A["4GB file = 10^9 32-bit ints"] --> B["2^32 possible values = 4 billion"]
B --> C["Bit array: 2^32 bits = 512 MB\n(one bit per possible value)"]
C --> D["Pass 1: set bit for each int seen"]
D --> E["Pass 2: if bit already set → DUPLICATE"]
E --> F["Total memory: 512 MB < 1 GB ✓"]
9.2 External Sort — Disk I/O Pattern¶
For sorting a file too large to fit in RAM:
sequenceDiagram
participant Disk as Disk (input file)
participant RAM as RAM (1 GB chunks)
participant Tmp as Temp files (sorted runs)
participant Out as Output file
loop until input exhausted
Disk->>RAM: Load 1GB chunk
RAM->>RAM: Sort in-memory (quicksort/mergesort)
RAM->>Tmp: Write sorted run to disk
end
Note over Tmp: K sorted runs of 1GB each
loop K-way merge
Tmp->>RAM: Read one buffer per run (K buffers)
RAM->>RAM: K-way merge via min-heap
RAM->>Out: Write merged output buffer
end
K-way merge uses a min-heap of size K → O(K log K) per element → total O(N log K) where K = number of runs.
10. Threads and Locks: Monitor Model and Deadlock Mechanics¶
10.1 Java Monitor — Object Header Lock Word¶
Every Java object has a 2-word object header: 8 bytes mark word + 4-byte class pointer (compressed OOPs) = 12 bytes. The mark word encodes lock state:
stateDiagram-v2
[*] --> Unlocked : fresh object
Unlocked --> Biased : first thread claims biased lock\n(thread ID stored in mark word)
Biased --> Lightweight : second thread contends\n(CAS spin lock, stack-based)
Lightweight --> Heavyweight : long contention\n(OS mutex, kernel mode)
Heavyweight --> Unlocked : last thread releases
Deadlock conditions (all 4 must hold simultaneously): 1. Mutual exclusion: resources held exclusively 2. Hold and wait: process holds lock while waiting for another 3. No preemption: locks cannot be forcibly taken 4. Circular wait: process A waits for B, B waits for A
flowchart LR
T1["Thread 1:\nhas Lock A\nwaiting for Lock B"] --> T2["Thread 2:\nhas Lock B\nwaiting for Lock A"]
T2 --> T1
Note["DEADLOCK: circular wait\nneither can proceed"]
Prevention: always acquire locks in the same global order across all threads.
10.2 Producer-Consumer — wait/notify Memory Visibility¶
sequenceDiagram
participant Producer
participant Queue as Shared Queue (synchronized)
participant Consumer
Producer->>Queue: synchronized(queue) { queue.add(item) }
Queue->>Consumer: notifyAll() — wake waiting consumers
Consumer->>Queue: synchronized(queue) { while(queue.isEmpty()) wait() }
Note over Queue: Java Memory Model guarantees:\n- All writes before notify() visible after wait() returns
Note over Queue: volatile field or synchronized block\n= happens-before relationship
11. Algorithm Patterns — Complexity Under the Hood¶
11.1 Five Approaches Mapped to Complexity Classes¶
CTCI describes 5 approaches. Here's their algorithmic genealogy:
flowchart TD
P1["Exemplify → Pattern Recognition\n→ O(1) or O(N) with insight"] --> P2["Simplify → Reduce to known problem\n→ amortized same as base algorithm"]
P2 --> P3["Base Case & Build\n→ Mathematical induction\n→ often O(2^N) naively, O(N!) with memoization"]
P3 --> P4["Data Structure Brainstorm\n→ Mapping to ADT operations\n→ O(log N) if heap/BST, O(1) amort if hash"]
P4 --> P5["BUD Optimization\nBottleneck + Unnecessary + Duplicate\n→ Profile inner loop → find dominant term"]
11.2 Space-Time Tradeoff Map¶
block-byzantine
columns 3
block:h1["Problem"]:1
block:h2["Time-optimal"]:1
block:h3["Space-optimal"]:1
block:r1["Unique characters"]:1
block:r2["O(N) hash set"]:1
block:r3["O(1) bit vector\n(if ASCII)"]
block:r4["Two-sum"]:1
block:r5["O(N) hash map"]:1
block:r6["O(N log N) sort\n+ O(1) two-pointer"]
block:r7["Palindrome check"]:1
block:r8["O(N) with center expand"]:1
block:r9["O(1) space, O(N) time\n(no extra array)"]
block:r10["Anagram detection"]:1
block:r11["O(N) frequency array"]:1
block:r12["O(N log N) sort both\nO(1) extra space"]
block:r13["LRU cache"]:1
block:r14["O(1) HashMap + DoublyLinkedList"]:1
block:r15["O(1) space not possible\n(must store N items)"]
12. Data Flow: From Problem Statement to Memory Operations¶
flowchart TD
Problem["Interview Problem:\n'Find all subsets of a set'"] --> Recognize["Pattern: Binary number 0..2^N-1\neach bit = include/exclude element"]
Recognize --> Memory["Memory: List<List<Integer>>\n2^N sublists, average N/2 elements each\n= O(N × 2^N) total"]
Memory --> Loop["For each i in 0..2^N-1:\n For each bit j in 0..N-1:\n if (i >> j) & 1: include element j"]
Loop --> Cache["Inner loop: test bit j of integer i\n= 2 integer operations\n→ CPU register ops, cache-line irrelevant"]
Cache --> Result["Total time: O(N × 2^N)\nTotal space: O(N × 2^N)"]
This is the canonical subset enumeration approach — it avoids recursion entirely, replacing the call stack with the binary bit pattern of the outer loop variable.
Document synthesized from Cracking the Coding Interview, 4th Edition (Gayle Laakmann McDowell) — covering bit manipulation (Ch.5), hash tables (Ch.1), linked lists (Ch.2), stacks/queues (Ch.3), trees/graphs (Ch.4), recursion (Ch.8), dynamic programming, sorting (Ch.9), system design (Ch.12), and threading (Ch.18), with focus on internal memory models and algorithmic invariants.