PDF格式笔记见:Notes
Divide and Conquer 分而治之
- DIVIDE into smaller sub-problems
- CONQUER via recursive calls
- COMBINE solutions of sub-problems into one for the original problem
Master Method
Cool feature: a “black-box” method for solving recurrences
Determine the upper bound of running time for most of the D&C algos
Assumption: all sub-problems have equal size
- unbalanced sub-problems?
- more than one recurrence?
Recurrence format:
- base case:
$T(n)\leq C$
(a constant), for all sufficiently small$n$
- for all larger
$n$
,$T(n)\leq aT(\frac{n}{b})+O(n^d)$
$a$
: # of recurrence calls (e.g., # of sub-problems),$a\geq 1$
$b$
: input size shrinkage factor,$b>1$
,$T(\frac{n}{b})$
is the time required to solve each sub-problem$d$
: exponent in running time of the combine step,$d\geq 0$
- constants
$a,b,d$
independent of$n$
- base case:
Three Cases:
$$ T(n)= \begin{cases} O(n^d\log n), &a=b^d \text{ (case 1)}\newline O(n^d), &a<b^d\text{ (case 2)}\newline O(n^{\log_b a}), &\text{otherwise (case 3)} \end{cases} $$ Case 1: base of logarithm does not matter
Case 3: base of logarithm matters
If
$T(n)= aT(\frac{n}{b})+\Theta(n^d)$
, then (with similar proof)$$ T(n)= \begin{cases} \Theta(n^d\log n), &a=b^d \text{ (case 1)}\newline \Theta(n^d), &a<b^d\text{ (case 2)}\newline \Theta(n^{\log_b a}), &\text{otherwise (case 3)} \end{cases} $$
Proof (Recursion Tree Approach)
3 cases $\Leftrightarrow$
3 types of recursion trees
For simplicity, we assume:
$T(1)\leq C$
(for some constant$C$
)$T(n)\leq aT(\frac{n}{b})+Cn^d$
$n$
is a power of$b$
At each level $j=0,1,\ldots, \log_b n$
, there are
$a^j$
sub-problems,- each of size
$n/b^j$
$(1+\log_b n)$
levels, level-0: root, level-$\log_b n$
: leaves
Then
$$
\text{Total work at level-}j \leq
a^j \times C\left(\frac{n}{b^j}\right)^d
$$
Thus,
$$
\text{Total work} \leq Cn^d\sum_{j=0}^{\log_bn}\left(\frac{a}{b^d}\right)^j \ \ (\Delta)
$$
$\Rightarrow$
all depends on the relationship between $a$
and $b^d$
For constant
$r>0$
, $$ 1+r+r^2+\cdots+r^k = \frac{1-r^{k+1}}{1-r} \leq \begin{cases} \frac{1}{1-r}, &r<1 \newline \frac{r}{r-1} r^k, &r>1 \end{cases} $$
$0<r<1$
,$LHS \leq Constant$
$r>1$
,$LHS$
is dominated by the largest power of$r$
Case 1. $a=b^d$
$$
(\Delta) \leq Cn^d\times log_b n=O(n^d\log n)
$$
Case 2. $a<b^d$
$$
(\Delta) \leq O(n^d)
$$
Case 3. $a>b^d$
$$
(\Delta)\leq Cn^d \left(\frac{a}{b^d}\right)^{\log_b n}= Ca^{\log_b n}=Cn^{\log_b a}=O(n^{\log_b a})
$$
$n^{\log_ba}=a^{\log_bn} \Leftrightarrow\log_ba\log_bn=\log_bn\log_ba$
$a^{\log_b n}=$
# of leaves
Interpretations
Upper bound on the work at level $j$
:
$$
Cn^d\times\left(\frac{a}{b^d}\right)^j
$$
$$
a:\text{ rate of sub-problem poliferation, RSP - force of evil} \\
b^d: \text{rate of work shrinkage (per sub-problem), RWS - force of good}
$$
$RSP<RWS$
- amount of work is decreasing with the recursion level
$j$
- most work at root
$\Rightarrow$
root dominates$\Rightarrow$
might expect$O(n^d)$
- amount of work is decreasing with the recursion level
$RSP>RWS$
- amount of work is increasing with the recursion level
$j$
- leaves dominate
$\Rightarrow$
might expect$O(\#\text{leaves})=O(a^{\log_b n})=O(n^{\log_b a})$
- amount of work is increasing with the recursion level
$RSP=RWS$
- amount of work is the same at each recursion level
$j$
(like merge sort) - might expect
$O(\log n)\times O(n^d)=O(n^d\log n)$
(recursion depth =$O(\log n)$
)
- amount of work is the same at each recursion level
Counting Inversions
- Problem: counting # of inversions in an array = # of pairs
(i,j)
of array indices withi<j
andA[i]>A[j]
- Number of inversions = Number of intersections of line segments
- Application: measuring similarity between 2 ranked lists => making good recommendations => collaborative filtering (CF)
- Largest possible number of inversions that an
$n$
-element array can have?$C_n^2$
(worst case: array in backward order) - Brute-force algorithm:
$O(n^2)$
- D&C algorithm: pseudocode
A = input array [length = n]
D = output [length=
n
]B = 1st sorted array [
n/2
], C = 2nd sorted array [n/2
]SortAndCount(A) if n=1, return 0 else (B,X) = SortAndCount(1st half of A) (C,Y) = SortAndCount(2nd half of A) (D,Z) = MergeAndCountSplitInv(A) return X + Y + Z
Goal: implement
MergeAndCountSplitInv
in linear time$O(n)$
The split inversions involving an element y of the 2nd array C are precisely the elements left in the 1st array B when y is copied to the output D.
$$ T(n)\leq 2T\left(\frac{n}{2}\right)+O(n) \Rightarrow T(n)=O(n\log n) $$
Python Code
def sortAndCount(arr):
n = len(arr)
# base case
if n < 2:
return arr, 0
m = n//2
left, x = sortAndCount(arr[:m])
right, y = sortAndCount(arr[m:])
i = j = z = 0
for k in range(n):
if j == n-m or (i < m and left[i] < right[j]):
arr[k] = left[i]
i += 1
else:
arr[k] = right[j]
j += 1
z += (m - i)
return arr, x + y + z
Karatsuba Multiplication
- Problem: multiplication of two
$n$
-digit numbers$x, y$
(base 10) - Application: cryptography
- Define primitive operations: add or multiply 2 single-digit numbers
- Simple method:
$\leq 4n^2 = O(n^2)$
operations - Recursive method:
$$ x = 10^{n/2}a+b, y = 10^{n/2}c+d, \text{ where }a,b,c,d\text{ are }\frac{n}{2}-\text{digit numbers} $$
$$ \Rightarrow xy = 10^nac+ 10^{n/2}(ad+bc)+bd\ \ \ \ \ \ (*) $$
Algorithm #1 (Naive method)
- recursively compute
$ac$
,$ad$
,$bc$
, and$bd$
- then compute (*)
Running time $$ T(n) \begin{cases} =O(1), &n=1(\text{base case})\newline \leq 4T(\frac{n}{2}) + O(n), &n\geq 1(\text{4 subproblems and linear time bit addition}) \end{cases} $$ By master method,
$$ a = 4 > 2^1 = b^d (\text{case 3}) \Rightarrow T(n) = O(n^{\log_ba})=O(n^2) $$ Algorithm #2 (Gauss method)
- recursively compute
$ac$
,$bd$
,$(a+b)(c+d)$
$ad+bc = (a+b)(c+d)-(ac+bd)$
- then compute (*)
Running time $$ T(n) \begin{cases} =O(1), &n=1(\text{base case})\newline \leq 3T(\lceil \frac{n}{2} \rceil) + O(n), &n\geq 1(\text{3 subproblems and linear time bit addition}) \end{cases} $$ By master method,
$$ a = 3 > 2^1 = b^d (\text{case 3}) \Rightarrow T(n) = O(n^{\log_ba})=O(n^{\log_23})=O(n^{1.59}) $$ Better than simple method!
Python Code
def karatsuba(x, y):
nx, ny = len(str(x)), len(str(y))
n = max(nx, ny)
# base case
if n == 1:
return x*y
m = n//2
bm = 10**m
a, b = x//bm, x%bm
c, d = y//bm, y%bm
term1 = karatsuba(a, c)
term2 = karatsuba(b, d)
term3 = karatsuba(a+b, c+d)
result = (bm**2)*term1 + bm*(term3 - term1 - term2) + term2
return result
Binary Search
- Problem: looking for an element in a given sorted array
- Running time:
$T(n)=T(n/2)+O(1)$
. By master method,
$$ a = 1=2^0=b^d \text{ (case 1) }\Rightarrow T(n)\leq O(n^d\log n)=O(\log n) $$
Python Code
See link
Strassen’s Matrix Multiplication
- Problem: compute
$Z_{n\times n}=X_{n\times n}\cdot Y_{n\times n}$
(Note: input size =$O(n^2)$
) - Naive iterative algorithm:
$O(n^3)$
(3for
loops)
$$ z_{ij} = \sum_{k=1}^n x_{ik}\cdot y_{kj} $$
- Write
$$ X = \begin{pmatrix} A &B\newline C &D \end{pmatrix}, Y = \begin{pmatrix} E &F\newline G &H \end{pmatrix} $$
where $A$
through $H$
are all $\frac{n}{2}\times \frac{n}{2}$
matrices. Then
$$
X\cdot Y = \begin{pmatrix}
AE+BG &AF+BH\newline
CE+DG &CF+DH
\end{pmatrix}
$$
- Recursive method #1:
- step1. recursively compute the 8 products
- step2. do additions (
$O(n^2)$
time) - Running time: by master method,
$$ a = 8>2^2=b^d \text{ (case 3) }\Rightarrow T(n)\leq O(n^{\log_b a})=O(n^{\log_2 8})=O(n^{3}) $$
- Strassen’s method:
- step1. recursively compute only 7 (cleverly chosen) products
- step2. do (clever) additions (
$O(n^2)$
time) - Running time: by master method,
$$ a = 7>2^2=b^d \text{ (case 3) }\Rightarrow T(n)\leq O(n^{\log_b a})=O(n^{\log_2 7})=O(n^{2.81}) $$
The 7 products: $$ P_1 =A(F-H)\\
P_2 =(A+B)H\\
P_3=(C+D)E\\
P_4=D(G-E)\\
P_5=(A+D)(E+H)\\
P_6=(B-D)(G+H)\\
P_7=(A-C)(E+F) $$Then $$ \begin{align} X\cdot Y &= \begin{pmatrix} AE+BG &AF+BH\newline CE+DG &CF+DH \end{pmatrix}\newline &= \begin{pmatrix} P_5+P_4-P_2+P_6 &P_1+P_2\newline P_3+P_4 &P_1+P_5-P_3-P_7 \end{pmatrix} \end{align} $$
Python Code
import numpy as np
def strassen(X, Y):
n = X.shape[0]
if n == 1:
return np.array([X[0, 0]*Y[0, 0]])
if n%2 == 1: # padding with zeros
Xpad = np.zeros((n + 1, n + 1))
Ypad = np.zeros((n + 1, n + 1))
Xpad[:n, :n], Ypad[:n, :n] = X, Y
return strassen(Xpad, Ypad)[:n, :n]
m = n//2
A, B, C, D = X[:m, :m], X[:m, m:], X[m:, :m], X[m:, m:]
E, F, G, H = Y[:m, :m], Y[:m, m:], Y[m:, :m], Y[m:, m:]
P1 = strassen(A, F - H)
P2 = strassen(A + B, H)
P3 = strassen(C + D, E)
P4 = strassen(D, G - E)
P5 = strassen(A + D, E + H)
P6 = strassen(B - D, G + H)
P7 = strassen(A - C, E + F)
Z = np.zeros((n, n))
Z11 = P5 + P4 - P2 + P6
Z12 = P1 + P2
Z21 = P3 + P4
Z22 = P1 + P5 - P3 - P7
Z[:m, :m], Z[:m, m:], Z[m:, :m], Z[m:, m:] = Z11, Z12, Z21, Z22
return Z
Closest Pair
- Input: a set
$P=\{p_1,\ldots,p_n\}$
of$n$
points in the plane ($\mathbb{R}^2$
) - Notation:
$d(p,q) = $
Euclidean distance - Output:
$p^*, q^*=\arg\min\limits_{p\neq q\in P}d(p,q)$
- Assumption: (for convenience) all points have distinct
$x$
-coordinates,$y$
-coordinates - Brute-force search:
$\Theta(n^2)$
- 1-d version of CP:
- sort points (
$O(n\log n)$
time) - return CP of adjacent points (
$O(n)$
time)
- sort points (
- Goal:
$O(n\log n)$
time algo for 2-d version - D&C: make copies of points sorted by
$x$
-coordinates ($P_x$
) and by$y$
-coordinates ($P_y$
) [$O(n\log n)$
time]
ClosestPair
$(P_x, P_y)$
$Q$
= left half of$P$
,$R$
= right half of$P$
. Form$Q_x, Q_y, R_x, R_y$
[$O(n)$
time]$(p_1,q_1)=$
ClosestPair
$(Q_x, Q_y)$
$(p_2,q_2)=$
ClosestPair
$(R_x, R_y)$
- let
$\delta = \min\{d(p_1,q_1), d(p_2,q_2)\}$
$(p_3,q_3)=$
ClosestSplitPair
$(P_x, P_y, \delta)$
. Requirements:
$O(n)$
time- correct whenever CP of
$P$
is a split pair- return best of
$(p_1,q_1),(p_2,q_2),(p_3,q_3)$
ClosestSplitPair
$(P_x, P_y, \delta)$
filtering + linear scan
- Let
$\bar{x}$
= biggest$x$
-coordinate in the left of$P$
[$O(1)$
time]- Let
$S_y$
= points of$P$
with$x$
-coordinate in$[\bar{x}-\delta,\bar{x}+\delta]$
, sorted by$y$
-coordinate- Initialize best =
$\delta$
, best point =NULL
,$m=\vert S_y\vert$
.For i = 1 to m-1 For j = 1 to min(7, m-i) Let p, q = ith, (i+j)th points of P If d(p, q) < best best point = (p, q) best = d(p, q)
[
$O(n)$
time]
Claim. Let $p\in Q$
, $q\in R$
be a split pair with $d(p,q)<\delta$
. Then
(A) $p,q\in S_y$
(B) $p,q$
are at most 7 positions apart in $S_y$
Proof.
(A) Let
$p=(x_1,y_1)\in Q, q = (x_2, y_2)\in R$
and$d(p,q)<\delta$
. Thus,$|x_1-x_2|<\delta$
. $$ \Rightarrow x_1 \leq \bar{x} \leq x_2 \Rightarrow \begin{cases} \bar{x}-x_1\leq x_2-x_1 <\delta &\Rightarrow x_1 >\bar{x}-\delta \newline x_2-\bar{x}\leq x_2-x_1 <\delta &\Rightarrow x_2 <\bar{x} +\delta \end{cases} $$ (B) Evenly divide$[\bar{x}-\delta, \bar{x}+\delta]\times [\tilde{y}, \tilde{y}+\delta]$
into 8$\frac{\delta}{2}\times \frac{\delta}{2}$
boxes, where$\tilde{y}=\min\{y_1,y_2\}$
.Then at most 1 point in each of the 8 boxes.
Proof by contradiction.
Let
$a,b$
be in the same box. Then$a,b\in Q$
or$a,b\in R$
. $$ d(a,b)\leq \frac{\delta}{\sqrt{2}}<\delta. \ \ \text{contradiction to definition of }\delta. $$
Corollary1. If CP of $P$
is a split pair, then ClosestSplitPair
can find it.
Corollary2. ClosestPair
is correct, and runs in $O(n\log n)$
time.
Python Code
import numpy as np
import numpy.linalg as LA
# find closet pair of n 2-dim points
# use n * 2 dim np array P
# each row: 1 point
def CP(P):
Px = P[P[:, 0].argsort(kind = 'mergesort')]
Py = P[P[:, 1].argsort(kind = 'mergesort')]
return closestPair(Px, Py)
def closestPair(Px, Py):
n = Px.shape[0] # no of points
if n < 4:
return bruteForceCP(Px)
mid = n//2
Qx, Rx = Px[:mid], Px[mid:]
midx = Px[mid, 0]
idx = (Py[:, 0] < midx)
Qy, Ry = Py[idx], Py[~idx]
(p1, q1), r1 = closestPair(Qx, Qy)
(p2, q2), r2 = closestPair(Rx, Ry)
if r1 < r2:
best = r1
bestpoint = (p1, q1)
else:
best = r2
bestpoint = (p2, q2)
(p3, q3), r3 = closestSplitPair(Px, Py, best, bestpoint)
if r3 < best:
best = r3
bestpoint = (p3, q3)
return bestpoint, best
def bruteForceCP(P):
best = float('Inf')
bestpoint = None
n = P.shape[0]
for i in range(n):
for j in range(i + 1, n):
d = LA.norm(P[i] - P[j])
if d < best:
best = d
bestpoint = (P[i], P[j])
return bestpoint, best
def closestSplitPair(Px, Py, best, bestpoint):
n = Px.shape[0]
mid = n//2
midx = Px[mid, 0]
idx = np.logical_and(midx - best <= Py[:, 0], Py[:, 0] <= midx + best)
Sy = Py[idx]
m = Sy.shape[0] # no of points in Sy
for i in range(m - 1):
for j in range(i + 1, min(i + 8, m)):
p, q = Sy[i], Sy[j]
d = LA.norm(p - q)
if d < best:
best = d
bestpoint = (p, q)
return bestpoint, best
# Test
import time
a = np.random.randint(1e5, size = (1000, 2))
tic = time.time()
b = bruteForceCP(a)
toc = time.time()
print("Brute Force takes time:", toc - tic)
print("ans:", b)
# Brute Force takes time: 4.079403400421143
# ans: ((array([29326, 36462]), array([29246, 36561])), 127.283148923964)
tic = time.time()
c = CP(a)
toc = time.time()
print("D&C takes time:", toc - tic)
print("ans:", c)
# D&C takes time: 0.14911723136901855
# ans: ((array([29326, 36462]), array([29246, 36561])), 127.283148923964)
Fictitious Example
$$
T(n)=2T\left(\frac{n}{2}\right) + O(n^2) \\
\Rightarrow a=2<2^2=b^d (\text{case 2}) \Rightarrow T(n)=O(n^2)
$$
Recursion depth = $O(\log n)$
, each level $O(n^2)$
.
$T(n)=O(n^2)=o(n^2\log n)$
! Master method gives a tighter upper bound!
Sorting & Selection
SORTING | SELECTION | |
---|---|---|
Randomized | Quick Sort | RSelect |
Deterministic | Merge Sort | DSelect |
Merge Sort versus Quick Sort
- Quick Sort has a smaller constant than Merge Sort
- Merge Sort needs extra space; Quick Sort works in-place
Running Time | Average Case | Worst Case |
---|---|---|
Merge Sort | $O(n\log n)$ |
$O(n\log n)$ |
Quick Sort | $O(n\log n)$ |
$O(n^2)$ |
Merge Sort (D&C)
- Input: array of numbers (assume distinct, exercise: duplicate case), unsorted
- Output: same numbers, sorted in increasing order
- Pseudocode for Merge:
C = output [length=
n
]A = 1st sorted array [
n/2
], B = 2nd sorted array [n/2
]
i = 1, j = 1
for k = 1 to n if A(i)<B(j) C(k)=A(i) i++ else [B(j)<A(i)] C(k)=B(j) j++ end
(ignores end cases)
Running Time
- Key question: running time of Merge Sort on array of
$n$
numbers? - Upshot: running time of Merge on an array of
$m$
numbers$\leq 5m+2\leq 6m (m>1)$
i = 1
,j = 1
[2 operations]each for loop: comparison, assignment, increment of
i
orj
, increment ofk
, comparison ofk
to the upper boundn
- Claim: Merge Sort requires
$\leq 6n\log_2n+6n$
operations to sort$n$
numbers.
Proof by Recursion Tree Method
Assume
$n$
= power of 2At each level
$j=0,1,2,\cdots,\log_2 n$
, there are$2^j$
sub-problems, each of size$n/2^j$
.Total # of operations at level
$j$
(Merge): $$ \leq 2^j \times 6\left(\frac{n}{2^j}\right)=6n \text{ (independent of }j) $$Total # of operations
$\leq 6n(\log_2 n +1)$
Proof by Master Method
$$ a = 2=2^1=b^d \text{ (case 1) }\Rightarrow T(n)\leq O(n^d\log n)=O(n\log n) $$
Python Code
def mergeSort(arr):
n = len(arr)
# base case
if n < 2:
return arr[:]
m = n//2
left, right = mergeSort(arr[:m]), mergeSort(arr[m:])
left.append(float('Inf'))
right.append(float('Inf'))
i = j = 0
result = []
for k in range(n):
if left[i] < right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
return result
- Space complexity:
$O(n)$
Quick Sort
a randomized algorithm,
$O(n\log n)$
time on average, works in place$O(1)$
Input: array of numbers (assume distinct, exercise: duplicate case), unsorted
Output: same numbers, sorted in increasing order
key idea: partition array around a pivot element
- pick pivot
- rearrange array so that pivot in its ‘rightful’ position
- left of pivot: less than pivot
- right of pivot: greater than pivot
cool facts about partition:
linear time
$O(n)$
, no extra spacereduces problem size
High-Level Description
Quick Sort
- Input: array
A
, lengthn
- if
n = 1
, returnp = choosePivot(A, n)
- partition
A
aroundp
- recursively sort 1st part
- recursively sort 2nd part
Partition (In-Place Implementation)
- Assume: pivot be the 1st element in arr [if not, swap pivot
$\leftrightarrow$
1st element as preprocessing step] - High-Level Idea
- single scan thru arr
- invariant: everything looked at so far is partitioned
- Pseudocode for partition:
Partition(A, l, r)
p := A[l] i : = l + 1 for j = l + 1 to r: if A[j] < p: swap A[j], A[i] i += 1 swap A[l] and A[i-1]
- running time:
$O(n), n=r-l+1$
- works in space
$O(1)$
Correctness of Quicksort
$P(n)$
= “Quick Sort correctly sorts every input array of length $n$
“
Claim: $P(n)$
holds for every $n\geq 1$
. [no matter how pivot is chosen]
Proof by induction.
- base case: every input array of length 1 is already sorted.
$P(1)$
holds.- inductive step: Fix
$n\geq 2$
. Fix some input array A of length$n$
.Need to show: if
$P(k)$
holds$\forall k\leq n$
, then$P(n)$
holds as well.Quick Sort first partitions A around some pivot p.
Pivot winds up in correct position.
Let
$k_1, k_2$
be lengths of 1st, 2nd parts of partitioned array.Since
$k_1<n, k_2<n$
, by inductive hypothesis, 1st, 2nd parts get sorted correctly by recursive calls.So after recursive calls, entire array is correctly sorted.
Choice of Pivot
- Running time of Quick Sort?
- Depends on the quality of the pivot.
- What is the running time of QS on an already sorted array if
choosePivot
always selects the first element of the array? 【Worst case】$O(n^2)$
- Best case: every recursive call chooses the median element of its subarray as its pivot. Running time?
$O(n\log n)$
$T(n)\leq 2T(\frac{n}{2})+O(n)\ \ \Rightarrow\ \ T(n)=O(n\log n) $
(master method)
- key question: how to choose pivots?
- big idea: random pivots
- Intuition:
- if always get a 25-75 split, good enough for
$O(n\log n)$
running time. (prove via recursion tree) - half of elements give a 25-75 split or better
- if always get a 25-75 split, good enough for
Quick Sort Theorem (Use Decomposition Principle)
Theorem: for every input array length $n$
, the average running time of Quick Sort (with random pivots) is $O(n\log n)$
. 【”average” is over random choices made by the algorithm (i.e. pivot choices)】
Proof.
Fix input array A of length $n$
.
Sample space $\Omega$
= all possible outcomes of random choices in Quick Sort. (i.e. pivot sequences)
Key random variable: for $\sigma\in\Omega$
,
$$
C(\sigma)=\text{# of comparisons between 2 input elements made by Quick Sort given random choices }\sigma
$$
Lemma. Running time of Quick Sort is dominated by comparisons.
$$
\Rightarrow \exists \text{ constant } d, \ \forall \sigma\in\Omega, \ RT(\sigma) \leq d\cdot C(\sigma)
$$
Remaining goal: prove $E[C]=O(n\log n)$
.
Notation: $z_i= ith$
smallest element of A.
For $\sigma\in \Omega$
, indices $i<j$
, let
$$
X_{ij}=\text{ # of times } z_i, z_j \text{ get compared in Quick Sort with pivot sequence }\sigma
$$
$$ \Rightarrow X_{ij} = 0 \text{ or } 1 \text{ (indicator variable)} = \mathbb{1}(z_i, z_j \text{ get compared with }\sigma) $$
Then, $\forall \sigma$
,
$$
C(\sigma) = \sum_{i=1}^n\sum_{j=i+1}^n X_{ij}(\sigma)
$$
By linearity of expectation,
$$
E[C] = \sum_{i=1}^{n-1}\sum_{j=i+1}^n E[X_{ij}]= \sum_{i=1}^{n-1} \sum_{j=i+1}^n\Pr(X_{ij}=1)\ \ \ (*)
$$
Key claim: $\forall i<j$
, $\Pr(X_{ij}=1)=\Pr(z_i, z_j \text{ get compared})=\frac{2}{j-i+1}$
As long as none of $z_i,z_{i+1},\ldots,z_j$
are chosen as a pivot, all are passed to the same recursive call.
Consider the first among $z_i,z_{i+1},\ldots,z_j$
that gets chosen as a pivot.
- if
$z_i$
or$z_j$
gets chosen first, then$z_i$
and$z_j$
get compared. - otherwise, then
$z_i$
and$z_j$
are never compared compared.
Note: since pivots always chosen uniformly at random, each of $z_i,z_{i+1},\ldots,z_j$
is equally likely to be the first.
$$
\Rightarrow \Pr(z_i, z_j \text{ get compared}) = \frac{2}{j-i+1}
$$
$$ \Rightarrow E[C]=2\sum_{i=1}^{n-1}\sum_{j=i+1}^n \frac{1}{j-i+1}\leq 2n\sum_{k=2}^n\frac{1}{k}\leq 2n\log n $$
$$ \sum_{k=2}^n\frac{1}{k}=\sum_{k=2}^n\int_{k-1}^k\frac{1}{k}dx< \sum_{k=2}^n\int_{k-1}^k\frac{1}{x}dx=\int_{1}^n\frac{1}{x}dx=\log n. $$
Python Code
from random import randint
def partition(arr, L, R):
pivot = arr[L]
i = L + 1
for j in range(i, R + 1):
if arr[j] < pivot:
arr[j], arr[i] = arr[i], arr[j]
i += 1
arr[L], arr[i - 1] = arr[i - 1], arr[L]
return i - 1
def quickSortHelper(arr, L, R):
if R - L < 1:
return arr
pos = randint(L, R)
arr[L], arr[pos] = arr[pos], arr[L]
pos = partition(arr, L, R)
quickSortHelper(arr, L, pos - 1)
quickSortHelper(arr, pos + 1, R)
def quickSort(arr):
return quickSortHelper(arr, 0, len(arr) - 1)
RSelect
DSelect
Graph
- 2 ingredients
- vertices (aka, nodes)
$V$
- edges
$E$
= pairs of vertices- undirected = unordered pair
- directed = ordered pair (aka, arcs)
- vertices (aka, nodes)
- examples: road networks, web, social networks, precedence constraints, etc.
Graph Representation
$n$
: # of vertices$m$
: # of edges- Max/min # of edges a connected graph that allows no parallel edges with
$n$
vertices have?- min:
$n-1$
(tree) - max:
$\binom{n}{2}$
(complete graph)
- min:
- sparse/dense graph
- sparse:
$m\approx O(n)$
- dense:
$m\approx \Theta(n^2)$
- sparse:
Adjacency Matrix
- represent
$G$
by a$n\times n$
matrix$A$
$$ A_{ij}=1 \Leftrightarrow G\text{ has an }i-j \text{ edge} $$
- variants:
$A_{ij}=$
# of$i-j$
edges (if exists parallel edges)$A_{ij}=$
weight of$i-j$
edge (if any)$A_{ij}=\begin{cases}+1,&\text{if }i\rightarrow j\newline -1,&\text{if }i\leftarrow j \end{cases}$
- space requirements:
$O(n^2)$
Adjacency List
- ingredients:
- array or list of vertices
$\Theta(n)$
- array or list of edges
$\Theta(m)$
- each edge points to its endpoints
$\Theta(m)$
- each vertex points to edges incident on it
$\Theta(m)$
(3$\leftrightarrow$
4, one-to-one correspondence)
- array or list of vertices
- space requirements:
$\Theta(n+m) = \Theta(\max\{m,n\})$
- Which representation is better?
- Depends on graph density and operations needed
- Adjacency list: perfect for graph search
Minimum Cuts Problem
- A cut of a graph
$(V,E)$
is a partition of$V$
into 2 nonempty sets$A$
and$B$
. - The crossing edges of a cut
$(A,B)$
are those with:- one endpoint in each of
$(A,B)$
(undirected) - tail in
$A$
, head in$B$
(directed)
- one endpoint in each of
- How many cuts does a graph with
$n$
vertices have?$2^n-2$
- Minimum Cuts:
Input: undirected graph
$G=(V,E)$
[parallel edges allowed]Output: a cut with fewest number of crossing edges
- Applications of min cuts:
- identify network bottlenecks/weaknesses
- community detection in social networks (highly interconnected amongst themselves, but weakly connected to rest of the graph)
- image segmentation:
- inputs: graph of pixels
- use edge weights:
$(u,v)$
has large weights$\Leftrightarrow$
“expect”$u,v$
to come from same objects - hope: repeated min cuts identifies the primary objects in picture
Random Contraction Algorithm
While
$\exists > 2$
vertices,
- pick a remaining edge
$(u,v)$
uniformly at random- merge (or “contract”)
$u,v$
into a single vertex- remove self-loops
- return cut represented by final 2 vertices
- Key question: probability of success?
Fix a graph $G=(V,E)$
with $n$
vertices and $m$
edges.
Fix a minimum cut $(A,B)$
.
Let $k$
= # of edges crossing $(A,B)$
. Call these edges $F$
.
- Suppose an edge of
$F$
is contracted at some point$\Rightarrow$
algorithm will output$(A,B)$
. - Suppose only edge inside
$A,B$
get contracted$\Rightarrow$
algorithm will not output$(A,B)$
.
Thus,
$$
\Pr(\text{output is } (A,B)) = \Pr(\text{never contract an edge of }F)
$$
Let $S_i=$
event that an edge of $F$
contracted in iteration $i$
.
Goal: compute $\Pr\left(\overline{S_1}\cap \overline{S_2}\cap\cdots\cap \overline{S_{n-2}}\right)$
$$
\Pr(S_i)=\frac{\text{# of crossing edges}}{\text{# of edges}}=\frac{k}{m}
$$
Key observation: degree of each vertex $\geq k$
.
Reason: each vertex $v$
defines a cut $(\{v\}, V-\{v\})$
.
Since $\sum_{v\in V}\deg(v)=2m$
,
$$
\Rightarrow 2m\geq kn\Rightarrow \Pr(S_i)\leq \frac{2}{n}.
$$
$$ \begin{aligned} \Pr(\overline{S_1}\cap\overline{S_2})&=\Pr(\overline{S_2}\vert \overline{S_1})\Pr(\overline{S_1})\newline &\geq \left[ 1-\Pr(S_2\vert \overline{S_1}) \right]\left(1-\frac{2}{n}\right)\newline &= \left[ 1-\frac{k}{\text{# of remaining edges}} \right]\left(1-\frac{2}{n}\right) \end{aligned} $$
Note: all nodes in contracted graph define cuts in $G$
(with at least $k$
crossing edges)
$\Rightarrow$
all degrees in contracted graph$\geq k$
$\Rightarrow$
# of remaining edges $\geq \frac{1}{2}k(n-1)$
$\Rightarrow \Pr(\overline{S_2}\vert \overline{S_1})\geq 1-\frac{2}{n-1}$
In general, $$ \begin{aligned} \Pr\left(\overline{S_1}\cap \overline{S_2}\cap\cdots\cap \overline{S_{n-2}}\right)&= \Pr(\overline{S_1})\Pr(\overline{S_2}|\overline{S_1})\Pr(\overline{S_3}|\overline{S_1}\cap\overline{S_2}) \cdots\Pr(\overline{S_{n-2}}|\overline{S_1}\cap\overline{S_2}\cap\cdots\cap\overline{S_{n-3}})\newline &\geq \left(1-\frac{2}{n}\right) \left(1-\frac{2}{n-1}\right) \cdots \left(1-\frac{2}{4}\right) \left(1-\frac{2}{3}\right)\newline &=\frac{n-2}{n} \frac{n-3}{n-1} \frac{n-4}{n-2} \cdots \frac{2}{4} \frac{1}{3}\newline &=\frac{2}{n(n-1)}=\frac{1}{\binom{n}{2}}\geq \frac{1}{n^2} \end{aligned} $$
Run the basic algorithm a large $N$
times, remember the smallest cut found.
Let $T_i$
= event that min cut $(A,B)$
is found on the $i$
th trial.
By definition, $T_i$
iid.
$$
\begin{aligned}
\Pr(\text{all trials fail}) &=\Pr(\overline{T_1}\cap\overline{T_2}\cap\cdots\overline{T_N})\newline
&=\prod_{i=1}^N\Pr(\overline{T_i})=\left(1-\frac{1}{n^2}\right)^N
\end{aligned}
$$
$\forall x\in\mathbb{R},\ 1+x\leq e^x$
$N=n^2$
$$
\Pr(\text{all trials fail})\leq e^{-\frac{1}{n^2}n^2}=\frac{1}{e}
$$
$N=n^2\log n$
$$
\Pr(\text{all trials fail})\leq e^{-\frac{\log n}{n^2}n^2}=\frac{1}{n}
$$
Running time: $\Omega(n^2m)$
【$n^2$
trails, each time $O(m)$
】
Better than brute force method $O(2^n)$
.
Counting Minimum Cuts
Note: a graph can have multiple min cuts.
- e.g. a tree with
$n$
vertices has$n-1$
min cuts
Claim. Largest # of min cuts that a graph with $n$
vertices can have: $\binom{n}{2}$
- lower bound
$n$
-cycle- each pair of the
$n$
edges has$\geq\binom{n}{2}$
min cuts
- upper bound
Let
$\{(A_j,B_j)\}_{j=1}^J$
be min cuts of a graph with$n$
vertices.Event
$E_j=$
output$(A_j,B_j)$
.By contraction algorithm, $$ \Pr(\text{output}=(A_j,B_j))\geq\frac{1}{\binom{n}{2}} $$ Since$E_j$
ind$\Rightarrow J\leq\binom{n}{2}$
.
Python Code
import random
def findMinCut(g):
"""
input: graph g,
adjacency list representation using dictionary
keys: nodes
values: list of adjacent nodes
"""
if len(g) < 3:
return len(g[list(g.keys())[0]])
v1, v2 = chooseEdge(g)
contract(g, v1, v2)
return findMinCut(g)
# edge might not be uniformly chosen
def chooseEdge(g):
"""
choose random edge (v1, v2)
"""
v1 = random.choice(list(g.keys()))
v2 = random.choice(g[v1])
return v1, v2
def contract(g, v1, v2):
"""
merge node v2 with v1
"""
g[v1].extend(g[v2])
for k in g[v2]:
g[k] = [v1 if x == v2 else x for x in g[k]]
del g[v2]
g[v1] = [x for x in g[v1] if x != v1]
from copy import deepcopy
mincuts = []
for i in range(10):
mincuts.append(findMinCut(deepcopy(g)))
print("min cuts = ", min(mincuts))