In this lecture, we introduce the conjugate gradient method (CGM) for solving $\mathbf{A}\mathbf{x} = \mathbf{b}$ where $\mathbf{A}$ is a symmetric positive definite (s.p.d.) matrix. Our motivation starts from the Steepest Descent (SD) method introduced in our previous lecture, which admits ${\mathcal{O}\left(\kappa(\mathbf{A}) \log \tfrac{1}{\epsilon}\right)}$ iteration complexity for achieving $\left\| \mathbf{x}_{t} - \mathbf{x}^{*} \right\| \le \epsilon$. We will see how the conjugation constructed from residuals helps to improve SD and leads to the CGM method. This method admits ${\mathcal{O}}\left(\sqrt{\kappa(\mathbf{A})}\log \tfrac{1}{\epsilon}\right)$ iteration complexity. Much of this note is adopted mainly from the excellent tutorial (Shewchuk & others, 1994) and (Kelley, 1995).
First, we recall some notations and review the Steepest Descent (SD) method from our previous lecture. We then comment on why a better iterative method might be possible.