Transformer architecture

In this lecture, we introduce the transformer architecture, the most widely used model in modern NLP.

Lecture 05 - Conjugate Gradient Method (CGM)

In this lecture, we introduce the conjugate gradient method (CGM) for solving the system of linear equations.

My First Post

My post is built based on Gregory Gundersen's blog-theme.