Valentin Velev

MSc Data Science Student | Researcher | Aspiring Data Scientist

Building GPT-2 from scratch using NumPy

The Generative Pre-Trained Transformer 2 (GPT-2), introduced in Radford et al. (2019), is a popular open-source generative pre-trained language model (PLM). In this blog post I will explain GPT-2's architecture and show how it can be implemented from scratch in Python using the package NumPy.