Get the latest tech news
Alice's adventures in a differentiable wonderland
My personal website, where I collect slides, publications, and presentations.
Neural networks surround us, in the form of large language models, speech transcription systems, molecular discovery algorithms, robotics, and much more. I overview the basics of optimizing a function via automatic differentiation, and a selection of the most common designs for handling sequences, graphs, texts, and audios. The focus is on a intuitive, self-contained introduction to the most important design techniques, including convolutional, attentional, and recurrent blocks, hoping to bridge the gap between theory and code (PyTorch and JAX) and leaving the reader capable of understanding some of the most advanced models out there, such as large language models (LLMs) and multimodal architectures.
Or read this on Hacker News