Get the latest tech news

Math Behind Transformers and LLMs


About

My goal is to give a brief introduction to the state of current large language models, the OpenGPT-X project, and the transformer neural network architecture for people unfamiliar with the subject. The computational device of choice is typically the GPU due to the massive parallelism it provides and hardware features that make it extremely efficient in performing matrix multiplications. I recently moved from numerical linear algebra, developing algorithms for solving structured eigenvalue problems, towards natural language processing with a focus on high performance computing.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of LLMs

LLMs

Photo of Transformers

Transformers

Photo of Math Behind

Math Behind

Related news:

News photo

Show HN: Mutahunter – LLMs to support mutating testing for all major languages

News photo

The Smart Principles: Designing Interfaces That LLMs Understand

News photo

Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs