Get the latest tech news
How does ChatGPT work: An in-depth look for Programmers
A bit about what's going on behind the scenes, in case you probably don't know yet
We are going to look at its software architecture, the use of advanced transformer networks, and how Reinforcement Learning from Human Feedback (RLHF) fine-tunes the model for conversational tasks. After this attention layer identifies the relevant words, a feedforward network polishes the output by applying learned weights (i.e., trained parameters, which stand for numerical values adjusted during the model’s optimization process). The system itself checks how far off its “guess“ was (we call it a “loss“ value in Machine learning), works backward through its calculations, and adjusts itself to do better next time.
Or read this on Hacker News