Get the latest tech news
Generating Pixels One by One
Your First Autoregressive Image Generation Model
We’ll build a basic autoregressive model using a simple MLP to generate images ofhandwritten digits, focusing on understanding the core concept of predicting the next pixel based on its predecessors. This naturally leads us to our next and final refinement for this MLP-based autoregressive model: What if we replace the one-hot encoding of pixel tokens with learned, dense vector representations (embeddings)? Model V1, our simplest MLP using only one hot encoded pizel valus without any spatial awarenes, prodiced results,that, as seen in the generated images, were largely abstract and dominated by local, repetitive patterns like horizontal streaks.
Or read this on Hacker News