Get the latest tech news
The super effectiveness of Pokémon embeddings using only raw JSON and images
Embeddings encourage engineers to go full YOLO because it’s actually rewarding to do so!
Due to the quadratic scaling at high input token counts, this is still very computationally intensive despite the optimization tricks: for the 1,302 embeddings, it took about a half-hour on a Google Colab T4 GPU. A couple years ago I hacked together a Python package named imgbeddings which uses OpenAI’s CLIP to generate the embeddings, albeit with mixed results. As a general rule, each Pokémon and its evolutions are extremely close: the UMAP process is able to find that lineage easily due to highly similar descriptions, move pools, and visual motifs.
Or read this on Hacker News