Get the latest tech news

DoppelBot: Replace Your CEO with an LLM

(quick links: add to your own Slack; source code)

Initial versions of the model were prone to generating short responses — unsurprising, because a majority of Slack communication is pretty terse. At inference time, loading the model with the LoRA adapter for a user takes 15-20s, so it’s important that we avoid doing this for every incoming request. app_mention: When the bot is mentioned in a channel, we retrieve the recent messages from that thread, do some basic cleaning and call the user’s model to generate a response.

Get the Android app

Or read this on Hacker News