High-throughput LLM serving engine.
High-throughput LLM serving engine. The default choice for self-hosting open-source models at scale.
Updates from the AI world — what shipped, what we’re using in production, and what’s worth your attention. Two emails a month, no spam.