My team is open-sourcing the inference stack and fined-tuned models we use to create LLM-powered NPCs: https://github.com/GigaxGames/gigax
The generative agents paper [1] pioneered the idea of prompting LLMs to create autonomous NPCs. But existing implementations require multiple calls to an LLM to make the agent plan its day, chat with people, and interact with its environment [2].
Our approach allows NPCs to be stepped at runtime with a single pass on consumer-grade hardware, with reasonable latency.
To achieve this, we've fine-tuned open-source LLMs [3] to parse a text description of a 3d scene, and respond with custom actions like `greet <someone>`, `grab <item>`, or `say <utterance>`. This simple whitespace-separated « function calling » format is less verbose than json and thus helps with inference speed. We're also using the Outlines library [4] to force the model to adhere to this format.
We're launching an API and we're looking for partnerships with studios to integrate our tech into upcoming games. Would love to connect if this is of any interest to you!
Thanks in advance for your feedback :)
[1] https://arxiv.org/abs/2304.03442 [2] https://github.com/joonspk-research/generative_agents/tree/m... [3] https://huggingface.co/Gigax (phi-3 fine-tune) [4] https://github.com/outlines-dev/outlines/tree/main