1An experiment using tiny LLMs as NPCs that could be embedded into the game.
2
3Embed models into the game, build a simple vector database from text, embed
4prompts, retrieve top‑k by cosine similarity, and feed context into tiny
5CPU LLMs for NPC interactions.
6
7**No external API calls.** Everything is local, directly using GGUF models
8and [llama.cpp](https://github.com/ggml-org/llama.cpp) for inference.
9
10https://github.com/user-attachments/assets/863b75eb-0da7-4235-8112-f00bc82d81f6
11
12> [!NOTE]
13> This project is just for fun, to see how LLMs would fare as NPCs. Because of
14> the non-deterministic nature of LLMs, the results vary and are often quite
15> funny. A lot of tweaking would be needed to make this really useful in real
16> games, but not impossible.
17
18Goals of the experiment:
19
20- Have LLM be run only on CPU, this is why small LLMs have been chosen in this
21 experiment, so they can be used in other games.
22- To produce a simple C library that can be reused elsewhere.
23- Test existing small and tiny LLMs and provide some useful results on how they
24 behave.
25
26## Getting started
27
281. Build dependencies and binaries:
29 ```bash
30 make build/llama.cpp
31 make run/fetch-models
32 make build/context
33 make build/game
34 ```
35
362. Build a vector context database:
37 ```bash
38 build/corpus
39 ```
40
413. Run the game:
42 ```bash
43 ./game -m phi-4-mini-instruct -e qwen3
44 ```
45
46## Building
47
48### Prerequisites
49
50- C compiler (gcc/clang)
51- CMake
52- Docker (optional, for containerized use of binaries)
53
54### Build Steps
55
561. Build llama.cpp libraries:
57 ```bash
58 make build/llama.cpp
59 ```
60
612. Download models:
62 ```bash
63 make run/fetch-models
64 ```
65
663. Build binaries:
67 ```bash
68 make build/context
69 make build/prompts
70 make build/npc
71 make build/game
72 ```
73
74## Usage
75
76### Build a vector context database
77
78`context` reads a text file (one document per line), embeds each line, and
79produces a binary vector database file. For best results, use a dedicated
80embedding model (for example, `qwen3`) even if you generate answers with a
81different model.
82
83```bash
84./context -m qwen3 -i corpus/map1_keldor.txt -o corpus/map1_keldor.vdb
85```
86
87### Run an NPC query with retrieved context
88
89`npc` loads a vector database, embeds the prompt, selects the top 5 matching
90lines by cosine similarity, and runs the NPC system prompt against that context.
91You can pass a separate embedding model with `-e`/`--embed-model`.
92
93```bash
94./npc -m phi-4-mini-instruct -e qwen3 -p "Who is Keldor?" -c corpus/map1_keldor.vdb
95./npc -m qwen3 -e qwen3 -p "What does Keldor believe about the marsh lights?" -c corpus/map1_keldor.vdb
96```
97
98### Run the game
99
100The game uses the same models and retrieval pipeline, with short NPC replies.
101
102```bash
103./game -m phi-4-mini-instruct -e qwen3
104```
105
106### context options
107
108| Flag | Description |
109|------|-------------|
110| `-m, --model` | Embedding model to use (default: first model in config) |
111| `-i, --in` | Input context text file (required) |
112| `-o, --out` | Output vector database file (required) |
113| `-l, --list` | List available models |
114| `-v, --verbose` | Enable llama.cpp logging |
115| `-h, --help` | Show help message |
116
117### npc options
118
119| Flag | Description |
120|------|-------------|
121| `-m, --model` | Model to use (required) |
122| `-e, --embed-model` | Embedding model to use (optional) |
123| `-p, --prompt` | Prompt text (required) |
124| `-c, --context` | Context vector database file (.vdb) (required) |
125| `-l, --list` | List available models |
126| `-v, --verbose` | Enable llama.cpp logging |
127| `-h, --help` | Show help message |
128
129### game options
130
131| Flag | Description |
132|------|-------------|
133| `-m, --model` | Model to use (default: first model in config) |
134| `-e, --embed-model` | Embedding model to use (optional) |
135| `-v, --verbose` | Enable llama.cpp logging |
136| `-h, --help` | Show help message |
137
138## Models
139
140Configure models in `models.h`. The default model is the first entry in the
141`models` array; each entry points at a local GGUF file under `models/`.
142
143## Vector database format
144
145`context` produces a binary file with a fixed header and a contiguous list of
146documents. The header includes a magic value, version, embedding size, maximum
147text length, and document count. Each document stores the original text (fixed
148size `VDB_MAX_TEXT`) and its embedding (`VDB_EMBED_SIZE`).
149
150## Docker
151
152```bash
153make run/docker
154```
155
156Builds a Docker image and runs an interactive shell with the binaries and
157models under `/app/`.
158
159## Cleaning
160
161```bash
162make run/clean
163```
164
165## Reading material
166
167- https://www.tinyllm.org/
168- https://en.wikipedia.org/wiki/Cosine_similarity