An experiment using tiny LLMs as NPCs that could be embedded into the game.

> [!NOTE]
> This project is just for fun, to see how LLMs would fare as NPCs. Because of
> the non-deterministic nature of LLMs, the results vary and are often quite
> funny. A lot of tweaking would be needed to make this really useful in real
> games, but not impossible.

Goals of the experiment:

- Have LLM be run only on CPU, this is why small LLMs have been chosen in this
  experiment, so they can be used in other games.
- To produce a simple C library that can be reused elsewhere.
- Test existing small and tiny LLMs and provide some useful results on how they
  behave.

## Getting started

1. Build dependencies and binaries:
   ```bash
   make build/llama.cpp
   make run/fetch-models
   make build/context
   make build/game
   ```

2. Build a vector context database:
   ```bash
   build/corpus
   ```

3. Run the game:
   ```bash
   ./game -m phi-4-mini-instruct -e qwen3
   ```

## Building

### Prerequisites

- C compiler (gcc/clang)
- CMake
- Docker (optional, for containerized use of binaries)

### Build Steps

1. Build llama.cpp libraries:
   ```bash
   make build/llama.cpp
   ```

2. Download models:
   ```bash
   make run/fetch-models
   ```

3. Build binaries:
   ```bash
   make build/context
   make build/prompts
   make build/npc
   make build/game
   ```

## Usage

### Build a vector context database

`context` reads a text file (one document per line), embeds each line, and
produces a binary vector database file. For best results, use a dedicated
embedding model (for example, `qwen3`) even if you generate answers with a
different model.

```bash
./context -m qwen3 -i corpus/map1_keldor.txt -o corpus/map1_keldor.vdb
```

### Run an NPC query with retrieved context

`npc` loads a vector database, embeds the prompt, selects the top 5 matching
lines by cosine similarity, and runs the NPC system prompt against that context.
You can pass a separate embedding model with `-e`/`--embed-model`.

```bash
./npc -m phi-4-mini-instruct -e qwen3 -p "Who is Keldor?" -c corpus/map1_keldor.vdb
./npc -m qwen3 -e qwen3 -p "What does Keldor believe about the marsh lights?" -c corpus/map1_keldor.vdb
```

### Run the game

The game uses the same models and retrieval pipeline, with short NPC replies.

```bash
./game -m phi-4-mini-instruct -e qwen3
```

### context options

| Flag | Description |
|------|-------------|
| `-m, --model` | Embedding model to use (default: first model in config) |
| `-i, --in` | Input context text file (required) |
| `-o, --out` | Output vector database file (required) |
| `-l, --list` | List available models |
| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |

### npc options

| Flag | Description |
|------|-------------|
| `-m, --model` | Model to use (required) |
| `-e, --embed-model` | Embedding model to use (optional) |
| `-p, --prompt` | Prompt text (required) |
| `-c, --context` | Context vector database file (.vdb) (required) |
| `-l, --list` | List available models |
| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |

### game options

| Flag | Description |
|------|-------------|
| `-m, --model` | Model to use (default: first model in config) |
| `-e, --embed-model` | Embedding model to use (optional) |
| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |

## Models

Configure models in `models.h`. The default model is the first entry in the
`models` array; each entry points at a local GGUF file under `models/`.

## Vector database format

`context` produces a binary file with a fixed header and a contiguous list of
documents. The header includes a magic value, version, embedding size, maximum
text length, and document count. Each document stores the original text (fixed
size `VDB_MAX_TEXT`) and its embedding (`VDB_EMBED_SIZE`).

## Docker

```bash
make run/docker
```

Builds a Docker image and runs an interactive shell with the binaries and
models under `/app/`.

## Cleaning

```bash
make run/clean
```

## Reading material

- https://www.tinyllm.org/
- https://en.wikipedia.org/wiki/Cosine_similarity