Update to multi model for embeddings and prompting

author: Mitja Felicijan <mitja.felicijan@gmail.com> 2026-02-20 13:54:21 +0100
committer: Mitja Felicijan <mitja.felicijan@gmail.com> 2026-02-20 13:54:21 +0100
commit: 306c3cb6924c6231c102ff7d75aa3f68e3618ca2 (patch)
tree: 1a41c8c4b70b43796cc3fc14f0c9e52b39651e2f /README.md
parent: 201bbf3e917066fb05ff1f10f7166d262b8ed2cf (diff)
download: llmnpc-306c3cb6924c6231c102ff7d75aa3f68e3618ca2.tar.gz
1 files changed, 28 insertions, 6 deletions
diff --git a/README.md b/README.md
index 7c5e7f7..6619175 100644
--- a/README.md
+++ b/README.md
@@ -39,6 +39,7 @@ Goals of the experiment:
    make build/context
    make build/prompts
    make build/npc
+   make build/game
    ```
 
 ## Usage
@@ -46,21 +47,31 @@ Goals of the experiment:
 ### Build a vector context database
 
 `context` reads a text file (one document per line), embeds each line, and
-produces a binary vector database file.
+produces a binary vector database file. For best results, use a dedicated
+embedding model (for example, `qwen3`) even if you generate answers with a
+different model.
 
 ```bash
-./context -i corpus/lotr.txt -o corpus/lotr.vdb
-./context -m flan-t5-small -i corpus/lotr.txt -o corpus/lotr.vdb
+./context -m qwen3 -i corpus/lotr.txt -o corpus/lotr.vdb
 ```
 
 ### Run an NPC query with retrieved context
 
-`npc` loads a vector database, embeds the prompt, selects the top 3 matching
+`npc` loads a vector database, embeds the prompt, selects the top 5 matching
 lines by cosine similarity, and runs the NPC system prompt against that context.
+You can pass a separate embedding model with `-e`/`--embed-model`.
 
 ```bash
-./npc -m flan-t5-small -p "Who is Gandalf?" -c corpus/lotr.vdb
-./npc -m flan-t5-small -p "Who is Frodo?" -c corpus/lotr.vdb
+./npc -m phi-4-mini-instruct -e qwen3 -p "Who is Gandalf?" -c corpus/lotr.vdb
+./npc -m qwen3 -e qwen3 -p "Who is Frodo?" -c corpus/lotr.vdb
+```
+
+### Run the game
+
+The game uses the same models and retrieval pipeline, with short NPC replies.
+
+```bash
+./game -m phi-4-mini-instruct -e qwen3
 ```
 
 ### context options
@@ -79,12 +90,22 @@ lines by cosine similarity, and runs the NPC system prompt against that context.
 | Flag | Description |
 |------|-------------|
 | `-m, --model` | Model to use (required) |
+| `-e, --embed-model` | Embedding model to use (optional) |
 | `-p, --prompt` | Prompt text (required) |
 | `-c, --context` | Context vector database file (.vdb) (required) |
 | `-l, --list` | List available models |
 | `-v, --verbose` | Enable llama.cpp logging |
 | `-h, --help` | Show help message |
 
+### game options
+
+| Flag | Description |
+|------|-------------|
+| `-m, --model` | Model to use (default: first model in config) |
+| `-e, --embed-model` | Embedding model to use (optional) |
+| `-v, --verbose` | Enable llama.cpp logging |
+| `-h, --help` | Show help message |
+
 ## Models
 
 Configure models in `models.h`. The default model is the first entry in the
@@ -115,3 +136,4 @@ make run/clean
 ## Reading material
 
 - https://www.tinyllm.org/
+- https://en.wikipedia.org/wiki/Cosine_similarity
author	Mitja Felicijan <mitja.felicijan@gmail.com>	2026-02-20 13:54:21 +0100
committer	Mitja Felicijan <mitja.felicijan@gmail.com>	2026-02-20 13:54:21 +0100
commit	306c3cb6924c6231c102ff7d75aa3f68e3618ca2 (patch)
tree	1a41c8c4b70b43796cc3fc14f0c9e52b39651e2f /README.md
parent	201bbf3e917066fb05ff1f10f7166d262b8ed2cf (diff)
download	llmnpc-306c3cb6924c6231c102ff7d75aa3f68e3618ca2.tar.gz