summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorMitja Felicijan <mitja.felicijan@gmail.com>2026-02-20 13:54:21 +0100
committerMitja Felicijan <mitja.felicijan@gmail.com>2026-02-20 13:54:21 +0100
commit306c3cb6924c6231c102ff7d75aa3f68e3618ca2 (patch)
tree1a41c8c4b70b43796cc3fc14f0c9e52b39651e2f /README.md
parent201bbf3e917066fb05ff1f10f7166d262b8ed2cf (diff)
downloadllmnpc-306c3cb6924c6231c102ff7d75aa3f68e3618ca2.tar.gz
Update to multi model for embeddings and prompting
Diffstat (limited to 'README.md')
-rw-r--r--README.md34
1 files changed, 28 insertions, 6 deletions
diff --git a/README.md b/README.md
index 7c5e7f7..6619175 100644
--- a/README.md
+++ b/README.md
@@ -39,6 +39,7 @@ Goals of the experiment:
make build/context
make build/prompts
make build/npc
+ make build/game
```
## Usage
@@ -46,21 +47,31 @@ Goals of the experiment:
### Build a vector context database
`context` reads a text file (one document per line), embeds each line, and
-produces a binary vector database file.
+produces a binary vector database file. For best results, use a dedicated
+embedding model (for example, `qwen3`) even if you generate answers with a
+different model.
```bash
-./context -i corpus/lotr.txt -o corpus/lotr.vdb
-./context -m flan-t5-small -i corpus/lotr.txt -o corpus/lotr.vdb
+./context -m qwen3 -i corpus/lotr.txt -o corpus/lotr.vdb
```
### Run an NPC query with retrieved context
-`npc` loads a vector database, embeds the prompt, selects the top 3 matching
+`npc` loads a vector database, embeds the prompt, selects the top 5 matching
lines by cosine similarity, and runs the NPC system prompt against that context.
+You can pass a separate embedding model with `-e`/`--embed-model`.
```bash
-./npc -m flan-t5-small -p "Who is Gandalf?" -c corpus/lotr.vdb
-./npc -m flan-t5-small -p "Who is Frodo?" -c corpus/lotr.vdb
+./npc -m phi-4-mini-instruct -e qwen3 -p "Who is Gandalf?" -c corpus/lotr.vdb
+./npc -m qwen3 -e qwen3 -p "Who is Frodo?" -c corpus/lotr.vdb
+```
+
+### Run the game
+
+The game uses the same models and retrieval pipeline, with short NPC replies.
+
+```bash
+./game -m phi-4-mini-instruct -e qwen3
```
### context options
@@ -79,12 +90,22 @@ lines by cosine similarity, and runs the NPC system prompt against that context.
| Flag | Description |
|------|-------------|
| `-m, --model` | Model to use (required) |
+| `-e, --embed-model` | Embedding model to use (optional) |
| `-p, --prompt` | Prompt text (required) |
| `-c, --context` | Context vector database file (.vdb) (required) |
| `-l, --list` | List available models |
| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |
+### game options
+
+| Flag | Description |
+|------|-------------|
+| `-m, --model` | Model to use (default: first model in config) |
+| `-e, --embed-model` | Embedding model to use (optional) |
+| `-v, --verbose` | Enable llama.cpp logging |
+| `-h, --help` | Show help message |
+
## Models
Configure models in `models.h`. The default model is the first entry in the
@@ -115,3 +136,4 @@ make run/clean
## Reading material
- https://www.tinyllm.org/
+- https://en.wikipedia.org/wiki/Cosine_similarity