summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMitja Felicijan <mitja.felicijan@gmail.com>2026-02-13 18:11:30 +0100
committerMitja Felicijan <mitja.felicijan@gmail.com>2026-02-13 18:11:30 +0100
commit25cf1b05d9ad73ff0d5024277beda7101b8e9aea (patch)
tree0011a45f88b2998e117776f0904f4acc87713deb
parent2a6fd554998c64733e2d97aecf653f6e48e0f8b4 (diff)
downloadllmnpc-25cf1b05d9ad73ff0d5024277beda7101b8e9aea.tar.gz
Readme update
-rw-r--r--README.md68
1 files changed, 53 insertions, 15 deletions
diff --git a/README.md b/README.md
index 9d8fe1b..240fcfb 100644
--- a/README.md
+++ b/README.md
@@ -1,8 +1,8 @@
# llmnpc
-A command-line LLM inference tool powered by
-[llama.cpp](https://github.com/ggerganov/llama.cpp) for testing how/if NPC's
-could use LLM's.
+Command-line LLM inference and simple context retrieval powered by
+[llama.cpp](https://github.com/ggerganov/llama.cpp) to test viability of using
+LLM's to drive NPC behaviour.
## Building
@@ -16,50 +16,88 @@ could use LLM's.
1. Build llama.cpp libraries:
```bash
- make llamacpp
+ make build/llama.cpp
```
-2. Download models
+2. Download models:
```bash
- make fetchmodels
+ make run/fetch-models
```
-3. Build the prompt binary:
+3. Build binaries:
```bash
- make prompt
+ make build/prompt
+ make build/context
```
## Usage
+### Build a vector context database
+
+`context` reads a text file (one document per line), embeds each line, and
+produces a binary vector database file.
+
+```bash
+./context -i context.txt -o context.vdb
+./context -m flan-t5-small -i context.txt -o context.vdb
+```
+
+### Run a prompt with retrieved context
+
+`prompt` reads the context text file, embeds the query, selects the top 3
+matching lines by cosine similarity, and builds a prompt from those lines.
+
```bash
-./prompt -p "Your prompt here"
-./prompt -m flan-t5-small -p "What is machine learning?"
+./prompt -p "What is machine learning?" -c context.txt
+./prompt -m flan-t5-small -p "What is machine learning?" -c context.txt
```
-### Options
+### context options
+
+| Flag | Description |
+|------|-------------|
+| `-m, --model` | Embedding model to use (default: first model in config) |
+| `-i, --in` | Input context text file (required) |
+| `-o, --out` | Output vector database file (required) |
+| `-l, --list` | List available models |
+| `-v, --verbose` | Enable llama.cpp logging |
+| `-h, --help` | Show help message |
+
+### prompt options
| Flag | Description |
|------|-------------|
| `-m, --model` | Model to use (default: first model in config) |
| `-p, --prompt` | Prompt text (required) |
+| `-c, --context` | Context text file (required) |
+| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |
## Models
-Configure models in `models.h`. The default model is `flan-t5-small`, expecting a GGUF file at `models/flan-t5-small.F16.gguf`.
+Configure models in `models.h`. The default model is the first entry in the
+`models` array; each entry points at a local GGUF file under `models/`.
+
+## Vector database format
+
+`context` produces a binary file with a fixed header and a contiguous list of
+documents. The header includes a magic value, version, embedding size, maximum
+text length, and document count. Each document stores the original text (fixed
+size `VDB_MAX_TEXT`) and its embedding (`VDB_EMBED_SIZE`).
## Docker
```bash
-make docker
+make run/docker
```
-This builds a Docker image and drops you into a shell with the prompt binary and models available at `/app/`.
+Builds a Docker image and runs an interactive shell with the binaries and
+models under `/app/`.
## Cleaning
```bash
-make clean
+make run/clean
```
## Reading material