diff options
| author | Mitja Felicijan <mitja.felicijan@gmail.com> | 2026-02-13 18:11:30 +0100 |
|---|---|---|
| committer | Mitja Felicijan <mitja.felicijan@gmail.com> | 2026-02-13 18:11:30 +0100 |
| commit | 25cf1b05d9ad73ff0d5024277beda7101b8e9aea (patch) | |
| tree | 0011a45f88b2998e117776f0904f4acc87713deb /README.md | |
| parent | 2a6fd554998c64733e2d97aecf653f6e48e0f8b4 (diff) | |
| download | llmnpc-25cf1b05d9ad73ff0d5024277beda7101b8e9aea.tar.gz | |
Readme update
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 68 |
1 files changed, 53 insertions, 15 deletions
@@ -1,8 +1,8 @@ # llmnpc -A command-line LLM inference tool powered by -[llama.cpp](https://github.com/ggerganov/llama.cpp) for testing how/if NPC's -could use LLM's. +Command-line LLM inference and simple context retrieval powered by +[llama.cpp](https://github.com/ggerganov/llama.cpp) to test viability of using +LLM's to drive NPC behaviour. ## Building @@ -16,50 +16,88 @@ could use LLM's. 1. Build llama.cpp libraries: ```bash - make llamacpp + make build/llama.cpp ``` -2. Download models +2. Download models: ```bash - make fetchmodels + make run/fetch-models ``` -3. Build the prompt binary: +3. Build binaries: ```bash - make prompt + make build/prompt + make build/context ``` ## Usage +### Build a vector context database + +`context` reads a text file (one document per line), embeds each line, and +produces a binary vector database file. + +```bash +./context -i context.txt -o context.vdb +./context -m flan-t5-small -i context.txt -o context.vdb +``` + +### Run a prompt with retrieved context + +`prompt` reads the context text file, embeds the query, selects the top 3 +matching lines by cosine similarity, and builds a prompt from those lines. + ```bash -./prompt -p "Your prompt here" -./prompt -m flan-t5-small -p "What is machine learning?" +./prompt -p "What is machine learning?" -c context.txt +./prompt -m flan-t5-small -p "What is machine learning?" -c context.txt ``` -### Options +### context options + +| Flag | Description | +|------|-------------| +| `-m, --model` | Embedding model to use (default: first model in config) | +| `-i, --in` | Input context text file (required) | +| `-o, --out` | Output vector database file (required) | +| `-l, --list` | List available models | +| `-v, --verbose` | Enable llama.cpp logging | +| `-h, --help` | Show help message | + +### prompt options | Flag | Description | |------|-------------| | `-m, --model` | Model to use (default: first model in config) | | `-p, --prompt` | Prompt text (required) | +| `-c, --context` | Context text file (required) | +| `-v, --verbose` | Enable llama.cpp logging | | `-h, --help` | Show help message | ## Models -Configure models in `models.h`. The default model is `flan-t5-small`, expecting a GGUF file at `models/flan-t5-small.F16.gguf`. +Configure models in `models.h`. The default model is the first entry in the +`models` array; each entry points at a local GGUF file under `models/`. + +## Vector database format + +`context` produces a binary file with a fixed header and a contiguous list of +documents. The header includes a magic value, version, embedding size, maximum +text length, and document count. Each document stores the original text (fixed +size `VDB_MAX_TEXT`) and its embedding (`VDB_EMBED_SIZE`). ## Docker ```bash -make docker +make run/docker ``` -This builds a Docker image and drops you into a shell with the prompt binary and models available at `/app/`. +Builds a Docker image and runs an interactive shell with the binaries and +models under `/app/`. ## Cleaning ```bash -make clean +make run/clean ``` ## Reading material |
