1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
|
# llmnpc
Command-line LLM inference and simple context retrieval powered by
[llama.cpp](https://github.com/ggerganov/llama.cpp) to test viability of using
LLM's to drive NPC behaviour.
## Building
### Prerequisites
- C compiler (gcc/clang)
- CMake
- Docker (optional, for containerized use of binaries)
### Build Steps
1. Build llama.cpp libraries:
```bash
make build/llama.cpp
```
2. Download models:
```bash
make run/fetch-models
```
3. Build binaries:
```bash
make build/prompt
make build/context
```
## Usage
### Build a vector context database
`context` reads a text file (one document per line), embeds each line, and
produces a binary vector database file.
```bash
./context -i context.txt -o context.vdb
./context -m flan-t5-small -i context.txt -o context.vdb
```
### Run a prompt with retrieved context
`prompt` reads the context text file, embeds the query, selects the top 3
matching lines by cosine similarity, and builds a prompt from those lines.
```bash
./prompt -p "What is machine learning?" -c context.txt
./prompt -m flan-t5-small -p "What is machine learning?" -c context.txt
```
### context options
| Flag | Description |
|------|-------------|
| `-m, --model` | Embedding model to use (default: first model in config) |
| `-i, --in` | Input context text file (required) |
| `-o, --out` | Output vector database file (required) |
| `-l, --list` | List available models |
| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |
### prompt options
| Flag | Description |
|------|-------------|
| `-m, --model` | Model to use (default: first model in config) |
| `-p, --prompt` | Prompt text (required) |
| `-c, --context` | Context text file (required) |
| `-v, --verbose` | Enable llama.cpp logging |
| `-h, --help` | Show help message |
## Models
Configure models in `models.h`. The default model is the first entry in the
`models` array; each entry points at a local GGUF file under `models/`.
## Vector database format
`context` produces a binary file with a fixed header and a contiguous list of
documents. The header includes a magic value, version, embedding size, maximum
text length, and document count. Each document stores the original text (fixed
size `VDB_MAX_TEXT`) and its embedding (`VDB_EMBED_SIZE`).
## Docker
```bash
make run/docker
```
Builds a Docker image and runs an interactive shell with the binaries and
models under `/app/`.
## Cleaning
```bash
make run/clean
```
## Reading material
- https://www.tinyllm.org/
|