llmnpc - llama.cpp/examples/diffusion/README.md

Path: llmnpc / llama.cpp / examples / diffusion / README.md (raw)
 1# Diffusion Text Generation
 2
 3This directory contains implementations for Diffusion LLMs (DLLMs)
 4
 5More Info:
 6- https://github.com/ggml-org/llama.cpp/pull/14644
 7- https://github.com/ggml-org/llama.cpp/pull/14771
 8
 9## Parameters
10The diffusion CLI supports various parameters to control the generation process:
11
12### Core Diffusion Parameters
13- `--diffusion-steps`: Number of diffusion steps (default: 256)
14- `--diffusion-algorithm`: Algorithm for token selection
15  - `0`: ORIGIN - Token will be generated in a purely random order from https://arxiv.org/abs/2107.03006.
16  - `1`: ENTROPY_BASED - Entropy-based selection
17  - `2`: MARGIN_BASED - Margin-based selection
18  - `3`: RANDOM - Random selection
19  - `4`: CONFIDENCE_BASED - Confidence-based selection (default)
20  - More documentation here https://github.com/DreamLM/Dream
21- `--diffusion-visual`: Enable live visualization during generation
22
23### Scheduling Parameters
24Choose one of the following scheduling methods:
25
26**Timestep-based scheduling:**
27- `--diffusion-eps`: Epsilon value for timestep scheduling (e.g., 0.001)
28
29**Block-based scheduling:**
30- `--diffusion-block-length`: Block size for block-based scheduling (e.g., 32)
31
32### Sampling Parameters
33- `--temp`: Temperature for sampling (0.0 = greedy/deterministic, higher = more random)
34- `--top-k`: Top-k filtering for sampling
35- `--top-p`: Top-p (nucleus) filtering for sampling
36- `--seed`: Random seed for reproducibility
37
38### Model Parameters
39- `-m`: Path to the GGUF model file
40- `-p`: Input prompt text
41- `-ub`: Maximum sequence length (ubatch size)
42- `-c`: Context size
43- `-b`: Batch size
44
45### Examples
46#### Dream architechture:
47```
48llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual
49```
50
51#### LLaDA architechture:
52```
53llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual
54```
55
56#### RND1 architecture:
57```
58llama-diffusion-cli -m RND1-Base-0910.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-algorithm 1 --diffusion-steps 256 --diffusion-visual --temp 0.5 --diffusion-eps 0.001
59```