1# llama.cpp/examples/training
 2
 3This directory contains examples related to language model training using llama.cpp/GGML.
 4So far finetuning is technically functional (for FP32 models and limited hardware setups) but the code is very much WIP.
 5Finetuning of Stories 260K and LLaMA 3.2 1b seems to work with 24 GB of memory.
 6**For CPU training, compile llama.cpp without any additional backends such as CUDA.**
 7**For CUDA training, use the maximum number of GPU layers.**
 8
 9Proof of concept:
10
11``` sh
12export model_name=llama_3.2-1b && export quantization=f32
13./build/bin/llama-finetune --file wikitext-2-raw/wiki.test.raw -ngl 999 --model models/${model_name}-${quantization}.gguf -c 512 -b 512 -ub 512
14./build/bin/llama-perplexity --file wikitext-2-raw/wiki.test.raw -ngl 999 --model finetuned-model.gguf
15```
16
17The perplexity value of the finetuned model should be lower after training on the test set for 2 epochs.