diff options
| author | Mitja Felicijan <mitja.felicijan@gmail.com> | 2026-02-12 20:57:17 +0100 |
|---|---|---|
| committer | Mitja Felicijan <mitja.felicijan@gmail.com> | 2026-02-12 20:57:17 +0100 |
| commit | b333b06772c89d96aacb5490d6a219fba7c09cc6 (patch) | |
| tree | 211df60083a5946baa2ed61d33d8121b7e251b06 /llama.cpp/docs/backend/zDNN.md | |
| download | llmnpc-b333b06772c89d96aacb5490d6a219fba7c09cc6.tar.gz | |
Engage!
Diffstat (limited to 'llama.cpp/docs/backend/zDNN.md')
| -rw-r--r-- | llama.cpp/docs/backend/zDNN.md | 66 |
1 files changed, 66 insertions, 0 deletions
diff --git a/llama.cpp/docs/backend/zDNN.md b/llama.cpp/docs/backend/zDNN.md new file mode 100644 index 0000000..94108f5 --- /dev/null +++ b/llama.cpp/docs/backend/zDNN.md @@ -0,0 +1,66 @@ +# llama.cpp for IBM zDNN Accelerator + +> [!WARNING] +> **Note:** zDNN is **not** the same as ZenDNN. +> - **zDNN** (this page): IBM's Deep Neural Network acceleration library for IBM Z & LinuxONE Mainframes +> - **ZenDNN**: AMD's deep learning library for AMD EPYC CPUs ([see ZenDNN documentation](ZenDNN.md)) + +## Background + +IBM zDNN (Z Deep Neural Network) is a hardware acceleration library designed specifically to leverage the IBM NNPA (Neural Network Processor Assist) accelerator located within IBM Telum I and II processors. It provides significant performance improvements for neural network inference operations. + +### Llama.cpp + IBM zDNN + +The llama.cpp zDNN backend is designed to enable llama.cpp on IBM z17 and later systems via the IBM zDNN hardware acceleration library. + +## Software & Hardware Support + +| Hardware Level | Status | Verified | +| -------------------- | ------------- | -------------------------- | +| IBM z17 / LinuxONE 5 | Supported | RHEL 9.6, IBM z17, 40 IFLs | +| IBM z16 / LinuxONE 4 | Not Supported | | + +## Data Types Supported + +| Data Type | Status | +| --------- | --------- | +| F32 | Supported | +| F16 | Supported | +| BF16 | Supported | + +## CMake Options + +The IBM zDNN backend has the following CMake options that control the behaviour of the backend. + +| CMake Option | Default Value | Description | +| ------------ | ------------- | ----------------------------------- | +| `GGML_ZDNN` | `OFF` | Compile llama.cpp with zDNN support | +| `ZDNN_ROOT` | `""` | Override zDNN library lookup | + +## 1. Install zDNN Library + +Note: Using the zDNN library provided via `apt` or `yum` may not work correctly as reported in [#15772](https://github.com/ggml-org/llama.cpp/issues/15772). It is preferred that you compile from source. + +```sh +git clone --recurse-submodules https://github.com/IBM/zDNN +cd zDNN + +autoreconf . +./configure --prefix=/opt/zdnn-libs + +make build +sudo make install +``` + +## 2. Build llama.cpp + +```sh +git clone https://github.com/ggml-org/llama.cpp +cd llama.cpp + +cmake -S . -G Ninja -B build \ + -DCMAKE_BUILD_TYPE=Release \ + -DGGML_ZDNN=ON \ + -DZDNN_ROOT=/opt/zdnn-libs +cmake --build build --config Release -j$(nproc) +``` |
