llmnpc - llama.cpp/docs/backend/zDNN.md

Path: llmnpc / llama.cpp / docs / backend / zDNN.md (raw)
 1# llama.cpp for IBM zDNN Accelerator
 2
 3> [!WARNING]
 4> **Note:** zDNN is **not** the same as ZenDNN.
 5> - **zDNN** (this page): IBM's Deep Neural Network acceleration library for IBM Z & LinuxONE Mainframes
 6> - **ZenDNN**: AMD's deep learning library for AMD EPYC CPUs ([see ZenDNN documentation](ZenDNN.md))
 7
 8## Background
 9
10IBM zDNN (Z Deep Neural Network) is a hardware acceleration library designed specifically to leverage the IBM NNPA (Neural Network Processor Assist) accelerator located within IBM Telum I and II processors. It provides significant performance improvements for neural network inference operations.
11
12### Llama.cpp + IBM zDNN
13
14The llama.cpp zDNN backend is designed to enable llama.cpp on IBM z17 and later systems via the IBM zDNN hardware acceleration library.
15
16## Software & Hardware Support
17
18| Hardware Level       | Status        | Verified                   |
19| -------------------- | ------------- | -------------------------- |
20| IBM z17 / LinuxONE 5 | Supported     | RHEL 9.6, IBM z17, 40 IFLs |
21| IBM z16 / LinuxONE 4 | Not Supported |                            |
22
23## Data Types Supported
24
25| Data Type | Status    |
26| --------- | --------- |
27| F32       | Supported |
28| F16       | Supported |
29| BF16      | Supported |
30
31## CMake Options
32
33The IBM zDNN backend has the following CMake options that control the behaviour of the backend.
34
35| CMake Option | Default Value | Description                         |
36| ------------ | ------------- | ----------------------------------- |
37| `GGML_ZDNN`  | `OFF`         | Compile llama.cpp with zDNN support |
38| `ZDNN_ROOT`  | `""`          | Override zDNN library lookup        |
39
40## 1. Install zDNN Library
41
42Note: Using the zDNN library provided via `apt` or `yum` may not work correctly as reported in [#15772](https://github.com/ggml-org/llama.cpp/issues/15772). It is preferred that you compile from source.
43
44```sh
45git clone --recurse-submodules https://github.com/IBM/zDNN
46cd zDNN
47
48autoreconf .
49./configure --prefix=/opt/zdnn-libs
50
51make build
52sudo make install
53```
54
55## 2. Build llama.cpp
56
57```sh
58git clone https://github.com/ggml-org/llama.cpp
59cd llama.cpp
60
61cmake -S . -G Ninja -B build \
62    -DCMAKE_BUILD_TYPE=Release \
63    -DGGML_ZDNN=ON \
64    -DZDNN_ROOT=/opt/zdnn-libs
65cmake --build build --config Release -j$(nproc)
66```