1name: Bug (model use)
  2description: Something goes wrong when using a model (in general, not specific to a single llama.cpp module).
  3title: "Eval bug: "
  4labels: ["bug-unconfirmed", "model evaluation"]
  5body:
  6  - type: markdown
  7    attributes:
  8      value: >
  9        Thanks for taking the time to fill out this bug report!
 10        This issue template is intended for bug reports where the model evaluation results
 11        (i.e. the generated text) are incorrect or llama.cpp crashes during model evaluation.
 12        If you encountered the issue while using an external UI (e.g. ollama),
 13        please reproduce your issue using one of the examples/binaries in this repository.
 14        The `llama-completion` binary can be used for simple and reproducible model inference.
 15  - type: textarea
 16    id: version
 17    attributes:
 18      label: Name and Version
 19      description: Which version of our software are you running? (use `--version` to get a version string)
 20      placeholder: |
 21        $./llama-cli --version
 22        version: 2999 (42b4109e)
 23        built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
 24    validations:
 25      required: true
 26  - type: dropdown
 27    id: operating-system
 28    attributes:
 29      label: Operating systems
 30      description: Which operating systems do you know to be affected?
 31      multiple: true
 32      options:
 33        - Linux
 34        - Mac
 35        - Windows
 36        - BSD
 37        - Other? (Please let us know in description)
 38    validations:
 39      required: true
 40  - type: dropdown
 41    id: backends
 42    attributes:
 43        label: GGML backends
 44        description: Which GGML backends do you know to be affected?
 45        options: [AMX, BLAS, CPU, CUDA, HIP, Metal, Musa, RPC, SYCL, Vulkan, OpenCL, zDNN]
 46        multiple: true
 47    validations:
 48      required: true
 49  - type: textarea
 50    id: hardware
 51    attributes:
 52      label: Hardware
 53      description: Which CPUs/GPUs are you using?
 54      placeholder: >
 55        e.g. Ryzen 5950X + 2x RTX 4090
 56    validations:
 57      required: true
 58  - type: textarea
 59    id: model
 60    attributes:
 61      label: Models
 62      description: >
 63        Which model(s) at which quantization were you using when encountering the bug?
 64        If you downloaded a GGUF file off of Huggingface, please provide a link.
 65      placeholder: >
 66        e.g. Meta LLaMA 3.1 Instruct 8b q4_K_M
 67    validations:
 68      required: false
 69  - type: textarea
 70    id: info
 71    attributes:
 72      label: Problem description & steps to reproduce
 73      description: >
 74        Please give us a summary of the problem and tell us how to reproduce it.
 75        If you can narrow down the bug to specific hardware, compile flags, or command line arguments,
 76        that information would be very much appreciated by us.
 77
 78        If possible, please try to reproduce the issue using `llama-completion` with `-fit off`.
 79        If you can only reproduce the issue with `-fit on`, please provide logs both with and without `--verbose`.
 80      placeholder: >
 81        e.g. when I run llama-completion with `-fa on` I get garbled outputs for very long prompts.
 82        With short prompts or `-fa off` it works correctly.
 83        Here are the exact commands that I used: ...
 84    validations:
 85      required: true
 86  - type: textarea
 87    id: first_bad_commit
 88    attributes:
 89      label: First Bad Commit
 90      description: >
 91        If the bug was not present on an earlier version: when did it start appearing?
 92        If possible, please do a git bisect and identify the exact commit that introduced the bug.
 93    validations:
 94      required: false
 95  - type: textarea
 96    id: logs
 97    attributes:
 98      label: Relevant log output
 99      description: >
100          Please copy and paste any relevant log output, including the command that you entered and any generated text.
101          For very long logs (thousands of lines), preferably upload them as files instead.
102          On Linux you can redirect console output into a file by appending ` > llama.log 2>&1` to your command.
103      value: |
104        <details>
105        <summary>Logs</summary>
106        <!-- Copy-pasted short logs go into the "console" area here -->
107
108        ```console
109
110        ```
111        </details>
112
113        <!-- Long logs that you upload as files go here, outside the "console" area -->
114    validations:
115      required: true