summaryrefslogtreecommitdiff
path: root/llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template
diff options
context:
space:
mode:
authorMitja Felicijan <mitja.felicijan@gmail.com>2026-02-12 20:57:17 +0100
committerMitja Felicijan <mitja.felicijan@gmail.com>2026-02-12 20:57:17 +0100
commitb333b06772c89d96aacb5490d6a219fba7c09cc6 (patch)
tree211df60083a5946baa2ed61d33d8121b7e251b06 /llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template
downloadllmnpc-b333b06772c89d96aacb5490d6a219fba7c09cc6.tar.gz
Engage!
Diffstat (limited to 'llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template')
-rw-r--r--llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template48
1 files changed, 48 insertions, 0 deletions
diff --git a/llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template b/llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template
new file mode 100644
index 0000000..9e63042
--- /dev/null
+++ b/llama.cpp/examples/model-conversion/scripts/embedding/modelcard.template
@@ -0,0 +1,48 @@
+---
+base_model:
+- {base_model}
+---
+# {model_name} GGUF
+
+Recommended way to run this model:
+
+```sh
+llama-server -hf {namespace}/{model_name}-GGUF --embeddings
+```
+
+Then the endpoint can be accessed at http://localhost:8080/embedding, for
+example using `curl`:
+```console
+curl --request POST \
+ --url http://localhost:8080/embedding \
+ --header "Content-Type: application/json" \
+ --data '{{"input": "Hello embeddings"}}' \
+ --silent
+```
+
+Alternatively, the `llama-embedding` command line tool can be used:
+```sh
+llama-embedding -hf {namespace}/{model_name}-GGUF --verbose-prompt -p "Hello embeddings"
+```
+
+#### embd_normalize
+When a model uses pooling, or the pooling method is specified using `--pooling`,
+the normalization can be controlled by the `embd_normalize` parameter.
+
+The default value is `2` which means that the embeddings are normalized using
+the Euclidean norm (L2). Other options are:
+* -1 No normalization
+* 0 Max absolute
+* 1 Taxicab
+* 2 Euclidean/L2
+* \>2 P-Norm
+
+This can be passed in the request body to `llama-server`, for example:
+```sh
+ --data '{{"input": "Hello embeddings", "embd_normalize": -1}}' \
+```
+
+And for `llama-embedding`, by passing `--embd-normalize <value>`, for example:
+```sh
+llama-embedding -hf {namespace}/{model_name}-GGUF --embd-normalize -1 -p "Hello embeddings"
+```