summaryrefslogtreecommitdiff
path: root/llama.cpp/tools/server/webui/docs/flows/data-flow-simplified-router-mode.md
diff options
context:
space:
mode:
Diffstat (limited to 'llama.cpp/tools/server/webui/docs/flows/data-flow-simplified-router-mode.md')
-rw-r--r--llama.cpp/tools/server/webui/docs/flows/data-flow-simplified-router-mode.md77
1 files changed, 77 insertions, 0 deletions
diff --git a/llama.cpp/tools/server/webui/docs/flows/data-flow-simplified-router-mode.md b/llama.cpp/tools/server/webui/docs/flows/data-flow-simplified-router-mode.md
new file mode 100644
index 0000000..bccacf5
--- /dev/null
+++ b/llama.cpp/tools/server/webui/docs/flows/data-flow-simplified-router-mode.md
@@ -0,0 +1,77 @@
+```mermaid
+%% ROUTER Mode Data Flow (multi-model)
+%% Detailed flows: ./flows/server-flow.mmd, ./flows/models-flow.mmd, ./flows/chat-flow.mmd
+
+sequenceDiagram
+ participant User as 👤 User
+ participant UI as 🧩 UI
+ participant Stores as đŸ—„ī¸ Stores
+ participant DB as 💾 IndexedDB
+ participant API as 🌐 llama-server
+
+ Note over User,API: 🚀 Initialization (see: server-flow.mmd, models-flow.mmd)
+
+ UI->>Stores: initialize()
+ Stores->>DB: load conversations
+ Stores->>API: GET /props
+ API-->>Stores: {role: "router"}
+ Stores->>API: GET /v1/models
+ API-->>Stores: models[] with status (loaded/available)
+ loop each loaded model
+ Stores->>API: GET /props?model=X
+ API-->>Stores: modalities (vision/audio)
+ end
+
+ Note over User,API: 🔄 Model Selection (see: models-flow.mmd)
+
+ User->>UI: select model
+ alt model not loaded
+ Stores->>API: POST /models/load
+ loop poll status
+ Stores->>API: GET /v1/models
+ API-->>Stores: check if loaded
+ end
+ Stores->>API: GET /props?model=X
+ API-->>Stores: cache modalities
+ end
+ Stores->>Stores: validate modalities vs conversation
+ alt valid
+ Stores->>Stores: select model
+ else invalid
+ Stores->>API: POST /models/unload
+ UI->>User: show error toast
+ end
+
+ Note over User,API: đŸ’Ŧ Chat Flow (see: chat-flow.mmd)
+
+ User->>UI: send message
+ UI->>Stores: sendMessage()
+ Stores->>DB: save user message
+ Stores->>API: POST /v1/chat/completions {model: X}
+ Note right of API: router forwards to model
+ loop streaming
+ API-->>Stores: SSE chunks + model info
+ Stores-->>UI: reactive update
+ end
+ API-->>Stores: done + timings
+ Stores->>DB: save assistant message + model used
+
+ Note over User,API: 🔁 Regenerate (optional: different model)
+
+ User->>UI: regenerate
+ Stores->>Stores: validate modalities up to this message
+ Stores->>DB: create message branch
+ Note right of Stores: same streaming flow
+
+ Note over User,API: âšī¸ Stop
+
+ User->>UI: stop
+ Stores->>Stores: abort stream
+ Stores->>DB: save partial response
+
+ Note over User,API: đŸ—‘ī¸ LRU Unloading
+
+ Note right of API: Server auto-unloads LRU models<br/>when cache full
+ User->>UI: select unloaded model
+ Note right of Stores: triggers load flow again
+```