misc : prefer ggml-org models in docs and examples (#20827)

author ddh0 <redacted>

Sat, 21 Mar 2026 21:00:26 +0000 (16:00 -0500)

committer GitHub <redacted>

Sat, 21 Mar 2026 21:00:26 +0000 (22:00 +0100)
author ddh0 <redacted>
Sat, 21 Mar 2026 21:00:26 +0000 (16:00 -0500)
committer GitHub <redacted>
Sat, 21 Mar 2026 21:00:26 +0000 (22:00 +0100)
diff --git a/common/arg.cpp b/common/arg.cpp

index aad70ec5464825bc6615ac672c92ab0bc18d76b3..c6a2dcbf2d86e78076d6bbe7d897a94c62ab5452 100644 (file)
--- a/common/arg.cpp
+++ b/common/arg.cpp
@@ -2583,7 +2583,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
          {"-hf", "-hfr", "--hf-repo"}, "<user>/<model>[:quant]",
          "Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.\n"
          "mmproj is also downloaded automatically if available. to disable, add --no-mmproj\n"
-        "example: unsloth/phi-4-GGUF:q4_k_m\n"
+        "example: ggml-org/GLM-4.7-Flash-GGUF:Q4_K_M\n"
          "(default: unused)",
          [](common_params & params, const std::string & value) {
              params.model.hf_repo = value;
diff --git a/tools/cli/README.md b/tools/cli/README.md

index 22d3fc87e96421a2750e36d80038aa996586e374..c344cab2a8dcab97587bdd4c939c9fb540851a6b 100644 (file)
--- a/tools/cli/README.md
+++ b/tools/cli/README.md
@@ -83,7 +83,7 @@
  | `-m, --model FNAME` | model path to load<br/>(env: LLAMA_ARG_MODEL) |
  | `-mu, --model-url MODEL_URL` | model download url (default: unused)<br/>(env: LLAMA_ARG_MODEL_URL) |
  | `-dr, --docker-repo [<repo>/]<model>[:quant]` | Docker Hub model repository. repo is optional, default to ai/. quant is optional, default to :latest.<br/>example: gemma3<br/>(default: unused)<br/>(env: LLAMA_ARG_DOCKER_REPO) |
-| `-hf, -hfr, --hf-repo <user>/<model>[:quant]` | Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.<br/>mmproj is also downloaded automatically if available. to disable, add --no-mmproj<br/>example: unsloth/phi-4-GGUF:q4_k_m<br/>(default: unused)<br/>(env: LLAMA_ARG_HF_REPO) |
+| `-hf, -hfr, --hf-repo <user>/<model>[:quant]` | Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.<br/>mmproj is also downloaded automatically if available. to disable, add --no-mmproj<br/>example: ggml-org/GLM-4.7-Flash-GGUF:Q4_K_M<br/>(default: unused)<br/>(env: LLAMA_ARG_HF_REPO) |
  | `-hfd, -hfrd, --hf-repo-draft <user>/<model>[:quant]` | Same as --hf-repo, but for the draft model (default: unused)<br/>(env: LLAMA_ARG_HFD_REPO) |
  | `-hff, --hf-file FILE` | Hugging Face model file. If specified, it will override the quant in --hf-repo (default: unused)<br/>(env: LLAMA_ARG_HF_FILE) |
  | `-hfv, -hfrv, --hf-repo-v <user>/<model>[:quant]` | Hugging Face model repository for the vocoder model (default: unused)<br/>(env: LLAMA_ARG_HF_REPO_V) |
diff --git a/tools/completion/README.md b/tools/completion/README.md

index f868c2c7d7d8e07169b7a7fe335f72232908a231..b5eeba7334960b668c8375ef09ef89bcef00cc58 100644 (file)
--- a/tools/completion/README.md
+++ b/tools/completion/README.md
@@ -166,7 +166,7 @@ llama-completion.exe -m models\gemma-1.1-7b-it.Q4_K_M.gguf --ignore-eos -n -1
  | `-m, --model FNAME` | model path to load<br/>(env: LLAMA_ARG_MODEL) |
  | `-mu, --model-url MODEL_URL` | model download url (default: unused)<br/>(env: LLAMA_ARG_MODEL_URL) |
  | `-dr, --docker-repo [<repo>/]<model>[:quant]` | Docker Hub model repository. repo is optional, default to ai/. quant is optional, default to :latest.<br/>example: gemma3<br/>(default: unused)<br/>(env: LLAMA_ARG_DOCKER_REPO) |
-| `-hf, -hfr, --hf-repo <user>/<model>[:quant]` | Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.<br/>mmproj is also downloaded automatically if available. to disable, add --no-mmproj<br/>example: unsloth/phi-4-GGUF:q4_k_m<br/>(default: unused)<br/>(env: LLAMA_ARG_HF_REPO) |
+| `-hf, -hfr, --hf-repo <user>/<model>[:quant]` | Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.<br/>mmproj is also downloaded automatically if available. to disable, add --no-mmproj<br/>example: ggml-org/GLM-4.7-Flash-GGUF:Q4_K_M<br/>(default: unused)<br/>(env: LLAMA_ARG_HF_REPO) |
  | `-hfd, -hfrd, --hf-repo-draft <user>/<model>[:quant]` | Same as --hf-repo, but for the draft model (default: unused)<br/>(env: LLAMA_ARG_HFD_REPO) |
  | `-hff, --hf-file FILE` | Hugging Face model file. If specified, it will override the quant in --hf-repo (default: unused)<br/>(env: LLAMA_ARG_HF_FILE) |
  | `-hfv, -hfrv, --hf-repo-v <user>/<model>[:quant]` | Hugging Face model repository for the vocoder model (default: unused)<br/>(env: LLAMA_ARG_HF_REPO_V) |
diff --git a/tools/llama-bench/llama-bench.cpp b/tools/llama-bench/llama-bench.cpp

index b0f1d6b936c35c1217ecd92b678b33d1f8b3e5a3..21173576cc7b01cb618b24e9ab4dd2f87f346a73 100644 (file)
--- a/tools/llama-bench/llama-bench.cpp
+++ b/tools/llama-bench/llama-bench.cpp
@@ -418,7 +418,7 @@ static void print_usage(int /* argc */, char ** argv) {
      printf("  -m, --model <filename>                      (default: %s)\n", join(cmd_params_defaults.model, ",").c_str());
      printf("  -hf, -hfr, --hf-repo <user>/<model>[:quant] Hugging Face model repository; quant is optional, case-insensitive\n");
      printf("                                              default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.\n");
-    printf("                                              example: unsloth/phi-4-GGUF:Q4_K_M\n");
+    printf("                                              example: ggml-org/GLM-4.7-Flash-GGUF:Q4_K_M\n");
      printf("                                              (default: unused)\n");
      printf("  -hff, --hf-file <file>                      Hugging Face model file. If specified, it will override the quant in --hf-repo\n");
      printf("                                              (default: unused)\n");
diff --git a/tools/server/README.md b/tools/server/README.md

index df59e2d9b7e45f0973778fdc15c8048e1343b05d..554444d74bdc5ddb03e986ccb0f9e1ce43ba0fc2 100644 (file)
--- a/tools/server/README.md
+++ b/tools/server/README.md
@@ -100,7 +100,7 @@ For the full list of features, please refer to [server's changelog](https://gith
  | `-m, --model FNAME` | model path to load<br/>(env: LLAMA_ARG_MODEL) |
  | `-mu, --model-url MODEL_URL` | model download url (default: unused)<br/>(env: LLAMA_ARG_MODEL_URL) |
  | `-dr, --docker-repo [<repo>/]<model>[:quant]` | Docker Hub model repository. repo is optional, default to ai/. quant is optional, default to :latest.<br/>example: gemma3<br/>(default: unused)<br/>(env: LLAMA_ARG_DOCKER_REPO) |
-| `-hf, -hfr, --hf-repo <user>/<model>[:quant]` | Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.<br/>mmproj is also downloaded automatically if available. to disable, add --no-mmproj<br/>example: unsloth/phi-4-GGUF:q4_k_m<br/>(default: unused)<br/>(env: LLAMA_ARG_HF_REPO) |
+| `-hf, -hfr, --hf-repo <user>/<model>[:quant]` | Hugging Face model repository; quant is optional, case-insensitive, default to Q4_K_M, or falls back to the first file in the repo if Q4_K_M doesn't exist.<br/>mmproj is also downloaded automatically if available. to disable, add --no-mmproj<br/>example: ggml-org/GLM-4.7-Flash-GGUF:Q4_K_M<br/>(default: unused)<br/>(env: LLAMA_ARG_HF_REPO) |
  | `-hfd, -hfrd, --hf-repo-draft <user>/<model>[:quant]` | Same as --hf-repo, but for the draft model (default: unused)<br/>(env: LLAMA_ARG_HFD_REPO) |
  | `-hff, --hf-file FILE` | Hugging Face model file. If specified, it will override the quant in --hf-repo (default: unused)<br/>(env: LLAMA_ARG_HF_FILE) |
  | `-hfv, -hfrv, --hf-repo-v <user>/<model>[:quant]` | Hugging Face model repository for the vocoder model (default: unused)<br/>(env: LLAMA_ARG_HF_REPO_V) |
diff --git a/tools/server/webui/src/lib/constants/settings-config.ts b/tools/server/webui/src/lib/constants/settings-config.ts

index 39aaf561bb79df2b633c901d3247e34a30f00f0c..ae9dd3ce8fd6fedd3e616a631b717b75900a7f5d 100644 (file)
--- a/tools/server/webui/src/lib/constants/settings-config.ts
+++ b/tools/server/webui/src/lib/constants/settings-config.ts
@@ -127,7 +127,7 @@ export const SETTING_CONFIG_INFO: Record<string, string> = {
         fullHeightCodeBlocks:
                 'Always display code blocks at their full natural height, overriding any height limits.',
         showRawModelNames:
-               'Display full raw model identifiers (e.g. "unsloth/Qwen3.5-27B-GGUF:BF16") instead of parsed names with badges.',
+               'Display full raw model identifiers (e.g. "ggml-org/GLM-4.7-Flash-GGUF:Q8_0") instead of parsed names with badges.',
         mcpServers:
                 'Configure MCP servers as a JSON list. Use the form in the MCP Client settings section to edit.',
         mcpServerUsageStats:
diff --git a/tools/server/webui/src/lib/stores/models.svelte.ts b/tools/server/webui/src/lib/stores/models.svelte.ts

index a6d7d6572ff2bc061fcd17d26b186232813ce9af..50c32034a62d5fb6f79c32f59c158947d2f58364 100644 (file)
--- a/tools/server/webui/src/lib/stores/models.svelte.ts
+++ b/tools/server/webui/src/lib/stores/models.svelte.ts
@@ -457,7 +457,7 @@ class ModelsStore {
  
         /**
          * Select a model by its model name (used for syncing with conversation model)
-        * @param modelName - Model name to select (e.g., "unsloth/gemma-3-12b-it-GGUF:latest")
+        * @param modelName - Model name to select (e.g., "ggml-org/GLM-4.7-Flash-GGUF")
          */
         selectModelByName(modelName: string): void {
                 const option = this.models.find((model) => model.model === modelName);
author	ddh0 <redacted>
	Sat, 21 Mar 2026 21:00:26 +0000 (16:00 -0500)
committer	GitHub <redacted>
	Sat, 21 Mar 2026 21:00:26 +0000 (22:00 +0100)
common/arg.cpp		patch \| blob \| history
tools/cli/README.md		patch \| blob \| history
tools/completion/README.md		patch \| blob \| history
tools/llama-bench/llama-bench.cpp		patch \| blob \| history
tools/server/README.md		patch \| blob \| history
tools/server/webui/src/lib/constants/settings-config.ts		patch \| blob \| history
tools/server/webui/src/lib/stores/models.svelte.ts		patch \| blob \| history