Update llama-run README.md (#11386)

author Eric Curtin <redacted>

Fri, 24 Jan 2025 09:39:24 +0000 (09:39 +0000)

committer GitHub <redacted>

Fri, 24 Jan 2025 09:39:24 +0000 (09:39 +0000)
author Eric Curtin <redacted>
Fri, 24 Jan 2025 09:39:24 +0000 (09:39 +0000)
committer GitHub <redacted>
Fri, 24 Jan 2025 09:39:24 +0000 (09:39 +0000)
diff --git a/examples/run/README.md b/examples/run/README.md

index a0680544120b944f73f8462a12f991af80a3e7f8..89a55207986619520833729551fbcd7c27ac9a1a 100644 (file)
--- a/examples/run/README.md
+++ b/examples/run/README.md
@@ -3,11 +3,10 @@
  The purpose of this example is to demonstrate a minimal usage of llama.cpp for running models.
  
  ```bash
-llama-run granite-code
+llama-run granite3-moe
  ```
  
  ```bash
-llama-run -h
  Description:
    Runs a llm
  
@@ -17,7 +16,7 @@ Usage:
  Options:
    -c, --context-size <value>
        Context size (default: 2048)
-  -n, --ngl <value>
+  -n, -ngl, --ngl <value>
        Number of GPU layers (default: 0)
    --temp <value>
        Temperature (default: 0.8)
author	Eric Curtin <redacted>
	Fri, 24 Jan 2025 09:39:24 +0000 (09:39 +0000)
committer	GitHub <redacted>
	Fri, 24 Jan 2025 09:39:24 +0000 (09:39 +0000)