docs : add "Quick start" section for new users (#13862)

author Xuan-Son Nguyen <redacted>

Tue, 3 Jun 2025 11:09:36 +0000 (13:09 +0200)

committer GitHub <redacted>

Tue, 3 Jun 2025 11:09:36 +0000 (13:09 +0200)
author Xuan-Son Nguyen <redacted>
Tue, 3 Jun 2025 11:09:36 +0000 (13:09 +0200)
committer GitHub <redacted>
Tue, 3 Jun 2025 11:09:36 +0000 (13:09 +0200)
diff --git a/README.md b/README.md

index 576332bc5d33d445a2036ee9cc20330b29cb4376..91401fa98033cab307273d135e02a3df6cb9a6c0 100644 (file)
--- a/README.md
+++ b/README.md
@@ -28,6 +28,30 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
  
  ----
  
+## Quick start
+
+Getting started with llama.cpp is straightforward. Here are several ways to install it on your machine:
+
+- Install `llama.cpp` using [brew, nix or winget](docs/install.md)
+- Run with Docker - see our [Docker documentation](docs/docker.md)
+- Download pre-built binaries from the [releases page](https://github.com/ggml-org/llama.cpp/releases)
+- Build from source by cloning this repository - check out [our build guide](docs/build.md)
+
+Once installed, you'll need a model to work with. Head to the [Obtaining and quantizing models](#obtaining-and-quantizing-models) section to learn more.
+
+Example command:
+
+```sh
+# Use a local model file
+llama-cli -m my_model.gguf
+
+# Or download and run a model directly from Hugging Face
+llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
+
+# Launch OpenAI-compatible API server
+llama-server -hf ggml-org/gemma-3-1b-it-GGUF
+```
+
  ## Description
  
  The main goal of `llama.cpp` is to enable LLM inference with minimal setup and state-of-the-art performance on a wide
@@ -230,6 +254,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
  
  </details>
  
+
  ## Supported backends
  
  | Backend | Target devices |
@@ -246,16 +271,6 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
  | [OpenCL](docs/backend/OPENCL.md) | Adreno GPU |
  | [RPC](https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc) | All |
  
-## Building the project
-
-The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
-The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server. Possible methods for obtaining the binaries:
-
-- Clone this repository and build locally, see [how to build](docs/build.md)
-- On MacOS or Linux, install `llama.cpp` via [brew, flox or nix](docs/install.md)
-- Use a Docker image, see [documentation for Docker](docs/docker.md)
-- Download pre-built binaries from [releases](https://github.com/ggml-org/llama.cpp/releases)
-
  ## Obtaining and quantizing models
  
  The [Hugging Face](https://huggingface.co) platform hosts a [number of LLMs](https://huggingface.co/models?library=gguf&sort=trending) compatible with `llama.cpp`:
@@ -263,7 +278,11 @@ The [Hugging Face](https://huggingface.co) platform hosts a [number of LLMs](htt
  - [Trending](https://huggingface.co/models?library=gguf&sort=trending)
  - [LLaMA](https://huggingface.co/models?sort=trending&search=llama+gguf)
  
-You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`.
+You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`. For example:
+
+```sh
+llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
+```
  
  By default, the CLI would download from Hugging Face, you can switch to other options with the environment variable `MODEL_ENDPOINT`. For example, you may opt to downloading model checkpoints from ModelScope or other model sharing communities by setting the environment variable, e.g. `MODEL_ENDPOINT=https://www.modelscope.cn/`.
  
diff --git a/docs/build.md b/docs/build.md

index 32717a793ffad38ce2b1557d64fed117cdc76751..680b0d83987419338a64e2c40e11dbd2ae09cc6b 100644 (file)
--- a/docs/build.md
+++ b/docs/build.md
@@ -1,5 +1,9 @@
  # Build llama.cpp locally
  
+The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
+
+The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server.
+
  **To get the Code:**
  
  ```bash
diff --git a/docs/install.md b/docs/install.md

index 4971c18281cc99d361bd89d3dd16f8665b57bcde..7200bf9b7b91d258f3fcb5b33d919032a41e9150 100644 (file)
--- a/docs/install.md
+++ b/docs/install.md
@@ -1,28 +1,42 @@
  # Install pre-built version of llama.cpp
  
-## Homebrew
+| Install via | Windows | Mac | Linux |
+|-------------|---------|-----|-------|
+| Winget      | ✅      |      |      |
+| Homebrew    |         | ✅   | ✅   |
+| MacPorts    |         | ✅   |      |
+| Nix         |         | ✅   | ✅   |
  
-On Mac and Linux, the homebrew package manager can be used via
+## Winget (Windows)
+
+```sh
+winget install llama.cpp
+```
+
+The package is automatically updated with new `llama.cpp` releases. More info: https://github.com/ggml-org/llama.cpp/issues/8188
+
+## Homebrew (Mac and Linux)
  
  ```sh
  brew install llama.cpp
  ```
+
  The formula is automatically updated with new `llama.cpp` releases. More info: https://github.com/ggml-org/llama.cpp/discussions/7668
  
-## MacPorts
+## MacPorts (Mac)
  
  ```sh
  sudo port install llama.cpp
  ```
-see also: https://ports.macports.org/port/llama.cpp/details/
  
-## Nix
+See also: https://ports.macports.org/port/llama.cpp/details/
  
-On Mac and Linux, the Nix package manager can be used via
+## Nix (Mac and Linux)
  
  ```sh
  nix profile install nixpkgs#llama-cpp
  ```
+
  For flake enabled installs.
  
  Or
@@ -34,13 +48,3 @@ nix-env --file '<nixpkgs>' --install --attr llama-cpp
  For non-flake enabled installs.
  
  This expression is automatically updated within the [nixpkgs repo](https://github.com/NixOS/nixpkgs/blob/nixos-24.05/pkgs/by-name/ll/llama-cpp/package.nix#L164).
-
-## Flox
-
-On Mac and Linux, Flox can be used to install llama.cpp within a Flox environment via
-
-```sh
-flox install llama-cpp
-```
-
-Flox follows the nixpkgs build of llama.cpp.
author	Xuan-Son Nguyen <redacted>
	Tue, 3 Jun 2025 11:09:36 +0000 (13:09 +0200)
committer	GitHub <redacted>
	Tue, 3 Jun 2025 11:09:36 +0000 (13:09 +0200)
README.md		patch \| blob \| history
docs/build.md		patch \| blob \| history
docs/install.md		patch \| blob \| history