doc: add references to hugging face GGUF-my-repo quantisation web tool. (#7288)

author Vaibhav Srivastav <redacted>

Thu, 16 May 2024 05:38:43 +0000 (07:38 +0200)

committer GitHub <redacted>

Thu, 16 May 2024 05:38:43 +0000 (15:38 +1000)
author Vaibhav Srivastav <redacted>
Thu, 16 May 2024 05:38:43 +0000 (07:38 +0200)
committer GitHub <redacted>
Thu, 16 May 2024 05:38:43 +0000 (15:38 +1000)
diff --git a/README.md b/README.md

index ecbe802dfe9ad6f181ccbba8dce39ea45ab79476..5d6217d139d21a12ad9c29a91a6f64bf46d9b854 100644 (file)
--- a/README.md
+++ b/README.md
@@ -712,6 +712,9 @@ Building the program with BLAS support may lead to some performance improvements
  
  ### Prepare and Quantize
  
+> [!NOTE]
+> You can use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to quantise your model weights without any setup too. It is synced from `llama.cpp` main every 6 hours.
+
  To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
  
  Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
diff --git a/examples/quantize/README.md b/examples/quantize/README.md

index 8a10365c07a60a709a202f372885a0c3811e384a..b78ece4e7f59dc493f7db85cac3cb5dbd213329f 100644 (file)
--- a/examples/quantize/README.md
+++ b/examples/quantize/README.md
@@ -1,6 +1,8 @@
  # quantize
  
-TODO
+You can also use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to build your own quants without any setup.
+
+Note: It is synced from llama.cpp `main` every 6 hours.
  
  ## Llama 2 7B
author	Vaibhav Srivastav <redacted>
	Thu, 16 May 2024 05:38:43 +0000 (07:38 +0200)
committer	GitHub <redacted>
	Thu, 16 May 2024 05:38:43 +0000 (15:38 +1000)
README.md		patch \| blob \| history
examples/quantize/README.md		patch \| blob \| history