readme : mention Metal could be used for gpt-2 (#553)

author Pierre Alexandre SCHEMBRI <redacted>

Sat, 7 Oct 2023 10:29:33 +0000 (12:29 +0200)

committer GitHub <redacted>

Sat, 7 Oct 2023 10:29:33 +0000 (13:29 +0300)
author Pierre Alexandre SCHEMBRI <redacted>
Sat, 7 Oct 2023 10:29:33 +0000 (12:29 +0200)
committer GitHub <redacted>
Sat, 7 Oct 2023 10:29:33 +0000 (13:29 +0300)
diff --git a/README.md b/README.md

index ebb262967fb28ee9d73e41edb5e635b04908bc23..5cfc8ba084f738d44628c727266b3406376dd580 100644 (file)
--- a/README.md
+++ b/README.md
@@ -105,6 +105,19 @@ The inference speeds that I get for the different models on my 32GB MacBook M1 P
  
  For more information, checkout the corresponding programs in the [examples](examples) folder.
  
+## Using Metal (only with GPT-2)
+
+For GPT-2 models, offloading to GPU is possible. Note that it will not improve inference performances but will reduce power consumption and free up the CPU for other tasks.
+
+To enable GPU offloading on MacOS:
+
+```bash
+cmake -DGGML_METAL=ON -DBUILD_SHARED_LIBS=Off ..
+
+# add -ngl 1
+./bin/gpt-2 -t 4 -ngl 100 -m models/gpt-2-117M/ggml-model.bin -p "This is an example"
+```
+
  ## Using cuBLAS
  
  ```bash
author	Pierre Alexandre SCHEMBRI <redacted>
	Sat, 7 Oct 2023 10:29:33 +0000 (12:29 +0200)
committer	GitHub <redacted>
	Sat, 7 Oct 2023 10:29:33 +0000 (13:29 +0300)