stablelm : put warning about bug in the implementation

author Georgi Gerganov <redacted>

Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)

committer GitHub <redacted>

Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)
author Georgi Gerganov <redacted>
Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)
committer GitHub <redacted>
Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)
diff --git a/examples/stablelm/README.md b/examples/stablelm/README.md

index 7a70ca720830556c105345e8c11430b65aace9af..fa708ae044bbb50c038e1070254364fca7727936 100644 (file)
--- a/examples/stablelm/README.md
+++ b/examples/stablelm/README.md
@@ -4,6 +4,43 @@ Transformer architecture: GPT-NeoX
  
  Ref: https://github.com/stability-AI/stableLM/#stablelm-alpha
  
+## Warning
+
+**There seems to be a bug in the implementation.
+The embeddings magnitude increases after each layer which is unexpected.
+To observe this, uncomment the following line:**
+
+https://github.com/ggerganov/ggml/blob/abea4b7609c14b837015ab625e3ac36c4708dd03/src/ggml.c#L9208
+
+```
+...
+p[  0] =  65.5842
+p[  1] =  61.6951
+p[  2] =  59.3500
+p[  3] =  61.2421
+p[  4] =  65.9653
+p[  5] =  59.4936
+p[  6] =  58.4164
+p[  0] = -209.6351
+p[  1] = -214.0987
+p[  2] = -217.0928
+p[  3] = -215.0267
+p[  4] = -208.2430
+p[  5] = -215.3692
+p[  6] = -214.1981
+p[  0] = -301.0286
+p[  1] = -308.6521
+p[  2] = -310.7513
+p[  3] = -307.0832
+p[  4] = -299.9238
+p[  5] = -306.0667
+p[  6] = -302.1777
+...
+```
+
+**Instead, the magnitude should remain around `1`.
+Not sure where is the bug yet - need to compare results with the reference python implementation.**
+
  ## Usage
  
  ```bash
author	Georgi Gerganov <redacted>
	Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)
committer	GitHub <redacted>
	Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)