- [X] Example of RWKV inference [saharNooby/rwkv.cpp](https://github.com/saharNooby/rwkv.cpp)
- [ ] Example of [SAM](https://github.com/facebookresearch/segment-anything) inference
- [ ] Idea for GPU support: https://github.com/ggerganov/llama.cpp/discussions/915
-- [X] Example of StableLM (GPTNeoX) inference [examples/stablelm](https://github.com/ggerganov/ggml/tree/master/examples/stablelm)
+- [X] Example of StableLM (GPT-NeoX) inference [examples/stablelm](https://github.com/ggerganov/ggml/tree/master/examples/stablelm)
## Whisper inference (example)
# StableLM
-Transformer architecture: GPTNeoX
+Transformer architecture: GPT-NeoX
Ref: https://github.com/stability-AI/stableLM/#stablelm-alpha
struct ggml_tensor * Kcur = ggml_cont(ctx0, ggml_view_3d(ctx0, cur, n_embd/n_head, n_head, N, cur->nb[1]/n_head, cur->nb[1], 1*sizeof(float)*n_embd/n_head));
struct ggml_tensor * Vcur = ggml_cont(ctx0, ggml_view_3d(ctx0, cur, n_embd/n_head, n_head, N, cur->nb[1]/n_head, cur->nb[1], 2*sizeof(float)*n_embd/n_head));
- // using mode = 2 for GPTNeoX mode
+ // using mode = 2 for GPT-NeoX mode
Qcur = ggml_rope(ctx0, Qcur, n_past, n_rot, 2);
Kcur = ggml_rope(ctx0, Kcur, n_past, n_rot, 2);
// rotary position embedding
// in-place, returns view(a)
// if mode & 1 == 1, skip n_past elements
-// if mode & 2 == 1, GPTNeoX style
+// if mode & 2 == 1, GPT-NeoX style
// TODO: avoid creating a new tensor every time
struct ggml_tensor * ggml_rope(
struct ggml_context * ctx,