]>
git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
llama: add support for QRWKV6 model architecture (llama/11001)
llama: add support for QRWKV6 model architecture (llama/11001)
* WIP: Add support for RWKV6Qwen2
Signed-off-by: Molly Sophia <redacted>
* RWKV: Some graph simplification
Signed-off-by: Molly Sophia <redacted>
* Add support for RWKV6Qwen2 with cpu and cuda GLA
Signed-off-by: Molly Sophia <redacted>
* RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead
Signed-off-by: Molly Sophia <redacted>
* Fix some typos
Signed-off-by: Molly Sophia <redacted>
* code format changes
Signed-off-by: Molly Sophia <redacted>
* Fix wkv test & add gla test
Signed-off-by: Molly Sophia <redacted>
* Fix cuda warning
Signed-off-by: Molly Sophia <redacted>
* Update README.md
Signed-off-by: Molly Sophia <redacted>
* Update ggml/src/ggml-cuda/gla.cu
Co-authored-by: Georgi Gerganov <redacted>
* Fix fused lerp weights loading with RWKV6
Signed-off-by: Molly Sophia <redacted>
* better sanity check skipping for QRWKV6 in llama-quant
thanks @compilade
Signed-off-by: Molly Sophia <redacted>
Co-authored-by: compilade <redacted>
---------
Signed-off-by: Molly Sophia <redacted>
Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: compilade <redacted>