]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : support for `falcon-mamba` architecture (#9074)
authorYounes Belkada <redacted>
Wed, 21 Aug 2024 08:06:36 +0000 (12:06 +0400)
committerGitHub <redacted>
Wed, 21 Aug 2024 08:06:36 +0000 (11:06 +0300)
commitb40eb84895bf723c7b327a1e3bf6e0e2c41877f8
tree5c803992fcb116cff6bcb8cc1e6e398524028dbf
parentf63f603c879c2232eaeded8c0aeba4244471d720
llama : support for `falcon-mamba` architecture (#9074)

* feat: initial support for llama.cpp

* fix: lint

* refactor: better refactor

* Update src/llama.cpp

Co-authored-by: compilade <redacted>
* Update src/llama.cpp

Co-authored-by: compilade <redacted>
* fix: address comments

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <redacted>
* fix: add more cleanup and harmonization

* fix: lint

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* fix: change name

* Apply suggestions from code review

Co-authored-by: compilade <redacted>
* add in operator

* fix: add `dt_b_c_rms` in `llm_load_print_meta`

* fix: correct printf format for bool

* fix: correct print format

* Update src/llama.cpp

Co-authored-by: compilade <redacted>
* llama : quantize more Mamba tensors

* llama : use f16 as the fallback of fallback quant types

---------

Co-authored-by: compilade <redacted>
README.md
convert_hf_to_gguf.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
src/llama.cpp