git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Yoshi Suhara <redacted>
	Fri, 16 Aug 2024 02:23:33 +0000 (19:23 -0700)
committer	GitHub <redacted>
	Fri, 16 Aug 2024 02:23:33 +0000 (04:23 +0200)
commit	2a24c8caa6d10a7263ca317fa7cb64f0edc72aae
tree	a6078942e3bb05dc2dce4fa12ec315b3ef066b87	tree
parent	e3f6fd56b1ab3c426596217d786910d641ae6ce0	commit \| diff

Add Nemotron/Minitron GGUF Conversion & Inference Support (#8922)

* Add nemotron GGUF conversion & inference support

* Fix formatting issues

* Remove unnecessary write_tensors()

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <redacted>
* Update src/llama.cpp

Co-authored-by: compilade <redacted>
* Address comments by @compilade

* Replace ggml_mul_mat()->llm_build_lora_mm()

* Remove mutable variable

* Use for bias tensors

* Cover corner case for role_scaling not in config.json

---------

Co-authored-by: compilade <redacted>

convert_hf_to_gguf.py		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/tensor_mapping.py		diff \| blob \| history
src/llama.cpp		diff \| blob \| history