git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Joan Fontanals <redacted>
	Sat, 11 May 2024 07:46:09 +0000 (09:46 +0200)
committer	GitHub <redacted>
	Sat, 11 May 2024 07:46:09 +0000 (10:46 +0300)
commit	b83cc3f5b303ff30c52874b2d5864dc6385ebf9f
tree	912fe3b8df9eb7b39ae7f4321f080b924628336b	tree
parent	9cb317f77e53067f7a138cc89ef7657148eae8e6	commit \| diff

llama : add Jina Embeddings architecture (#6826)

* feat: first things to do

* feat: create tensors for Jina architecture

* fix: use other tensors

* feat: embedding gets results

* fix: fix usage of ALIBI

* fix: clean prints

* fix: do some cleanup unused vars

* fix: revert changes to Makefile and CMakeLists

* fix: revert some changes

* fix: fix small detail

* fix: fix convert formatting

* fix: fix linting and editor

* feat: set proper vocab settings

* fix: JinaBertForMaskedLM registration

* feat: support q_normalization and k_normalization in Jina arch

* feat: handle gpt2 tokenizer with Jina architecture

* feat: example comments in embedding

* feat: rename Jina Bert to Jina Bert V2

* fix: add some changes as per review

* feat: proper KQ_pos for Jina embeddings

* feat: add capacity to load models ES and DE for Spanish

* llama : fix pre-tokenizers

* ggml : full ALiBi support

* ggml : update ggml_soft_max_ext() CUDA, SYCL

* ggml : ggml_flash_attn_ext() support ALiBi (CPU)

* ggml : ggml_flash_attn_ext() support ALiBi (Metal)

* ggml : fix warning

* ggml : ggml_flash_attn_ext() support ALiBi (CUDA)

ggml-ci

* minor : clean-up

* embedding : add warning about missing SEP

---------

Co-authored-by: Georgi Gerganov <redacted>

convert-hf-to-gguf-update.py		diff \| blob \| history
convert-hf-to-gguf.py		diff \| blob \| history
examples/embedding/embedding.cpp		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/tensor_mapping.py		diff \| blob \| history
llama.cpp		diff \| blob \| history