git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Andrew Godfrey <redacted>
	Wed, 1 Nov 2023 11:49:04 +0000 (04:49 -0700)
committer	GitHub <redacted>
	Wed, 1 Nov 2023 11:49:04 +0000 (13:49 +0200)
commit	73bdcb395ef9a997d9c02950c7cd4249546162cd
tree	9cace5e626d13541dda1798fbee2d74b57874952	tree
parent	f0e209324a7f663225791897877bf610f1af152d	commit \| diff

finetune : add -ngl parameter (#3762)

* Add '-ngl' support to finetune.cpp

* Add fprintf in ggml_cuda_op_add

When I tried CUDA offloading during finetuning following the readme, I got an assert here.
This probably isn't an important case because inference later gives a warning saying you should use f16 or f32 instead when using lora

* Add 'finetune.sh', which currently fails when using GPU

"error: operator (): Finetuning on tensors with type 'f16' is not yet supported"

* tweak finetune.sh

* Suppress some warnings in ggml.c

* Add f16 implementation to ggml_compute_forward_add_f16_f32

* Add an f16 case to ggml_add_cast_impl and llama_build_lora_finetune_graphs

* finetune.sh: Edit comments

* Add "add_f16_f32_f32_cuda"

* Tweak an error message

* finetune.sh: Add an optional LLAMA_MODEL_DIR variable

* finetune.sh: Add an optional LLAMA_TRAINING_DIR variable

* train : minor

* tabs to spaces

---------

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: cebtenzzre <redacted>

common/train.cpp		diff \| blob \| history
common/train.h		diff \| blob \| history
examples/finetune/finetune.cpp		diff \| blob \| history
examples/finetune/finetune.sh	[new file with mode: 0644]	blob
ggml-cuda.cu		diff \| blob \| history
ggml-quants.c		diff \| blob \| history
ggml.c		diff \| blob \| history
llama.cpp		diff \| blob \| history