git.djapps.eu Git - pkg/ggml/sources/ggml/log

]> git.djapps.eu Git - pkg/ggml/sources/ggml/log

overview / pkg / ggml / sources / ggml / log

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 13:38:00 +0000 (16:38 +0300)]

ggml : sync llama.cpp (AVX improvements)

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 13:34:39 +0000 (16:34 +0300)]

ggml : fix Q4_3 cuBLAS + fix quantize_row_q4_2()

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 12:49:15 +0000 (15:49 +0300)]

examples : refactor quantization tools

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 11:59:42 +0000 (14:59 +0300)]

examples : utils -> common

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 10:59:49 +0000 (13:59 +0300)]

ggml : fix ARM build

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 10:23:20 +0000 (13:23 +0300)]

cmake : add CMake support for cuBLAS (#101)

* cmake : add cuBLAS support

* cmake : fix cuBLAS build

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 09:52:25 +0000 (12:52 +0300)]

examples : add Q4_2 and Q4_3 quantization support

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 09:36:42 +0000 (12:36 +0300)]

ggml : sync llama.cpp (Q4_3 + CUDA)

commit | commitdiff | tree

Bart Pelle [Thu, 20 Apr 2023 21:15:45 +0000 (23:15 +0200)]

mnist : add missing header (#95)

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:35:52 +0000 (23:35 +0300)]

stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:23:07 +0000 (23:23 +0300)]

minor : fix GPT-NeoX name

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:21:38 +0000 (23:21 +0300)]

readme : add StableLM reference

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:20:38 +0000 (23:20 +0300)]

examples : add StableLM example (#96)

* ggml : there is a bug in ggml_cpy() F32 -> F32

Cannot see why, but multi-thread does not work

* stablelm : initial implementation, but QKV seems broken

* stablelm : make it work

* stablelm : use original merged QKV matrix

* stablelm : minor

* stablelm : instructions

* stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 19:00:49 +0000 (22:00 +0300)]

ggml : sync llama.cpp (cuBLAS, Q4_3, bug fix, etc)

commit | commitdiff | tree

Georgi Gerganov [Wed, 19 Apr 2023 17:20:23 +0000 (20:20 +0300)]

ggml : sync llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 19:23:10 +0000 (22:23 +0300)]

examples : update huggingface links

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 16:50:54 +0000 (19:50 +0300)]

ggml : sync llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 11:25:34 +0000 (14:25 +0300)]

ggml : add ggml_type_name()

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 11:23:26 +0000 (14:23 +0300)]

ggml : use posix_memalign on non-Windows env

commit | commitdiff | tree

Georgi Gerganov [Fri, 14 Apr 2023 14:45:54 +0000 (17:45 +0300)]

ggml : add unary and binary map operations

commit | commitdiff | tree

Georgi Gerganov [Fri, 14 Apr 2023 10:32:27 +0000 (13:32 +0300)]

ggml : avoid powf() calls in ggml_rope()

commit | commitdiff | tree

Georgi Gerganov [Fri, 14 Apr 2023 10:32:12 +0000 (13:32 +0300)]

ggml : fix ARM NEON dot product types

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 21:02:31 +0000 (00:02 +0300)]

mnist : update README

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 21:00:42 +0000 (00:00 +0300)]

mnist : minor fixes and adjustments

commit | commitdiff | tree

Ray Cromwell [Thu, 13 Apr 2023 20:49:45 +0000 (13:49 -0700)]

examples : MNIST example for ggml (#84)

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 15:37:19 +0000 (18:37 +0300)]

ggml : sync latest changes from llama.cpp

commit | commitdiff | tree

Jakob Frick [Thu, 13 Apr 2023 12:41:53 +0000 (14:41 +0200)]

gpt-2 : typo fix for the Cerebras instructions (#57)

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 12:40:33 +0000 (15:40 +0300)]

ggml : add GGML_DEFAULT_N_THREADS

commit | commitdiff | tree

LostRuins [Thu, 13 Apr 2023 12:27:56 +0000 (20:27 +0800)]

gpt : fix pytorch converter text encodings (#78)

* Fixed quantization for f16 models not working - this is because the f16 tables were not initialized thus f16 to f32 conversion was failing.

* On some situations, the script fails with the error : UnicodeDecodeError: 'charmap' codec can't decode byte (byte) in position (number) : character maps to <undefined>
This is probably because the encodings are incorrect.
Explicitly specifying them as UTF-8 seems to resolve the issue and allow for correct conversion.

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Wed, 12 Apr 2023 15:59:41 +0000 (18:59 +0300)]

readme : update roadmap

commit | commitdiff | tree

Georgi Gerganov [Tue, 11 Apr 2023 18:33:17 +0000 (21:33 +0300)]

gpt-j : update inference to match latest llama.cpp insights

- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 20:21:11 +0000 (23:21 +0300)]

ggml : fix <windows.h> include

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 20:19:15 +0000 (23:19 +0300)]

ggml : fix WASM build

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 19:39:24 +0000 (22:39 +0300)]

whisper : sync with whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 19:39:07 +0000 (22:39 +0300)]

ggml : optimize ggml_cpy() for contiguous dst

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 16:36:06 +0000 (19:36 +0300)]

ggml : sync with llama.cpp

- int64_t number of elements
- remove mlock
- expose quantization functions
- expose ggml_object
- add ggml_view_3d()
- multi-thread ggml_rope()
- fix ggml_cpy()
- add ggml_init_params.no_alloc
- fix ggml_mul_mat() backward

commit | commitdiff | tree

LostRuins [Mon, 10 Apr 2023 07:47:47 +0000 (15:47 +0800)]

gpt : initialize f16 tables during quantization (#77)

commit | commitdiff | tree

Georgi Gerganov [Fri, 7 Apr 2023 18:21:33 +0000 (21:21 +0300)]

readme : update Roadmap (add rwkv.cpp)

commit | commitdiff | tree

Georgi Gerganov [Thu, 30 Mar 2023 21:37:37 +0000 (00:37 +0300)]

gpt-2 : minor update readme

commit | commitdiff | tree

Georgi Gerganov [Thu, 30 Mar 2023 21:34:14 +0000 (00:34 +0300)]

gpt-2 : fix qunatize tool to quantize the "lm_head" tensor

commit | commitdiff | tree

Georgi Gerganov [Thu, 30 Mar 2023 20:39:15 +0000 (23:39 +0300)]

gpt-2 : add Cerebras-GPT example

commit | commitdiff | tree

Supreet Sethi [Thu, 30 Mar 2023 17:25:29 +0000 (01:25 +0800)]

ggml : fix NEON sign types (#51)

commit | commitdiff | tree

Cordeiro [Wed, 29 Mar 2023 20:39:27 +0000 (15:39 -0500)]

gpt-2 : convert h5 to ggml (#35)

* Script to convert h5 to ggml adapted from gpt-j example

* Fix map tensors

* optimize

* rename headers to keep compatibility

* revert gpt-2/main.cpp

---------

Co-authored-by: Alan <redacted>
Co-authored-by: Alan <redacted>
Co-authored-by: ocordeiro <redacted>

commit | commitdiff | tree

Georgi Gerganov [Wed, 29 Mar 2023 19:23:14 +0000 (22:23 +0300)]

readme : update Roadmap

commit | commitdiff | tree

Georgi Gerganov [Wed, 29 Mar 2023 19:21:36 +0000 (22:21 +0300)]

ggml : 4-bit Integer quantisation + many llama.cpp improvements (#27)

* gq : attempt at n-bit quantization

* gq : add amax based method 3

* gq : progress on method 2

* gq : method 4 (AVX2)

* gq : method 4 (ARM)

* gq : method 4 (AVX2 attempt) + method 5 (no min)

* gq : method 5 (ARM)

* gpt-2 : model conversion for Q4_0 quantization

* ggml : Q4_0 quantization support (ggml_get_rows())

* gpt-2 : loading Q4_0 quantized model

* ggml : q4_0 quantization support

* ggml : q4_1 quantization support (seems to work for bigger models)

* gpt-2 : add gpt-2-quantize tool for quantizing f32 GPT-2 models

* ggml : 4-bit quantization works (only scalar for now)

* gq : add method 6 (ARM)

* ggml : vectorized mad q4_0 (ARM)

* ggml : vectorized quantize_row_q4_0 (ARM)

* ggml : simplify mad q4_0 (ARM)

* ggml : minor indentations

* gpt-j : support for 4-bit quantized model inference

* ggml : GGML_ASSERT() instead of assert() where appropriate

* gpt : avoid ggml_transpose on model tensors (new models!)

* gpt-2 : minor

* gpt-j : fix conversion for FP16 models (such as GPT-JT-6B)

* ggml : add ggml_compute_forward_rope_f16()

* gpt : fix memory usage computation

* ggml : fix ggml_is_contiguous() to take into account blck size

* whisper : add whisper-qunatize tool

* whisper : add support for quantized models

* whisper : mem usage based on model format type

* gpt : seems not worth to use FP16 for KV cache

* gpt : support quantisation of f16 models files

* ggml : fixes for rpi4

* whisper : add Q4_1 model sizes

* ggml : add WASM SIMD for Q4_0

* utils : print quantization histograms

* ggml : sync all changes from llama.cpp and whisper.cpp

* ggml : finalize the Q4_1 quantization for ARM_NEON

commit | commitdiff | tree

MaiHD [Sat, 25 Mar 2023 20:43:24 +0000 (03:43 +0700)]

ggml : make it work on Windows (#46)

commit | commitdiff | tree

Georgi Gerganov [Sat, 25 Mar 2023 14:32:48 +0000 (16:32 +0200)]

tests : add test-blas0

commit | commitdiff | tree

Georgi Gerganov [Wed, 22 Mar 2023 19:52:32 +0000 (21:52 +0200)]

Fix CMake indentation

commit | commitdiff | tree

katsu560 [Wed, 22 Mar 2023 19:51:47 +0000 (04:51 +0900)]

add OpenBLAS detection and modify tests codes (#40)

* fix indents and commands for Haiku, and add OpenBLAS detection in src/CMakeLists.txt

* add system detection and add OpenBLAS detection

* change loop number by environment variable GGML_NLOOP or command line option

* change fmadd codes on no FMA support system

* change n_threads by environment variable GGML_NTHREADS or command line option

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Alex von Gluck IV [Wed, 22 Mar 2023 19:43:58 +0000 (14:43 -0500)]

CMakeLists: Fix Haiku CPU detection (#39)

commit | commitdiff | tree

hidenorly [Wed, 22 Mar 2023 19:43:22 +0000 (04:43 +0900)]

Add pipe input for prompt on gpt examples (#38)

Enable prompt input through pipe, instead of using -p option.
This makes easier to give longer and multiple lines for the prompt.

Test:
$ echo "This is an example" > prompt.txt
$ cat prompt.txt | ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin
$ cat promot.txt | ./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin

Note that -p option and no -p specified case are kept.
$ ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example"
$ ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin
$ ./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin -p "This is an example"
$ ./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin

commit | commitdiff | tree

katsu560 [Mon, 6 Mar 2023 17:52:16 +0000 (02:52 +0900)]

cmake : update CMakeLists.txt to add correct flags (#26)

* modify src/CMakeLists.txt from whisper.cpp

* cmake : remove OpenBLAS stuff

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Mon, 6 Mar 2023 05:40:55 +0000 (07:40 +0200)]

readme : update Roadmap

commit | commitdiff | tree

Georgi Gerganov [Sun, 5 Mar 2023 16:02:27 +0000 (18:02 +0200)]

readme : add Roadmap section

commit | commitdiff | tree

Georgi Gerganov [Sun, 26 Feb 2023 19:10:50 +0000 (21:10 +0200)]

sync : latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Tue, 21 Feb 2023 20:16:56 +0000 (22:16 +0200)]

tests : fix cblas_sgemm call

commit | commitdiff | tree

Georgi Gerganov [Sat, 18 Feb 2023 14:05:31 +0000 (16:05 +0200)]

tests : add SVD experiments

commit | commitdiff | tree

Georgi Gerganov [Wed, 15 Feb 2023 18:59:36 +0000 (20:59 +0200)]

sync : latest whisper.cpp (scratch buffers in ggml)

commit | commitdiff | tree

Georgi Gerganov [Fri, 20 Jan 2023 06:45:45 +0000 (08:45 +0200)]

Update README.md

commit | commitdiff | tree

Takuya Takeuchi [Sun, 15 Jan 2023 14:30:13 +0000 (23:30 +0900)]

cmake : configure CMAKE_C_FLAGS and target_link_libraries for MSVC (#15)

commit | commitdiff | tree

Georgi Gerganov [Sun, 15 Jan 2023 13:53:08 +0000 (15:53 +0200)]

gpt : fix sampling to use the temperature (close #16)

commit | commitdiff | tree

Georgi Gerganov [Sun, 15 Jan 2023 13:09:36 +0000 (15:09 +0200)]

ggml : sync latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sun, 8 Jan 2023 18:28:38 +0000 (20:28 +0200)]

gpt-2 : fix broken prompt due to recent experiments

No idea why I commited that!?

commit | commitdiff | tree

Georgi Gerganov [Sun, 8 Jan 2023 18:23:01 +0000 (20:23 +0200)]

ggml : sync latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 19:05:33 +0000 (21:05 +0200)]

cmake : disable warnings about unused functions

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 19:04:24 +0000 (21:04 +0200)]

ggml : bugfix in new soft max computation

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 18:00:25 +0000 (20:00 +0200)]

tests : change test2 eps

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 17:53:05 +0000 (19:53 +0200)]

ggml : sync with latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 10:17:34 +0000 (12:17 +0200)]

tests : some more quantization experiments

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:43:02 +0000 (09:43 +0200)]

sync : forgot to sync ggml.h

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:39:12 +0000 (09:39 +0200)]

sync : latest changes from whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:36:32 +0000 (09:36 +0200)]

tests : wip quantized matrix multiplication method 2

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:31:42 +0000 (09:31 +0200)]

tests : minor fixes for x86

commit | commitdiff | tree

Georgi Gerganov [Thu, 5 Jan 2023 19:05:41 +0000 (21:05 +0200)]

tests : experiments with n-bit quantized matrix multiplication

commit | commitdiff | tree

Georgi Gerganov [Sat, 31 Dec 2022 10:32:04 +0000 (12:32 +0200)]

sync : latest changes from whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 31 Dec 2022 10:29:52 +0000 (12:29 +0200)]

gpt-2 : experimenting with attention mask

commit | commitdiff | tree

Georgi Gerganov [Sat, 31 Dec 2022 10:29:30 +0000 (12:29 +0200)]

gpt-2 : fix off-by-one error in batching logic

commit | commitdiff | tree

Georgi Gerganov [Mon, 12 Dec 2022 21:49:12 +0000 (23:49 +0200)]

examples : redirect download scripts to HF

commit | commitdiff | tree

Georgi Gerganov [Sun, 4 Dec 2022 16:33:14 +0000 (18:33 +0200)]

gpt : add support for gpt-jt + fix unicode support

commit | commitdiff | tree

Georgi Gerganov [Sun, 4 Dec 2022 09:06:13 +0000 (11:06 +0200)]

ggml : sync with latest code from whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Wed, 9 Nov 2022 19:43:03 +0000 (21:43 +0200)]

sync : latest changes from whisper.cpp

- Documentation
- whisper : token-level timestamps
- ggml : Windows build fixes
- etc.

commit | commitdiff | tree

Georgi Gerganov [Tue, 1 Nov 2022 20:15:22 +0000 (22:15 +0200)]

Update README.md

commit | commitdiff | tree

Georgi Gerganov [Tue, 1 Nov 2022 20:13:15 +0000 (22:13 +0200)]

sync : latest changes from whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Tue, 18 Oct 2022 18:14:27 +0000 (21:14 +0300)]

whisper : fix timestamp sampling

commit | commitdiff | tree

Georgi Gerganov [Tue, 18 Oct 2022 16:12:07 +0000 (19:12 +0300)]

sync : whisper.cpp

- Add MSVC header
- FP16 GELU
- C interface fixes (no unions)
- Minor CMake updates

commit | commitdiff | tree

Georgi Gerganov [Mon, 17 Oct 2022 20:54:35 +0000 (23:54 +0300)]

sync : whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Mon, 17 Oct 2022 18:31:23 +0000 (21:31 +0300)]

Minor fixes

commit | commitdiff | tree

Georgi Gerganov [Mon, 17 Oct 2022 18:20:33 +0000 (21:20 +0300)]

Improve mul_mat performance for big matrices using Accelerate framework

Also:

- Speedup GELU operator via F16 cast
- Multi-thread NORM operator
- Disable FLASH_FF in whisper example

commit | commitdiff | tree

Georgi Gerganov [Mon, 17 Oct 2022 18:17:13 +0000 (21:17 +0300)]

Performance tests - trying to optimize mul_mat

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Oct 2022 19:18:46 +0000 (22:18 +0300)]

sync : whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 8 Oct 2022 15:15:22 +0000 (18:15 +0300)]

whisper : sync with whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Wed, 5 Oct 2022 20:15:10 +0000 (23:15 +0300)]

whisper : various improvements