git.djapps.eu Git - pkg/ggml/sources/ggml/log

]> git.djapps.eu Git - pkg/ggml/sources/ggml/log

overview / pkg / ggml / sources / ggml / log

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 17:23:16 +0000 (20:23 +0300)]

ggml : sync llama.cpp (clBLAST support + tensor names)

commit | commitdiff | tree

Georgi Gerganov [Mon, 1 May 2023 07:13:59 +0000 (10:13 +0300)]

ggml : temp comment

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 19:28:14 +0000 (22:28 +0300)]

ggml : fix UB (int << 31)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 16:03:35 +0000 (19:03 +0300)]

ggml, whisper : sync whisper.cpp (GGML_FTYPE + Q5 WASM SIMD)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 07:25:13 +0000 (10:25 +0300)]

ggml : fix labels for GGML_OP_ALIBI

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:33:59 +0000 (21:33 +0300)]

ggml : fix 32-bit ARM NEON

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:13:40 +0000 (21:13 +0300)]

ggml : use vzip instead of vuzp for consistency

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 16:13:53 +0000 (19:13 +0300)]

ggml : fix SHARED build

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 16:07:19 +0000 (19:07 +0300)]

ggml : sync llama.cpp (less memory for mul_mat f16 + asserts)

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 09:33:57 +0000 (12:33 +0300)]

scripts : add sync-whisper.sh

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:30:56 +0000 (10:30 +0300)]

common : forgot to remove Q4_3 references

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:03:59 +0000 (10:03 +0300)]

ggml : remove Q4_3

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 17:47:27 +0000 (20:47 +0300)]

ggml : ggml_alibi() fixes (#113)

commit | commitdiff | tree

Dan Forbes [Fri, 28 Apr 2023 17:37:07 +0000 (10:37 -0700)]

ggml : add ggml_alibi (positional embedding) (#113)

Co-authored-by: @hhamud <redacted>

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 17:34:38 +0000 (20:34 +0300)]

ggml : sync llama.cpp (CLBlast)

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 17:33:44 +0000 (20:33 +0300)]

gitignore : add python env folders

commit | commitdiff | tree

Santtu Keskinen [Fri, 28 Apr 2023 04:25:11 +0000 (07:25 +0300)]

readme : add bert.cpp link (#114)

commit | commitdiff | tree

Georgi Gerganov [Thu, 27 Apr 2023 16:07:40 +0000 (19:07 +0300)]

stablelm : put warning about bug in the implementation

commit | commitdiff | tree

Georgi Gerganov [Thu, 27 Apr 2023 15:31:53 +0000 (18:31 +0300)]

ggml : sync llama.cpp (Q5_0 + Q5_1) + refactor examples quantization

commit | commitdiff | tree

Georgi Gerganov [Mon, 24 Apr 2023 15:52:25 +0000 (18:52 +0300)]

ggml : sync llama.cpp (fix GCC 8 build, close #99)

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 17:04:03 +0000 (20:04 +0300)]

ggml : indentation

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 16:57:37 +0000 (19:57 +0300)]

ggml : add GGML_API for exporting shared symbols

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 16:45:39 +0000 (19:45 +0300)]

ggml : better PERF prints

commit | commitdiff | tree

le.chang [Sun, 23 Apr 2023 16:12:49 +0000 (00:12 +0800)]

tests : fix compile error (#98)

commit | commitdiff | tree

appvoid [Sun, 23 Apr 2023 16:11:33 +0000 (12:11 -0400)]

gpt-2 : remove GPT-J unnecessary import (#91)

commit | commitdiff | tree

AsukaMinato [Sun, 23 Apr 2023 15:03:52 +0000 (00:03 +0900)]

tests : remove type cast (#100)

commit | commitdiff | tree

Georgi Gerganov [Sun, 23 Apr 2023 13:38:00 +0000 (16:38 +0300)]

ggml : sync llama.cpp (AVX improvements)

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 13:34:39 +0000 (16:34 +0300)]

ggml : fix Q4_3 cuBLAS + fix quantize_row_q4_2()

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 12:49:15 +0000 (15:49 +0300)]

examples : refactor quantization tools

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 11:59:42 +0000 (14:59 +0300)]

examples : utils -> common

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 10:59:49 +0000 (13:59 +0300)]

ggml : fix ARM build

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 10:23:20 +0000 (13:23 +0300)]

cmake : add CMake support for cuBLAS (#101)

* cmake : add cuBLAS support

* cmake : fix cuBLAS build

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 09:52:25 +0000 (12:52 +0300)]

examples : add Q4_2 and Q4_3 quantization support

commit | commitdiff | tree

Georgi Gerganov [Sat, 22 Apr 2023 09:36:42 +0000 (12:36 +0300)]

ggml : sync llama.cpp (Q4_3 + CUDA)

commit | commitdiff | tree

Bart Pelle [Thu, 20 Apr 2023 21:15:45 +0000 (23:15 +0200)]

mnist : add missing header (#95)

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:35:52 +0000 (23:35 +0300)]

stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:23:07 +0000 (23:23 +0300)]

minor : fix GPT-NeoX name

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:21:38 +0000 (23:21 +0300)]

readme : add StableLM reference

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 20:20:38 +0000 (23:20 +0300)]

examples : add StableLM example (#96)

* ggml : there is a bug in ggml_cpy() F32 -> F32

Cannot see why, but multi-thread does not work

* stablelm : initial implementation, but QKV seems broken

* stablelm : make it work

* stablelm : use original merged QKV matrix

* stablelm : minor

* stablelm : instructions

* stablelm : update README.md

commit | commitdiff | tree

Georgi Gerganov [Thu, 20 Apr 2023 19:00:49 +0000 (22:00 +0300)]

ggml : sync llama.cpp (cuBLAS, Q4_3, bug fix, etc)

commit | commitdiff | tree

Georgi Gerganov [Wed, 19 Apr 2023 17:20:23 +0000 (20:20 +0300)]

ggml : sync llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 19:23:10 +0000 (22:23 +0300)]

examples : update huggingface links

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 16:50:54 +0000 (19:50 +0300)]

ggml : sync llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 11:25:34 +0000 (14:25 +0300)]

ggml : add ggml_type_name()

commit | commitdiff | tree

Georgi Gerganov [Sat, 15 Apr 2023 11:23:26 +0000 (14:23 +0300)]

ggml : use posix_memalign on non-Windows env

commit | commitdiff | tree

Georgi Gerganov [Fri, 14 Apr 2023 14:45:54 +0000 (17:45 +0300)]

ggml : add unary and binary map operations

commit | commitdiff | tree

Georgi Gerganov [Fri, 14 Apr 2023 10:32:27 +0000 (13:32 +0300)]

ggml : avoid powf() calls in ggml_rope()

commit | commitdiff | tree

Georgi Gerganov [Fri, 14 Apr 2023 10:32:12 +0000 (13:32 +0300)]

ggml : fix ARM NEON dot product types

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 21:02:31 +0000 (00:02 +0300)]

mnist : update README

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 21:00:42 +0000 (00:00 +0300)]

mnist : minor fixes and adjustments

commit | commitdiff | tree

Ray Cromwell [Thu, 13 Apr 2023 20:49:45 +0000 (13:49 -0700)]

examples : MNIST example for ggml (#84)

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 15:37:19 +0000 (18:37 +0300)]

ggml : sync latest changes from llama.cpp

commit | commitdiff | tree

Jakob Frick [Thu, 13 Apr 2023 12:41:53 +0000 (14:41 +0200)]

gpt-2 : typo fix for the Cerebras instructions (#57)

commit | commitdiff | tree

Georgi Gerganov [Thu, 13 Apr 2023 12:40:33 +0000 (15:40 +0300)]

ggml : add GGML_DEFAULT_N_THREADS

commit | commitdiff | tree

LostRuins [Thu, 13 Apr 2023 12:27:56 +0000 (20:27 +0800)]

gpt : fix pytorch converter text encodings (#78)

* Fixed quantization for f16 models not working - this is because the f16 tables were not initialized thus f16 to f32 conversion was failing.

* On some situations, the script fails with the error : UnicodeDecodeError: 'charmap' codec can't decode byte (byte) in position (number) : character maps to <undefined>
This is probably because the encodings are incorrect.
Explicitly specifying them as UTF-8 seems to resolve the issue and allow for correct conversion.

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Wed, 12 Apr 2023 15:59:41 +0000 (18:59 +0300)]

readme : update roadmap

commit | commitdiff | tree

Georgi Gerganov [Tue, 11 Apr 2023 18:33:17 +0000 (21:33 +0300)]

gpt-j : update inference to match latest llama.cpp insights

- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 20:21:11 +0000 (23:21 +0300)]

ggml : fix <windows.h> include

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 20:19:15 +0000 (23:19 +0300)]

ggml : fix WASM build

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 19:39:24 +0000 (22:39 +0300)]

whisper : sync with whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 19:39:07 +0000 (22:39 +0300)]

ggml : optimize ggml_cpy() for contiguous dst

commit | commitdiff | tree

Georgi Gerganov [Mon, 10 Apr 2023 16:36:06 +0000 (19:36 +0300)]

ggml : sync with llama.cpp

- int64_t number of elements
- remove mlock
- expose quantization functions
- expose ggml_object
- add ggml_view_3d()
- multi-thread ggml_rope()
- fix ggml_cpy()
- add ggml_init_params.no_alloc
- fix ggml_mul_mat() backward

commit | commitdiff | tree

LostRuins [Mon, 10 Apr 2023 07:47:47 +0000 (15:47 +0800)]

gpt : initialize f16 tables during quantization (#77)

commit | commitdiff | tree

Georgi Gerganov [Fri, 7 Apr 2023 18:21:33 +0000 (21:21 +0300)]

readme : update Roadmap (add rwkv.cpp)

commit | commitdiff | tree

Georgi Gerganov [Thu, 30 Mar 2023 21:37:37 +0000 (00:37 +0300)]

gpt-2 : minor update readme

commit | commitdiff | tree

Georgi Gerganov [Thu, 30 Mar 2023 21:34:14 +0000 (00:34 +0300)]

gpt-2 : fix qunatize tool to quantize the "lm_head" tensor

commit | commitdiff | tree

Georgi Gerganov [Thu, 30 Mar 2023 20:39:15 +0000 (23:39 +0300)]

gpt-2 : add Cerebras-GPT example

commit | commitdiff | tree

Supreet Sethi [Thu, 30 Mar 2023 17:25:29 +0000 (01:25 +0800)]

ggml : fix NEON sign types (#51)

commit | commitdiff | tree

Cordeiro [Wed, 29 Mar 2023 20:39:27 +0000 (15:39 -0500)]

gpt-2 : convert h5 to ggml (#35)

* Script to convert h5 to ggml adapted from gpt-j example

* Fix map tensors

* optimize

* rename headers to keep compatibility

* revert gpt-2/main.cpp

---------

Co-authored-by: Alan <redacted>
Co-authored-by: Alan <redacted>
Co-authored-by: ocordeiro <redacted>

commit | commitdiff | tree

Georgi Gerganov [Wed, 29 Mar 2023 19:23:14 +0000 (22:23 +0300)]

readme : update Roadmap

commit | commitdiff | tree

Georgi Gerganov [Wed, 29 Mar 2023 19:21:36 +0000 (22:21 +0300)]

ggml : 4-bit Integer quantisation + many llama.cpp improvements (#27)

* gq : attempt at n-bit quantization

* gq : add amax based method 3

* gq : progress on method 2

* gq : method 4 (AVX2)

* gq : method 4 (ARM)

* gq : method 4 (AVX2 attempt) + method 5 (no min)

* gq : method 5 (ARM)

* gpt-2 : model conversion for Q4_0 quantization

* ggml : Q4_0 quantization support (ggml_get_rows())

* gpt-2 : loading Q4_0 quantized model

* ggml : q4_0 quantization support

* ggml : q4_1 quantization support (seems to work for bigger models)

* gpt-2 : add gpt-2-quantize tool for quantizing f32 GPT-2 models

* ggml : 4-bit quantization works (only scalar for now)

* gq : add method 6 (ARM)

* ggml : vectorized mad q4_0 (ARM)

* ggml : vectorized quantize_row_q4_0 (ARM)

* ggml : simplify mad q4_0 (ARM)

* ggml : minor indentations

* gpt-j : support for 4-bit quantized model inference

* ggml : GGML_ASSERT() instead of assert() where appropriate

* gpt : avoid ggml_transpose on model tensors (new models!)

* gpt-2 : minor

* gpt-j : fix conversion for FP16 models (such as GPT-JT-6B)

* ggml : add ggml_compute_forward_rope_f16()

* gpt : fix memory usage computation

* ggml : fix ggml_is_contiguous() to take into account blck size

* whisper : add whisper-qunatize tool

* whisper : add support for quantized models

* whisper : mem usage based on model format type

* gpt : seems not worth to use FP16 for KV cache

* gpt : support quantisation of f16 models files

* ggml : fixes for rpi4

* whisper : add Q4_1 model sizes

* ggml : add WASM SIMD for Q4_0

* utils : print quantization histograms

* ggml : sync all changes from llama.cpp and whisper.cpp

* ggml : finalize the Q4_1 quantization for ARM_NEON

commit | commitdiff | tree

MaiHD [Sat, 25 Mar 2023 20:43:24 +0000 (03:43 +0700)]

ggml : make it work on Windows (#46)

commit | commitdiff | tree

Georgi Gerganov [Sat, 25 Mar 2023 14:32:48 +0000 (16:32 +0200)]

tests : add test-blas0

commit | commitdiff | tree

Georgi Gerganov [Wed, 22 Mar 2023 19:52:32 +0000 (21:52 +0200)]

Fix CMake indentation

commit | commitdiff | tree

katsu560 [Wed, 22 Mar 2023 19:51:47 +0000 (04:51 +0900)]

add OpenBLAS detection and modify tests codes (#40)

* fix indents and commands for Haiku, and add OpenBLAS detection in src/CMakeLists.txt

* add system detection and add OpenBLAS detection

* change loop number by environment variable GGML_NLOOP or command line option

* change fmadd codes on no FMA support system

* change n_threads by environment variable GGML_NTHREADS or command line option

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Alex von Gluck IV [Wed, 22 Mar 2023 19:43:58 +0000 (14:43 -0500)]

CMakeLists: Fix Haiku CPU detection (#39)

commit | commitdiff | tree

hidenorly [Wed, 22 Mar 2023 19:43:22 +0000 (04:43 +0900)]

Add pipe input for prompt on gpt examples (#38)

Enable prompt input through pipe, instead of using -p option.
This makes easier to give longer and multiple lines for the prompt.

Test:
$ echo "This is an example" > prompt.txt
$ cat prompt.txt | ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin
$ cat promot.txt | ./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin

Note that -p option and no -p specified case are kept.
$ ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example"
$ ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin
$ ./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin -p "This is an example"
$ ./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin

commit | commitdiff | tree

katsu560 [Mon, 6 Mar 2023 17:52:16 +0000 (02:52 +0900)]

cmake : update CMakeLists.txt to add correct flags (#26)

* modify src/CMakeLists.txt from whisper.cpp

* cmake : remove OpenBLAS stuff

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Mon, 6 Mar 2023 05:40:55 +0000 (07:40 +0200)]

readme : update Roadmap

commit | commitdiff | tree

Georgi Gerganov [Sun, 5 Mar 2023 16:02:27 +0000 (18:02 +0200)]

readme : add Roadmap section

commit | commitdiff | tree

Georgi Gerganov [Sun, 26 Feb 2023 19:10:50 +0000 (21:10 +0200)]

sync : latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Tue, 21 Feb 2023 20:16:56 +0000 (22:16 +0200)]

tests : fix cblas_sgemm call

commit | commitdiff | tree

Georgi Gerganov [Sat, 18 Feb 2023 14:05:31 +0000 (16:05 +0200)]

tests : add SVD experiments

commit | commitdiff | tree

Georgi Gerganov [Wed, 15 Feb 2023 18:59:36 +0000 (20:59 +0200)]

sync : latest whisper.cpp (scratch buffers in ggml)

commit | commitdiff | tree

Georgi Gerganov [Fri, 20 Jan 2023 06:45:45 +0000 (08:45 +0200)]

Update README.md

commit | commitdiff | tree

Takuya Takeuchi [Sun, 15 Jan 2023 14:30:13 +0000 (23:30 +0900)]

cmake : configure CMAKE_C_FLAGS and target_link_libraries for MSVC (#15)

commit | commitdiff | tree

Georgi Gerganov [Sun, 15 Jan 2023 13:53:08 +0000 (15:53 +0200)]

gpt : fix sampling to use the temperature (close #16)

commit | commitdiff | tree

Georgi Gerganov [Sun, 15 Jan 2023 13:09:36 +0000 (15:09 +0200)]

ggml : sync latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sun, 8 Jan 2023 18:28:38 +0000 (20:28 +0200)]

gpt-2 : fix broken prompt due to recent experiments

No idea why I commited that!?

commit | commitdiff | tree

Georgi Gerganov [Sun, 8 Jan 2023 18:23:01 +0000 (20:23 +0200)]

ggml : sync latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 19:05:33 +0000 (21:05 +0200)]

cmake : disable warnings about unused functions

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 19:04:24 +0000 (21:04 +0200)]

ggml : bugfix in new soft max computation

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 18:00:25 +0000 (20:00 +0200)]

tests : change test2 eps

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 17:53:05 +0000 (19:53 +0200)]

ggml : sync with latest whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 10:17:34 +0000 (12:17 +0200)]

tests : some more quantization experiments

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:43:02 +0000 (09:43 +0200)]

sync : forgot to sync ggml.h

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:39:12 +0000 (09:39 +0200)]

sync : latest changes from whisper.cpp

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:36:32 +0000 (09:36 +0200)]

tests : wip quantized matrix multiplication method 2

commit | commitdiff | tree

Georgi Gerganov [Sat, 7 Jan 2023 07:31:42 +0000 (09:31 +0200)]

tests : minor fixes for x86

commit | commitdiff | tree

Georgi Gerganov [Thu, 5 Jan 2023 19:05:41 +0000 (21:05 +0200)]

tests : experiments with n-bit quantized matrix multiplication

Packaging of ggml-org/ggml

RSS Atom