]>
git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log
Georgi Gerganov [Wed, 6 Nov 2024 17:53:51 +0000 (19:53 +0200)]
metal : add BF16 support (llama/8439)
* ggml : add initial BF16 support
ggml-ci
* metal : add mul_mat_id BF16 support
ggml-ci
* metal : check for bfloat support on the Metal device
ggml-ci
* metal : better var names [no ci]
* metal : do not build bfloat kernels when not supported
ggml-ci
* metal : try to fix BF16 support check
ggml-ci
* metal : this should correctly check bfloat support
Diego Devesa [Wed, 6 Nov 2024 11:10:07 +0000 (12:10 +0100)]
metal : fix from ptr buffer name (llama/10189)
Georgi Gerganov [Wed, 6 Nov 2024 09:20:10 +0000 (11:20 +0200)]
ggml : adjust is_first_call init value (llama/10193)
ggml-ci
Georgi Gerganov [Wed, 6 Nov 2024 08:24:23 +0000 (10:24 +0200)]
metal : add quantized FA support (llama/10149)
* metal : add quantized FA (vec) support
ggml-ci
* metal : add quantized FA (non-vec) support
* metal : fix support check
ggml-ci
* metal : clean-up
* metal : clean-up (cont)
* metal : fix shared memory calc + reduce smem + comments
* metal : float-correctness
* metal : minor [no ci]
Diego Devesa [Mon, 4 Nov 2024 22:17:01 +0000 (23:17 +0100)]
ggml : fix arch check in bf16_to_fp32 (llama/10164)
Eve [Mon, 4 Nov 2024 22:06:31 +0000 (22:06 +0000)]
Q6_K AVX improvements (llama/10118)
* q6_k instruction reordering attempt
* better subtract method
* should be theoretically faster
small improvement with shuffle lut, likely because all loads are already done at that stage
* optimize bit fiddling
* handle -32 offset separately. bsums exists for a reason!
* use shift
* Update ggml-quants.c
* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86
Diego Devesa [Mon, 4 Nov 2024 19:06:58 +0000 (20:06 +0100)]
ggml : fix gelu tables initialization (llama/10172)
Diego Devesa [Mon, 4 Nov 2024 16:34:08 +0000 (17:34 +0100)]
ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (llama/10167)
snadampal [Mon, 4 Nov 2024 15:08:33 +0000 (09:08 -0600)]
fix build break on arm64 linux (llama/10166)
This fixes the build break from the recent changes
to move the CPU backend to separate files
https://github.com/ggerganov/llama.cpp/pull/10144
Diego Devesa [Mon, 4 Nov 2024 12:10:23 +0000 (13:10 +0100)]
cuda : clear error after changing peer access (llama/10153)
Georgi Gerganov [Mon, 4 Nov 2024 11:49:34 +0000 (13:49 +0200)]
metal : simplify f16 and f32 dequant kernels (llama/0)
Georgi Gerganov [Mon, 4 Nov 2024 11:43:32 +0000 (13:43 +0200)]
metal : move dequantize templates to beginning of MSL source (llama/0)
leo-pony [Mon, 4 Nov 2024 11:08:22 +0000 (19:08 +0800)]
CANN: adjust backend registry refactor. (llama/10158)
remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.
Diego Devesa [Sun, 3 Nov 2024 18:34:08 +0000 (19:34 +0100)]
ggml : move CPU backend to a separate file (llama/10144)
Georgi Gerganov [Sun, 3 Nov 2024 13:18:40 +0000 (15:18 +0200)]
metal : minor fixup in FA kernel (llama/10143)
* metal : minor fixup in FA kernel
ggml-ci
* metal : use the unrolled loop variable
* metal : remove unused var
Diego Devesa [Fri, 1 Nov 2024 22:50:59 +0000 (23:50 +0100)]
llama : add simple-chat example (llama/10124)
* llama : add simple-chat example
---------
Co-authored-by: Xuan Son Nguyen <redacted>
Diego Devesa [Fri, 1 Nov 2024 22:48:26 +0000 (23:48 +0100)]
llama : use smart pointers for ggml resources (llama/10117)
Shupei Fan [Fri, 1 Nov 2024 18:33:14 +0000 (02:33 +0800)]
vulkan : improve ggml_vk_create_buffer error handling (llama/9898)
Georgi Gerganov [Fri, 1 Nov 2024 10:58:45 +0000 (12:58 +0200)]
ggml : remove ggml_scratch (llama/10121)
ggml-ci
Zhenwei Jin [Fri, 1 Nov 2024 03:09:59 +0000 (11:09 +0800)]
build: fix build error in Windows env with OneAPI setup (llama/10107)
Diego Devesa [Thu, 31 Oct 2024 21:54:23 +0000 (22:54 +0100)]
llama : fix buffer checks for mamba and rwk (llama/10111)
* llama : fix buffer checks for mamba and rwk
* llama : fix missing worst case flag during reserve
* cuda : fix supports_op for norm
* disable sched SET_CAUSE
Diego Devesa [Thu, 31 Oct 2024 10:40:59 +0000 (11:40 +0100)]
ggml : check tensor name lengths in gguf files (llama/10100)
Sergio López [Thu, 31 Oct 2024 09:09:52 +0000 (10:09 +0100)]
kompute: add mul_mat_q4_k shader (llama/10097)
This is a more or less direct translation from the Metal implementation
to GLSL.
Signed-off-by: Sergio Lopez <redacted>
Sergio López [Wed, 30 Oct 2024 16:01:52 +0000 (17:01 +0100)]
kompute: add backend registry / device interfaces (llama/10045)
Get in line with the other backends by supporting the newer
backend/device registry interfaces.
Signed-off-by: Sergio Lopez <redacted>
Diego Devesa [Wed, 30 Oct 2024 13:51:21 +0000 (14:51 +0100)]
ggml : fix memory leaks when loading invalid gguf files (llama/10094)
* ggml : fix gguf string leak when reading kv pairs fails
* ggml : avoid crashing with GGML_ABORT when the KV has an invalid type
* ggml : avoid crashing on failed memory allocations when loading a gguf file
xctan [Wed, 30 Oct 2024 07:00:40 +0000 (15:00 +0800)]
ggml : add Q4_0_8_8 RISC-V GEMV and GEMM kernels (llama/10029)
* ggml : RISC-V vector gemv for q4_0_8x8
* ggml : Added WIP rvv q4_0_8x8 gemm
* ggml : Added initial implementation of rvv gemm
* ggml : optimize gemm to avoid register spillover
* ggml : Fix GCC rvv load alignment issue
* ggml : Format gemm rvv code
* ggml : Fix a typo in RVV q4_0_8_8 GEMM
Diego Devesa [Wed, 30 Oct 2024 01:01:23 +0000 (02:01 +0100)]
llama : refactor model loader with backend registry (llama/10026)
Changyeon Kim [Tue, 29 Oct 2024 08:52:56 +0000 (17:52 +0900)]
ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (llama/9763)
* ggml: Add POOL2D OP for GPU ACC to the Vulkan.
- The MobileVLM model now supports inference acceleration through GPU by utilizing the Vulkan backend.
- A GGML_OP_POOL_2D shader has been added. (Pooling)
- The encoding performance of the CLIP model improved from 2.8s on the CPU to 0.7s on the GPU.
Signed-off-by: Changyeon Kim <redacted>
* [fix] Correct the incorrect order of the parameters.
fix casting to int.
Signed-off-by: Changyeon Kim <redacted>
---------
Signed-off-by: Changyeon Kim <redacted>
R0CKSTAR [Mon, 28 Oct 2024 09:02:48 +0000 (17:02 +0800)]
musa: workaround for Guilty Lockup in cleaning src0 (llama/10042)
Signed-off-by: Xiaodong Ye <redacted>
Yuri Khrustalev [Sat, 2 Nov 2024 09:09:12 +0000 (05:09 -0400)]
cmake : make it possible linking ggml as external lib (ggml/1003)
Plamen Minev [Fri, 1 Nov 2024 14:55:10 +0000 (16:55 +0200)]
metal : fix minor string leaks (ggml/1004)
Georgi Gerganov [Fri, 15 Nov 2024 06:34:49 +0000 (08:34 +0200)]
scripts : update sync
Raiya Araki [Fri, 15 Nov 2024 09:07:17 +0000 (18:07 +0900)]
ci : fix building workflow for linux/arm64 container (#2555)
KITAITI Makoto [Wed, 13 Nov 2024 19:52:56 +0000 (04:52 +0900)]
ruby : extend API (#2551)
* Handle objs in Ruby code
* Add task to make Makefile
* Share commont constance in test suites
* Add model-related APIs
* Add Whisper::Model class
* Add tests for Whisper::Model
* Add missing LDFLAG -lstdc++
* Add tests for Whisper.log_set
* Add Whisper.set_log
* Define log level
* Add document on logging
* Add license section to README
* Add document on Whisper::Model
* Fix examples in README
* Add test for Model with GC
* Make dependency on Makefile more accurate
* Fix bug about Whisper::Model and GC
Jhen-Jie Hong [Wed, 13 Nov 2024 19:51:34 +0000 (03:51 +0800)]
whisper.swiftui : add model download list & bench methods (#2546)
* swift : fix resources & exclude build
* whisper : impl whisper_timings struct & api
* whisper.swiftui : model list & bench methods
* whisper : return ptr for whisper_get_timings
* revert unnecessary change
* whisper : avoid designated initializer
* whisper.swiftui: code style changes
* whisper.swiftui : get device name / os from UIDevice
* whisper.swiftui : fix UIDevice usage
* whisper.swiftui : add memcpy and ggml_mul_mat (commented)
Wilson Silva [Wed, 13 Nov 2024 19:47:42 +0000 (19:47 +0000)]
ruby : fix the instructions (#2548)
#prompt doesn't exist but #initial_prompt does
thewh1teagle [Wed, 13 Nov 2024 19:47:15 +0000 (21:47 +0200)]
ggml : vulkan logs (#2547)
Stefan Sydow [Wed, 13 Nov 2024 19:41:52 +0000 (20:41 +0100)]
examples : fix ffmpeg v5 build (#2543)
remove call to 'av_register_all()' which does not exist in ffmpeg v5
anymore.
Vin Misra [Wed, 6 Nov 2024 21:02:11 +0000 (13:02 -0800)]
whisper : fix extra memory usage (#2534)
* passing samples_padded by ref to the threads.
* passing samples_padded by ref to the threads.
---------
Co-authored-by: Vinith Misra <redacted>
Georgi Gerganov [Thu, 31 Oct 2024 20:53:46 +0000 (22:53 +0200)]
whisper : backend registry init before model load
Georgi Gerganov [Thu, 31 Oct 2024 20:29:22 +0000 (22:29 +0200)]
talk-llama : sync llama.cpp
Georgi Gerganov [Thu, 31 Oct 2024 20:26:28 +0000 (22:26 +0200)]
sync : ggml
Ma Mingfei [Sat, 26 Oct 2024 06:43:40 +0000 (09:43 +0300)]
ggml : add AMX backend (llama/8998)
Georgi Gerganov [Fri, 25 Oct 2024 19:26:15 +0000 (22:26 +0300)]
metal : support permuted matrix multiplicaions (llama/10033)
* metal : support permuted matrix multiplicaions
ggml-ci
* cont : use nb01 directly for row steps
ggml-ci
* cont : add comments [no ci]
* metal : minor refactor
* metal : minor
Johannes Gäßler [Thu, 24 Oct 2024 12:40:23 +0000 (14:40 +0200)]
CUDA: fix insufficient buffer clearing for MMQ (llama/10032)
Johannes Gäßler [Thu, 24 Oct 2024 09:09:36 +0000 (11:09 +0200)]
CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)
* CUDA: fix MMQ for non-contiguous src0, add tests
* revise test code
bssrdf [Wed, 23 Oct 2024 18:34:00 +0000 (14:34 -0400)]
increase cuda_cpy block size (ggml/996)
Co-authored-by: bssrdf <redacted>
Jun Hee Yoo [Wed, 23 Oct 2024 10:33:45 +0000 (19:33 +0900)]
metal : add POOL2D and fix IM2COL (llama/9943)
* add pool_2d
Signed-off-by: Junhee Yoo <redacted>
* fix im2col and add unittest for N>=1024
Signed-off-by: Junhee Yoo <redacted>
* add tests for N % 1024 != 0
Signed-off-by: Junhee Yoo <redacted>
* remove trailing whitespaces
Signed-off-by: Junhee Yoo <redacted>
* apply suggestions
Signed-off-by: Junhee Yoo <redacted>
* apply more optimization
- original IM2COL kernel + _ext with MIN()
Signed-off-by: Junhee Yoo <redacted>
* apply review: change kernel name of pool_2d
Signed-off-by: Junhee Yoo <redacted>
* apply review
Signed-off-by: Junhee Yoo <redacted>
* fix more formatting and enhance readability
Signed-off-by: Junhee Yoo <redacted>
---------
Signed-off-by: Junhee Yoo <redacted>
leo-pony [Tue, 22 Oct 2024 08:16:01 +0000 (16:16 +0800)]
Adapt to dynamically loadable backends mechanism (llama/9970)
* [CANN] Adapt to dynamically loadable backends mechanism
* Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class
* Handle the review comments of this pull request
Georgi Gerganov [Mon, 21 Oct 2024 13:20:46 +0000 (16:20 +0300)]
ggml : add asserts for type conversion in fattn kernels (llama/9971)
ggml-ci
Radoslav Gerganov [Mon, 21 Oct 2024 10:35:40 +0000 (13:35 +0300)]
rpc : pack only RPC structs (llama/9959)
Neo Zhang Jianyu [Mon, 21 Oct 2024 06:26:09 +0000 (14:26 +0800)]
fix mul_mat_vec_q and *_vec_q error (llama/9939)
Co-authored-by: arthw <redacted>
Radoslav Gerganov [Fri, 18 Oct 2024 11:33:58 +0000 (14:33 +0300)]
rpc : backend refactoring (llama/9912)
* rpc : refactor backend
Use structs for RPC request/response messages
* rpc : refactor server
Ouadie EL FAROUKI [Fri, 18 Oct 2024 05:46:16 +0000 (06:46 +0100)]
Add SYCL Backend registry, device and Event Interfaces (llama/9705)
* implemented missing SYCL event APIs
* sycl : Added device and backend reg interfaces
* Restructured ggml-sycl.cpp
Ma Mingfei [Fri, 18 Oct 2024 05:34:36 +0000 (13:34 +0800)]
add amx kernel for gemm (llama/8998)
add intel amx isa detection
add vnni kernel for gemv cases
add vnni and amx kernel support for block_q8_0
code cleanup
fix packing B issue
enable openmp
fine tune amx kernel
switch to aten parallel pattern
add error message for nested parallelism
code cleanup
add f16 support in ggml-amx
add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS
update CMakeList
update README
fix some compilation warning
fix compiler warning when amx is not enabled
minor change
ggml-ci
move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp
ggml-ci
update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16
ggml-ci
add amx as an ggml-backend
update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h
minor change
update CMakeLists.txt
minor change
apply weight prepacking in set_tensor method in ggml-backend
fix compile error
ggml-ci
minor change
ggml-ci
update CMakeLists.txt
ggml-ci
add march dependency
minor change
ggml-ci
change ggml_backend_buffer_is_host to return false for amx backend
ggml-ci
fix supports_op
use device reg for AMX backend
ggml-ci
minor change
ggml-ci
minor change
fix rebase
set .buffer_from_host_ptr to be false for AMX backend
Diego Devesa [Thu, 17 Oct 2024 00:46:58 +0000 (02:46 +0200)]
vulkan : add backend registry / device interfaces (llama/9721)
* vulkan : add backend registry / device interfaces
* llama : print devices used on model load
Gilad S [Wed, 16 Oct 2024 23:34:22 +0000 (02:34 +0300)]
fix: allocating CPU buffer with size `0` (llama/9917)
Gilad S [Wed, 16 Oct 2024 22:36:51 +0000 (01:36 +0300)]
fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875)
* fix: use `vm_allocate` to allocate CPU backend buffer on macOS
* fix: switch to `posix_memalign` to keep existing `free()` usages work
* feat: move `GGML_ALIGNED_MALLOC` to `ggml-backend-impl.h`, add support for `vm_allocate` on macOS
* style: formatting
* fix: move const outside of `#ifndef`
* style: formatting
* fix: unused var
* fix: transform `GGML_ALIGNED_MALLOC` and `GGML_ALIGNED_FREE` into functions and add them to `ggml-impl.h`
* fix: unused var
* fix: page align to `GGUF_DEFAULT_ALIGNMENT`
* fix: page align to `TENSOR_ALIGNMENT`
* fix: convert `TENSOR_ALIGNMENT` to a macro
* fix: increase page size to `32` on iOS
* fix: iOS page size
* fix: `hbw_posix_memalign` alignment
Johannes Gäßler [Fri, 18 Oct 2024 07:24:44 +0000 (09:24 +0200)]
CUDA: fix 1D im2col, add tests (ggml/993)
leo-pony [Wed, 16 Oct 2024 00:51:46 +0000 (08:51 +0800)]
Fix cann compilation error (llama/9891)
Fix cann compilation error after merging llama.cpp supports dynamically loadable backends.
agray3 [Mon, 14 Oct 2024 00:49:08 +0000 (01:49 +0100)]
Vectorize load instructions in dmmv f16 CUDA kernel (llama/9816)
* Vectorize load instructions in dmmv f16 CUDA kernel
Replaces scalar with vector load instructions, which substantially
improves performance on NVIDIA HBM GPUs, e.g. gives a 1.27X overall
speedup for Meta-Llama-3-8B-Instruct-F16 BS1 inference evaluation on
H100 SXM 80GB HBM3. On GDDR GPUs, there is a slight (1.01X) speedup.
* addressed comment
* Update ggml/src/ggml-cuda/dmmv.cu
Co-authored-by: Johannes Gäßler <redacted>
---------
Co-authored-by: Johannes Gäßler <redacted>
Diego Devesa [Fri, 11 Oct 2024 13:34:45 +0000 (15:34 +0200)]
ggml : move more prints to the ggml log system (llama/9839)
* ggml : move more prints to the ggml log system
* show BLAS OpenMP warnings in all builds using debug print
Diego Devesa [Thu, 10 Oct 2024 18:14:55 +0000 (20:14 +0200)]
rpc : add backend registry / device interfaces (llama/9812)
* rpc : add backend registry / device interfaces
* llama : add llama_supports_rpc API
* ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server
R0CKSTAR [Thu, 10 Oct 2024 18:10:37 +0000 (02:10 +0800)]
musa: add docker image support (llama/9685)
* mtgpu: add docker image support
Signed-off-by: Xiaodong Ye <redacted>
* mtgpu: enable docker workflow
Signed-off-by: Xiaodong Ye <redacted>
---------
Signed-off-by: Xiaodong Ye <redacted>
Diego Devesa [Tue, 8 Oct 2024 12:21:43 +0000 (14:21 +0200)]
ggml : fix BLAS with unsupported types (llama/9775)
* ggml : do not use BLAS with types without to_float
* ggml : return pointer from ggml_internal_get_type_traits to avoid unnecessary copies
* ggml : rename ggml_internal_get_type_traits -> ggml_get_type_traits
it's not really internal if everybody uses it
Diego Devesa [Mon, 7 Oct 2024 19:55:08 +0000 (21:55 +0200)]
ggml : add backend registry / device interfaces to BLAS backend (llama/9752)
* ggml : add backend registry / device interfaces to BLAS backend
* fix mmap usage when using host buffers
Andrew Minh Nguyen [Mon, 7 Oct 2024 16:37:31 +0000 (09:37 -0700)]
Update building for Android (llama/9672)
* docs : clarify building Android on Termux
* docs : update building Android on Termux
* docs : add cross-compiling for Android
* cmake : link dl explicitly for Android
Georgi Gerganov [Mon, 7 Oct 2024 15:27:51 +0000 (18:27 +0300)]
ggml : add metal backend registry / device (llama/9713)
* ggml : add metal backend registry / device
ggml-ci
* metal : fix names [no ci]
* metal : global registry and device instances
ggml-ci
* cont : alternative initialization of global objects
ggml-ci
* llama : adapt to backend changes
ggml-ci
* fixes
* metal : fix indent
* metal : fix build when MTLGPUFamilyApple3 is not available
ggml-ci
* fix merge
* metal : avoid unnecessary singleton accesses
ggml-ci
* metal : minor fix [no ci]
* metal : g_state -> g_ggml_ctx_dev_main [no ci]
* metal : avoid reference of device context in the backend context
ggml-ci
* metal : minor [no ci]
* metal : fix maxTransferRate check
* metal : remove transfer rate stuff
---------
Co-authored-by: slaren <redacted>
Paul Tsochantaris [Mon, 7 Oct 2024 12:26:31 +0000 (13:26 +0100)]
metal : single allocation of encode_async block (llama/9747)
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m
* Moving Block_release to the deallocation code
* Release encode block when re-setting encoding buffer count if needed
* Update ggml/src/ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <redacted>
Daniel Bevenius [Wed, 9 Oct 2024 14:40:35 +0000 (16:40 +0200)]
ggml-alloc : remove buffer_id from leaf_alloc (ggml/987)
This commit removes the buffer_id field from the leaf_alloc struct.
The motivation for is that this field is only written to and never
read/used as far as I can tell. Each tensor_alloc has a buffer_id field
and this is what caused me to look into this more closely, to
understand what the buffer_id in leaf_alloc was used for.
Georgi Gerganov [Thu, 31 Oct 2024 20:13:24 +0000 (22:13 +0200)]
scripts : sync amx
Georgi Gerganov [Thu, 31 Oct 2024 20:00:09 +0000 (22:00 +0200)]
ggml : alloc ggml_contexts on the heap (#2525)
* whisper : reduce ggml_context usage
* ggml : allocate contexts on the heap (v2)
* ggml : aligned malloc -> malloc
Georgi Gerganov [Wed, 30 Oct 2024 10:58:26 +0000 (12:58 +0200)]
ci : fix openblas build (#2511)
* ci : fix openblas build
* cont : would this work?
* ci : I'm sorry, windows
* cont : disabled wrong build
* ci : fix openblas build with pkgconfiglite (#2517)
- choco install pkgconfiglite (vcpkg-pkgconf doesn't contain pkg-config executable?)
- vcpkg install openblas (otherwise it is not detected now)
---------
Co-authored-by: Tamotsu Takahashi <redacted>
Georgi Gerganov [Tue, 29 Oct 2024 17:37:24 +0000 (19:37 +0200)]
scripts : add turbo-q8_0 to the benchmark
Georgi Gerganov [Tue, 29 Oct 2024 17:27:52 +0000 (19:27 +0200)]
whisper : minor compile warning
jettoblack [Tue, 29 Oct 2024 06:47:21 +0000 (02:47 -0400)]
whisper : move new-segment callback after DTW step (#2515)
KITAITI Makoto [Tue, 29 Oct 2024 06:45:37 +0000 (15:45 +0900)]
ruby : fix installation test (#2519)
KITAITI Makoto [Mon, 28 Oct 2024 17:23:23 +0000 (02:23 +0900)]
ruby : add more APIs (#2518)
* Add test for built package existence
* Add more tests for Whisper::Params
* Add more Whisper::Params attributes
* Add tests for callbacks
* Add progress and abort callback features
* [skip ci] Add prompt usage in README
* Change prompt text in example
KITAITI Makoto [Mon, 28 Oct 2024 13:43:27 +0000 (22:43 +0900)]
ruby : support new-segment callback (#2506)
* Add Params#new_segment_callback= method
* Add tests for Params#new_segment_callback=
* Group tests for #transcribe
* Don't use static for thread-safety
* Set new_segment_callback only when necessary
* Remove redundant check
* [skip ci] Add Ruby version README
* Revert "Group tests for #transcribe"
This reverts commit
71b65b00ccf1816c9ea8a247fb30f71bc09707d3 .
* Revert "Add tests for Params#new_segment_callback="
This reverts commit
81e6df3bab7662da5379db51f28a989db7408c02 .
* Add test for Context#full_n_segments
* Add Context#full_n_segments
* Add tests for lang API
* Add lang API
* Add tests for Context#full_lang_id API
* Add Context#full_lang_id
* Add abnormal test cases for lang
* Raise appropriate errors from lang APIs
* Add tests for Context#full_get_segment_t{0,1} API
* Add Context#full_get_segment_t{0,1}
* Add tests for Context#full_get_segment_speaker_turn_next API
* Add Context#full_get_segment_speaker_turn_next
* Add tests for Context#full_get_segment_text
* Add Context#full_get_setgment_text
* Add tests for Params#new_segment_callback=
* Run new segment callback
* Split tests to multiple files
* Use container struct for new segment callback
* Add tests for Params#new_segment_callback_user_data=
* Add Whisper::Params#new_user_callback_user_data=
* Add GC-related test for new segment callback
* Protect new segment callback related structs from GC
* Add meaningful test for build
* Rename: new_segment_callback_user_data -> new_segment_callback_container
* Add tests for Whisper::Segment
* Add Whisper::Segment and Whisper::Context#each_segment
* Extract c_ruby_whisper_callback_container_allocate()
* Add test for Whisper::Params#on_new_segment
* Add Whisper::Params#on_new_egment
* Assign symbol IDs to variables
* Make extsources.yaml simpler
* Update README
* Add document comments
* Add test for calling Whisper::Params#on_new_segment multiple times
* Add file dependencies to GitHub actions config and .gitignore
* Add more files to ext/.gitignore
KITAITI Makoto [Mon, 28 Oct 2024 11:08:09 +0000 (20:08 +0900)]
ruby : add Metal support (#2516)
Josscii [Wed, 23 Oct 2024 12:14:03 +0000 (20:14 +0800)]
whisper : fix index overflow in token-level timestamp logic (#2505)
toboil-features [Thu, 17 Oct 2024 10:25:18 +0000 (13:25 +0300)]
readme : update links and make commands (#2489)
* Update links to headers in README.md
* Add link to Vulkan section in README.md
* Add "-j" for parallelism for "make" in README.md
* Update README.md
KITAITI Makoto [Wed, 16 Oct 2024 15:44:04 +0000 (00:44 +0900)]
ruby : fix bindings (#2484)
* Improve Rakefile
* Remove intermediate files
* Remove unnecessary manipulations from extconf.rb
* Add README and LINCENSE to source files
* Manage ext source files using YAML file
* Use extsources.yaml to include files into gem package file
* Add git-managed source files to build dependency
* Add test task
* Download model for test if not exists
* Add test for build
* Ignore gem package directory
* Enable GitHub action for Ruby binding
* Fix model name
* Build lib file for test
* Use extension for each platform
* Use extension for each platform on testing
* Move built lib file rather than copy
* Add intermediate files to clean targets
toboil-features [Wed, 16 Oct 2024 15:43:26 +0000 (18:43 +0300)]
readme : add Vulkan notice (#2488)
* Add Vulkan notice in README.md
* Fix formatting for Vulkan section in README.md
* Fix formatting in README.md
Georgi Gerganov [Wed, 16 Oct 2024 15:42:47 +0000 (18:42 +0300)]
make : fix GGML_VULKAN=1 build (#2485)
Rotem Dan [Tue, 15 Oct 2024 18:00:21 +0000 (21:00 +0300)]
whisper : add dtw preset for large-v3-turbo (#2481)
CrispStrobe [Mon, 14 Oct 2024 07:46:33 +0000 (09:46 +0200)]
convert : handle max_target_positions (#2477)
as needed eg for
https://huggingface.co/primeline/whisper-large-v3-turbo-german/blob/main/config.json
Salman Faroz [Mon, 14 Oct 2024 07:44:57 +0000 (13:14 +0530)]
readme : update the Quick Start section (#2475)
navigating into the directory
Sandro Hanea [Tue, 8 Oct 2024 17:08:00 +0000 (19:08 +0200)]
whisper : add OpenVINO init with state (#2464)
* Fixed OpenVino init on state
* Removed an empty line
* Fixed typo
* Replaced tabs with spaces
---------
Co-authored-by: Sandro Hanea <redacted>
Georgi Gerganov [Mon, 7 Oct 2024 10:06:48 +0000 (13:06 +0300)]
release : v1.7.1
SRHMorris [Sun, 6 Oct 2024 07:34:20 +0000 (08:34 +0100)]
vulkan : retry allocation with fallback flags (#2451)
Co-authored-by: Samuel Morris <redacted>
Georgi Gerganov [Sat, 5 Oct 2024 13:43:26 +0000 (16:43 +0300)]
release : v1.7.0
Georgi Gerganov [Sat, 5 Oct 2024 13:22:53 +0000 (16:22 +0300)]
scripts : bench v3-turbo
Georgi Gerganov [Sat, 5 Oct 2024 13:13:03 +0000 (16:13 +0300)]
whisper : remove mel leftover constants (
396089f )
Georgi Gerganov [Sat, 5 Oct 2024 12:22:17 +0000 (15:22 +0300)]
whisper : zero-out the KV cache upon clear (#2445)
Georgi Gerganov [Sat, 5 Oct 2024 12:18:50 +0000 (15:18 +0300)]
objc : fix build
Georgi Gerganov [Sat, 5 Oct 2024 11:33:54 +0000 (14:33 +0300)]
metal : zero-init buffer contexts (#0)
Georgi Gerganov [Sat, 5 Oct 2024 11:29:45 +0000 (14:29 +0300)]
whisper : revert mel-related changes (#0)
too much extra logic and complexity for small benefit
Georgi Gerganov [Sat, 5 Oct 2024 10:14:03 +0000 (13:14 +0300)]
whisper : adapt to latest ggml (skip) (#0)
Daniel Bevenius [Fri, 4 Oct 2024 13:46:18 +0000 (15:46 +0200)]
ggml : fix typo in example usage ggml_gallocr_new (ggml/984)