git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log

]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log

overview / pkg / ggml / sources / whisper.cpp / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

alonfaraj [Sun, 16 Jul 2023 20:00:34 +0000 (23:00 +0300)]

ci : more platforms coverage (#1101)

* add multi platform

* add image name

* fix

* fix /bin/sh path

* add missing \

* add all platforms for check

* remove platforms

* remove s390x

* - add arm v6
- format run cmd

* remove arm v6

* - bump checkout to v3
- use setup emsdk action
- add arch to all ubuntu jobs

* mymindstorm/setup-emsdk to v12

* add missing QEMU step

* add fail-fast: false for debug

* add freebsd

* remark all jobs except freebsd for test

* add sudo

* enable all tests again

* format

* check __AVX__ support before include immintrin.h

* try auto detect flag by cmake

* fix check for immintrin.h

* fix include check for immintrin.h

* Remove all platforms for sanitizer build except amd64

We have no clue why they failed.

---------

Co-authored-by: Alon Faraj <redacted>

commit | commitdiff | tree

Georgi Gerganov [Tue, 4 Jul 2023 17:28:27 +0000 (20:28 +0300)]

whisper : minor OpenVINO refactoring (#1037)

Hopefully I didn't break something - haven't tested

commit | commitdiff | tree

Travis Cline [Tue, 4 Jul 2023 13:13:25 +0000 (06:13 -0700)]

go : call SetDuration appropriately (#1077)

commit | commitdiff | tree

Murilo Santana [Tue, 4 Jul 2023 13:05:35 +0000 (10:05 -0300)]

go : fix context.Process call in examples (#1067)

commit | commitdiff | tree

Ryan Metcalfe [Tue, 4 Jul 2023 12:56:11 +0000 (08:56 -0400)]

whisper : add OpenVINO support (#1037)

* openvino: use OpenVINO encoder inference

* openvino: add python script for OpenVINO model generation

* whisper: Fix 'unused' warnings when OpenVINO isn't enabled in build

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <redacted>
* whisper: Fix compilation error

* whisper: revert whisper_get_openvino_path_encoder & whisper_get_openvino_path_cache to non-const func signatures

* cmake: Add openvino-encoder as separate object target

* whisper : minor style fixes

* minor : indentation fixes

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Martin Warnaar [Tue, 4 Jul 2023 12:30:31 +0000 (14:30 +0200)]

readme : better wording (#1064)

commit | commitdiff | tree

Georgi Gerganov [Tue, 4 Jul 2023 06:51:22 +0000 (09:51 +0300)]

readme : add tinydiarize instructions (#1058)

commit | commitdiff | tree

Akash Mahajan [Tue, 4 Jul 2023 06:45:00 +0000 (23:45 -0700)]

whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize (#1058)

* add HuggingFace mirror to download ggml model

* support tdrz via simple hack overriding solm tokens

* fix incorrect translate/transcribe token_ids that are not static const

* add apollo 13 sample for tdrz demo

* render [SPEAKER TURN] consistently in all terminal output using vocab.id_to_token

* extend whisper_segment with speaker_turn_next field and save in json output

* fix failing go build

* slipped in some python syntax whoops

* whisper : finalize tinydiarize support (add flag + fixes)

* whisper : tdrz support for word-level timestamps (respect max_len)

* java : try to fix tests after adding tdrz_enable flag

* main : remove TODO leftover

* java : fix params order list after adding "tdrz_enable"

* whisper : fix solm and add nosp token

* main : print tinydiarize help

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Mon, 3 Jul 2023 16:24:01 +0000 (19:24 +0300)]

talk-llama : fix new rope interface

commit | commitdiff | tree

Georgi Gerganov [Sun, 2 Jul 2023 18:53:52 +0000 (21:53 +0300)]

Revert "ggml : do not use _GNU_SOURCE gratuitously (#1027)"

This reverts commit 3f7a03ebe3b65be0792849e300a122f6a050e3f8.

commit | commitdiff | tree

Georgi Gerganov [Sun, 2 Jul 2023 18:45:27 +0000 (21:45 +0300)]

ggml : sync latest repo (mostly refactoring changes)

commit | commitdiff | tree

Przemysław Pawełczyk [Wed, 28 Jun 2023 19:34:50 +0000 (21:34 +0200)]

talk-llama : fix build on macOS (#1062)

* talk-llama : use posix_madvise() instead of madvise() derived from BSD

sed -i 's,\<madvise\>,posix_&,g;s,\<MADV_,POSIX_&,g' examples/talk-llama/llama-util.h

* make : enable Darwin extensions for macOS builds

This is an attempt at fixing macOS build error coming from the fact that
RLIMIT_MEMLOCK define is not available there without Darwin extensions.

commit | commitdiff | tree

thefinaldegree [Wed, 28 Jun 2023 19:07:02 +0000 (07:07 +1200)]

extra : update 'quantize-all.sh' to quantize all downloaded models (#1054)

Script will now do what it says: quantize everything except testing models in the 'models' directory.

commit | commitdiff | tree

Georgi Gerganov [Sun, 25 Jun 2023 20:51:01 +0000 (23:51 +0300)]

whisper : `split_on_word` no longer trims (#1046)

commit | commitdiff | tree

Przemysław Pawełczyk [Sun, 25 Jun 2023 13:34:30 +0000 (15:34 +0200)]

ggml : do not use _GNU_SOURCE gratuitously (#1027)

* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

This is an attempt at fixing macOS build error coming from SDL2 relying
on Darwin extension memset_pattern4/8/16 coming from Apple's string.h.

commit | commitdiff | tree

Przemysław Pawełczyk [Sun, 25 Jun 2023 13:13:50 +0000 (15:13 +0200)]

talk-llama : fix build after ggml sync (#1049)

sed -i 's,GGML_BACKEND_CUDA,GGML_BACKEND_GPU,g' examples/talk-llama/llama.cpp

commit | commitdiff | tree

Georgi Gerganov [Sun, 25 Jun 2023 12:40:30 +0000 (15:40 +0300)]

metal : sync ggml-metal (ref #1047)

commit | commitdiff | tree

Georgi Gerganov [Sun, 25 Jun 2023 12:38:12 +0000 (15:38 +0300)]

opencl : sync latest ggml-opencl

commit | commitdiff | tree

Philippe Normand [Sun, 25 Jun 2023 12:30:39 +0000 (13:30 +0100)]

whisper : fix build with -Werror=undef (#1045)

commit | commitdiff | tree

Simon Moisselin [Sun, 25 Jun 2023 12:29:54 +0000 (02:29 -1000)]

models : add ggml_to_pt script (#1042)

* adding ggml_to_pt

* typo sys too many args

* fixing swap errors dimensions

---------

Co-authored-by: simonMoisselin <redacted>

commit | commitdiff | tree

Roddur Dasgupta [Sun, 25 Jun 2023 12:27:28 +0000 (05:27 -0700)]

models : cd statements are quoted to allow spaces in path (#1041)

commit | commitdiff | tree

Georgi Gerganov [Sun, 25 Jun 2023 12:22:49 +0000 (15:22 +0300)]

models : handle paths with spaces in download script (close #1038)

commit | commitdiff | tree

Colin [Sun, 25 Jun 2023 12:07:57 +0000 (07:07 -0500)]

main : add diarization support for all current output types (#1031)

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

GiviMAD [Sun, 25 Jun 2023 11:46:07 +0000 (04:46 -0700)]

readme : add java alternative binding (#1029)

Signed-off-by: Miguel Álvarez <redacted>

commit | commitdiff | tree

Jay Binks [Sun, 25 Jun 2023 11:45:33 +0000 (21:45 +1000)]

go : add support for whisper_full_lang_id() (#1010)

* * Add support for whisper_full_lang_id() to go bindings

* Expose token.id so we can test beg, eot etc

---------

Co-authored-by: Jay Binks <redacted>

commit | commitdiff | tree

Georgi Gerganov [Sun, 25 Jun 2023 11:34:10 +0000 (14:34 +0300)]

go : fix "cb" -> "callNewSegment"

commit | commitdiff | tree

Georgi Gerganov [Sun, 25 Jun 2023 11:22:21 +0000 (14:22 +0300)]

ggml : sync latest ggml lib

commit | commitdiff | tree

Bo-Yi Wu [Sun, 25 Jun 2023 11:07:55 +0000 (19:07 +0800)]

go : improve progress reporting and callback handling (#1024)

- Rename `cb` to `callNewSegment` in the `Process` function
- Add `callProgress` as a new parameter to the `Process` function
- Introduce `ProgressCallback` type for reporting progress during processing
- Update `Whisper_full` function to include `progressCallback` parameter
- Add `registerProgressCallback` function and `cbProgress` map for handling progress callbacks

Signed-off-by: appleboy <redacted>

commit | commitdiff | tree

byte-6174 [Sun, 25 Jun 2023 10:59:48 +0000 (06:59 -0400)]

make : update cuBLAS build both x86 and aarch64 (#1015)

make cuBLAS compilation compatible with x86 as well as aarch64.

commit | commitdiff | tree

KP Kaiser [Sun, 25 Jun 2023 10:57:18 +0000 (06:57 -0400)]

make : fix for CUDA native not working as an option on Ubuntu (#1012)

commit | commitdiff | tree

faker [Sun, 25 Jun 2023 10:52:29 +0000 (18:52 +0800)]

main : exit gracefully when invalid params are passed

* Refactor whisper_params_parse to return false on failure

* Updated help flag behavior

commit | commitdiff | tree

faker [Sun, 25 Jun 2023 10:51:59 +0000 (18:51 +0800)]

main : gracefully exit when invalid params are passed (#1002)

* Refactor whisper_params_parse to return false on failure

* Updated help flag behavior

commit | commitdiff | tree

Akash Mahajan [Sun, 25 Jun 2023 10:50:14 +0000 (03:50 -0700)]

py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001)

* patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer

* typo fix

commit | commitdiff | tree

Larry Battle [Sun, 25 Jun 2023 10:46:44 +0000 (05:46 -0500)]

readme : corrected syntax for markdown link (#995)

commit | commitdiff | tree

Nicholas Albion [Tue, 6 Jun 2023 00:27:26 +0000 (10:27 +1000)]

updated java README

commit | commitdiff | tree

Nicholas Albion [Thu, 1 Jun 2023 12:45:00 +0000 (22:45 +1000)]

`speak` scripts for Windows

commit | commitdiff | tree

Nicholas Albion [Thu, 1 Jun 2023 06:53:56 +0000 (16:53 +1000)]

updated README for java

commit | commitdiff | tree

geniusnut [Wed, 31 May 2023 07:13:14 +0000 (15:13 +0800)]

whisper.android : support decode wav file has 2 channels (#972)

commit | commitdiff | tree

Nicholas Albion [Sun, 28 May 2023 23:38:58 +0000 (09:38 +1000)]

Feature/java bindings2 (#944)

* Java needs to call `whisper_full_default_params_by_ref()`, returning struct by val does not seem to work.
* added convenience methods to WhisperFullParams
* Remove unused WhisperJavaParams

commit | commitdiff | tree

genevera (she/her) [Sat, 27 May 2023 07:40:28 +0000 (03:40 -0400)]

models : fix README.md (#964)

Fixes typo on line 76 of models/README.md

commit | commitdiff | tree

DGdev91 [Wed, 24 May 2023 18:11:01 +0000 (20:11 +0200)]

examples : update elevenlabs scripts to use official python API (#837)

* Update elevenlabs example to use ufficial python API

* Update elevenlabs example to use official python API

commit | commitdiff | tree

0xsourcecode [Wed, 24 May 2023 08:23:51 +0000 (04:23 -0400)]

readme : highlight OpenBLAS support (#956)

* highlight openblas support

* Update README.md

commit | commitdiff | tree

Georgi Gerganov [Tue, 23 May 2023 11:04:39 +0000 (14:04 +0300)]

talk-llama : sync latest llama.cpp (close #922, close #954)

commit | commitdiff | tree

Alexey Kharlamov [Sat, 20 May 2023 18:23:45 +0000 (19:23 +0100)]

cmake : build with any BLAS compatible library (#927)

* Build with any BLAS library

* ci: Removed explicit CUDA nvcc path

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 17:00:06 +0000 (20:00 +0300)]

ggml : update WASM SIMD

commit | commitdiff | tree

Georgi Gerganov [Sat, 20 May 2023 15:56:30 +0000 (18:56 +0300)]

ggml : sync latest ggml repo

- new Q4 and Q8 quantization
- updated CUDA

commit | commitdiff | tree

Nicholas Albion [Sat, 20 May 2023 15:25:02 +0000 (01:25 +1000)]

bindings : add java bindings (#931)

* WIP - java bindings

* updated README

* failed attempt at JNI

* fullTranscribe() test passes

* tested on Ubuntu 20

* link to Java bindings

commit | commitdiff | tree

Elkana Bardugo [Sat, 20 May 2023 15:17:54 +0000 (18:17 +0300)]

whisper : fix hebrew language code (#935)

commit | commitdiff | tree

Ahmad Bilal [Mon, 15 May 2023 15:36:06 +0000 (20:36 +0500)]

coreml : add support of large-v1 model (#926)

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 16:06:45 +0000 (19:06 +0300)]

release : v1.4.2

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 15:56:46 +0000 (18:56 +0300)]

ggml : add AVX dot products

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 15:46:19 +0000 (18:46 +0300)]

talk-llama : fix build + sync latest llama.cpp

commit | commitdiff | tree

Jhen-Jie Hong [Sun, 14 May 2023 15:11:08 +0000 (23:11 +0800)]

readme : improve Core ML model conversion guidance (#915)

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 15:09:44 +0000 (18:09 +0300)]

coreml : support quantized model files

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 15:04:23 +0000 (18:04 +0300)]

ggml : sync latest ggml

- New Q4 and Q5 formats
- Various improvements

commit | commitdiff | tree

Rich Jones [Sun, 14 May 2023 14:54:57 +0000 (16:54 +0200)]

main : fix help for --no-timestamps arg (#908)

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 07:01:52 +0000 (10:01 +0300)]

extra : update ggml sync script

commit | commitdiff | tree

Jhen-Jie Hong [Sun, 14 May 2023 06:47:02 +0000 (14:47 +0800)]

whisper.objc : enable Core ML in example & fix segmentation fault (#910)

* coreml : update endcoder header import path

* coreml : force objc_arc in whisper-encoder.mm

* whisper.objc : create coreml/ group link

* whisper.objc : add coreml model link

* whisper.objc : update readme

* coreml : use -fobjc-arc for coreml/whisper-encoder.mm

* ci: create dummy .mlmodelc for pass ios build

* whisper.objc : update readme

---------

Co-authored-by: Georgi Gerganov <redacted>

commit | commitdiff | tree

Georgi Gerganov [Sun, 14 May 2023 06:42:19 +0000 (09:42 +0300)]

coreml : fix seg fault, double free (#919, #917, #899)

commit | commitdiff | tree

Georgi Gerganov [Tue, 9 May 2023 15:38:12 +0000 (18:38 +0300)]

coreml : fix memory leak (#899)

commit | commitdiff | tree

Jonathan Soo [Mon, 8 May 2023 18:08:09 +0000 (14:08 -0400)]

cmake : fix define used for COREML_ALLOW_FALLBACK (#893)

commit | commitdiff | tree

Luis Herrera [Mon, 8 May 2023 17:59:21 +0000 (12:59 -0500)]

talk-llama : only copy used KV cache in get / set state (#890)

---------

Co-authored-by: ejones <redacted>

commit | commitdiff | tree

Clifford Heath [Mon, 8 May 2023 17:58:36 +0000 (03:58 +1000)]

readme : add instructions on converting to GGML + "--no-config" to wget (#874)

commit | commitdiff | tree

ZaBlazzingZephyrus [Mon, 8 May 2023 17:45:53 +0000 (00:45 +0700)]

cmake : fix options disabling AVX and AVX2 flags (#885)

commit | commitdiff | tree

Georgi Gerganov [Thu, 4 May 2023 16:31:04 +0000 (19:31 +0300)]

cmake : add options to disable CPU flags (#860)

commit | commitdiff | tree

RelatedTitle [Wed, 3 May 2023 20:47:37 +0000 (14:47 -0600)]

ci : add cuBLAS build workflow and fix error causing lines in CMakeLists (#867)

* Add windows build with cuBLAS

* Remove error causing lines for cuBLAS on Windows

commit | commitdiff | tree

Vulcan [Wed, 3 May 2023 16:24:43 +0000 (21:54 +0530)]

readme : partial OpenCL GPU support via CLBlast (#863)

* ggml : CLBlast support as in llama.cpp

Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.

Usage:
WHISPER_CLBLAST=1 make

* CMake/Makefile : CLBlast support as in llama.cpp

Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.

Usage:
```
Makefile:
cd whisper.cpp
WHISPER_CLBLAST=1 make

CMake:
cd whisper.cpp ; mkdir build ; cd build
cmake -DWHISPER_CLBLAST=ON ..
make
```

* Update README.md

Added OpenCL Build Instructions

* Instruction: Partial OpenCL GPU support via CLBlast

Added build instructions and examples for Make and CMake to support OpenCL enabled GPUs.

commit | commitdiff | tree

Vulcan [Tue, 2 May 2023 19:50:32 +0000 (01:20 +0530)]

build : CLBlast support as in llama.cpp (#862)

* ggml : CLBlast support as in llama.cpp

Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.

Usage:
WHISPER_CLBLAST=1 make

* CMake/Makefile : CLBlast support as in llama.cpp

Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.

Usage:
```
Makefile:
cd whisper.cpp
WHISPER_CLBLAST=1 make

CMake:
cd whisper.cpp ; mkdir build ; cd build
cmake -DWHISPER_CLBLAST=ON ..
make
```

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 18:47:12 +0000 (21:47 +0300)]

ggml : fix 32-bit ARM build + quantization

commit | commitdiff | tree

Georgi Gerganov [Tue, 2 May 2023 18:23:54 +0000 (21:23 +0300)]

ggml : sync ggml (clBLAST + tensor names)

commit | commitdiff | tree

Luis Herrera [Tue, 2 May 2023 17:05:27 +0000 (12:05 -0500)]

talk-llama : fix session prompt load (#854)

commit | commitdiff | tree

CRD716 [Tue, 2 May 2023 16:51:52 +0000 (11:51 -0500)]

whisper : add detect-language mode (#853)

* add detectlanguage flag

* renaming and help

* no idea why that last one didn't commit

* run language detection if dl is set

* help message fix

* various fixes

* fix quitting

* fix language being english on print

commit | commitdiff | tree

Luis Herrera [Mon, 1 May 2023 17:18:10 +0000 (12:18 -0500)]

talk-llama : add --session support (#845)

* feat: adding session support

* readme: adding --session info in examples/talk-llama

* llama: adding session fixes

* readme: updating session doc

* talk-llama: update the value of need_to_save_session to true in order to save the session in the subsequent interaction

* talk-llama: adding missing function which updates session_tokens

commit | commitdiff | tree

Georgi Gerganov [Mon, 1 May 2023 11:44:39 +0000 (14:44 +0300)]

bench : improve benchmarks

commit | commitdiff | tree

Georgi Gerganov [Mon, 1 May 2023 07:03:56 +0000 (10:03 +0300)]

whisper : add memory sizes for Q8_0 (close #846)

commit | commitdiff | tree

Baffin Lee [Mon, 1 May 2023 06:28:05 +0000 (14:28 +0800)]

whisper.wasm : fix typo in readme (#832)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 19:57:42 +0000 (22:57 +0300)]

release : v1.4.1

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 19:50:04 +0000 (22:50 +0300)]

whisper : fix quantize bug (#842)

* whisper : debug

* whisper : fix bug during quantization

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 19:27:30 +0000 (22:27 +0300)]

ggml : fix UB (int << 31)

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 16:23:37 +0000 (19:23 +0300)]

release : v1.4.0

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 16:12:49 +0000 (19:12 +0300)]

examples : fix + refactor Levenshtein distance

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 15:51:57 +0000 (18:51 +0300)]

whisper : add integer quantization support (#540)

* whisper : add integer quantization support

* examples : add common-ggml + prepare to add "quantize" tool

* whisper : quantization tool ready

* whisper : fix F32 support

* whisper : try to fix shared lib linkage

* wasm : update quantized models to Q5

* bench.wasm : remove "medium" button

* bench.wasm : fix custom model button

* ggml : add Q5_0 and Q5_1 WASM SIMD

* wasm : add quantized models to all WASM examples

* wasm : bump DB version number to 2

* talk-llama : update example to latest llama.cpp

* node : increase test timeout to 10s

* readme : add information for model quantization

* wasm : add links to other examples

commit | commitdiff | tree

Georgi Gerganov [Sun, 30 Apr 2023 09:14:33 +0000 (12:14 +0300)]

whisper : add GPU support via cuBLAS (#834)

* make : add WHISPER_CUBLAS

* make : fix CUBLAS build

* whisper : disable Flash Attention + adjust memory buffers

* whisper : remove old commented code

* readme : add cuBLAS instructions

* cmake : add WHISPER_CUBLAS option

* gitignore : ignore build-cublas

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:37:23 +0000 (21:37 +0300)]

ggml : fix WASM build

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:33:33 +0000 (21:33 +0300)]

ggml : fix 32-bit ARM NEON (#836)

* ggml : add support for 32-bit ARM

* ggml : fix

* ggml : fix

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 18:14:09 +0000 (21:14 +0300)]

ggml : use vzip instead of vuzp for consistency

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 17:21:25 +0000 (20:21 +0300)]

ggml : fix WASM build

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 16:30:22 +0000 (19:30 +0300)]

ggml : sync with ggml repo (warning fixes + asserts)

commit | commitdiff | tree

Thijs Raymakers [Sat, 29 Apr 2023 15:55:37 +0000 (17:55 +0200)]

whisper : use correct seek_end when offset is used (#833)

Whenever an `offset_ms` is provided, the value of `seek_end` is
calculated incorrectly. This causes Whisper to keep transcribing
after the end of the file.

The current behavior looks like
```
[00:34:40.000 --> 00:34:47.000]   This is an example audio file.
[00:34:47.000 --> 00:34:49.000]   The text has been redacted
[00:34:49.000 --> 00:34:51.000]   This is the end of the audio.
[00:34:51.000 --> 00:34:52.000]   ***
[00:34:52.000 --> 00:34:53.000]   ***
[00:34:53.000 --> 00:34:54.000]   ***
[00:34:55.000 --> 00:34:56.000]   ***
...
```

The expected behavior should be
```
[00:34:40.000 --> 00:34:47.000]   This is an example audio file.
[00:34:47.000 --> 00:34:49.000]   The text has been redacted
[00:34:49.000 --> 00:34:51.000]   This is the end of the audio.
- end of program -
```

This commit changes the calculation of the `seek_end` variable to
only add `seek_start` if a custom `duration_ms` is provided.
Otherwise, it defaults to the end of the file.

Signed-off-by: Thijs Raymakers <redacted>

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 09:32:18 +0000 (12:32 +0300)]

tests : add "threads" to run-tests.sh

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 09:32:05 +0000 (12:32 +0300)]

extra : add sync-ggml.sh script

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 09:31:52 +0000 (12:31 +0300)]

ggml : sync latest ggml + llama.cpp updates (quantization)

commit | commitdiff | tree

Zollner [Sat, 29 Apr 2023 08:00:20 +0000 (16:00 +0800)]

whisper.android : add some tips (#816)

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:55:24 +0000 (10:55 +0300)]

build : add WHISPER_COREML_ALLOW_FALLBACK to make / CMake (#812)

commit | commitdiff | tree

Canis Lupus [Sat, 29 Apr 2023 07:49:02 +0000 (08:49 +0100)]

whisper : allow non-CoreML fallback when Core ML cannot be loaded (#812)

if the Core ML model cannot be loaded, continue without Core ML instead of
returning. This allows a single build to transcribe using Core ML models
where available, and regular models when not.

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:42:14 +0000 (10:42 +0300)]

whisper : fix bug from previous commit

commit | commitdiff | tree

Georgi Gerganov [Sat, 29 Apr 2023 07:36:50 +0000 (10:36 +0300)]

whisper : avoid designated initializers

commit | commitdiff | tree

AsukaMinato [Sat, 29 Apr 2023 07:06:25 +0000 (16:06 +0900)]

minor : improve C++ and Python style (#768)

* use some STL functions

* use self.field than setattr, use pathlib.Path

* recover some format

* const some iter

* Keep the original

* 2 space

commit | commitdiff | tree

Georgi Gerganov [Fri, 28 Apr 2023 19:41:29 +0000 (22:41 +0300)]

readme : add logo

commit | commitdiff | tree

Laytan Laats [Sun, 23 Apr 2023 16:01:59 +0000 (18:01 +0200)]

main : escape quotes in csv output (#815)

Packaging of ggerganov/whisper.cpp