]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log
pkg/ggml/sources/whisper.cpp
2 years agowhisper : use emplace_back in place of push_back (#319)
Andy Maloney [Fri, 23 Dec 2022 09:07:19 +0000 (04:07 -0500)]
whisper : use emplace_back in place of push_back (#319)

This avoids potential construction of temporaries.

2 years agowhisper : fix mem leak on failure to load model (#318)
Andy Maloney [Fri, 23 Dec 2022 09:06:17 +0000 (04:06 -0500)]
whisper : fix mem leak on failure to load model (#318)

2 years agoggml : make consts static (#317)
Andy Maloney [Fri, 23 Dec 2022 09:05:27 +0000 (04:05 -0500)]
ggml : make consts static (#317)

These shouldn't be able to be referenced outside the compilation unit.

2 years agoUpdate README.md
Georgi Gerganov [Fri, 23 Dec 2022 09:02:46 +0000 (11:02 +0200)]
Update README.md

Add SwiftUI example links

2 years agoexamples : add whisper.swiftui demo app (#308)
Digipom [Fri, 23 Dec 2022 08:56:18 +0000 (03:56 -0500)]
examples : add whisper.swiftui demo app (#308)

* Add SwiftUI demo project.

* Add -DGGML_USE_ACCELERATE

2 years agoUpdate README.md
Georgi Gerganov [Thu, 22 Dec 2022 16:22:58 +0000 (18:22 +0200)]
Update README.md

Add bindings links / discussions

2 years agominor : small code cleanups (#302)
Andy Maloney [Thu, 22 Dec 2022 15:06:19 +0000 (10:06 -0500)]
minor : small code cleanups (#302)

* Small code cleanups

- fix indentation
- remove extra semicolons
- remove extra break after returns in case statements
- remove unnecessary call to .data() on string
- use empty() instead of checking size()
- no need to check for nullptr before free
- remove unnecessary initialization of string to ""

* minor : switch case always break

Co-authored-by: Georgi Gerganov <redacted>
2 years agominor : flag "ARM FMA" -> "ARM_FMA"
Georgi Gerganov [Thu, 22 Dec 2022 14:43:57 +0000 (16:43 +0200)]
minor : flag "ARM FMA" -> "ARM_FMA"

2 years agoBuild a vfpv4 library for armeabi-v7a and do runtime detection to select the right...
Kevin Brothaler [Tue, 20 Dec 2022 20:15:59 +0000 (15:15 -0500)]
Build a vfpv4 library for armeabi-v7a and do runtime detection to select the right library

2 years agoCheck for both __ARM_NEON and __ARM_FEATURE_FMA so that the project can be compiled...
Kevin Brothaler [Tue, 20 Dec 2022 18:33:33 +0000 (13:33 -0500)]
Check for both __ARM_NEON and __ARM_FEATURE_FMA so that the project can be compiled for armv7a.

Android armeabi-v7a's NEON support doesn't support FMA unless configured with `-mfpu=neon-fp-armv8`, which would need runtime checks.
* Also removed ABI filter from Android project.

2 years agoBump NDK version
Kevin Brothaler [Tue, 20 Dec 2022 18:33:27 +0000 (13:33 -0500)]
Bump NDK version

2 years agowhisper : use nullptr (C++11) instead of NULL macro (#299)
Andy Maloney [Thu, 22 Dec 2022 14:35:18 +0000 (09:35 -0500)]
whisper : use nullptr (C++11) instead of NULL macro (#299)

2 years agocmake : add headers to target (#298)
Andy Maloney [Thu, 22 Dec 2022 14:34:47 +0000 (09:34 -0500)]
cmake : add headers to target (#298)

This will show the header files in IDEs.

2 years agogo : run `go mod tidy` before building examples + fix permissions (#296)
Mohit Agarwal [Thu, 22 Dec 2022 14:34:20 +0000 (20:04 +0530)]
go : run `go mod tidy` before building examples + fix permissions (#296)

* run `go mod tidy` before building examples

Running `make examples` after cloning the repository gives the following
error:

```
...
[100%] Built target whisper
gmake[3]: Leaving directory '/tmp/exp/whisper.cpp/bindings/go/build'
gmake[2]: Leaving directory '/tmp/exp/whisper.cpp/bindings/go/build'
gmake[1]: Leaving directory '/tmp/exp/whisper.cpp/bindings/go/build'
Build example go-model-download
Build example go-whisper
examples/go-whisper/process.go:11:2: missing go.sum entry for module providing package github.com/go-audio/wav (imported by github.com/ggerganov/whisper.cpp/bindings/go/examples/go-whisper); to add:
        go get github.com/ggerganov/whisper.cpp/bindings/go/examples/go-whisper
make: *** [Makefile:26: examples/go-whisper] Error 1
```

* remove executable bit from various files

2 years agobindings : initial import of golang bindings (#287)
David Thorpe [Tue, 20 Dec 2022 06:54:33 +0000 (07:54 +0100)]
bindings : initial import of golang bindings (#287)

* Initial import of golang bindings

* Updated makefile rules

* Updated bindings

* Makefile update to add in more tests

2 years agoUpdate README.md
Georgi Gerganov [Mon, 19 Dec 2022 20:09:21 +0000 (22:09 +0200)]
Update README.md

2 years agocmake : enable and fix -Wall -Wextra -Wpedantic C++ warnings
Georgi Gerganov [Mon, 19 Dec 2022 18:45:08 +0000 (20:45 +0200)]
cmake : enable and fix  -Wall -Wextra -Wpedantic C++ warnings

2 years agominor : resolves some of warnings when compiling with clang/clang++ (#294)
Matheus de Sousa [Mon, 19 Dec 2022 18:19:01 +0000 (15:19 -0300)]
minor : resolves some of warnings when compiling with clang/clang++ (#294)

* Resolves some of warnings when compiling with clang/clang++

Mostly nit stuff that clang catches when compiling with -Wall -Wextra
-pedantic.

- Fix comparison between sign/unsigned integers.
- Passes a constant reference (const&) instead of copying each time.

* minor : normalize coding style

* minor : fix warning

Co-authored-by: Georgi Gerganov <redacted>
2 years agorelease : v1.0.4
Georgi Gerganov [Sat, 17 Dec 2022 17:52:42 +0000 (19:52 +0200)]
release : v1.0.4

2 years agoAdd AVX,AVX2 support for ggml_vec_scale_f32
katsu560 [Fri, 16 Dec 2022 23:42:30 +0000 (08:42 +0900)]
Add AVX,AVX2 support for ggml_vec_scale_f32

2 years agomake : revert accidental change of optimization flags
Georgi Gerganov [Sat, 17 Dec 2022 16:57:42 +0000 (18:57 +0200)]
make : revert accidental change of optimization flags

2 years agowhisper : language auto-detect (#59)
Georgi Gerganov [Sat, 17 Dec 2022 15:58:08 +0000 (17:58 +0200)]
whisper : language auto-detect (#59)

2 years agoAdd Roadmap
Georgi Gerganov [Fri, 16 Dec 2022 21:41:57 +0000 (23:41 +0200)]
Add Roadmap

2 years agoggml : implement ggml_compute_forward_dup_f16() special cases
Georgi Gerganov [Fri, 16 Dec 2022 19:50:41 +0000 (21:50 +0200)]
ggml : implement ggml_compute_forward_dup_f16() special cases

2 years agomain : add option to print the progress (#276)
Georgi Gerganov [Fri, 16 Dec 2022 18:20:43 +0000 (20:20 +0200)]
main : add option to print the progress (#276)

2 years agomain : add "--prompt" command line argument (#90)
Georgi Gerganov [Fri, 16 Dec 2022 17:43:16 +0000 (19:43 +0200)]
main : add "--prompt" command line argument (#90)

This allows to provide an initial prompt to be used at the start of the
processing.

2 years agocommand : better indentation
Georgi Gerganov [Tue, 13 Dec 2022 19:46:42 +0000 (21:46 +0200)]
command : better indentation

2 years agocommand : update README, show how to use guided mode
Georgi Gerganov [Tue, 13 Dec 2022 19:36:29 +0000 (21:36 +0200)]
command : update README, show how to use guided mode

2 years agocommand : adding guided mode
Georgi Gerganov [Tue, 13 Dec 2022 17:21:32 +0000 (19:21 +0200)]
command : adding guided mode

2 years agowhisper : add whisper_tokenize()
Georgi Gerganov [Tue, 13 Dec 2022 17:21:07 +0000 (19:21 +0200)]
whisper : add whisper_tokenize()

Tokenizes a string into a list of vocabulary tokens

2 years agoUpdate README.md (#46)
Georgi Gerganov [Fri, 16 Dec 2022 17:28:51 +0000 (19:28 +0200)]
Update README.md (#46)

Add references to the new Android app

2 years agoAdd Android sample (#277)
Digipom [Fri, 16 Dec 2022 17:20:13 +0000 (12:20 -0500)]
Add Android sample (#277)

* Add Android sample

* Use main project C files

* Stop existing playback before starting new playback

* Make text scrollable

* Stop playback when starting to record

* Remove extra var

2 years agoci : add Windows build without OpenBLAS + change to Release (#85) (#282)
Georgi Gerganov [Fri, 16 Dec 2022 16:51:46 +0000 (18:51 +0200)]
ci : add Windows build without OpenBLAS + change to Release (#85) (#282)

2 years agowhisper : improve decoding strategy (#244)
Georgi Gerganov [Fri, 16 Dec 2022 16:31:17 +0000 (18:31 +0200)]
whisper : improve decoding strategy (#244)

- Clear past prompt when there is very short audio left for processing.
  My observation is that in these cases the decoding tends to repeat and
  hallucinate stuff and I think this is induced by the existing prompt
- When we fail to sample timestamp token, retry by clearing the past
  prompt. If it fails again, then we advance the window by 1 second

2 years agostream : update README.md + comments
Georgi Gerganov [Fri, 16 Dec 2022 16:04:19 +0000 (18:04 +0200)]
stream : update README.md + comments

2 years agoUpdate README.md (#56)
Georgi Gerganov [Fri, 16 Dec 2022 16:01:05 +0000 (18:01 +0200)]
Update README.md (#56)

2 years agoggml : make more compatible with c99 (#262)
Georgi Gerganov [Fri, 16 Dec 2022 16:00:12 +0000 (18:00 +0200)]
ggml : make more compatible with c99 (#262)

2 years agoUpdate README.md
Georgi Gerganov [Thu, 15 Dec 2022 18:38:08 +0000 (20:38 +0200)]
Update README.md

2 years agostream : fix build
Georgi Gerganov [Thu, 15 Dec 2022 18:15:36 +0000 (20:15 +0200)]
stream : fix build

2 years agostream : add sliding window mode
Georgi Gerganov [Thu, 15 Dec 2022 16:28:22 +0000 (18:28 +0200)]
stream : add sliding window mode

2 years agowhisper : fix UB when reading buffer of length 0 bytes (#265)
Georgi Gerganov [Tue, 13 Dec 2022 21:13:55 +0000 (23:13 +0200)]
whisper : fix UB when reading buffer of length 0 bytes (#265)

2 years agoggml : fix indentation
Georgi Gerganov [Tue, 13 Dec 2022 21:09:01 +0000 (23:09 +0200)]
ggml : fix indentation

2 years agoggml : make compatible with c99 (#262)
Georgi Gerganov [Tue, 13 Dec 2022 21:07:49 +0000 (23:07 +0200)]
ggml : make compatible with c99 (#262)

2 years agotalk : improve prompting
Georgi Gerganov [Mon, 12 Dec 2022 21:44:36 +0000 (23:44 +0200)]
talk : improve prompting

2 years agorelease : v1.0.3
Georgi Gerganov [Mon, 12 Dec 2022 18:36:52 +0000 (20:36 +0200)]
release : v1.0.3

Fixed whisper.spm tests

2 years agoUpdate README.md
Georgi Gerganov [Mon, 12 Dec 2022 18:33:09 +0000 (20:33 +0200)]
Update README.md

2 years agorelease : v1.0.2
Georgi Gerganov [Mon, 12 Dec 2022 18:25:56 +0000 (20:25 +0200)]
release : v1.0.2

2 years agoUpdate README.md
Georgi Gerganov [Mon, 12 Dec 2022 18:23:10 +0000 (20:23 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Mon, 12 Dec 2022 18:20:51 +0000 (20:20 +0200)]
Update README.md

2 years agoNode.js package (#260)
Georgi Gerganov [Mon, 12 Dec 2022 18:17:27 +0000 (20:17 +0200)]
Node.js package (#260)

* npm : preparing infra for node package

* npm : package infra ready

* npm : initial version ready

* npm : change name to whisper.cpp

whisper.js is taken

2 years agotalk : make compatible with c++11 (part 2)
Georgi Gerganov [Sun, 11 Dec 2022 18:34:04 +0000 (20:34 +0200)]
talk : make compatible with c++11 (part 2)

2 years agotalk : make compatible with c++11
Georgi Gerganov [Sun, 11 Dec 2022 18:19:17 +0000 (20:19 +0200)]
talk : make compatible with c++11

2 years agocmake : require c++11 instead of c++20
Georgi Gerganov [Sun, 11 Dec 2022 18:04:05 +0000 (20:04 +0200)]
cmake : require c++11 instead of c++20

2 years agoRemove C++20 requirement (#257)
Roland Rabien [Sun, 11 Dec 2022 18:03:07 +0000 (10:03 -0800)]
Remove C++20 requirement (#257)

* Remove C++20 requirement

* Roll back C features not supported in VS2017

2 years agoAdd newline per segment for text output (#254)
Lexevolution [Sun, 11 Dec 2022 18:00:29 +0000 (04:00 +1000)]
Add newline per segment for text output (#254)

2 years agobench : more concise representation of the results (#89)
Georgi Gerganov [Sun, 11 Dec 2022 09:56:13 +0000 (11:56 +0200)]
bench : more concise representation of the results (#89)

2 years agominor : fix .gitignore to not ignore examples
Georgi Gerganov [Sun, 11 Dec 2022 09:39:46 +0000 (11:39 +0200)]
minor : fix .gitignore to not ignore examples

2 years agobench.wasm : same as "bench" but runs in the browser (#89)
Georgi Gerganov [Sun, 11 Dec 2022 09:09:01 +0000 (11:09 +0200)]
bench.wasm : same as "bench" but runs in the browser (#89)

2 years agoUpdate README.md
Georgi Gerganov [Sat, 10 Dec 2022 14:54:57 +0000 (16:54 +0200)]
Update README.md

2 years agotalk : fix build for MSVC
Georgi Gerganov [Sat, 10 Dec 2022 14:48:15 +0000 (16:48 +0200)]
talk : fix build for MSVC

2 years agotalk : talk with AI in the terminal
Georgi Gerganov [Fri, 9 Dec 2022 18:38:10 +0000 (20:38 +0200)]
talk : talk with AI in the terminal

2 years agofix potential bug reading model data into a small size optimized string which could...
bert hubert [Sat, 10 Dec 2022 12:09:31 +0000 (13:09 +0100)]
fix potential bug reading model data into a small size optimized string which could lead to memory corruption. In an SSO string, you can't write data to &str[0] and expect it to work well.

Also added a small wrapper function to more safely read model data without having to get the sizeof right. I tested this on tiny, base and large models, there was no change in behaviour.

2 years agowhisper : minor improvemnt in decoding strategy (#244)
Georgi Gerganov [Sat, 10 Dec 2022 11:38:26 +0000 (13:38 +0200)]
whisper : minor improvemnt in decoding strategy (#244)

Do not allow for text segments to go beyond end of audio.
This partially mitigates some issues when the last audio window is 1-2
seconds just before the end of the audio file and the decoding spirals
into a repetition of the last transcribed phrase.

2 years agoggml : add alternative cblas_sgemm call
Georgi Gerganov [Thu, 8 Dec 2022 21:48:04 +0000 (23:48 +0200)]
ggml : add alternative cblas_sgemm call

2 years agomake : indentation + .gitignore
Georgi Gerganov [Thu, 8 Dec 2022 17:42:06 +0000 (19:42 +0200)]
make : indentation + .gitignore

2 years agoFlag for Position Independent Code
Reinis Muiznieks [Wed, 7 Dec 2022 12:44:58 +0000 (14:44 +0200)]
Flag for Position Independent Code

2 years agotwitch.sh : various fixes and polishing
Georgi Gerganov [Thu, 8 Dec 2022 17:17:24 +0000 (19:17 +0200)]
twitch.sh : various fixes and polishing

- check if streamlink is installed
- fix audio chunking
- change default threads to 4

2 years agoAllow for Twitch.tv live transcription
keyehzy [Thu, 8 Dec 2022 02:45:54 +0000 (23:45 -0300)]
Allow for Twitch.tv live transcription

We rely on streamlink library to give us a stream, then we proceed similarly to
the radio livestream example.

2 years agoFix paths echoed after the download
Kartik Saranathan [Thu, 8 Dec 2022 04:18:30 +0000 (23:18 -0500)]
Fix paths echoed after the download

Was using models path instead of root path

2 years agofix compilation on haiku
Al Hoang [Thu, 8 Dec 2022 05:34:19 +0000 (05:34 +0000)]
fix compilation on haiku

2 years agoyt-wsp.sh : improve usage instructions
Georgi Gerganov [Wed, 7 Dec 2022 20:12:08 +0000 (22:12 +0200)]
yt-wsp.sh : improve usage instructions

2 years agoyt-wsp.sh : fix usage instruction + comment
Georgi Gerganov [Wed, 7 Dec 2022 19:12:55 +0000 (21:12 +0200)]
yt-wsp.sh : fix usage instruction + comment

2 years agoUpdate README.md
Georgi Gerganov [Wed, 7 Dec 2022 03:15:46 +0000 (05:15 +0200)]
Update README.md

2 years agolivestream.sh : remove obsolete comment
Georgi Gerganov [Wed, 7 Dec 2022 02:41:43 +0000 (04:41 +0200)]
livestream.sh : remove obsolete comment

2 years agoggml : fix typo in previous commit
Georgi Gerganov [Tue, 6 Dec 2022 20:12:57 +0000 (22:12 +0200)]
ggml : fix typo in previous commit

2 years agoggml : use macros to inline FP16 <-> FP32 conversions
Georgi Gerganov [Tue, 6 Dec 2022 20:05:33 +0000 (22:05 +0200)]
ggml : use macros to inline FP16 <-> FP32 conversions

2 years agoggml : add F16C CPU flag check
Georgi Gerganov [Tue, 6 Dec 2022 19:56:56 +0000 (21:56 +0200)]
ggml : add F16C CPU flag check

2 years agoadd fp16/fp32 convert intrinsics
katsu560 [Tue, 6 Dec 2022 18:32:48 +0000 (03:32 +0900)]
add fp16/fp32 convert intrinsics

2 years agomodels : add the new "large" model release by OpenAI
Georgi Gerganov [Tue, 6 Dec 2022 16:48:57 +0000 (18:48 +0200)]
models : add the new "large" model release by OpenAI

The old "large" model is now renamed "large-v1".
If you have been using it, make sure to rename it and download the new
"large" model for best results.

2 years agobench : add commit hash to bench-all.sh results
Georgi Gerganov [Tue, 6 Dec 2022 16:47:48 +0000 (18:47 +0200)]
bench : add commit hash to bench-all.sh results

2 years agoTry to improve the token sampling strategy (#193)
Georgi Gerganov [Fri, 2 Dec 2022 19:51:50 +0000 (21:51 +0200)]
Try to improve the token sampling strategy (#193)

* whisper : try to improve the token sampling strategy

- Add the "max_initial_timestaamp" token logic from OpenAI
- Disallow sampling timestamps that are in the past

* whisper : fix the max initial timestamp logic + fallback decoding

2 years agotests : adding transcription tests
Georgi Gerganov [Mon, 28 Nov 2022 20:44:01 +0000 (22:44 +0200)]
tests : adding transcription tests

2 years agoggml : remove inline specifier from fp16 <-> fp32 converters
Georgi Gerganov [Thu, 1 Dec 2022 20:15:12 +0000 (22:15 +0200)]
ggml : remove inline specifier from fp16 <-> fp32 converters

2 years agolivestream : handle ffmpeg errors gracefully and stabilize transcript
Georgi Gerganov [Thu, 1 Dec 2022 18:49:09 +0000 (20:49 +0200)]
livestream : handle ffmpeg errors gracefully and stabilize transcript

2 years agolivestream : minor changes
Georgi Gerganov [Thu, 1 Dec 2022 17:47:58 +0000 (19:47 +0200)]
livestream : minor changes

2 years agolivestream : fix losing words across audio chunk (#195)
semiformal-net [Thu, 1 Dec 2022 17:18:22 +0000 (12:18 -0500)]
livestream : fix losing words across audio chunk (#195)

* improve livestream script

* Update examples/livestream.sh

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Paul Edwards <redacted>
Co-authored-by: Georgi Gerganov <redacted>
2 years agoFix Darwin flags - was incorrectly always using the Linux else clause
Tienshiao Ma [Tue, 29 Nov 2022 07:29:34 +0000 (23:29 -0800)]
Fix Darwin flags - was incorrectly always using the Linux else clause

2 years agowhisper : add mechanism for aborting the whisper_full() computation
Georgi Gerganov [Sun, 27 Nov 2022 18:28:36 +0000 (20:28 +0200)]
whisper : add mechanism for aborting the whisper_full() computation

2 years agoUpdate README.md
Georgi Gerganov [Sun, 27 Nov 2022 09:30:32 +0000 (11:30 +0200)]
Update README.md

2 years agowhisper.objc : fix context + broken readme links
Georgi Gerganov [Sun, 27 Nov 2022 08:48:59 +0000 (10:48 +0200)]
whisper.objc : fix context + broken readme links

2 years agowhisper.objc : add real-time processing (#97)
Georgi Gerganov [Sat, 26 Nov 2022 15:28:28 +0000 (17:28 +0200)]
whisper.objc : add real-time processing (#97)

Similar to the "stream" app

2 years agowhisper.objc : fix build warnings
Georgi Gerganov [Sat, 26 Nov 2022 14:27:04 +0000 (16:27 +0200)]
whisper.objc : fix build warnings

2 years agominor : remove "examples/" prefix from the README
Georgi Gerganov [Sat, 26 Nov 2022 11:07:54 +0000 (13:07 +0200)]
minor : remove "examples/" prefix from the README

2 years agoyt-wsp.sh : script to easily transcribe VODs
Georgi Gerganov [Sat, 26 Nov 2022 10:53:23 +0000 (12:53 +0200)]
yt-wsp.sh : script to easily transcribe VODs

Thanks to @DaniruKun
ref: https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818

Usage:

  cd whisper.cpp
  make

  ./examples/yt-wsp.sh <video-url>

2 years agoUpdate README.md
Georgi Gerganov [Sat, 26 Nov 2022 09:56:55 +0000 (11:56 +0200)]
Update README.md

2 years agocommand.wasm : add voice assistant example for the Web (#171)
Georgi Gerganov [Sat, 26 Nov 2022 09:40:06 +0000 (11:40 +0200)]
command.wasm : add voice assistant example for the Web (#171)

Same as the command-line tool "command", but runs in the browser

Also, added helper script "extra/deploy-wasm.sh" and fixed some timing
constants for the WASM examples.

2 years agominor : add comment for using "generate_karaoke.sh"
Georgi Gerganov [Sat, 26 Nov 2022 08:22:42 +0000 (10:22 +0200)]
minor : add comment for using "generate_karaoke.sh"

2 years agolivestream.sh : simple tool to transcribe audio livestreams (#185)
Georgi Gerganov [Sat, 26 Nov 2022 08:05:37 +0000 (10:05 +0200)]
livestream.sh : simple tool to transcribe audio livestreams (#185)

2 years agostream.wasm : add web-based real-time transcription (#112)
Georgi Gerganov [Fri, 25 Nov 2022 21:57:46 +0000 (23:57 +0200)]
stream.wasm : add web-based real-time transcription (#112)

2 years agowhisper.wasm : do not block page while processing (close #86)
Georgi Gerganov [Fri, 25 Nov 2022 21:07:42 +0000 (23:07 +0200)]
whisper.wasm : do not block page while processing (close #86)