]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log
pkg/ggml/sources/whisper.cpp
2 years agostream : add "max_tokens" parameter
Georgi Gerganov [Sun, 20 Nov 2022 18:52:24 +0000 (20:52 +0200)]
stream : add "max_tokens" parameter

Used to limit the number of tokens in a segment.
Useful to battle with word repetition when using partial encoder context

2 years agostream : add "single_segment" option
Georgi Gerganov [Sun, 20 Nov 2022 18:45:10 +0000 (20:45 +0200)]
stream : add "single_segment" option

Force the entire audio chunk to be transcribed into a single segment

2 years agostream : partial encoder experiments
Georgi Gerganov [Fri, 11 Nov 2022 20:33:10 +0000 (22:33 +0200)]
stream : partial encoder experiments

2 years agofix: free ggml_context (close #149) (#150)
greeshmay [Thu, 17 Nov 2022 20:12:51 +0000 (12:12 -0800)]
fix: free ggml_context (close #149) (#150)

* fix: free ggml_context

* ggml : free the model's contexts in whisper_free()

Co-authored-by: Georgi Gerganov <redacted>
2 years agomodels : simplify the conversion script
Georgi Gerganov [Wed, 16 Nov 2022 17:21:43 +0000 (19:21 +0200)]
models : simplify the conversion script

"transformers" dependency is not actually needed

2 years agoUpdate download-ggml-model.sh
Dody Suria Wijaya [Wed, 16 Nov 2022 16:53:01 +0000 (23:53 +0700)]
Update download-ggml-model.sh

follow curl redirect to new hosting site

2 years agomodels : change default hosting to Hugging Face
Georgi Gerganov [Tue, 15 Nov 2022 17:47:06 +0000 (19:47 +0200)]
models : change default hosting to Hugging Face

My Linode is running out of monthly bandwidth due to the big interest in
the project

2 years agowhisper : add option to speed up the audio tempo by x2
Georgi Gerganov [Sat, 12 Nov 2022 16:03:49 +0000 (18:03 +0200)]
whisper : add option to speed up the audio tempo by x2

Using a Phase Vocoder for speeding up the audio tempo by scaling down
the frequencies in the frequency domain.

This reduces the computation in the Encoder by a factor of 2.
The transcription accuracy is degraded, but for slow to normal speech -
it seems to be still very good.

I think this can find application for real-time transcription - i.e. the
"stream" example.

2 years agomake : add libwhisper.so target (#144)
Georgi Gerganov [Sun, 13 Nov 2022 07:08:33 +0000 (09:08 +0200)]
make : add libwhisper.so target (#144)

2 years agoAdd WHISPER_NO_AVX and WHISPER_NO_AVX2 to CMakeLists (#136)
Chidi Williams [Fri, 11 Nov 2022 16:10:01 +0000 (16:10 +0000)]
Add WHISPER_NO_AVX and WHISPER_NO_AVX2 to CMakeLists (#136)

* Check for AVX and AVX2 on Darwin

* Add AVX options to CMakeLists

2 years agominor : remove one more redundant line
Georgi Gerganov [Fri, 11 Nov 2022 16:02:58 +0000 (18:02 +0200)]
minor : remove one more redundant line

2 years agominor : fix double float32 conversion in python script
Georgi Gerganov [Fri, 11 Nov 2022 15:58:51 +0000 (17:58 +0200)]
minor : fix double float32 conversion in python script

2 years agoref #40 : start working on the documentation
Georgi Gerganov [Wed, 9 Nov 2022 19:41:21 +0000 (21:41 +0200)]
ref #40 : start working on the documentation

2 years agoAdds support for stdin wav input
Alan [Wed, 9 Nov 2022 18:24:06 +0000 (15:24 -0300)]
Adds support for stdin wav input

2 years agojs : update whipser.js to latest
Georgi Gerganov [Wed, 9 Nov 2022 17:32:58 +0000 (19:32 +0200)]
js : update whipser.js to latest

2 years agoCheck for AVX and AVX2 on Darwin
Chidi Williams [Wed, 9 Nov 2022 00:28:36 +0000 (00:28 +0000)]
Check for AVX and AVX2 on Darwin

2 years agoFix the Windows pthread_create shim
boolemancer [Tue, 8 Nov 2022 11:04:23 +0000 (03:04 -0800)]
Fix the Windows pthread_create shim

The current implementation doesn't actually set the out parameter,
and it returns 0 on failure instead of on success.

2 years agosync : submodule whisper.spm
Georgi Gerganov [Mon, 7 Nov 2022 19:48:13 +0000 (21:48 +0200)]
sync : submodule whisper.spm

2 years agocmake : add submodule whisper.spm
Georgi Gerganov [Mon, 7 Nov 2022 18:50:24 +0000 (20:50 +0200)]
cmake : add submodule whisper.spm

2 years agoref #22 : add "duration" option
Georgi Gerganov [Mon, 7 Nov 2022 18:14:52 +0000 (20:14 +0200)]
ref #22 : add "duration" option

Can be used to partially process a recording

2 years agoUpdate README.md
Georgi Gerganov [Sun, 6 Nov 2022 19:04:21 +0000 (21:04 +0200)]
Update README.md

2 years agoexamples : add simple script for generating Karaoke video
Georgi Gerganov [Sun, 6 Nov 2022 07:22:50 +0000 (09:22 +0200)]
examples : add simple script for generating Karaoke video

2 years agoUpdate README.md
Georgi Gerganov [Sat, 5 Nov 2022 06:44:41 +0000 (08:44 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 4 Nov 2022 20:26:08 +0000 (22:26 +0200)]
Update README.md

2 years agomain : fix generated bash script
Georgi Gerganov [Fri, 4 Nov 2022 16:30:38 +0000 (18:30 +0200)]
main : fix generated bash script

2 years agoggml : multi-thread the ggml_add operator
Georgi Gerganov [Thu, 3 Nov 2022 18:53:44 +0000 (20:53 +0200)]
ggml : multi-thread the ggml_add operator

2 years agocmake : fix passing GGML_PERF compile option
Georgi Gerganov [Thu, 3 Nov 2022 18:18:57 +0000 (20:18 +0200)]
cmake : fix passing GGML_PERF compile option

2 years agoUpdate README.md
Georgi Gerganov [Wed, 2 Nov 2022 20:03:27 +0000 (22:03 +0200)]
Update README.md

2 years agowhisper : token-level timestamp refactoring (#49, #120)
Georgi Gerganov [Wed, 2 Nov 2022 19:18:20 +0000 (21:18 +0200)]
whisper : token-level timestamp refactoring (#49, #120)

This turned out pretty good overall. The algorithm has been moved from
main.cpp to whisper.cpp and can be reused for all subtitles types. This
means that now you can specify the maximum length of the generated
lines. Simply provide the "-ml" argument specifying the max length in
number of characters

2 years agoUpdate README.md
Georgi Gerganov [Wed, 2 Nov 2022 16:33:29 +0000 (18:33 +0200)]
Update README.md

2 years agoextra : compute SHA of all models files
Georgi Gerganov [Wed, 2 Nov 2022 16:31:55 +0000 (18:31 +0200)]
extra : compute SHA of all models files

2 years agowhisper : fix extra memory usage after recent processor changes
Georgi Gerganov [Wed, 2 Nov 2022 16:31:18 +0000 (18:31 +0200)]
whisper : fix extra memory usage after recent processor changes

Had increased the memory buffer to the size of the model and forgot to
bring it down.

2 years agoAllow building with Accelerate for x86_64 Macs (#123)
Syed Jafri [Wed, 2 Nov 2022 16:00:19 +0000 (10:00 -0600)]
Allow building with Accelerate for x86_64 Macs (#123)

* Cross compile windows

* set env properly

* rm log

* fix review

* Add back space

* Don't force architecture

* Allow building x86_64 with accelerate

2 years agoggml : fix the check for NEON support (#7)
Georgi Gerganov [Wed, 2 Nov 2022 15:52:24 +0000 (17:52 +0200)]
ggml : fix the check for NEON support (#7)

Was using the wrong preprocessor macro

2 years agoCross compilation (#121)
Syed Jafri [Wed, 2 Nov 2022 06:46:49 +0000 (00:46 -0600)]
Cross compilation (#121)

* Cross compile windows

* set env properly

* rm log

* fix review

* Add back space

2 years agoUpdate README.md
Georgi Gerganov [Tue, 1 Nov 2022 20:47:58 +0000 (22:47 +0200)]
Update README.md

2 years agomain : add some comments for the word-level timestamp algorithm
Georgi Gerganov [Tue, 1 Nov 2022 20:35:21 +0000 (22:35 +0200)]
main : add some comments for the word-level timestamp algorithm

2 years agomain : fix some edge cases for word-level timestamps
Georgi Gerganov [Tue, 1 Nov 2022 20:09:25 +0000 (22:09 +0200)]
main : fix some edge cases for word-level timestamps

2 years agoUpdate README.md
Georgi Gerganov [Mon, 31 Oct 2022 20:06:05 +0000 (22:06 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Mon, 31 Oct 2022 18:19:41 +0000 (20:19 +0200)]
Update README.md

2 years agoAdded for Windows implemenated script download-ggml-model.cmd
Mikhail Grigorev [Sun, 30 Oct 2022 19:51:29 +0000 (00:51 +0500)]
Added for Windows implemenated script download-ggml-model.cmd

2 years agoFixed sched_yield
Mikhail Grigorev [Sun, 30 Oct 2022 17:19:24 +0000 (22:19 +0500)]
Fixed sched_yield

2 years agoImplemenated sched_yield function for Windows
Mikhail Grigorev [Sun, 30 Oct 2022 10:29:27 +0000 (15:29 +0500)]
Implemenated sched_yield function for Windows

2 years agoUpdate README.md
Georgi Gerganov [Sun, 30 Oct 2022 15:11:37 +0000 (17:11 +0200)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sun, 30 Oct 2022 15:10:46 +0000 (17:10 +0200)]
Update README.md

2 years agomain : add option for word-leve timestamps (very experimental)
Georgi Gerganov [Sun, 30 Oct 2022 08:05:58 +0000 (10:05 +0200)]
main : add option for word-leve timestamps (very experimental)

2 years agostream : add "--capture" option to select capture device (ref #10)
Georgi Gerganov [Sun, 30 Oct 2022 06:27:04 +0000 (08:27 +0200)]
stream : add "--capture" option to select capture device (ref #10)

2 years agoclose #113 : fix struct whisper_token_data
Georgi Gerganov [Sun, 30 Oct 2022 06:23:52 +0000 (08:23 +0200)]
close #113 : fix struct whisper_token_data

2 years agominor : update whisper.js
Georgi Gerganov [Sat, 29 Oct 2022 18:28:06 +0000 (21:28 +0300)]
minor : update whisper.js

2 years agowhisper.wasm : update system info print
Georgi Gerganov [Sat, 29 Oct 2022 17:30:05 +0000 (20:30 +0300)]
whisper.wasm : update system info print

2 years agoref #5 : update CMake for Windows build
Georgi Gerganov [Sat, 29 Oct 2022 16:41:50 +0000 (19:41 +0300)]
ref #5 : update CMake for Windows build

- __AVX2__ should already be defined due to /arch:AVX2
- _CRT_SECURE_NO_WARNINGS should be defined both for shared and static lib

2 years agominor : fix multiple definitions of to_timestamp()
Georgi Gerganov [Sat, 29 Oct 2022 11:14:23 +0000 (14:14 +0300)]
minor : fix multiple definitions of to_timestamp()

2 years agoparallel : print time of audio boundaries + fix timings
Georgi Gerganov [Sat, 29 Oct 2022 11:08:23 +0000 (14:08 +0300)]
parallel : print time of audio boundaries + fix timings

2 years agoggml : fix barrier
Georgi Gerganov [Sat, 29 Oct 2022 09:38:29 +0000 (12:38 +0300)]
ggml : fix barrier

2 years agomain : merge parallel example in main
Georgi Gerganov [Sat, 29 Oct 2022 09:26:03 +0000 (12:26 +0300)]
main : merge parallel example in main

2 years agoparallel : working
Georgi Gerganov [Sat, 29 Oct 2022 09:24:02 +0000 (12:24 +0300)]
parallel : working

2 years agoggml : fix thread-safety of ggml_init and ggml_free
Georgi Gerganov [Sat, 29 Oct 2022 08:23:44 +0000 (11:23 +0300)]
ggml : fix thread-safety of ggml_init and ggml_free

2 years agomain : fix sampling time + add max_context parameter
Georgi Gerganov [Sat, 29 Oct 2022 06:42:14 +0000 (09:42 +0300)]
main : fix sampling time + add max_context parameter

2 years agoparallel : adding tool for parallel transformer inference
Georgi Gerganov [Sat, 29 Oct 2022 06:27:08 +0000 (09:27 +0300)]
parallel : adding tool for parallel transformer inference

2 years agoDefine WHISPER_BUILD so as to export symbols on Windows
Borislav Stanimirov [Sat, 29 Oct 2022 10:15:56 +0000 (13:15 +0300)]
Define WHISPER_BUILD so as to export symbols on Windows

2 years agoUpdate README.md
Georgi Gerganov [Fri, 28 Oct 2022 19:09:40 +0000 (22:09 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Fri, 28 Oct 2022 18:40:52 +0000 (21:40 +0300)]
Update README.md

2 years agoCreate README.md
Georgi Gerganov [Fri, 28 Oct 2022 17:22:49 +0000 (20:22 +0300)]
Create README.md

2 years agowhisper.nvim : add helper script for the Neovim integration
Georgi Gerganov [Fri, 28 Oct 2022 16:57:09 +0000 (19:57 +0300)]
whisper.nvim : add helper script for the Neovim integration

2 years agostream : few updates to make it compatible for Vim usage (#99)
Georgi Gerganov [Thu, 27 Oct 2022 19:10:50 +0000 (22:10 +0300)]
stream : few updates to make it compatible for Vim usage (#99)

2 years agoAdd OpenBLAS support
Georgi Gerganov [Thu, 27 Oct 2022 15:31:49 +0000 (18:31 +0300)]
Add OpenBLAS support

Supported via CMake - just add:

cmake .. -DWHISPER_SUPPORT_OPENBLAS=ON

On Ubuntu, you have to install the library like this:

apt install libopenblas-dev

Unfortunately, I don't observe any benefit compared to the
original AVX2 + FP16 implementation. Maybe I'm missing something

2 years agoPrint system info at start of program
Georgi Gerganov [Thu, 27 Oct 2022 14:22:10 +0000 (17:22 +0300)]
Print system info at start of program

2 years agoFixed compile definitions and link libraries for MSVC
Mikhail Grigorev [Thu, 27 Oct 2022 09:59:02 +0000 (14:59 +0500)]
Fixed compile definitions and link libraries for MSVC

2 years agoAdd helper script to benchmark all models
Georgi Gerganov [Wed, 26 Oct 2022 20:19:58 +0000 (23:19 +0300)]
Add helper script to benchmark all models

Simply run:

$ ./extra/bench-all.sh

2 years agoPrint system info in main
Georgi Gerganov [Wed, 26 Oct 2022 19:54:09 +0000 (22:54 +0300)]
Print system info in main

2 years agoCreate README.md
Georgi Gerganov [Wed, 26 Oct 2022 15:14:10 +0000 (18:14 +0300)]
Create README.md

2 years agoChanges to work by default on macOS - use curl when wget is not available, and use...
andypayne [Wed, 26 Oct 2022 00:35:11 +0000 (17:35 -0700)]
Changes to work by default on macOS - use curl when wget is not available, and use an alternative method to get the script path when realpath is not available.

2 years agoUpdate README.md
Georgi Gerganov [Tue, 25 Oct 2022 17:51:56 +0000 (20:51 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Tue, 25 Oct 2022 17:47:31 +0000 (20:47 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Tue, 25 Oct 2022 17:43:10 +0000 (20:43 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Tue, 25 Oct 2022 17:28:47 +0000 (20:28 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Tue, 25 Oct 2022 17:25:23 +0000 (20:25 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Tue, 25 Oct 2022 17:23:39 +0000 (20:23 +0300)]
Update README.md

2 years agoggml : add system info functions
Georgi Gerganov [Tue, 25 Oct 2022 17:18:26 +0000 (20:18 +0300)]
ggml : add system info functions

2 years agorefactoring : move main + stream in examples + other stuff
Georgi Gerganov [Tue, 25 Oct 2022 16:13:08 +0000 (19:13 +0300)]
refactoring : move main + stream in examples + other stuff

2 years agomain : fix SRT timestamp to use comma "," instead of dot "."
Georgi Gerganov [Mon, 24 Oct 2022 15:28:23 +0000 (18:28 +0300)]
main : fix SRT timestamp to use comma "," instead of dot "."

2 years agoUpdate README.md
Georgi Gerganov [Mon, 24 Oct 2022 15:26:21 +0000 (18:26 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sun, 23 Oct 2022 09:51:09 +0000 (12:51 +0300)]
Update README.md

2 years agoobjc : polishing the sample application
Georgi Gerganov [Sun, 23 Oct 2022 09:24:03 +0000 (12:24 +0300)]
objc : polishing the sample application

2 years agoCreate README.md
Georgi Gerganov [Sun, 23 Oct 2022 08:36:36 +0000 (11:36 +0300)]
Create README.md

2 years agoios : whisper.objc example
Georgi Gerganov [Sun, 23 Oct 2022 08:10:15 +0000 (11:10 +0300)]
ios : whisper.objc example

2 years agoref #68, #79 : fix segment time output
Georgi Gerganov [Sun, 23 Oct 2022 10:29:36 +0000 (13:29 +0300)]
ref #68, #79 : fix segment time output

2 years agoUpdate README.md
Georgi Gerganov [Sun, 23 Oct 2022 09:47:51 +0000 (12:47 +0300)]
Update README.md

2 years agoMerge pull request #78 from jokkebk/Specify-utf8-for-vocab.json
Georgi Gerganov [Sun, 23 Oct 2022 09:23:04 +0000 (12:23 +0300)]
Merge pull request #78 from jokkebk/Specify-utf8-for-vocab.json

Add enconding parameter to vocab.json opening to fix errors

2 years agoAdd enconding parameter to vocab.json opening to fix errors
Joonas Pihlajamaa [Sun, 23 Oct 2022 08:55:01 +0000 (11:55 +0300)]
Add enconding parameter to vocab.json opening to fix errors

2 years agoUpdate README.md
Georgi Gerganov [Sun, 23 Oct 2022 07:24:36 +0000 (10:24 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sun, 23 Oct 2022 07:12:10 +0000 (10:12 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sun, 23 Oct 2022 05:04:33 +0000 (08:04 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sat, 22 Oct 2022 18:16:08 +0000 (21:16 +0300)]
Update README.md

2 years agomain : print colors + no timestamps
Georgi Gerganov [Sat, 22 Oct 2022 18:09:30 +0000 (21:09 +0300)]
main : print colors + no timestamps

2 years agowhisper : add new-segment callback
Georgi Gerganov [Sat, 22 Oct 2022 18:06:50 +0000 (21:06 +0300)]
whisper : add new-segment callback

Can be used to process new segments as they are being generated.
Sample usage in main, for printing the resulting segments during the
inference.

2 years agomain : refactor subtitle output
Georgi Gerganov [Sat, 22 Oct 2022 17:42:11 +0000 (20:42 +0300)]
main : refactor subtitle output

2 years agowip : experimental color coding of tokens based on probabilities
Georgi Gerganov [Fri, 21 Oct 2022 14:33:59 +0000 (17:33 +0300)]
wip : experimental color coding of tokens based on probabilities

2 years agoUpdate README.md
Georgi Gerganov [Sat, 22 Oct 2022 16:30:35 +0000 (19:30 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Sat, 22 Oct 2022 16:00:25 +0000 (19:00 +0300)]
Update README.md