]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/log
pkg/ggml/sources/whisper.cpp
2 years agoUpdate README.md (#43)
Topping1 [Wed, 12 Oct 2022 04:32:14 +0000 (23:32 -0500)]
Update README.md (#43)

* Update README.md

Updated README.md to list new features, such as subtitle file support (VTT and SRT)

* Update README.md

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Georgi Gerganov <redacted>
2 years agoMerge pull request #42 from iboB/msvc-build
Georgi Gerganov [Wed, 12 Oct 2022 04:31:41 +0000 (07:31 +0300)]
Merge pull request #42 from iboB/msvc-build

ref #5 : MSVC build

2 years agoBuilding with MSVC
Borislav Stanimirov [Tue, 11 Oct 2022 17:57:52 +0000 (20:57 +0300)]
Building with MSVC

2 years agoVisual Studio ignored dirs
Borislav Stanimirov [Tue, 11 Oct 2022 17:57:33 +0000 (20:57 +0300)]
Visual Studio ignored dirs

2 years agoUpdate README.md
Georgi Gerganov [Mon, 10 Oct 2022 21:36:32 +0000 (00:36 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Mon, 10 Oct 2022 19:16:25 +0000 (22:16 +0300)]
Update README.md

2 years agostream : improve real-time transcription
Georgi Gerganov [Mon, 10 Oct 2022 19:06:18 +0000 (22:06 +0300)]
stream : improve real-time transcription

2 years agoMinor
Georgi Gerganov [Mon, 10 Oct 2022 19:06:03 +0000 (22:06 +0300)]
Minor

2 years agoUpdate README.md
Georgi Gerganov [Mon, 10 Oct 2022 19:05:37 +0000 (22:05 +0300)]
Update README.md

2 years agoMerge pull request #36 from Topping1/master
Georgi Gerganov [Mon, 10 Oct 2022 06:13:31 +0000 (09:13 +0300)]
Merge pull request #36 from Topping1/master

Fix SRT timestamp format from mm:ss.sss to hh:mm:ss.sss

2 years agoref #35 : add <stdbool.h> to whisper.h
Georgi Gerganov [Mon, 10 Oct 2022 05:11:18 +0000 (08:11 +0300)]
ref #35 : add <stdbool.h> to whisper.h

"bool" type is not implicitly defined for some compilers.

2 years agoMerge pull request #34 from tazz4843/master
Georgi Gerganov [Mon, 10 Oct 2022 05:05:57 +0000 (08:05 +0300)]
Merge pull request #34 from tazz4843/master

Add static library make target

2 years agoUpdate main.cpp
Topping1 [Mon, 10 Oct 2022 04:35:10 +0000 (23:35 -0500)]
Update main.cpp

2 years agoadd static library make target
0/0 [Mon, 10 Oct 2022 01:16:42 +0000 (19:16 -0600)]
add static library make target

2 years agoMerge pull request #31 from lkwq007/master
Georgi Gerganov [Sun, 9 Oct 2022 14:52:46 +0000 (17:52 +0300)]
Merge pull request #31 from lkwq007/master

Add MinGW support

2 years agoAdd MinGW support
lnyan [Sun, 9 Oct 2022 14:26:37 +0000 (22:26 +0800)]
Add MinGW support

2 years agoMinor
Georgi Gerganov [Sat, 8 Oct 2022 15:13:26 +0000 (18:13 +0300)]
Minor

2 years agoref #9 : add API documentation in whisper.h
Georgi Gerganov [Sat, 8 Oct 2022 15:09:56 +0000 (18:09 +0300)]
ref #9 : add API documentation in whisper.h

2 years agoFix Makefile for MacBook Intel
Georgi Gerganov [Sat, 8 Oct 2022 14:35:55 +0000 (17:35 +0300)]
Fix Makefile for MacBook Intel

2 years agoref #17 : print whisper logs to stderr
Georgi Gerganov [Sat, 8 Oct 2022 14:28:06 +0000 (17:28 +0300)]
ref #17 : print whisper logs to stderr

Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.

2 years agoref #17 : add options to output result to file
Georgi Gerganov [Sat, 8 Oct 2022 14:22:22 +0000 (17:22 +0300)]
ref #17 : add options to output result to file

Support for:

- plain text
- VTT
- SRT

2 years agoUpdate README.md
Georgi Gerganov [Sat, 8 Oct 2022 08:46:34 +0000 (11:46 +0300)]
Update README.md

2 years agoUpdate tests
Georgi Gerganov [Sat, 8 Oct 2022 08:34:20 +0000 (11:34 +0300)]
Update tests

2 years agoci : add base model tests to GH Actions
Georgi Gerganov [Sat, 8 Oct 2022 08:17:41 +0000 (11:17 +0300)]
ci : add base model tests to GH Actions

2 years agoUpdate README.md
Georgi Gerganov [Sat, 8 Oct 2022 08:17:29 +0000 (11:17 +0300)]
Update README.md

2 years agoCreate README.md
Georgi Gerganov [Sat, 8 Oct 2022 08:16:37 +0000 (11:16 +0300)]
Create README.md

2 years agoAdding dummy models for testing purposes
Georgi Gerganov [Sat, 8 Oct 2022 07:57:42 +0000 (10:57 +0300)]
Adding dummy models for testing purposes

2 years agoAdding sanitizer tests
Georgi Gerganov [Sat, 8 Oct 2022 07:56:59 +0000 (10:56 +0300)]
Adding sanitizer tests

2 years agoCleanup CMakeLists.txt
Georgi Gerganov [Sat, 8 Oct 2022 06:00:59 +0000 (09:00 +0300)]
Cleanup CMakeLists.txt

2 years agocmake : fixes
Georgi Gerganov [Fri, 7 Oct 2022 21:21:16 +0000 (00:21 +0300)]
cmake : fixes

2 years agoci : add cmake builds
Georgi Gerganov [Fri, 7 Oct 2022 21:14:34 +0000 (00:14 +0300)]
ci : add cmake builds

2 years agowhisper : fix bug in token sampling logic
Georgi Gerganov [Fri, 7 Oct 2022 21:14:05 +0000 (00:14 +0300)]
whisper : fix bug in token sampling logic

Could overflow buffer

2 years agoAdd CMake support
Georgi Gerganov [Fri, 7 Oct 2022 20:53:12 +0000 (23:53 +0300)]
Add CMake support

2 years agoref #10 : option to keep context in "stream" example
Georgi Gerganov [Fri, 7 Oct 2022 19:30:44 +0000 (22:30 +0300)]
ref #10 : option to keep context in "stream" example

Seems the results become worse when we keep the context, so by default
this is not enabled

2 years agoref #10 : add "step" argument for "stream" example
Georgi Gerganov [Fri, 7 Oct 2022 19:07:24 +0000 (22:07 +0300)]
ref #10 : add "step" argument for "stream" example

Controls how often we run the inference.
By default, we run it every 3 seconds.

2 years agoref #16, #22 : add "offset" argument
Georgi Gerganov [Fri, 7 Oct 2022 19:00:40 +0000 (22:00 +0300)]
ref #16, #22 : add "offset" argument

Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.

2 years agoref #11, #18, #26 : fix CACHE_LINE_SIZE constant
Georgi Gerganov [Fri, 7 Oct 2022 18:56:44 +0000 (21:56 +0300)]
ref #11, #18, #26 : fix CACHE_LINE_SIZE constant

2 years agoAdd CI using Github Actions
Georgi Gerganov [Fri, 7 Oct 2022 15:32:18 +0000 (18:32 +0300)]
Add CI using Github Actions

2 years agoref #22 : add option to provide multiple input .wav files
Georgi Gerganov [Wed, 5 Oct 2022 20:44:10 +0000 (23:44 +0300)]
ref #22 : add option to provide multiple input .wav files

2 years agoUpdate README.md
Georgi Gerganov [Wed, 5 Oct 2022 20:13:15 +0000 (23:13 +0300)]
Update README.md

2 years agoMinor updates
Georgi Gerganov [Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)]
Minor updates

2 years agowip : rpi4 support
Georgi Gerganov [Wed, 5 Oct 2022 18:34:41 +0000 (21:34 +0300)]
wip : rpi4 support

2 years agowip : improve makefile
Georgi Gerganov [Wed, 5 Oct 2022 17:41:35 +0000 (20:41 +0300)]
wip : improve makefile

2 years agoMerge pull request #20 from ArtyomZemlyak/master
Georgi Gerganov [Wed, 5 Oct 2022 04:27:29 +0000 (07:27 +0300)]
Merge pull request #20 from ArtyomZemlyak/master

Fix: main get language from cli args

2 years agoFix: main get n_threads from cli
Артём Земляк [Wed, 5 Oct 2022 02:47:48 +0000 (09:47 +0700)]
Fix: main get n_threads from cli

2 years agoFix: main get language from cli args
Артём Земляк [Wed, 5 Oct 2022 02:24:53 +0000 (09:24 +0700)]
Fix: main get language from cli args

2 years agoUpdate README.md
Georgi Gerganov [Tue, 4 Oct 2022 20:27:25 +0000 (23:27 +0300)]
Update README.md

2 years agoImprove result printing
Georgi Gerganov [Tue, 4 Oct 2022 20:16:33 +0000 (23:16 +0300)]
Improve result printing

2 years agoExtend C-style API with full inference methods
Georgi Gerganov [Tue, 4 Oct 2022 19:43:37 +0000 (22:43 +0300)]
Extend C-style API with full inference methods

2 years agoInitial C-style interface for whisper.cpp
Georgi Gerganov [Tue, 4 Oct 2022 17:35:01 +0000 (20:35 +0300)]
Initial C-style interface for whisper.cpp

2 years agoref #10 : handle Ctrl+C in "stream" app
Georgi Gerganov [Sun, 2 Oct 2022 17:11:17 +0000 (20:11 +0300)]
ref #10 : handle Ctrl+C in "stream" app

2 years agoUpdate README.md
Georgi Gerganov [Sun, 2 Oct 2022 15:19:22 +0000 (18:19 +0300)]
Update README.md

2 years agoref #10 : quick-and-dirty attempt for real-time audio transciption
Georgi Gerganov [Sun, 2 Oct 2022 14:55:45 +0000 (17:55 +0300)]
ref #10 : quick-and-dirty attempt for real-time audio transciption

- Processes input in chunks of 3 seconds.
- Padding audio with silence
- Uses 1 second audio from previous pass
- No text context

2 years agoFix bug in FFT
Georgi Gerganov [Sun, 2 Oct 2022 14:46:21 +0000 (17:46 +0300)]
Fix bug in FFT

The FFT routine does not work for odd N
Solution is to add DFT and use it when N is odd

2 years agoFix reading of stereo WAV files
Georgi Gerganov [Sat, 1 Oct 2022 05:41:57 +0000 (08:41 +0300)]
Fix reading of stereo WAV files

2 years agoUpdate README.md
Georgi Gerganov [Fri, 30 Sep 2022 21:01:04 +0000 (00:01 +0300)]
Update README.md

2 years agoBug fix
Georgi Gerganov [Fri, 30 Sep 2022 17:37:29 +0000 (20:37 +0300)]
Bug fix

Longer prompts could cause out-of-bounds access

2 years agoReduce memory usage even more + better sampling
Georgi Gerganov [Fri, 30 Sep 2022 16:33:09 +0000 (19:33 +0300)]
Reduce memory usage even more + better sampling

- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
  force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample

2 years agoUpdate README.md
Georgi Gerganov [Thu, 29 Sep 2022 20:48:01 +0000 (23:48 +0300)]
Update README.md

2 years agoUpdate README.md
Georgi Gerganov [Thu, 29 Sep 2022 20:37:59 +0000 (23:37 +0300)]
Update README.md

2 years agoref #4 : added transcription timestamps
Georgi Gerganov [Thu, 29 Sep 2022 20:09:04 +0000 (23:09 +0300)]
ref #4 : added transcription timestamps

Can be turned off with "-nt" argument.
Performance has also improved.

2 years agoMerge pull request #3 from cdosoftei/master
Georgi Gerganov [Wed, 28 Sep 2022 19:06:09 +0000 (22:06 +0300)]
Merge pull request #3 from cdosoftei/master

Pass -pthread to linker

2 years agoPass -pthread to linker
cdosoftei [Wed, 28 Sep 2022 19:01:54 +0000 (15:01 -0400)]
Pass -pthread to linker

2 years agoUpdate README.md
Georgi Gerganov [Wed, 28 Sep 2022 18:13:32 +0000 (21:13 +0300)]
Update README.md

2 years agoFlash + language support (ref #2)
Georgi Gerganov [Wed, 28 Sep 2022 17:46:05 +0000 (20:46 +0300)]
Flash + language support (ref #2)

- Achieved big performance improvement + memory usage reduction
- Can now translate / transcribe different languages

2 years agoref #1 : add -pthread to compilation flags
Georgi Gerganov [Mon, 26 Sep 2022 08:58:44 +0000 (11:58 +0300)]
ref #1 : add -pthread to compilation flags

2 years agoUpdate README.md and simplify usage
Georgi Gerganov [Mon, 26 Sep 2022 06:36:51 +0000 (09:36 +0300)]
Update README.md and simplify usage

2 years agoCreate README.md
Georgi Gerganov [Sun, 25 Sep 2022 19:35:26 +0000 (22:35 +0300)]
Create README.md

2 years agoCreate LICENSE
Georgi Gerganov [Sun, 25 Sep 2022 19:15:44 +0000 (22:15 +0300)]
Create LICENSE

2 years agoInitial release
Georgi Gerganov [Sun, 25 Sep 2022 18:23:15 +0000 (21:23 +0300)]
Initial release