Minor updates

author Georgi Gerganov <redacted>

Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)

committer Georgi Gerganov <redacted>

Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)
author Georgi Gerganov <redacted>
Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)
committer Georgi Gerganov <redacted>
Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)
diff --git a/README.md b/README.md

index e570bcd7df0a3104728543ecb35fddf59632d339..f25d9b72448c3c89ce4cc575fd53b8ca576a1f0e 100644 (file)
--- a/README.md
+++ b/README.md
@@ -7,13 +7,12 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
  - Mixed F16 / F32 precision
  - Low memory usage (Flash Attention + Flash Forward)
  - Zero memory allocations at runtime
-- Runs on the CPU (Mac and Linux)
+- Runs on the CPU
  - [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h)
+- Supported platforms: Linux, Mac OS (Intel and Arm), Raspberry Pi, Android
  
  Incoming features:
  - [Realtime audio input transcription](https://github.com/ggerganov/whisper.cpp/issues/10#issuecomment-1264665959)
-- [Raspberry Pi support](https://github.com/ggerganov/whisper.cpp/issues/7)
-- [Android support](https://github.com/ggerganov/whisper.cpp/issues/8)
  
  ## Usage
  
@@ -220,10 +219,16 @@ $ ./stream -m models/ggml-small.en.bin -t 8
  
  https://user-images.githubusercontent.com/1991296/193465125-c163d304-64f6-4f5d-83e5-72239c9a203e.mp4
  
+## Implementation details
+
+- The core tensor operations are implemented in C (`ggml.h` / `ggml.c`)
+- The high-level C-style API is implemented in C++ (`whisper.h` / `whisper.cpp`)
+- Simple usage is demonstrated in `main.cpp`
+- Sample real-time audio transcription from the microphone is demonstrated in `stream.cpp`
+
  ## Limitations
  
-- Very basic greedy sampling scheme - always pick up the top token
-- Only 16-bit WAV at 16 kHz is supported
+- Very basic greedy sampling scheme - always pick up the top token. You can implement your own strategy
  - Inference only
  - No GPU support
  
diff --git a/stream.cpp b/stream.cpp

index e9d0364bf3b256aaa47bf389b126a26ed69ad7a2..1f84d667c060f710f71f44ebd6deeca973d12215 100644 (file)
--- a/stream.cpp
+++ b/stream.cpp
@@ -265,6 +265,11 @@ int main(int argc, char ** argv) {
  
              wparams.print_progress       = false;
              wparams.print_special_tokens = params.print_special_tokens;
+            wparams.print_realtime       = false;
+            wparams.print_timestamps     = !params.no_timestamps;
+            wparams.translate            = params.translate;
+            wparams.language             = params.language.c_str();
+            wparams.n_threads            = params.n_threads;
  
              if (whisper_full(ctx, wparams, pcmf32.data(), pcmf32.size()) != 0) {
                  fprintf(stderr, "%s: failed to process audio\n", argv[0]);
author	Georgi Gerganov <redacted>
	Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)
committer	Georgi Gerganov <redacted>
	Wed, 5 Oct 2022 20:11:02 +0000 (23:11 +0300)
README.md		patch \| blob \| history
stream.cpp		patch \| blob \| history