instrisics or CBLAS Accelerate framework routines are used. The latter are especially effective for bigger sizes since
the Accelerate framework utilizes the special-purpose AMX coprocessor available in modern Apple products.
-## Limitations
-
-- Inference only
-- No GPU support
-- Very basic greedy sampling scheme - always pick up the token with highest probability.
- This should be similar to the [GreedyDecoder](https://github.com/openai/whisper/blob/main/whisper/decoding.py#L249-L274)
- from the original python implementation, so in order to make a fair comparison between the 2 implementations, make sure
- to run the python code with the following parameters:
-
- ```
- whisper --best_of None --beam_size None ...
- ```
-
- In the future, `whisper.cpp` will support more sampling strategies.
-
## Quick start
First, download one of the Whisper models converted in [ggml format](models). For example:
| medium | 1.5 GB | ~2.6 GB | `fd9727b6e1217c2f614f9b698455c4ffd82463b4` |
| large | 2.9 GB | ~4.7 GB | `0f4c8e34f21cf1a914c59d8b3ce882345ad349d6` |
+## Limitations
+
+- Inference only
+- No GPU support
+- Very basic greedy sampling scheme - always pick up the token with highest probability.
+ This should be similar to the [GreedyDecoder](https://github.com/openai/whisper/blob/main/whisper/decoding.py#L249-L274)
+ from the original python implementation, so in order to make a fair comparison between the 2 implementations, make sure
+ to run the python code with the following parameters:
+
+ ```
+ whisper --best_of None --beam_size None ...
+ ```
+
+ In the future, `whisper.cpp` will support more sampling strategies.
+
## Another example
Here is another example of transcribing a [3:24 min speech](https://upload.wikimedia.org/wikipedia/commons/1/1f/George_W_Bush_Columbia_FINAL.ogg)