How to setup Whisper from OpenAI

I was researching a subject and came across a podcast that provided a great overview of the subject and dove deep into specific topics in subsequent episodes. I wanted to read the transcripts and take notes, but the podcast didn’t provide transcripts. I researched different ways to generate transcripts myself and the tool that rose to the top was Whisper. Whisper is an automatic speech recognition (ASR) system that enables transcription in multiple languages, as well as translation from those languages into English. It is an open source project provided by OpenAI. This is how I got Whisper AI working on my Windows 11 laptop.

Rather than install Python and compile the application myself Purview provides standalone executables on GitHub whisper-standalone-win. whisper-standalone-win uses the faster-whisper implementation of OpenAI’s Whisper model. This implementation uses the CTranslate2 library and is up to 4 times faster while using less memory.

Here are the installation steps:

  1. Install Microsoft Visual C++ Redistributable package
  2. Install Visual Studio 2022. This step may not be required, but the NVIDIA CUDA Toolkit displayed a warning message during installation that Visual Studio was not detected and that it recommends installing Visual Studio before installing the CUDA Toolkit.
  3. Install NVIDIA CUDA Toolkit
  4. Download and extract NVIDIA CUDA Deep Neural Network library (cuDNN)
  5. Copy the files from the cudnn bin folder to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin. The version of the CUDA framework you installed may be different so you may not be able to copy and paste that exact path, but have to navigate there manually.
  6. This step may no longer be required. When I attempted to run the whisper executable I got the following error “errpr: RuntimeError: Library cublas64_11.dll is not found or cannot be loaded”. This is because the executable was compiled against the CUDA toolkit 11 version and I had installed the CUDA toolkit 12 version. To fix this issue I created a copy of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\cublas64_12.dll and renamed it to cublas64_11.dll. Now I have cublas64_12.dll and cublas64_11.dll in the bin folder.
  7. Download and extract whisper-standalone-win

You should now be able to open PowerShell in Terminal navigate to the Whisper-Faster folder and successfully run the program.

Here are a few usage examples:

  • Display help information
    whisper-faster.exe --help
  • Generate subtitles (srt file) for a video file
    whisper-faster.exe "D:\videofile.mkv" --language=English --model=medium
  • Generate a transcript (txt file) for an audio file
    whisper-faster.exe "D:\audiofile.mp3" --language=English --model=medium --output_format=txt

Resources

Leave a comment

Blog at WordPress.com.