A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.
pip install -r requirements.txt
Install the appropriate version of PyTorch based on your CUDA version:
CUDA 12.6:
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
CUDA 12.1:
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
CPU version:
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cpu
You can check your CUDA version with nvcc --version
or nvidia-smi
.
On Windows, simply run start_server.bat
.
On other platforms, run:
python whisper_server.py
Open the Claude Desktop configuration file:
%APPDATA%\Claude\claude_desktop_config.json
~/Library/Application Support/Claude/claude_desktop_config.json
Add the Whisper server configuration:
{
"mcpServers": {
"whisper": {
"command": "python",
"args": ["D:/path/to/whisper_server.py"],
"env": {}
}
}
}
The server provides the following tools:
mcp dev whisper_server.py
Use Claude Desktop for integration testing
Use command line direct invocation (requires mcp[cli]):
mcp run whisper_server.py
The server implements the following error handling mechanisms:
whisper_server.py
: Main server codemodel_manager.py
: Whisper model loading and cachingaudio_processor.py
: Audio file validation and preprocessingformatters.py
: Output formatting (VTT, SRT, JSON)transcriber.py
: Core transcription logicstart_server.bat
: Windows startup scriptMIT
This project was developed with the assistance of these amazing AI tools and models:
Special thanks to these incredible tools and the teams behind them.
{
"mcpServers": {
"whisper": {
"env": {},
"args": [
"D:/path/to/whisper_server.py"
],
"command": "python"
}
}
}
Seamless access to top MCP servers powering the future of AI integration.