v4.52.4-Kyutai-STT-preview
📦 transformersView on GitHub →
✨ 3 features🔧 2 symbols
Summary
This release introduces a preview of the Kyutai-STT model architecture, featuring 1B and 2.6B parameter checkpoints for high-accuracy speech-to-text transcription.
Migration Steps
- Install the preview version using: pip install git+https://github.com/huggingface/transformers@v4.52.4-Kyutai-STT-preview
✨ New Features
- Added Kyutai-STT model architecture, a speech-to-text model based on the Mimi codec and a Moshi-like autoregressive decoder.
- Support for kyutai/stt-1b-en_fr (1B parameters, English and French transcription).
- Support for kyutai/stt-2.6b-en (2.6B parameters, English transcription).
🔧 Affected Symbols
KyutaiSpeechToTextProcessorKyutaiSpeechToTextForConditionalGeneration