In stock
View Purchasing OptionsProject update 3 of 9
Speech-to-text technology transforms spoken language into written text, serving at the core of voice assistants and interactive voice response (IVR) systems. Modern advancements use AI and machine learning (ML) to enhance this process. AI models, trained on extensive voice data, can now adeptly recognize and transcribe speech patterns. Through continuous learning, these systems refine their accuracy and adaptability, accommodating diverse accents, languages, and contexts to become increasingly efficient and versatile.
In this article, we’ll create and explore a Python3 script for offline speech-to-text on a Raspberry Pi 5 running the Raspberry Pi OS, and using the ANAVI Dev Mic for sound capture. The ANAVI Dev Mic is an open-source microphone built with the XIAO module and a digital MEMs microphone.
Our script utilizes the SpeechRecognition library and works with multiple engines. Unlike Google’s Speech recognition, which requires an internet connection and is thus unsuitable for offline use, our script employs OpenAI Whisper, an advanced ASR system that functions offline and handles various audio conditions effectively.
Follow these steps to set up and run the script:
python -m venv test
cd test
source bin/activate
sudo apt update
sudo apt install portaudio19-dev
pip install SpeechRecognition pyaudio openai-whisper
git clone https://github.com/AnaviTechnology/anavi-examples.git
cd anavi-examples/speech-to-text
python3 stt.py
The script is designed specifically for the ANAVI Dev Mic, selecting it from available microphones or reporting an error if it’s not found. As shown in the video, offline speech-to-text works well on the Raspberry Pi 5 with Python3, OpenAI Whisper, and the ANAVI Dev Mic. According to my tests, the script also performs efficiently on more powerful computers, such as those with Intel i5 12th and 10th generation processors, but it’s impressive that it also delivers good results on a low-cost, single board computer like Raspberry Pi 5.
If you’re interested in exploring speech-to-text further, consider supporting the ANAVI Dev Mic, an open source microphone that you can trust.