Video Tutorial: Offline Speech-to-Text on Raspberry Pi 5 with ANAVI Dev Mic

by ANAVI Technology

Speech-to-text technology transforms spoken language into written text, serving at the core of voice assistants and interactive voice response (IVR) systems. Modern advancements use AI and machine learning (ML) to enhance this process. AI models, trained on extensive voice data, can now adeptly recognize and transcribe speech patterns. Through continuous learning, these systems refine their accuracy and adaptability, accommodating diverse accents, languages, and contexts to become increasingly efficient and versatile.

In this article, we’ll create and explore a Python3 script for offline speech-to-text on a Raspberry Pi 5 running the Raspberry Pi OS, and using the ANAVI Dev Mic for sound capture. The ANAVI Dev Mic is an open-source microphone built with the XIAO module and a digital MEMs microphone.

Our script utilizes the SpeechRecognition library and works with multiple engines. Unlike Google’s Speech recognition, which requires an internet connection and is thus unsuitable for offline use, our script employs OpenAI Whisper, an advanced ASR system that functions offline and handles various audio conditions effectively.

Follow these steps to set up and run the script:

Launch Raspberry Pi OS and open a terminal.
Create a Python3 virtual environment:

python -m venv test
cd test
source bin/activate

Install dependencies:

sudo apt update
sudo apt install portaudio19-dev

Install Python libraries:

pip install SpeechRecognition pyaudio openai-whisper

Download the source code from GitHub:

git clone https://github.com/AnaviTechnology/anavi-examples.git

Run the script:

cd anavi-examples/speech-to-text
python3 stt.py

Follow the onscreen instructions to start speaking when the script is listening.

The script is designed specifically for the ANAVI Dev Mic, selecting it from available microphones or reporting an error if it’s not found. As shown in the video, offline speech-to-text works well on the Raspberry Pi 5 with Python3, OpenAI Whisper, and the ANAVI Dev Mic. According to my tests, the script also performs efficiently on more powerful computers, such as those with Intel i5 12th and 10th generation processors, but it’s impressive that it also delivers good results on a low-cost, single board computer like Raspberry Pi 5.

If you’re interested in exploring speech-to-text further, consider supporting the ANAVI Dev Mic, an open source microphone that you can trust.

Questions?

Ask Crowd Supply about an order
Ask ANAVI Technology a technical question

Learn More About This Project

Go to the main project page
See all project updates

ANAVI Dev Mic

Open-hardware, USB Type-C omnidirectional mic with programmable RP2040 MCU

Video Tutorial: Offline Speech-to-Text on Raspberry Pi 5 with ANAVI Dev Mic

Questions?

Learn More About This Project

Subscribe to the Crowd Supply newsletter, highlighting the latest creators and projects