Voice Activity Detection (VAD) is used to detect changes in speech audio patterns to classify audio as voiced or unvoiced.
Using Web Real-Time Communication (WebRTC), you can perform real-time voice activity detection via Python and in this article, you will learn to perform WebRTC Voice Activity Detection using Python.
Ready to get started? Let us start by installing the necessary Python library for this project.
Getting Started with WebRTC Voice Activity Detection in Python
To perform WebRTC Voice Activity Detection using Python, you can install the py-webrtcvad library using Python’s Package Manager (pip).
The py-webrtcvad package/library is a Python interface to the WebRTC Voice Activity Detector from Google and is compatible with Python 2 and Python 3. This library can be used for telephony and speech recognition free of charge.
To install the py-webrtcvad library, open up your command line/terminal and run the following command:
pip install webrtcvad
This will install the latest version of the py-webrtcvad library on your machine. You can then import it in Python using a Python IDE or Python Shell by writing the following line of code:
import webrtcvad
If running this line of code doesn’t give an error, then, you’ve successfully installed and imported py-webrtcvad in Python. Note that the py-webrtcvad library is written as webrtcvad
in Python.
WebRTC Voice Activity Detection in Python Example
Here’s an example of how Voice Activity Detection can be performed in Python:
# Import the py-webrtcvad library import webrtcvad # Initialize a vad object vad = webrtcvad.Vad() # Run the VAD on 10 ms of silence and 16000 sampling rate sample_rate = 16000 frame_duration = 10 # in ms # Creating an audio frame of silence frame = b'\x00\x00' * int(sample_rate * frame_duration / 1000) # Detecting speech print(f'Contains speech: {vad.is_speech(frame, sample_rate)}')
Contains speech: False
As you can see, the is_speech()
method from webrtcvad
library/package can be used to detect voice in Python. You can use this method in any of your projects to start detecting voice in recorded audio frames.
See example.py from the library’s GitHub repository for a more detailed example that will process a .wav file, find the voiced segments, and write each one as a separate .wav.
In Conclusion
You now know how to perform WebRTC Voice Activity Detection using Python. If you have any questions, please feel free to comment down below and we will get back to you.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:
- Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
- Introduction to Data Science in Python- 400,000+ students already enrolled!
- Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
- Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!