Retrieve Audio Segment in Large Text Corpus
git clone https://github.com/davoodwadi/FuzzyAudioSearch.git
cd FuzzyAudioSearch
python -m venv venv
# if on Windows
source venv/Scripts/activate
# if on MacOS or Linux
source venv/bin/activate
pip install -r requirements.txt
python FuzzyAudioSearch.py -a audio_file -t text_file
where audio_file is the path to the audio file you want to use to search.
text_file is the path to the large corpus.
You can optionally pass -c to set the chunk of the start and end of audio to find matches.
python FuzzyAudioSearch.py -a audio_file -t text_file -c 100
While the default model size, tiny, is sufficient for many texts, for multilingual texts (e.g. books by Nietzsche, which contain English and German text) it helps to use larger whisper models.
python FuzzyAudioSearch.py -a audio_file -t text_file -c 100 -m tiny
model options:
- tiny
- small
- medium
- large-v3
@software{Wadi_Retrieve_Audio_Segment_2023,
author = {Wadi, Davood},
month = jan,
title = {{Retrieve Audio Segment in Large Text Corpus }},
version = {0.0.1},
year = {2023}
}