Skip to content

Crash on empty whisper segment #101

@mikelococo

Description

@mikelococo

Overview

Subaligner as of fc18afa (current head of master) crashes when using Whisper to transcribe audio if Whisper outputs an empty segment. The crash output looks like:

subaligner.transcriber - INFO - MainThread - Finished transcribing the audio
ERROR: list index out of range
  File "subaligner/.venv/bin/subaligner", line 10, in <module>
    sys.exit(main())
  File "subaligner/.venv/lib/python3.11/site-packages/subaligner/__main__.py", line 438, in main
    "ERROR: {}\n{}".format(str(e), "".join(traceback.format_stack()) if FLAGS.debug else "")
  File "subaligner/.venv/lib/python3.11/site-packages/subaligner/__main__.py", line 377, in main
    subtitle, frame_rate = transcriber.transcribe(video_file_path=local_video_path,
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "subaligner/.venv/lib/python3.11/site-packages/subaligner/transcriber.py", line 106, in transcribe
    f"{Utils.format_timestamp(segment['words'][0]['start'])} --> {Utils.format_timestamp(segment['words'][-1]['end'])}\n" \
                              ~~~~~~~~~~~~~~~~^^^
subaligner.transcriber - INFO - MainThread - process shutting down
subaligner.transcriber - DEBUG - MainThread - running all "atexit" finalizers with priority >= 0
subaligner.transcriber - DEBUG - MainThread - running the remaining "atexit" finalizers

Reproduction

I think it's probably input-file dependent and my file is large and private, but here's the command-line invocation that crashed:

CUDA_VISIBLE_DEVICES= subaligner -m transcribe -v '/tmp/myfile.webm' -ml deu -mr whisper -mf large-v3 -o '/tmp/myfile.de.srt' --debug

Analysis

I did some debugging to print out the result from

result = self.__model.transcribe(audio,
and found a segment that looks like this (in and amongst other more conventional segments that have transcribed content in the text field and a list of words):

{
    'id': 70,
    'seek': 14668,
    'start': 163.36,
    'end': 163.36,
    'text': '',
    'tokens': [],
    'temperature': 0.0,
    'avg_logprob': -0.785851426011934,
    'compression_ratio': 1.8442211055276383,
    'no_speech_prob': 0.034043002873659134,
    'words': []
}

The crash is happening in

f"{Utils.format_timestamp(segment['words'][0]['start'])} --> {Utils.format_timestamp(segment['words'][-1]['end'])}\n" \
because words is an empty array.

Possible Fix

In my local install , I patched up the conditions in the for-loop that processes segments to skip over empty segments like so:

                for segment in result["segments"]:
                    if max_char_length is not None and len(segment["text"]) > max_char_length:
                        srt_str, srt_idx = self._chunk_segment(segment, srt_str, srt_idx, max_char_length)
                    elif with_word_time_codes:
                        for word in segment["words"]:
                            srt_str += f"{srt_idx}\n" \
                                        f"{Utils.format_timestamp(word['start'])} --> {Utils.format_timestamp(word['end'])}\n" \
                                        f"{word['word'].strip().replace('-->', '->')}\n" \
                                        "\n"
                            srt_idx += 1
                    elif segment["words"]:
                        srt_str += f"{srt_idx}\n" \
                                    f"{Utils.format_timestamp(segment['words'][0]['start'])} --> {Utils.format_timestamp(segment['words'][-1]['end'])}\n" \
                                    f"{segment['text'].strip().replace('-->', '->')}\n" \
                                    "\n"
                        srt_idx += 1

I don't know much about whisper's output format and I can imagine lots of ways this could still bail ungracefully... but handling empty segments in some fashion seems necessary for at least some input files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions