Whisper Transcription

type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Extension:


This utility is used to make a transcription of a voice or video recording, using the Whisper large language model from OpenAI.

Input format

The app can process .mp3, .mp4, .m4a, .wav and .mpg files.

Output format

  • CSV:
    Contains every parameter outputted from the whisper model.

  • DOTE:
    DOTE Transcription software developed by the BigSoftVideo team at AAU

  • DOCX:
    Office Open XML Document (Microsoft Word)

  • JSON:
    JavaScript Object Notation

  • SRT:
    SubRip file format, widely adopted subtitle format

  • TSV:
    Tab-separated value file contain start, end and text

  • TXT:
    Pure text file with the transcription

  • VTT:
    Web Video Text Tracks format

  • ZIP:
    Archive with all of the output files. If Archive Password is set, then the archive is encrypted with AES

Output Folder

By default the transcript files are saved in /Jobs/Whisper Transcription/<job-id>/out. The user can select an other directory using the corresponding optional parameter.

Interactive mode

The Interactive mode parameter is used to start an interactive job session where the user can either select "Open terminal" or "Open interface". The latter gives access to a JupyterLab workspace to run notebooks.