Transcribe videos with google colab and openai whisper

Google colab provides powerful GPUs with 12G of VRAM even at free tier. I foud this very useful to first generate srt files of chinese youtube videos and then translate the chinese srt files into english srt files.

Sample yt-dlp command to download video

# proxy, cookies, format and output name are not compulsory.
yt-dlp --proxy socks5://localhost:1080/ --cookies /tmp/cookies.txt -f "bestvideo[height=720]+bestaudio" 'https://www.youtube.com/watch?v=XlmWtg4ksaw' -o '藏南的诅咒：困死在喜马拉雅南坡的印度'

Sample ffmpeg command to extract audio from video

ffmpeg -i input.webm -vn -acodec libmp3lame -ab 192k output.mp3

First visit https://colab.research.google.com/

File
    > New notebook in Drive

Set the runtime type¶

Runtime
    > Change runtime type

Select python3 as interpreter and t4 gpu as hardware accelator

Each code bock below is inserted by pressing the +Code button in the UI first and then entering the text. Lines that beings with ! is run in the os shell environment, while the rest are run in the selected interpreter (here python3). After typing in each code blocks, press the run button:

The codeblocks (run in sequence)¶

!pip install openai-whisper

from google.colab import files

uploaded = files.upload() # (1)

Click on Choose Files, navigate to and select file.

filename = list(uploaded.keys())[0]
print(f"Uploaded file: {filename}")
import os
os.environ['FNAME'] = filename

!whisper "$FNAME" --output_format srt --language zh --model turbo

Note

Replace zh with the spoken language in the uploaded audio file

filename_without_ext = os.path.splitext(os.environ['FNAME'])[0]
files.download(f"{filename_without_ext}.srt")

Note

The generated srt file has the same name but different srt as extension.

Translate with google translate¶

At https://translate.google.com, in Documents tab you see that google only supports .docx, .pdf, .pptx, .xlsx.

To work around this limitation, use libreoffice.

libreoffice --convert-to "docx:MS Word 2007 XML" output.srt

This creates output.docx. Get it translated at google translate and download the resulting docx. Convert it back with

libreoffice --convert-to "txt:Text" output-eng.docx

Rename this file to your "video name.srt" and you're good to go!

Transcribe videos with google colab and openai whisper

Set the runtime type¶

The codeblocks (run in sequence)¶

Translate with google translate¶

Comments