Interactive videos are multimedia content that includes user interaction, voice commands, and directions. Creating content transcripts by analyzing the audio from these videos and converting them to PDF format is highly valuable for education, meeting summaries, interview archives, and many other uses. In this article, we explain step-by-step the process of processing the audio content of a video, converting it to text, and then obtaining a well-organized PDF output.

1. Extract Audio

The first step is to extract the audio from the video file.

✅ Recommended tool: FFmpeg

ffmpeg -i video.mp4 -vn -acodec copy ses.aac

or if you want WAV format:

ffmpeg -i video.mp4 -ab 160k -ac 2 -ar 44100 -vn ses.wav

2. Speech to Text

Various AI-based solutions can be used to transcribe the conversations in the video.

Recommended tools:

OpenAI Whisper (high accuracy rate)
Google Speech-to-Text API
Vosk (offline option)

Whisper command example:

whisper ses.wav --language Turkish --model medium

Output: ses.txt file

3. Text Editing and Formatting

The raw transcript obtained usually contains timestamps and irregular structures. In the text editing step:

Remove timestamps (or leave them optional)
Create paragraph structure
Add speaker names (e.g., in interviews)
Clean up unnecessary sounds ("um", "uh")

4. Create PDF File

Method 1: Via Word or LibreOffice

Paste the contents of ses.txt into Word
Format as desired
Save as "File > Save As > PDF"

Method 2: Create PDF automatically with Python

from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.set_font("Arial", size=12)
with open("ses.txt", "r", encoding="utf-8") as f:
    for line in f:
        pdf.multi_cell(0, 10, line)
pdf.output("video_ozeti.pdf")

Extra: Slide-based or Interactive PDFs

You can enrich the audio text with visual elements using tools like Canva or Adobe InDesign and turn it into a PDF
Interactive PDFs also support features such as adding links, buttons, and audio files

✅ Conclusion

Extracting audio from interactive videos, transcribing it, and converting it to PDF is a method that can be automated and provides benefits in many areas. With open-source tools such as FFmpeg and Whisper, this process can be done completely free of charge and with high accuracy.