Tracks

Subtitles

View Markdown

Overview

The Subtitles Task creates caption tracks (.vtt or .srt) for video or audio files.

It converts detected speech into synchronized subtitle text with timestamps that align with playback.

When a Subtitles task runs, it creates a Track file with kind: "subtitles"

and outputs a .vtt or .srt text file containing caption data.


Example Output

{
  "id": "file_abc123xyz",
  "object": "track",
  "kind": "subtitles",
  "language": "en",
  "format": "vtt",
  "filename": "video-subtitles.vtt",
  "duration": 63.2,
  "filesize": 2048,
  "url": "https://live.ittybit.net/video-subtitles.vtt",
  "metadata": {},
  "created": "2025-01-01T01:23:45Z",
  "updated": "2025-01-01T01:23:45Z"
}

Creating a Subtitles Task

Subtitles tasks can be created directly for a file or URL.

They typically follow a prior Speech task to transcribe the spoken audio.

import { IttybitClient } from "@ittybit/sdk";

const ittybit = new IttybitClient({
  apiKey: process.env.ITTYBIT_API_KEY!
});

const task = await ittybit.tasks.create({
  kind: "subtitles",
  url: "https://example.com/video.mp4",
  description: "Generate English subtitles from speech",
  webhook_url: "https://your-app.com/subtitles-webhook"
});

console.log("Task created:", task.id);
console.log("Status:", task.status);

When the task completes, ittybit will create a Track file in your project

and send a webhook to your endpoint if webhook_url is defined.


Webhook Example

app.post("/subtitles-webhook", async (req, res) => {
  const { kind, status, results } = req.body || {};

  if (kind !== "subtitles" || status !== "completed") {
    return res.status(200).send("Not a completed Subtitles task");
  }

  console.log("Subtitle file URL:", results.url);
  console.log("Format:", results.format);

  res.status(200).send("Subtitles received");
});

File Structure

PropertyTypeDescription
idstringUnique file ID for the subtitle track.
objectstringAlways "track".
kindstringAlways "subtitles".
languagestringLanguage code (ISO 639-1).
formatstringOutput format — "vtt" or "srt".
filenamestringName of the subtitle file.
durationnumberDuration of the associated media file in seconds.
filesizenumberSize of the subtitle file in bytes.
urlstringPublicly accessible subtitle file URL.
metadataobjectReserved for additional details.
created / updatedstring (ISO 8601)Timestamps for creation and last update.

Supported Inputs

Subtitles can be generated from:

  • Video files: .mp4, .mov, .webm

  • Audio files: .mp3, .wav, .m4a


Example Workflow Integration

Subtitles tasks can be part of a broader Automation

that processes uploaded media automatically.

{
  "name": "Generate subtitles for uploaded videos",
  "trigger": {
    "kind": "event",
    "event": "media.created"
  },
  "workflow": [
    { "kind": "video", "format": "mp4" },
    { "kind": "subtitles", "ref": "video-subtitles" }
  ],
  "status": "active"
}

When a new media file is created, this automation will encode the video and generate subtitles automatically.


Example Output Format

Typical .vtt output:

WEBVTT

00:00:12.000 --> 00:00:14.500
Hello, and welcome to UkeTube. I'm Jesse Doe.

00:00:14.800 --> 00:00:18.280
And I'm John Doe. Today we're going to be learning Sandstorm by Darude.

Common Use Cases

  • Generating closed captions for accessibility

  • Localizing content into multiple languages

  • Improving SEO and video searchability

  • Creating transcribed learning materials


Summary

The Subtitles task converts detected speech into time-coded captions for video or audio.

It produces .vtt or .srt files compatible with standard web and media players

and can be chained with other tasks in workflows or automations.


On this page