Building a Podcast Automation Platform with Ittybit

View Markdown

Part 4 - Chapters, Thumbnails, and Declarative Workflows

In the last article, we conquered video transcoding and AI transcription. Our pipeline can take a raw podcast recording and spit out pristine audio, optimized video, and accurate transcripts. That's powerful. But here's something I've learned from running my own podcast: having great content isn't enough. You need to make it discoverable.

Think about it. You've got a 90-minute conversation with incredible insights scattered throughout. Your viewer lands on YouTube, sees the runtime, and thinks "90 minutes? I don't have time for this." They close the tab. Meanwhile, buried at minute 47 is the exact answer to their question - but they'll never find it.

This is the discoverability problem, and it's killing long-form content. The solution? Make your content navigable. Chapters. Thumbnails. Key moments. Visual cues that say "here's where we talk about X, here's where Y happens."

And here's where things get really interesting: we've been building our processing pipeline imperatively - create task A, wait for result, create task B. But what if I told you there's a better way? A declarative approach where you describe what you want, and the system figures out how to do it?

Let me show you what I mean.

The Discoverability Problem (And Why It's Not Your Fault)

I used to publish 60-90 minute podcast episodes with zero chapters. Just a title, a description, and a play button. My analytics were depressing. Average view duration: 8 minutes. Completion rate: 4%. People would start watching, realize they couldn't scan the content, and bail.

Then I started adding chapters manually. I'd watch the whole episode, note timestamps where topics changed, and add them to YouTube. My stats immediately improved. Average view duration jumped to 18 minutes. Completion rate hit 12%. But the process was brutal - 45 minutes of work per episode, just to add timestamps.

Here's what manual chapter creation looks like:

[Opens video editor]
00:00 - "Welcome everyone..." → 00:00 Introduction
02:34 - "So let's talk about..." → 02:34 Getting Started with APIs
15:47 - "The next thing to consider..." → 15:47 Authentication Strategies
[... 30 more timestamps ...]
43:12 - "Here's where it gets interesting..." → Wait, what was this about again?
[Rewinds video]
[Takes notes]
[Formats for YouTube]
[Uploads]

It's mind-numbing. And if you forget to add a chapter? Too bad. You'd have to re-upload or edit the video description.

We can do better.

Making Videos Navigable: The Three Pillars

There are three things that make long-form video content navigable:

1. Chapters - Timestamped sections with descriptive titles 2. Thumbnails - Visual keyframes for each chapter 3. Subtitles - Searchable, accessible text overlays

We've already built subtitle generation (transcripts → VTT files). Now let's tackle chapters and thumbnails.

Intelligent Chapter Detection

In the last article, I showed you Ittybit's chapter detection task. But I didn't really explain how it works or why it's better than doing it yourself.

Here's what Ittybit does under the hood:

  1. Scene detection - Analyzes visual changes (new speaker, screen share starts, etc.)
  2. Audio analysis - Detects topic boundaries from speech patterns and pauses
  3. Transcript analysis - Uses NLP to identify topic transitions
  4. Heuristics - Applies rules about minimum chapter length, natural breaks

The result? Chapters that actually make sense. Not just "every 5 minutes," but "when the topic genuinely changes."

Let me show you how to configure this properly:

<?php

namespace App\Services;

class IttybitService
{
    // ... previous methods ...

    /**
     * Create a chapter detection task with smart defaults
     */
    public function createChaptersTask(
        string $fileId,
        array $options = []
    ): array {
        $taskData = [
            'file_id' => $fileId,
            'kind' => 'chapters',

            // Minimum chapter length (default: 2 minutes)
            // Too short = too many chapters (overwhelming)
            // Too long = not granular enough
            'min_duration' => $options['min_duration'] ?? 120,

            // Maximum chapter length (optional)
            // Force breaks in very long segments
            'max_duration' => $options['max_duration'] ?? 600, // 10 minutes

            // Use transcript for better chapter titles
            'use_transcript' => $options['use_transcript'] ?? true,

            // Chapter title style
            'title_style' => $options['title_style'] ?? 'descriptive', // or 'concise'

            // Scene detection sensitivity
            'scene_threshold' => $options['scene_threshold'] ?? 0.4, // 0.0 (strict) to 1.0 (loose)

            'ref' => $options['ref'] ?? 'chapters',
        ];

        if ($webhookUrl = config('services.ittybit.webhook_url')) {
            $taskData['webhook_url'] = $webhookUrl;
        }

        return $this->createTask($taskData);
    }
}

The min_duration and max_duration settings are crucial. I learned this the hard way. My first attempt at automatic chapters gave me 47 chapters in a 60-minute video. That's not helpful—that's just a transcript with timestamps. After experimentation, I settled on:

  • Min duration: 2 minutes (120 seconds) - Short enough to be granular, long enough to be meaningful
  • Max duration: 10 minutes (600 seconds) - Forces breaks even in marathon discussions

The scene_threshold controls sensitivity. Lower values mean "only create chapters at obvious transitions." Higher values mean "be liberal about what counts as a topic change." I keep mine at 0.4, which seems to hit the sweet spot.

Generating Chapter Thumbnails

Chapters are great, but you know what's even better? Visual previews. YouTube lets you add custom thumbnails for each chapter, and viewers love it. They can scan your video visually and jump to the part that interests them.

Manually creating chapter thumbnails means:

  1. Scrubbing through the video
  2. Finding a representative frame for each chapter
  3. Taking screenshots
  4. Cropping and formatting
  5. Uploading to YouTube

For a 10-chapter video, that's 30-45 minutes of work. Let's automate it:

<?php

namespace App\Services;

class IttybitService
{
    // ... previous methods ...

    /**
     * Create thumbnails at specific timestamps
     */
    public function createThumbnailsTask(
        string $fileId,
        array $options = []
    ): array {
        $taskData = [
            'file_id' => $fileId,
            'kind' => 'thumbnails',

            // Generate thumbnails at these timestamps (seconds)
            'timestamps' => $options['timestamps'] ?? [0], // Default: just first frame

            // Or generate at chapter markers
            'from_chapters' => $options['from_chapters'] ?? true,

            // Thumbnail dimensions
            'width' => $options['width'] ?? 1280,
            'height' => $options['height'] ?? 720,
            'format' => 'jpg',
            'quality' => 85, // JPEG quality

            'ref' => $options['ref'] ?? 'thumbnails',
        ];

        if ($webhookUrl = config('services.ittybit.webhook_url')) {
            $taskData['webhook_url'] = $webhookUrl;
        }

        return $this->createTask($taskData);
    }

    /**
     * Create a single hero thumbnail (for video preview)
     */
    public function createHeroThumbnail(
        string $fileId,
        array $options = []
    ): array {
        // Use 30% through the video (usually past intro)
        $timestamp = $options['timestamp'] ?? null;

        // If no timestamp provided, Ittybit will auto-select
        // the most visually interesting frame

        $taskData = [
            'file_id' => $fileId,
            'kind' => 'image',
            'timestamp' => $timestamp,
            'width' => 1280,
            'height' => 720,
            'format' => 'jpg',
            'quality' => 90, // Higher quality for hero image
            'ref' => 'hero_thumbnail',
        ];

        if ($webhookUrl = config('services.ittybit.webhook_url')) {
            $taskData['webhook_url'] = $webhookUrl;
        }

        return $this->createTask($taskData);
    }
}

Here's the beautiful part: set from_chapters: true and Ittybit automatically extracts thumbnails at each chapter boundary. You get a thumbnail for every chapter without specifying timestamps. It just works.

The Problem with Imperative Workflows

Alright, we've been building this system imperatively. Here's what our current ProcessEpisodeJob does:

1. Create media object
2. Create audio task
3. Create video task
4. Create transcript task
5. Wait for webhook (audio done)
6. Wait for webhook (video done)
7. Wait for webhook (transcript done)
8. Then create chapters task
9. Wait for webhook (chapters done)
10. Then create thumbnails task
11. Wait for webhook (thumbnails done)
12. Finally, dispatch upload jobs

See the problem? It's a waterfall. Each step waits for the previous. If we want to add a new task - say, generating social media clips - we have to modify the job, add more webhook handling, update the status tracking. It's fragile. It's hard to test. And it doesn't scale.

This is where I discovered Ittybit's Automations feature, and it changed everything.

Declarative Workflows: Automations

Here's the big idea: instead of writing code that says "do A, then B, then C," you write a configuration that says "when media is created, here's everything I want done." The system figures out dependencies, parallelization, and ordering.

It's like the difference between:

Imperative (what we've been doing):

$audio = createAudioTask($file);
wait($audio);
$video = createVideoTask($file);
wait($video);
$transcript = createTranscriptTask($file);
wait($transcript);
$chapters = createChaptersTask($file);
wait($chapters);

Declarative (automations):

{
  "trigger": "media.created",
  "workflow": [
    { "kind": "audio" },
    { "kind": "video" },
    { "kind": "speech" },
    { "kind": "chapters", "depends": ["speech"] },
    { "kind": "thumbnails", "depends": ["chapters"] }
  ]
}

The system sees: "Audio, video, and transcript can run in parallel. Chapters needs the transcript. Thumbnails need chapters." It orchestrates everything automatically.

Let me show you how to build this.

Creating Your First Automation

Automations are created once and run automatically every time you upload new media. Let's build a complete podcast processing automation:

<?php

namespace App\Services;

class IttybitService
{
    // ... previous methods ...

    /**
     * Create a podcast processing automation
     */
    public function createPodcastAutomation(array $options = []): array
    {
        $automationData = [
            'name' => 'Podcast Episode Processing',
            'description' => 'Complete workflow for podcast episodes: audio, video, transcript, chapters, thumbnails',

            // Trigger: run this automation when new media is created
            'trigger' => [
                'kind' => 'event',
                'event' => 'media.created',
            ],

            // The workflow: declarative task definitions
            'workflow' => [
                // Step 1: Extract audio (runs immediately)
                [
                    'kind' => 'audio',
                    'format' => 'mp3',
                    'bitrate' => 192000,
                    'channels' => 2,
                    'sample_rate' => 44100,
                    'ref' => 'podcast_audio',
                ],

                // Step 2: Transcode video for YouTube (runs in parallel with audio)
                [
                    'kind' => 'video',
                    'format' => 'mp4',
                    'codec' => 'h264',
                    'profile' => 'high',
                    'width' => 1920,
                    'height' => 1080,
                    'fps' => 30,
                    'bitrate' => 8000000,
                    'audio_codec' => 'aac',
                    'audio_bitrate' => 192000,
                    'movflags' => '+faststart',
                    'ref' => 'youtube_video',
                ],

                // Step 3: Generate transcript (runs in parallel)
                [
                    'kind' => 'speech',
                    'language' => 'en',
                    'format' => 'vtt',
                    'word_timestamps' => true,
                    'punctuation' => true,
                    'ref' => 'transcript',
                ],

                // Step 4: Detect chapters (waits for transcript)
                [
                    'kind' => 'chapters',
                    'min_duration' => 120,
                    'max_duration' => 600,
                    'use_transcript' => true,
                    'title_style' => 'descriptive',
                    'ref' => 'chapters',
                    // This runs after transcript completes
                ],

                // Step 5: Generate thumbnails (waits for chapters)
                [
                    'kind' => 'thumbnails',
                    'from_chapters' => true,
                    'width' => 1280,
                    'height' => 720,
                    'format' => 'jpg',
                    'quality' => 85,
                    'ref' => 'chapter_thumbnails',
                ],

                // Step 6: Hero thumbnail (runs immediately, picks best frame)
                [
                    'kind' => 'image',
                    'width' => 1280,
                    'height' => 720,
                    'format' => 'jpg',
                    'quality' => 90,
                    'ref' => 'hero_thumbnail',
                ],
            ],

            // Active by default
            'status' => 'active',
        ];

        $response = Http::withHeaders([
            'Authorization' => "Bearer {$this->apiKey}",
            'Accept-Version' => '2025-08-20',
        ])->post("{$this->baseUrl}/automations", $automationData);

        if ($response->failed()) {
            Log::error('Failed to create automation', [
                'status' => $response->status(),
                'body' => $response->body(),
            ]);

            throw new \Exception('Failed to create automation');
        }

        return $response->json();
    }

    /**
     * Get automation details
     */
    public function getAutomation(string $automationId): array
    {
        $response = Http::withHeaders([
            'Authorization' => "Bearer {$this->apiKey}",
            'Accept-Version' => '2025-08-20',
        ])->get("{$this->baseUrl}/automations/{$automationId}");

        if ($response->failed()) {
            throw new \Exception("Failed to fetch automation: {$automationId}");
        }

        return $response->json();
    }

    /**
     * List all automations
     */
    public function listAutomations(): array
    {
        $response = Http::withHeaders([
            'Authorization' => "Bearer {$this->apiKey}",
            'Accept-Version' => '2025-08-20',
        ])->get("{$this->baseUrl}/automations");

        if ($response->failed()) {
            throw new \Exception('Failed to list automations');
        }

        return $response->json();
    }
}

Look at that workflow. Six tasks. Some parallel, some sequential. Zero imperative code. Just a description of what we want.

Here's how Ittybit executes it:

Trigger: Media created

  ├─→ Audio task (parallel)
  ├─→ Video task (parallel)
  ├─→ Transcript task (parallel)
  └─→ Hero thumbnail (parallel)

  [Wait for transcript to complete]

  Chapters task

  [Wait for chapters to complete]

  Chapter thumbnails task

  [All tasks complete]

  Webhook: workflow.completed

Ittybit handles all the orchestration. It knows chapters need the transcript, so it waits. It knows thumbnails need chapters, so it waits. But audio, video, and transcript can all run simultaneously, so they do.

Conditional Workflows: Smart Routing

Here's where automations get really powerful. You can add conditional logic:

public function createSmartPodcastAutomation(): array
{
    $automationData = [
        'name' => 'Smart Podcast Processing',
        'trigger' => [
            'kind' => 'event',
            'event' => 'media.created',
        ],
        'workflow' => [
            // Always: extract audio
            [
                'kind' => 'audio',
                'format' => 'mp3',
                'bitrate' => 192000,
                'ref' => 'podcast_audio',
            ],

            // Always: generate transcript
            [
                'kind' => 'speech',
                'language' => 'en',
                'format' => 'vtt',
                'ref' => 'transcript',
            ],

            // Conditional: only transcode video if source is video
            [
                'kind' => 'conditions',
                'conditions' => [
                    ['prop' => 'media.kind', 'value' => 'video']
                ],
                'next' => [
                    [
                        'kind' => 'video',
                        'format' => 'mp4',
                        'codec' => 'h264',
                        'width' => 1920,
                        'height' => 1080,
                        'ref' => 'youtube_video',
                    ],
                    [
                        'kind' => 'chapters',
                        'use_transcript' => true,
                        'ref' => 'chapters',
                    ],
                    [
                        'kind' => 'thumbnails',
                        'from_chapters' => true,
                        'ref' => 'chapter_thumbnails',
                    ],
                ],
            ],
        ],
        'status' => 'active',
    ];

    return $this->createAutomation($automationData);
}

See the conditions block? It says "only do these tasks if the media is video." This is perfect for podcasters who sometimes publish audio-only episodes and sometimes publish video. The automation adapts.

You can check:

  • media.kind - video, audio, or image
  • media.duration - length in seconds
  • media.width / media.height - dimensions
  • file.filesize - file size in bytes

Want to skip transcoding for short clips? Add a condition:

[
    'kind' => 'conditions',
    'conditions' => [
        ['prop' => 'media.duration', 'operator' => 'gt', 'value' => 60]
    ],
    'next' => [ /* tasks for videos longer than 1 minute */ ]
]

Simplifying Our Application Code

Here's the beautiful part: with automations set up, our application code becomes trivial. Watch this:

Before (imperative approach):

class ProcessEpisodeJob implements ShouldQueue
{
    public function handle(IttybitService $ittybit): void
    {
        // Create media
        $media = $ittybit->createMediaFromFile(...);

        // Create audio task
        $audioTask = $ittybit->createAudioTask(...);
        ProcessingTask::create([...]);

        // Create video task
        $videoTask = $ittybit->createVideoTask(...);
        ProcessingTask::create([...]);

        // Create transcript task
        $transcriptTask = $ittybit->createTranscriptTask(...);
        ProcessingTask::create([...]);

        // Create chapters task
        $chaptersTask = $ittybit->createChaptersTask(...);
        ProcessingTask::create([...]);

        // Create thumbnails task
        $thumbnailsTask = $ittybit->createThumbnailsTask(...);
        ProcessingTask::create([...]);

        // 50+ lines of setup code
    }
}

After (with automations):

class ProcessEpisodeJob implements ShouldQueue
{
    public function handle(IttybitService $ittybit): void
    {
        // Create media - automation handles the rest
        $media = $ittybit->createMediaFromFile(
            fileId: $this->episode->ittybit_file_id,
            metadata: [
                'title' => $this->episode->title,
                'description' => $this->episode->description,
            ]
        );

        $this->episode->update([
            'ittybit_media_id' => $media['id'],
            'status' => 'processing',
            'processing_started_at' => now(),
        ]);

        // That's it. The automation runs automatically.
    }
}

From 50+ lines to 15. And the automation runs for every media you create, not just this one episode. Create a media object, and the entire pipeline executes. That's the power of declarative configuration.

Setting Up Automations: One-Time Setup

Automations are persistent. You create them once, and they run forever (or until you delete/disable them). Here's how I recommend setting them up:

1. Create a setup command:

<?php

namespace App\Console\Commands;

use App\Services\IttybitService;
use Illuminate\Console\Command;

class SetupIttybitAutomations extends Command
{
    protected $signature = 'ittybit:setup-automations';
    protected $description = 'Create Ittybit automations for podcast processing';

    public function __construct(
        private IttybitService $ittybit
    ) {
        parent::__construct();
    }

    public function handle(): int
    {
        $this->info('Setting up Ittybit automations...');

        try {
            // Check if automation already exists
            $automations = $this->ittybit->listAutomations();
            $existing = collect($automations)->firstWhere('name', 'Podcast Episode Processing');

            if ($existing) {
                $this->warn('Podcast automation already exists (ID: ' . $existing['id'] . ')');

                if (!$this->confirm('Do you want to recreate it?')) {
                    return 0;
                }

                // Delete existing
                $this->ittybit->deleteAutomation($existing['id']);
                $this->info('Deleted existing automation');
            }

            // Create new automation
            $automation = $this->ittybit->createPodcastAutomation();

            $this->info('✓ Created podcast automation: ' . $automation['id']);
            $this->info('✓ Status: ' . $automation['status']);
            $this->info('✓ Workflow steps: ' . count($automation['workflow']));

            // Store automation ID in config/cache
            cache()->forever('ittybit.automation_id', $automation['id']);

            return 0;

        } catch (\Exception $e) {
            $this->error('Failed to create automation: ' . $e->getMessage());
            return 1;
        }
    }
}

2. Run it once during setup:

php artisan ittybit:setup-automations

3. Store the automation ID:

// config/services.php
return [
    'ittybit' => [
        'api_key' => env('ITTYBIT_API_KEY'),
        'webhook_url' => env('APP_URL') . '/webhooks/ittybit',
        'automation_id' => env('ITTYBIT_AUTOMATION_ID'),
    ],
];

Now your automation runs for every media object you create. Forever. Until you explicitly disable or delete it.

Tracking Automation Results

When an automation runs, Ittybit creates a "workflow task" that tracks the entire execution. Each step in the workflow gets its own task ID, but they're all linked to the parent workflow.

Let's update our webhook handler to track automation workflows:

public function ittybit(Request $request): JsonResponse
{
    $payload = $request->all();

    Log::info('Received Ittybit webhook', [
        'task_id' => $payload['id'] ?? null,
        'kind' => $payload['kind'] ?? null,
        'status' => $payload['status'] ?? null,
    ]);

    // Check if this is a workflow completion
    if (($payload['kind'] ?? null) === 'workflow') {
        return $this->handleWorkflowWebhook($payload);
    }

    // Handle individual task completion (existing code)
    // ...
}

private function handleWorkflowWebhook(array $payload): JsonResponse
{
    $mediaId = $payload['input']['media_id'] ?? null;

    if (!$mediaId) {
        return response()->json(['error' => 'No media_id in workflow payload'], 400);
    }

    // Find the episode by media ID
    $episode = Episode::where('ittybit_media_id', $mediaId)->first();

    if (!$episode) {
        Log::warning('Workflow completed for unknown episode', [
            'media_id' => $mediaId,
        ]);
        return response()->json(['status' => 'ignored']);
    }

    $status = $payload['status'] ?? null;

    if ($status === 'completed') {
        // Entire workflow completed successfully
        Log::info('Workflow completed', [
            'episode_id' => $episode->id,
            'media_id' => $mediaId,
        ]);

        // Fetch the media object to get all processed files
        $media = app(IttybitService::class)->getMedia($mediaId);

        // Extract URLs by ref
        $audioFile = collect($media['files'])->firstWhere('ref', 'podcast_audio');
        $videoFile = collect($media['files'])->firstWhere('ref', 'youtube_video');
        $transcriptFile = collect($media['files'])->firstWhere('ref', 'transcript');
        $chaptersFile = collect($media['files'])->firstWhere('ref', 'chapters');
        $heroThumb = collect($media['files'])->firstWhere('ref', 'hero_thumbnail');

        // Update episode with all outputs
        $episode->update([
            'status' => 'processed',
            'processing_completed_at' => now(),
            'audio_file_id' => $audioFile['id'] ?? null,
            'audio_url' => $audioFile['url'] ?? null,
            'video_file_id' => $videoFile['id'] ?? null,
            'video_url' => $videoFile['url'] ?? null,
            'hero_thumbnail_url' => $heroThumb['url'] ?? null,
        ]);

        // Process transcript
        if ($transcriptFile) {
            $this->processTranscript($episode, $transcriptFile);
        }

        // Process chapters
        if ($chaptersFile) {
            $this->processChapters($episode, $chaptersFile);
        }

        // Dispatch upload jobs
        UploadToTransistorJob::dispatch($episode);
        UploadToYouTubeJob::dispatch($episode);

    } elseif ($status === 'failed') {
        Log::error('Workflow failed', [
            'episode_id' => $episode->id,
            'error' => $payload['error'] ?? 'Unknown error',
        ]);

        $episode->update(['status' => 'failed']);
    }

    return response()->json(['status' => 'processed']);
}

private function processTranscript(Episode $episode, array $transcriptFile): void
{
    try {
        $content = file_get_contents($transcriptFile['url']);
        $segments = app(TranscriptService::class)->parseVtt($content);
        $plainText = app(TranscriptService::class)->extractPlainText($segments);

        $episode->update(['transcript' => $plainText]);

        Storage::put(
            "transcripts/{$episode->id}.json",
            json_encode(['segments' => $segments], JSON_PRETTY_PRINT)
        );

    } catch (\Exception $e) {
        Log::error('Failed to process transcript', [
            'episode_id' => $episode->id,
            'error' => $e->getMessage(),
        ]);
    }
}

private function processChapters(Episode $episode, array $chaptersFile): void
{
    try {
        $content = file_get_contents($chaptersFile['url']);
        $chapters = json_decode($content, true);

        Storage::put(
            "chapters/{$episode->id}.json",
            json_encode($chapters, JSON_PRETTY_PRINT)
        );

        Log::info('Processed chapters', [
            'episode_id' => $episode->id,
            'chapter_count' => count($chapters['chapters'] ?? []),
        ]);

    } catch (\Exception $e) {
        Log::error('Failed to process chapters', [
            'episode_id' => $episode->id,
            'error' => $e->getMessage(),
        ]);
    }
}

Beautiful. One webhook tells us the entire workflow completed. We fetch the media object, grab all the processed files by their ref values, and we're done.

The Database Update: Storing Chapter Data

Let's add columns to store chapter and thumbnail data:

<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
    public function up(): void
    {
        Schema::table('episodes', function (Blueprint $table) {
            // Thumbnails
            $table->string('hero_thumbnail_url')->nullable()->after('video_url');
            $table->json('chapter_thumbnails')->nullable()->after('hero_thumbnail_url');

            // Chapters (stored as JSON)
            $table->json('chapters')->nullable()->after('transcript');
        });
    }

    public function down(): void
    {
        Schema::table('episodes', function (Blueprint $table) {
            $table->dropColumn([
                'hero_thumbnail_url',
                'chapter_thumbnails',
                'chapters',
            ]);
        });
    }
};

Update the Episode model:

<?php

namespace App\Models;

class Episode extends Model
{
    protected $fillable = [
        // ... existing fields ...
        'hero_thumbnail_url',
        'chapter_thumbnails',
        'chapters',
    ];

    protected $casts = [
        // ... existing casts ...
        'chapter_thumbnails' => 'array',
        'chapters' => 'array',
    ];
}

Enhanced Episode API Response

Now our episode endpoint can return rich navigational data:

public function show(Request $request, string $id): JsonResponse
{
    $episode = Episode::where('id', $id)
        ->where('user_id', $request->user()->id)
        ->firstOrFail();

    return response()->json([
        'id' => $episode->id,
        'title' => $episode->title,
        'description' => $episode->description,
        'status' => $episode->status,

        'media' => [
            'audio_url' => $episode->audio_url,
            'video_url' => $episode->video_url,
            'hero_thumbnail_url' => $episode->hero_thumbnail_url,
        ],

        'transcript' => $episode->transcript ? [
            'text' => $episode->transcript,
            'segments' => Storage::exists("transcripts/{$episode->id}.json")
                ? json_decode(Storage::get("transcripts/{$episode->id}.json"), true)['segments']
                : null,
        ] : null,

        'chapters' => $episode->chapters ? [
            'items' => $episode->chapters,
            'thumbnails' => $episode->chapter_thumbnails,
        ] : null,

        'timestamps' => [
            'created_at' => $episode->created_at,
            'processing_started_at' => $episode->processing_started_at,
            'processing_completed_at' => $episode->processing_completed_at,
        ],
    ]);
}

Example response:

{
  "id": "9b1f4c4e-...",
  "title": "How to Build APIs That Don't Suck",
  "status": "processed",
  "media": {
    "audio_url": "https://you.ittybit.net/.../audio.mp3",
    "video_url": "https://you.ittybit.net/.../video.mp4",
    "hero_thumbnail_url": "https://you.ittybit.net/.../thumbnail.jpg"
  },
  "chapters": {
    "items": [
      {
        "timestamp": 0,
        "title": "Introduction: Why API Design Matters",
        "duration": 180
      },
      {
        "timestamp": 180,
        "title": "RESTful Principles Explained",
        "duration": 420
      },
      {
        "timestamp": 600,
        "title": "Authentication Strategies",
        "duration": 360
      }
    ],
    "thumbnails": [
      "https://you.ittybit.net/.../thumb_0.jpg",
      "https://you.ittybit.net/.../thumb_180.jpg",
      "https://you.ittybit.net/.../thumb_600.jpg"
    ]
  }
}

The Complete Picture

Let me show you what the entire flow looks like with automations:

1. User uploads video
   └─> POST /episodes/prepare-upload
   └─> Upload to Ittybit (client-side)
   └─> POST /episodes/{id}/confirm-upload

2. ProcessEpisodeJob runs
   └─> Create media object in Ittybit
   └─> Automation triggers automatically

3. Ittybit automation executes (5-15 minutes)
   ├─> Audio extraction (parallel)
   ├─> Video transcoding (parallel)
   ├─> Transcript generation (parallel)
   ├─> Hero thumbnail (parallel)
   └─> [Wait for transcript]
       └─> Chapters detection
           └─> [Wait for chapters]
               └─> Chapter thumbnails

4. Webhook: workflow.completed
   └─> Fetch media object
   └─> Extract all processed files
   └─> Update episode record
   └─> Dispatch upload jobs

5. Episode ready to publish!

The imperative complexity is gone. We describe what we want, and Ittybit handles the orchestration.

Why This Matters: The Developer Experience

Here's what I love about declarative workflows: they're testable, versionable, and portable.

Testable: You can test an automation config without running it:

curl -X POST https://api.ittybit.com/automations/validate \
  -d @automation.json

Versionable: Store your automation configs in Git:

/automations
  ├─ podcast-standard.json
  ├─ podcast-premium.json
  └─ social-clips.json

Portable: Need a staging environment? Copy the automation config. Need to process old episodes differently? Create a second automation.

And if something breaks? Look at the automation config. The workflow is right there, not scattered across job classes and webhook handlers.

The DIY Comparison: Orchestration is Hard

Let me tell you what building this yourself looks like:

Task orchestration:

  • Build a DAG (directed acyclic graph) system
  • Implement dependency resolution
  • Handle parallel execution
  • Track task states across multiple workers
  • Retry failed tasks without re-running successful ones
  • Detect circular dependencies

Time investment: 3-4 weeks for basic orchestration. More for robustness.

Chapter detection:

  • Scene detection algorithms
  • Audio analysis
  • NLP for topic segmentation
  • Heuristics for good breaks
  • Testing on diverse content

Time investment: 2-3 weeks.

Thumbnail extraction:

  • Frame selection algorithms
  • Quality scoring
  • Batch processing
  • Format conversion

Time investment: 1 week.

Total DIY investment: 6-8 weeks, plus ongoing maintenance.

With Ittybit automations: The code we've written today. Maybe 1-2 days, including testing. And it just works.

What We've Built

Intelligent chapter detection - AI-powered topic segmentation

Automatic thumbnails - Visual previews for every chapter

Hero thumbnails - Engaging preview images

Declarative workflows - Configuration over code

Conditional logic - Smart routing based on content

Parallel execution - Maximum speed

Simple integration - One webhook, all results

Our videos are now navigable. Viewers can scan chapters, see thumbnails, jump to topics. And we did it without writing orchestration code.

What's Next

In Part 5, we'll tackle the final piece: automated distribution. We have all this processed media - audio, video, transcripts, chapters. Now we need to publish it:

  • Uploading audio to Transistor (podcast RSS feed)
  • Uploading video to YouTube with chapters and subtitles
  • OAuth authentication flows
  • Handling rate limits
  • Retry strategies
  • Success/failure notifications

We're in the home stretch. Our content is processed. Our videos are navigable. Now we just need to get them out into the world.

Until then, think about the workflows in your own applications. How many imperative job chains could be replaced with declarative configs? How much complexity could you eliminate?