Building a Podcast Automation Platform with Ittybit
Part 2 - Your First Media Processing Tasks
In the last article, we laid the groundwork: an API that accepts episode uploads and stores them in Ittybit. We ended with a file sitting in Ittybit's storage and a ProcessEpisodeJob being dispatched to our queue. That's where things get interesting.
See, here's the thing about media processing: it's slow. Not "your API took 500ms instead of 200ms" slow. I mean "go make a coffee, check Twitter, maybe walk your dog" slow. A 60-minute 1080p video can take 5-15 minutes to process, depending on what you're doing with it. Extract audio, transcode video, generate transcripts, detect chapters - each of these is compute-intensive work.
This is why we need queues. This is why we need webhooks. And this is why understanding async workflows isn't just a nice-to-have - it's fundamental to building media processing applications that don't collapse under their own weight.
Let's dig in.
The Reality of Media Processing
Before we write any code, I want you to understand what we're actually dealing with here. When you tell Ittybit "extract audio from this video," here's what happens behind the scenes:
- Ittybit queues your task
- A worker picks it up and downloads your video
- FFmpeg (or similar) decodes the video, extracts audio streams
- The audio gets transcoded to your desired format (MP3, AAC, etc.)
- The output file gets uploaded to Ittybit's CDN
- Ittybit sends you a webhook: "Hey, it's done"
For a 2.5GB, 60-minute video, this might take 8-12 minutes. If you're doing this synchronously - blocking your API response while waiting - you've got a timeout waiting to happen. Your HTTP client gives up. Your reverse proxy gives up. Everything gives up.
The solution? Embrace the async. Fire off the work, return immediately, and get notified when it's done.
The Processing Pipeline Architecture
Here's how our system will work:
Let's build this step by step.
Expanding the Database Schema
First, we need a way to track individual processing tasks. Remember: we're firing off multiple tasks (audio, video, transcript) and they complete at different times. We need to know what's finished and what's still cooking.
I also want to add a few columns to our episodes table:
Now we have a place to track everything.
Expanding the Ittybit Service
Before we build the job, let's add methods to our IttybitService for creating tasks:
Notice how each task type has its own method with sensible defaults? This is intentional. I could have made one giant createTask() method that takes a million parameters, but that's a recipe for confusion. Better to have focused, purpose-built methods.
Also notice the ref parameter. This is Ittybit's way of letting you tag outputs. When you create an audio task with ref: 'podcast_audio', you can later find that file by its ref. It's like giving your files nicknames.
The ProcessEpisodeJob: Where the Magic Happens
Alright, time for the main event. This job is the conductor of our orchestra. It creates all the tasks, saves their IDs, and sets everything in motion.
Let me walk you through what's happening here:
The transaction wrapper - We wrap everything in a database transaction because we're creating multiple related records. If creating the third task fails, we don't want the first two hanging around orphaned.
Creating the media object - This associates our uploaded file with a media container in Ittybit. Think of it like creating a project that all related files belong to.
Three parallel tasks - We're creating three tasks that Ittybit will process in parallel:
- Audio extraction (8-12 minutes for a 60-min video)
- Video transcoding (10-15 minutes)
- Transcript generation (5-8 minutes)
The status flow - We're moving from uploaded → processing. Later, when all tasks complete, we'll move to completed.
Error handling - If anything fails, we roll back, mark the episode as failed, and rethrow (so Laravel's queue system knows to retry).
The Webhook Handler: Receiving Completion Notifications
Here's where it gets really interesting. Ittybit doesn't make you poll for task status like it's 2005. Instead, it sends you a webhook when tasks complete. You give it a URL, and it POSTs to it.
First, let's add the webhook URL to our config:
Now let's build the webhook controller:
This webhook handler does several critical things:
1. Finds the task - We look up which of our tasks this webhook is about.
2. Updates status - We record the new status (completed, failed, etc.)
3. Extracts output - If the task completed, we save the output file ID and URL.
4. Checks completion - We count how many tasks are done. If all tasks for an episode are complete, we move to the next phase.
5. Returns quickly - We respond with 200 OK within a second or two. We don't do heavy processing in the webhook handler itself - we dispatch another job for that.
Why dispatch another job instead of doing the work right here? Because webhook handlers should be fast. Ittybit expects a response quickly. If you take too long, they might retry the webhook, and then you're processing the same event twice. Not fun.
The Completion Job: Gathering Results
When all tasks finish, we dispatch CompleteEpisodeProcessingJob. This job gathers all the results and updates the episode:
Now the episode has everything it needs: processed audio, processed video, and a transcript. In the next article, we'll upload these to Transistor and YouTube. But for now, we have a complete processing pipeline.
Adding a Status Endpoint
Users need a way to check on their episodes. Let's enhance our show() method to return rich status information:
Now when a client hits GET /episodes/{id}, they get a complete picture:
Beautiful. The client can poll this endpoint to show progress to the user.
The Episode Model: Relationships
Let's add the relationship to our Episode model so the code above works:
Testing the Flow
Let's walk through what actually happens when you use this system:
1. Upload initiation
2. Client uploads to Ittybit (direct upload, bypassing our server)
3. Confirm upload
At this point, ProcessEpisodeJob runs. It creates three tasks in Ittybit and saves them to our database. Episode status: processing.
4. Ittybit processes (5-15 minutes)
During this time, you can poll:
And get progress updates as tasks complete.
5. Webhooks arrive (as each task finishes)
Our webhook handler updates the task status. When all three tasks are done, it dispatches CompleteEpisodeProcessingJob.
6. Episode is ready
What We've Built
Let's take stock. We now have:
✅ A job that creates multiple processing tasks - Audio, video, transcript ✅ Task tracking in the database - We know what's running and what's done ✅ Webhook handling - Ittybit tells us when tasks complete ✅ Progress tracking - Clients can see real-time progress ✅ Error handling - Failed tasks are logged and reported ✅ Async processing - Everything runs in the background
This is the foundation of any serious media processing application. You fire off work, return immediately, and handle results asynchronously. No timeouts. No blocking. Just clean, scalable workflows.
The DIY Comparison: Why This Matters
Let me tell you what this would look like if you tried to build it yourself without Ittybit:
Infrastructure you'd need:
- FFmpeg servers (at least 2-3 for redundancy)
- Queue system (we already have this)
- Storage system (S3 or similar)
- CDN for delivery
- Speech-to-text service (AWS Transcribe, Google Speech-to-Text, etc.)
Code you'd write:
- FFmpeg wrapper scripts
- Upload/download management
- Format detection and validation
- Error handling for dozens of edge cases
- Progress tracking
- Resource management (preventing server overload)
Time investment: 4-6 weeks minimum. And that's just to get it working. Then you have to maintain it. FFmpeg updates. Server management. Scaling issues. It never ends.
With Ittybit: The code we've written today. A service class and two jobs. Maybe 300 lines total. And it's production-ready.
The cost savings aren't just in development time - they're in the mental overhead you don't have to carry. I can focus on building features users care about instead of debugging why FFmpeg is crashing on certain MP4 variants.
What's Next
In the next article, we'll tackle distribution. We have processed audio and video sitting in Ittybit. Now we need to get them to their final destinations:
- Upload audio to Transistor (podcast hosting)
- Upload video to YouTube
- Handle OAuth flows
- Manage API rate limits
- Retry strategies
We'll build UploadToTransistorJob and UploadToYouTubeJob, and our episodes will finally be published automatically. The finish line is in sight.
Until then, I'd love to hear: have you built async processing systems before? What challenges did you run into? Hit me up on Twitter or drop a comment below.