Skip to content
← Back to Articles

20 Minutes, Two Prompts, a Complete Video Pipeline

AI GitHub Copilot Automation Developer Experience

The Pipeline That Processed Itself Into Existence

If you’re watching the video I posted on LinkedIn, you’re looking at proof. That video was transcribed, captioned, clipped into shorts, and turned into social posts by the very pipeline I built to do those things — a 14-stage video processing system scaffolded in 20 minutes with two prompts. The tool didn’t just build the system. It processed its own creation story.

I’ve written before about the shift from writing code to directing it. This project made that shift visceral. I didn’t write a single line of TypeScript. I described what I wanted, answered a few clarifying questions, and watched parallel agents assemble a production-quality pipeline while I sipped coffee.

The tool that made this possible is GitHub Copilot CLI and its experimental /fleet command.

What Fleet Mode Actually Does

Fleet mode is an experimental feature introduced by Evan Boyle that enables parallel sub-agent orchestration inside Copilot CLI. Instead of one agent grinding through tasks sequentially, /fleet decomposes your request into parallelizable work units and dispatches multiple sub-agents simultaneously.

Here’s the workflow:

  1. Prompt ingestion — You describe a complex system in natural language
  2. Clarifying questions — The orchestrator asks targeted questions to fill context gaps
  3. Planning — Fleet mode decomposes the project into dependency-aware tasks, tracked in a SQLite database per session
  4. Parallel dispatch — Multiple general-purpose sub-agents spawn concurrently, each assigned to a specific module
  5. Integration — After parallel work completes, a final pass resolves conflicts and ensures coherence

The January 2026 changelog confirmed four built-in agent types — Explore, Task, Plan, and Code-review — that Copilot delegates to automatically. WinBuzzer reported that version 0.0.382 transforms sequential agent handoffs into concurrent execution, cutting complex tasks from 90 seconds to 30.

Paired with autopilot mode (cycle with Shift+Tab), the agent keeps working until the job is done — no confirmation pauses, no hand-holding.

The 14-Stage Pipeline

My two prompts described a system that watches for new video files and automatically generates everything a content creator needs. Fleet mode decomposed it into 14 stages:

#StageWhat It Does
1File WatcherMonitors a directory for new video files using chokidar
2Video IngestionValidates codecs and extracts metadata via ffprobe
3Audio ExtractionStrips audio track with FFmpeg for transcription
4TranscriptionGenerates full transcript via Whisper API
5Caption GenerationFormats timed SRT/VTT subtitles with word-level timestamps
6Caption BurningHard-codes captions into the video using FFmpeg filters
7Chapter DetectionAnalyzes transcript for topic shifts, generates chapter markers
8Summary GenerationProduces concise summaries from the full transcript
9Shorts GenerationClips vertical short-form videos (9:16) from highlight moments
10Thumbnail GenerationExtracts key frames for video and shorts thumbnails
11Social Post GenerationWrites platform-specific posts for LinkedIn and X
12Blog Content GenerationTransforms transcript into long-form blog content
13DocumentationAuto-generates README and pipeline docs
14Output OrganizationStructures artifacts into organized directories with manifest

The generated code was clean, modular TypeScript with proper try/catch error handling and Winston structured logging throughout. Each stage follows a pipeline pattern with defined inputs and outputs feeding the next — the kind of architecture you’d expect from a well-structured Node.js logging setup, not a speedrun.

Reddit users on r/GithubCopilot report similar experiences — one commenter described watching “3 agents arguing about architecture in your terminal” before converging on a solution. Another thread showed 5 sub-agents completing a complex refactoring in about 7 minutes of wall time with only 52 seconds of actual API time.

The Three Skills That Matter Now

Building this pipeline didn’t require me to know FFmpeg filter syntax or Winston transport configuration. It required three things:

Context Engineering

Context engineering is replacing prompt engineering as the critical AI skill. It’s not about finding magic words — it’s about structuring what information the model can access when generating a response. I provided examples of video pipeline architectures, named the specific tools I wanted (FFmpeg, Whisper, chokidar), and described the output directory structure. The AI didn’t have to guess — I gave it the context to succeed.

Architectural Thinking

I didn’t describe individual functions. I described a system: data flow, stage boundaries, error propagation, and output contracts. The AI translated system-level thinking into implementation. If I’d prompted at the function level, I’d still be typing.

Articulation Clarity

The difference between a mediocre AI output and a great one is how clearly you describe what’s in your head. My two prompts weren’t clever tricks — they were precise descriptions of a video processing system with clearly defined stage responsibilities and output expectations.

How Fleet Mode Compares

The agentic coding landscape in 2026 is crowded. Here’s where the major tools stand:

ToolTypeParallel AgentsBest For
Copilot CLI (Fleet)Terminal agentGitHub integration, zero-cost entry for subscribers
Claude CodeTerminal agentDeep reasoning with Opus-class models
CursorAI IDEFamiliar IDE UX, inline editing
WindsurfAgentic IDEBeginner-friendly autonomous execution
DevinAutonomous agentEnd-to-end delivery, enterprise adoption

Each tool has real strengths. Claude Code’s reasoning with Opus 4.6 is genuinely superior for complex logic. Cursor offers the most polished inline diff experience. Devin handles fully autonomous end-to-end delivery, backed by a $10.2 billion valuation.

But Copilot CLI’s fleet mode hits a sweet spot for my workflow: it’s terminal-native, included with my existing Copilot subscription, deeply integrated with the GitHub ecosystem, and extensible via MCP and ACP. For greenfield scaffolding projects like this video pipeline, the combination of /plan, /fleet, and autopilot is unmatched.

What This Means for Developers

The productivity numbers here aren’t incremental improvements. A 14-stage pipeline that would take 2–4 weeks to hand-build emerged in 20 minutes. That’s not a 5x speedup — it’s closer to 100x for this class of problem.

But the takeaway isn’t “AI writes code faster.” It’s that implementation is being commoditized. The competitive advantage is shifting from “can you code this?” to “can you envision this?” The developer who can articulate a clear system design, bring the right context, and think in terms of architecture will extract dramatically more value from these tools than someone who treats them as fancy autocomplete.

“If you’re watching this video, you’re looking at proof. This pipeline processed itself into existence in front of your eyes.”

The self-referential nature of this project — the pipeline processing its own creation video — isn’t just a fun demo. It’s a signal. We’re entering an era where the gap between imagining a system and having a working system is collapsing to minutes. The developers who thrive won’t be the fastest typists. They’ll be the clearest thinkers.


← All Articles