---
title: "AI Harness v0.6.0 — Harness as Code Gets Its Reference Implementation"
description: "v0.6.0 is the first AI Harness release where the artifact model, the agent loop, and the docs all line up with the Harness-as-Code thesis: typed capability bundles, a hardened finish_reason guard, full reference docs, and context observability you can actually inspect."
date: 2026-06-16
tags: ["AI Agents", "Agentic Development", "Open Source", "Go", "Platform Engineering", "Announcement"]
canonical: https://htek.dev/articles/ai-harness-v060-launch
---
Most AI harnesses start as a prompt and a wrapper. They get to v1.0 by
accumulating branches in the wrapper. **AI Harness took the opposite path:**
codify governance as typed artifacts, then make the wrapper as small as
possible.

v0.6.0 is the first release where that bet looks proven.

If you've been following the [Harness as Code thesis](/articles/what-is-harness-as-code),
this is the release where the runtime catches up to the philosophy.

## What v0.6.0 actually changes

Four things matter in this release. Everything else is supporting work.

### 1. Typed artifact bundles are real

Shape A bundles — `.harness/{plugins,builtins,overrides}/*.md` — are now
first-class. The bundle loader (PR #123) closed the gap where the artifact
registry already understood `harness_artifact/v1alpha1` declarations but
`serve` and `validate` quietly ignored them.

One file = one capability bundle. Tools, hooks, and prompts that belong to
the same governance unit live in the same artifact.

### 2. The agent loop is hardened

A strict `finish_reason` guard now sits at the top of the loop (PRs #121, #123):

| `finish_reason`   | Behavior                                          |
| ----------------- | ------------------------------------------------- |
| `stop`, `end_turn`, `""` | Fall through to a final answer              |
| `length`          | Retriable error — context truncated               |
| `content_filter`  | Hard error — no silent recovery                   |
| anything else, no tool calls | Retriable error — no silent stop       |

No more "agent quietly stopped on turn 14 and we don't know why."

### 3. Reference docs are complete

Every public surface has an exhaustive reference page now:

- `harness.md` frontmatter — every field, every default, every `validate()` check
- Tool artifact schema — file shape, parameters, Starlark dialect, `async` reserved
- Hook artifact schema — full event catalog, payload shapes, decision contract, `when:` semantics
- Starlark built-ins — every builtin from `scripting.Engine.makeBuiltins`, per-module
- CLI — every subcommand, flag, env var, exit code

No more "read the source." The docs are now the contract.

### 4. The live bot is governed

[`@htekdevaiharness`](https://t.me/htekdevaiharness) on Telegram runs the
same Shape A bundles you'd ship to your own team:

```
$ harness validate -v
21 tools registered (across harness.md + 2 plugin bundles)
5 hooks registered
```

That count comes from a `notes-bundle` (note save/list + audit hook) and a
`safety-bundle` (command guard + output redactor + status tool) both loaded
as typed artifacts. The same loader. The same precedence rules. The same
docs you'd read.

## Why typed artifact bundles matter

This is the conceptual centerpiece, and it's where AI Harness takes the
strongest position against everything else in the category.

Most "extension" systems give you one file per capability and pretend that's
the answer. The reality: a real capability is rarely *one* tool. It's a tool
plus a hook plus a guard plus a default prompt fragment. Splitting those
across four files breaks composability — you can no longer move "the safety
capability" between repos as one diff.

Shape A bundles fix that. Each `.md` file declares a single **capability bundle**:

```yaml
---
artifact: harness_artifact/v1alpha1
kind: plugin            # plugin | builtin | override
name: safety-bundle
priority: 40
---

# Safety bundle

Tools, hooks, and prompts that govern destructive operations.

## Tool: command_guard
...

## Hook: tool.pre / output_redactor
...
```

Composition is **deterministic**. Precedence is declared at the kind level:

```
override > harness > builtin > plugin > model
```

Per-turn evaluation re-checks each artifact's `when:` predicate every turn,
not just at startup. An artifact that's inactive on turn 3 can light up on
turn 4 without restarting the agent.

This is the line that separates **"extensions"** from **Harness as Code**:
the unit of governance is the *bundle*, not the individual file. You can
review one diff. You can move one folder. You can audit one artifact. The
runtime composes them deterministically.

## Things you can actually inspect now

Three commands that didn't quite work two releases ago and now are the
daily-driver:

### `harness validate -v`

Registers every artifact, runs every parser, prints a per-bundle tool/hook
count. On the live bot today: **21 tools / 5 hooks** across `harness.md` +
two plugin bundles. If the number doesn't match what you expect, your
bundle isn't loading. That's the loop.

### `harness context --verbose`

Shows what the agent saw on a given turn:

- which chunks were assembled into the system prompt
- where each chunk came from (which artifact, which file)
- which artifacts were active vs inactive
- which `when:` predicates passed
- total token spend, broken down by source

Context observability is not an afterthought. It is shipped.

### `harness artifacts`

Flat list of every loaded artifact with its priority, kind, source file,
and active/inactive state. Useful when you need to answer "is this hook
actually firing?" without grepping through bundles.

## What's still off the menu

Honesty matters. v0.6.0 is **not** a "we figured it all out" release.

- **Compaction engine vs hooks** — open question (#69 / roadmap). The leading
  candidate is hooks-driven compaction in v0.7.
- **Memory persistence** — flat-files today; SQLite is on the table for v0.7.
- **Sub-agent supervision** — primitive level, not orchestration level. Phase
  7 territory.
- **Async tool calls** — `async:` is **reserved** in the tool schema (parsed
  but not propagated through `ToolConfig`). Wired in Phase 3.
- **`agent.stop` hook event** — the strict `finish_reason` guard ships in
  v0.6.0, but the proper hook primitive (issue #104) is held for v0.7.0 so
  it can get its own design pass.

If you need any of those today, you're early. That's fine. The core's
*shape* is what we're committing to in v0.6.0; the edges are still moving.

The pre-1.0 schema-evolution clause stays in effect: artifact frontmatter
fields can still change between minor releases. The CHANGELOG calls every
break out explicitly.

## How to try it

```bash
go install github.com/htekdev/ai-harness/cmd/harness@latest
harness init my-agent
cd my-agent
harness validate -v
harness serve --source stdin
```

Then drop a Shape A bundle into `.harness/plugins/`:

```yaml
---
artifact: harness_artifact/v1alpha1
kind: plugin
name: my-first-bundle
priority: 50
---

## Tool: hello
Say hello and exit.

## Hook: tool.post / log-everything
Print every tool call to stderr.
```

Re-run `harness validate -v`. The tool/hook count should go up. That's the
loop. That's the whole product surface.

## The bigger arc

- **v0.4.0** was the first usable harness.
- **v0.5.0** was the first one with proper claims verification (Ralph loop
  at the delegation boundary).
- **v0.6.0 is the first one where the artifact model, the loop, and the
  docs all line up with the Harness-as-Code thesis.**

That's the milestone worth marking. v0.7 is async, memory persistence, and
the compaction engine. After that, v1.0 is a positioning question, not an
engineering one.

## Where to go next

- **Repo:** [github.com/htekdev/ai-harness](https://github.com/htekdev/ai-harness)
- **Docs:** [htekdev.github.io/ai-harness](https://htekdev.github.io/ai-harness/)
- **Live bot:** [@htekdevaiharness](https://t.me/htekdevaiharness) on Telegram
- **Companion piece:** [What Is Harness as Code?](/articles/what-is-harness-as-code)
- **Category survey:** [Live comparison of agent harnesses](/articles/all-agent-harnesses-live-comparison)

If you've been waiting for "the small one with real governance," this is it.
