All posts

Porting Microsoft's HVE Workflow to Claude Code

AIAnthropicClaude CodeGitHubTechnical

How I got hooked

A friend who works at Microsoft told me her teams use something called HVE (Hypervelocity Engineering). The way she described it, her teams were shipping code fast and not running into issues with quality. I looked into HVE and the RPI (Research, Plan, Implement, and Review) workflow, tried it out, and I was hooked immediately.

It wasn’t just “AI writes the code for you. Good luck.” Working with HVE/RPI felt like having guardrails. I could see the plan before anything got built. I could watch it implement to the spec instead of improvising. And the review step caught things I’d otherwise have missed and shipped without noticing. For the first time in a while, I wasn’t squinting at generated code wondering whether I could trust it. I had a map.

Then I opened Claude Code and went looking for the same thing. I couldn’t find it. Microsoft’s HVE Core is built entirely for Copilot: its agents are .agent.md files, the frontmatter is Copilot-specific, and it ships as a VS Code extension. None of that runs in Claude Code. So I used Claude Code to port it, and the result is hve-claude. A few days in, the workflow caught a silent bug in its own installer, something I’d never have found otherwise. More on that below.

What HVE is

The core idea is a methodology called RPI: Research, Plan, Implement, Review. You don’t let the model do everything in one pass. You split the work into phases with different jobs.

The insight that makes it click: when the AI can’t implement during research, it stops optimizing for writing code and starts optimizing for verified truth. Separating the phases by role produces better outcomes than asking one session to do it all at once. That’s Microsoft’s idea, and it’s a good one.

Why a new repo instead of a fork

I thought about forking. I decided against it.

Copilot’s agent format and Claude Code’s command and subagent model are different enough that a fork would mean constantly translating upstream changes by hand to stay in sync. There was no clean merge path. A fresh port that I could evolve on its own, with clear attribution back to microsoft/hve-core and the MIT license, made far more sense. So I ported it fresh.

What the port involved

The most interesting part was a structural mismatch I ended up treating as a feature.

Copilot’s HVE has two kinds of things you interact with: “agents” you select from a menu, and “commands” you type. Claude Code only has one of those concepts, the slash command. So Microsoft’s user-facing agents and their commands both had to collapse into a single set of /hve-* slash commands. The internal subagents, the ones spawned behind the scenes to do research or validation, map straight onto Claude Code’s subagents.

That collapse sounds like a loss, but it actually maps better to the RPI phases. A researcher, a plan validator, and an implementation validator really are different roles, and Claude Code lets me scope each one’s tools precisely. The research and validation agents get read and search tools only, with no Edit and no Bash, so they can’t modify your code in place or run shell commands. They keep just enough write access to record their own findings in the tracking folder.

The names came along too. Microsoft uses task-* prefixes and bare verbs; I namespaced everything under /hve- so the commands show up as a discoverable group and don’t collide with anything else. So task-research became /hve-research, pr-review went from a Copilot agent to a command, and two git-commit helpers merged into one. The full mapping is in the repo’s docs.

And here’s the coverage at a glance. The entire core workflow ported over. The only real gaps are one helper skill and a pile of platform integrations that were never the goal.

CategoryMicrosoft HVE Corehve-claudeCoverage
Slash commands (RPI phases plus challenge, doc-ops, git, prompt tooling)1615Fully ported
Subagents behind those phases88One to one
Instruction files (language and writing conventions)7 in core12Pulled extra language guides from Microsoft’s coding-standards plugin
Reusable prompts66Ported
pr-reference skill10Not yet
Platform and domain bundles (Jira, ADO, GitHub, security, data science)12+0Out of scope

Microsoft advertises something like “16 commands and 10 agents,” but those are two views of the same core surface, not 26 separate things. The 15 /hve-* commands cover it. The count comes out one short of Microsoft’s 16 only because a couple of commands were merged or moved to prompts.

The first-day bugs, found by using it on itself

The first tasks I ran through the workflow were on the project itself: writing the README, then reviewing and cleaning up its own CLAUDE.md. Which means the very first serious use of hve-claude was hve-claude researching and documenting hve-claude. If you want to see what a research phase produces, the artifact from that CLAUDE.md cleanup is still in the repo: severity-graded findings, a file:line citation on every one, confidence markers (scroll to the “Non-Issues” section and you’ll see [HIGH], [MEDIUM], and [LOW] tags on each line), and even a list of things it checked and decided were fine. It even caught that the hve-claude docs had been calling HVE “Human-Value Engineering” when the actual name is “Hypervelocity Engineering.”

Putting it to work like that also shook out two bugs in short order. Every one of the 15 commands had the wrong frontmatter format for allowed-tools: it was written as a YAML list when Claude Code wants a comma-separated string. And several read-only agents had write tools they had no business having. Both caught and fixed immediately, in the first commits after the initial drop. There’s something satisfying about a tool whose first goal is to find the holes in itself.

The bug the workflow caught in its own installer

A few days in, I decided the installer needed tests. Installing is the one thing every user does before they touch anything else, and it has fiddly edge cases: a fresh project with no CLAUDE.md, an existing one with my markers already in it, an older one whose markers are in a slightly different format, an upgrade from a previous layout where files have moved. Get any of those wrong and someone’s first experience with the tool is a broken CLAUDE.md.

So I ran the whole loop (research, plan, implement, review) on the task of building a test suite for install.sh. The plan came back with six scenarios. One of them seeded a project whose CLAUDE.md used the old marker format (an em-dash where the current one uses a hyphen) and asserted that re-running the installer replaced the block in place instead of duplicating it.

That test failed. Not because the test was wrong, but because the installer was.

The installer has to find the old HVE block in your CLAUDE.md and swap the new one in. To do that find-and-replace it leans on awk, handing it the replacement text through a -v variable. On Linux that’s fine. The version of awk that ships with macOS rejects a -v value that has embedded newlines, and the replacement block runs many lines long. So on a Mac, upgrading a project that still had the old marker format silently left CLAUDE.md untouched. No error, no warning: an upgrade that quietly did nothing to the one file that matters most.

I’d run that installer on my own Mac, and it had “worked.” But I’d only ever pointed it at fresh projects; the block-replacement path only fires when a CLAUDE.md already has the markers in it, on a re-run or an upgrade, and I’d never tested that by hand. The bug surfaced because the plan phase enumerated an edge case I wouldn’t have bothered to check, and the implement phase turned it into an assertion that actually ran it. The fix was small (write the block to a temp file and read it back into awk with getline), but without the test I’d have shipped it, and anyone on macOS upgrading from an early version would have hit it. The workflow built a test suite, and the test suite caught a bug in the thing that installs the workflow.

Keeping the docs honest

This README for the project eventually got too long to scan, so I split it into a lean front page plus a docs/ folder, then ran /hve-doc-ops to check I hadn’t broken anything in the move. Doc-ops does three things: checks structural conventions, hunts for gaps, and (the part I lean on most) verifies the docs against the actual code. It confirmed that every command, subagent, and instruction file named in the docs still matched what was in the repo, that every cross-link and anchor resolved, and that the example artifacts the README points to still existed.

Documentation drifts away from code the moment you stop looking. Having a command whose entire job is to cross-reference the two on demand means the prose can be checked as routinely as the tests.

A task, end to end

Using hve-claude is mostly just typing one command. For a full task, run /hve <your task> and it walks the whole loop, pausing for your approval between phases.

For anything bigger, though, I’d recommend running the phases as separate commands, starting with an empty context window: /hve-research, then /hve-plan, then /hve-implement, then /hve-review. Each phase starts in its own session, which keeps the context window clean and focused on just that step. The single /hve flow is convenient for smaller tasks, but on a large one a clean context per phase tends to give better results. Each phase command finds the previous phase’s artifact on disk automatically, so there’s no file juggling and nothing to copy around.

Say your task is “add rate limiting to the public API.”

1. Research. You hand off the request. You run /hve-research "add rate limiting to the public API" and pass along whatever specs you have: the limits, which endpoints, any storage constraints. The researcher goes and answers questions instead of writing code. Where do requests enter? Is there middleware already? What does the storage layer look like? It writes findings to an artifact, and every claim points at a real file:line so you can verify it instead of trusting it. Nothing gets built yet.

2. Plan. You turn findings into a path. You run /hve-plan. It reads that research artifact and produces a phased plan: ordered steps, files to touch, dependencies between them. A plan validator checks the plan back against the research for gaps before you commit. You read the summary, adjust if something looks off, and move on.

This gap between plan and implement is also where /hve-challenge earns its keep. It interrogates your plan as a skeptic who wasn’t in the room, one pointed question at a time. The first time I ran it, it pushed on my test design and a couple of edge cases I’d waved off, and it shook loose a small feature idea I hadn’t thought of. It’s exactly the step you’re tempted to skip when you think you already know the answer.

3. Implement. You let it build to spec. You run /hve-implement. It works through the plan one phase at a time, writing code and keeping a running changes log. Because the “how” was already settled during planning, this part stays focused instead of wandering.

4. Review. You get a second set of eyes. You run /hve-review. Validators compare the actual changes against the plan and grade what they find as Critical, Major, or Minor. This is the step that catches what you’d otherwise miss: spec drift, a skipped edge case, a secret that slipped into a committed file.

One more thing worth knowing. Before any of this, the orchestrator sizes up the task and tells you what it found. If it thinks the task is simple, it asks whether you want to skip the full loop and just do it directly, or run the whole thing anyway. You stay in control of how much ceremony a given task gets.

The throughline across all four phases: each one hands off through a file, not chat history. You can close your laptop after research, come back the next day, run /hve-plan, and it picks up right where you left off.

What’s different on Claude Code

A few things are genuinely specific to this version:

  • Confidence markers. Every key assumption in a handoff artifact carries a [HIGH], [MEDIUM], or [LOW] tag. A [LOW] marker in research is a flag to go verify that thing in the planning phase before you build on it. Small, but useful.
  • Platform-native delivery. Slash commands and subagents instead of a VS Code extension. The recommended path is a single natural-language prompt you copy out of the README and paste into Claude Code, which clones the repo, copies the files into your project, merges the HVE block into your CLAUDE.md, and cleans up after itself. There’s still an install.sh for terminal folks who’d rather run a script.
  • Memory wired to Claude Code’s built-in system. Microsoft has a memory agent too, but mine hooks into the platform’s own persistent memory rather than a custom store, so /hve-memory at the end of a session carries context into the next one.

What’s not done yet

This is a working tool, not a finished one.

Microsoft’s repo has a pr-reference skill I haven’t ported. It generates an XML reference of the commit history and diffs between two branches to feed PR descriptions and reviews. There’s also a stack of platform and domain bundles (Jira, Azure DevOps, GitHub, security, data science) that were never in scope for a lean core port. A couple of pieces (the pull-request and checkpoint helpers) still live as prompt templates rather than first-class slash commands.

And there’s cleanup still to do. The language instruction files used to sit in the project root, an arrangement I wanted to fix; they now live under .claude/instructions/ with the rest of the Claude Code config, and the installer migrates anyone on the old layout automatically, leaving any file you’ve customized in place rather than overwriting it. Mostly, though, the work right now is just using it, hitting edges, and finding the limitations. That’s the fun part.

Try it, and help me make it better

If any of this sounds like the kind of guardrails you’ve been wanting in Claude Code, the repo is right here. The fastest way in: paste this into Claude Code from inside your project directory and let it do the copying. No scripts to run, so it works the same on Mac, Linux, or Windows.

Please install the HVE Claude Code workflow into this project. Clone
https://github.com/kevrcress/hve-claude into a temporary directory, then
copy its hve-* commands and agents into my .claude/ folder, copy its
.claude/instructions/ and .claude/prompts/ files in, and merge everything above the
'## Your Project' heading in its CLAUDE.md into mine wrapped in these markers:
<!-- HVE:START - managed by hve-claude, do not edit between markers -->
...HVE content...
<!-- HVE:END -->
If my CLAUDE.md already has those markers, replace the content between them.
If it has no markers, prepend the wrapped block before my existing content.
Never touch anything outside the markers or below '## Your Project'. Add the
.claude-hve-tracking subagents and sandbox paths to my .gitignore, then
delete the temp clone and show me what changed.

Nothing here is destructive: it adds files and prepends a block to your CLAUDE.md, and it never touches your source code. The whole repo is open, so skim it first if you’d rather see exactly what you’re getting. Prefer to do it by hand? The repo’s docs lay out the same steps manually.

Once it’s installed, run /hve on a real task and see how the phases feel. If you hit a rough edge, find a bug, or want to port one of the pieces I haven’t gotten to yet, open an issue or a PR. I’d love the help, and the workflow gets better the more people put it through its paces.

The original is microsoft/hve-core. I’ve been leaning on hve-claude for all of my recent AI-assisted programming, including a hackathon recently, and there’s a follow-up coming about getting through that one with my sanity intact.