How to put Claude in the loop and keep your audit trail intact

If you manage projects, you probably already use AI somewhere in the workflow. Most teams’ AI sessions look like this:

“What should we work on next?” “Here’s a plan: epic, three features, six stories under each…” “OK, I’ll start on the first story.”

…and then the plan stays in the chat transcript. The board is still empty. Half the day later, when someone asks “what is everyone working on?” or “what did the AI ship this week?”, that planning conversation is buried four scrolls up in someone else’s chat history. The AI did the work; the project record is missing.

For project managers this is the worst of both worlds: AI is in your team but not in your tracking. You can’t audit it. You can’t run a sprint review on it. You can’t explain to a client “here’s what shipped and who shipped it” when half of “who” was a Claude session that left no trail.

We hit that gap one too many times in our own work and decided to close it. The result is project-planner, a Claude Code skill (and a sibling MCP server) that lets you — or your developers, or AI agents themselves — say “what’s next?”, “I’ll start on FLNKA-22”, or “bug: the resize handle stopped working” and have those words land as real work items on a real project board. With status transitions, narrative summaries, screenshot attachments, and a full audit row per AI action — all visible to the project manager in real time.

This post has three parts:

What this looks like for project managers — the visibility, audit, and governance story. AI moves from a sidebar to a tracked participant.
How Claude Code skills work in general — the file layout, the frontmatter, trigger phrases, helper scripts, and the ~/.claude/skills/ convention. Useful even if you never write a skill yourself; it’s the universal pattern.
How we built the FastLinkIt planner specifically — the design decisions, the trade-offs, the things we’d do differently next time.

By the end you’ll know what AI-in-the-loop project tracking actually looks like in practice, enough about Claude skills to evaluate or extend the approach, and the planner’s full architecture in case you want to use it as a reference for your own integration.

Part 1 — For project managers: AI as a tracked participant

Before the technical detail, the value prop in PM terms.

The default failure mode of AI in software teams is shadow work. A developer pairs with Claude on a story, ships it, the work is fine, but the story on the board never moves. The status didn’t change. The assignee never reflected who (or what) actually picked it up. The narrative of what was done lives in a Slack DM with the dev, not on the work item. From a PM’s perspective the work was invisible — indistinguishable from no work happening at all until someone notices the commit.

This skill closes that. Every action through the planner produces a tracked, auditable record on the project board:

What you can see, in real time

Plans become work items. When Claude proposes a plan and the human approves it, items land on the board with the right hierarchy (Epic → Feature → Story → Task), priority, estimate, tags, and parentage. Sprint planning conversations stop being lossy — you have the items.
Pickups appear immediately. When Claude (or a teammate) starts on an item, it moves to In Progress and the Assignee column shows “Claude Code” (or whatever client name you configure). PMs watching the board see who’s on what without asking.
Wrap-ups carry narrative evidence. When work is closed, the planner attaches a markdown narrative to the work item — “replaced the hero copy, ran LCP profile in DevTools (2.1s now), pushed to main” — visible in the work item’s preview pane. No more “what did you actually change?” questions in standup.
Bugs ship with screenshots and structured repro. When someone pastes a screenshot and says “bug: …”, the bug lands with numbered repro steps, expected vs actual, affected file, and the screenshot attached as evidence. Triage moves faster because the acceptance criteria for “this is reproducible” is built in.
Blockers carry a reason. AI can’t silently fail — closing a task as Blocked requires an explicit reason that’s logged on the work item. The board shows you what’s stuck and why.

What you can audit, after the fact

Every AI action writes a WorkItemAgentRun row on the work item:

Field	What it captures
`AgentId`	Which client did the work — `claude-code`, `cursor`, `claude-desktop`, etc.
`RequestedByUserId`	Which human triggered the run (the PM doesn’t lose accountability)
`StartedAt` / `CompletedAt`	Timestamps for the work — used by sprint reports + time-tracking rollups
`DurationMs`	How long the AI session took
`Status`	Pending / Running / Completed / Blocked / Failed
`Narrative`	The wrap-up summary, rendered as Markdown
`BlockReason`	Why a Blocked run got stuck

Open any work item’s preview pane and you see the full AI activity history — every run, sortable, with the narrative readable inline. “What did the AI ship on FLNKA-23?” is a one-click answer, not a chat-log archaeology project.

For governance and compliance teams: this is the data shape you’d build a “show me all AI actions in the last 30 days” report against. It’s already on the board, already structured, already owned by the right user. Sprint retros and quarterly reviews can pull AI contributions into the same charts as human contributions because the shape is the same.

What it lets you control

Every AI write to the board is gated by an explicit human “yes”. The skill never auto-pushes. “Looks good” is not consent — only an explicit “y” / “yes” / “ship it”. So the PM (or the developer running the session) is always the last gate before a plan lands or a status flips. AI proposes; humans confirm; the board records. The loop is yours, not the model’s.

For a typical sprint workflow this means:

Sprint moment	What AI in the loop changes
Sprint planning	Plan candidates are drafted in seconds; PM trims + confirms what lands.
Daily standup	Board already shows what each agent is on (assignee column) and what shipped overnight (Review column with narratives). The “what did you do?” round becomes “did anything block you?”.
Sprint review	AI contributions are first-class line items with narratives, durations, and attachments. Stakeholder demos pull from the same board the team uses.
Retro	“Where did we lose time?” includes AI runs. Calibration data (estimate vs actual hours) covers AI sessions the same way it covers human work.
Audit / compliance	“Show me what AI did this week” is a query over `WorkItemAgentRun`, not a Slack-search expedition.

What it deliberately doesn’t do

No auto-push. Already mentioned twice — but it bears repeating. The PM (or dev) is always the last gate.
No bypass of access controls. AI sees only what the API key’s user can see. Plan-based limits, organisation membership, and per-project ownership all apply identically to AI sessions.
No code merge, today. Phase 1c (code execution + worktree-based diffs) is the next step on our roadmap; for now AI proposes narratives, humans implement. Worth knowing because it bounds the current scope: AI is a planning + tracking participant, not yet a shipping participant.

If you’re a PM evaluating this for your team — the headline is AI moves from a sidebar to a tracked participant, and you keep the same governance you already have for human work.

OK, on to the technical primer.

Part 2 — What is a Claude Code skill?

A Claude Code skill is a folder under ~/.claude/skills/ (personal) or .claude/skills/ (project-local, ships with the repo). Each skill folder has a SKILL.md with YAML frontmatter and a markdown body that Claude reads on every session. Optional helper scripts, templates, and documentation live alongside.

The minimal skill is a single file:

~/.claude/skills/my-skill/SKILL.md

…with frontmatter that names + describes the skill:

---
name: my-skill
description: One-line description of what this skill does. Used by Claude
             to decide whether the skill is relevant on a given turn.
---

# my-skill

When the user asks "<some trigger phrase>", do <something>.

That’s it. Claude Code auto-loads the skill on session start, reads the description, and applies the instructions when a relevant trigger comes up.

What goes inside the SKILL.md body?

The body is just markdown. There’s no rigid schema — it’s instructions for Claude. The patterns that work well:

A modes table at the top mapping user phrasings to behaviours. Claude scans for trigger phrases on every turn and picks the matching mode.
Per-mode sections with concrete steps Claude should follow: “call X, then surface Y, then wait for the user’s yes, then call Z”.
Conventions — output format, when to ask clarifying questions, when to fail loudly, when to fail silently.
What the skill does NOT do — explicit boundaries so Claude doesn’t drift into adjacent behaviour.

Helper scripts

Skills can ship shell scripts (or any executable) alongside SKILL.md. The convention is ~/.claude/skills/<name>/lib/. Claude can invoke them via the Bash tool — typical pattern is “run lib/my-helper.sh do-thing arg1 arg2”.

Helper scripts are useful when:

The skill calls a remote API and you don’t want Claude hand-rolling curl with auth headers each call.
The output needs structured parsing before Claude continues.
You want the same operations available outside Claude (CI, manual shell, scripts).

For our planner, we ship lib/flnk-plan.ps1 (PowerShell, Windows-first) and lib/flnk-plan.sh (Bash, POSIX) with subcommands like list, context, plan, start, complete, block, attach, assigned. Claude calls them from the SKILL.md instructions; humans can call them directly from any shell.

Project-local skills

Skills don’t have to live in the user’s home directory. Drop a folder under .claude/skills/<name>/ at the root of a git repo, commit it, and every teammate who clones the repo gets the skill auto-loaded when they open Claude Code there. This is huge for team distribution: the playbook ships with the codebase.

Auto-load resolution order:

<repo>/.claude/skills/<name>/SKILL.md (project-local, wins)
~/.claude/skills/<name>/SKILL.md (personal fallback)

The MCP alternative

Claude Code is one client. Cursor, Claude Desktop, Continue, and others speak the Model Context Protocol instead. MCP servers are stand-alone processes that expose typed tools to any MCP-aware client.

If you want your skill’s behaviour available across editors, ship an MCP server alongside the skill. They’re complementary:

The Claude Code skill is the natural-language wrapper; SKILL.md documents trigger phrases and per-mode flows.
The MCP server exposes the underlying operations as typed tools any client can call.

Both end up calling the same backend API. We do this for the planner — Plugins/claude-skill/ for Claude Code, Plugins/mcp-server/ (npm package @flnkit/mcp-server) for everything else.

Part 3 — Designing the FastLinkIt planner skill

We had three constraints going in:

Don’t let Claude push to the board without explicit confirmation. The board is shared team state. AI proposing items is fine; AI silently writing items is a recipe for noise.
Match the project’s methodology. If you’re on an Agile project, plans should come back as Epic → Feature → Story → Task hierarchies, not flat task lists. If you’re on Kanban, the opposite.
Round-trip the lifecycle. Picking up an item should move it to In Progress. Wrapping up should move it to Review with a narrative. Same lifecycle a teammate dragging the card would produce — agents are first-class teammates, not a side audit log.

Those three constraints shaped the five modes the skill exposes.

The five modes

Mode	Trigger	What happens
Plan	“What’s next for X?”	Claude reads the project’s methodology, fetches recent items to avoid duplicates, proposes a structured plan, waits for explicit “yes”, then POSTs the plan to `/api/projects/{id}/plan`.
Pickup	“I’ll start on FLNKA-22”	Calls `start_task` — server moves the item to In Progress, sets the assignee to the agent identifier (only when previously empty so we don’t trample human assignments), opens an audit row.
Wrap-up	“Done — fixed X, tests pass”	Calls `complete_task` with the user’s narrative as evidence. Item moves to Review with the narrative attached and visible in the work-item preview.
Bug	“Bug: the dropdown doesn’t close on outside click” + screenshot	Drafts a structured bug item (numbered repro / Expected / Actual / Affected file / Environment), uploads pasted screenshots as attachments, waits for explicit “yes”.
Scan	“What’s on my plate?”	Filters the board to items assigned to the agent identifier, groups by status (In Progress / Backlog / Review), suggests resume or pickup.

The trigger phrases aren’t slash commands — Claude infers them from natural language. “plan some work for the homepage redesign” and “give me a plan for X” both hit Plan mode.

Why methodology-aware structure matters

A single fact governs the rest of the design: work-item types are methodology-scoped. A Kanban project only allows task and bug. An Agile project allows the whole epic / feature / story / task / bug / subtask hierarchy. Submit the wrong type and the API rejects the plan.

So before Claude can propose a single item, it has to know the project’s methodology. The flow:

list_projects        → identify the project from the user's wording
get_project_context  → fetch methodology + columns + recent items + types
                       (cached for the session)
[propose plan]       → structured to match the methodology, references
                       column keys for initial_status, parents new items
                       under existing epics where appropriate
[user confirms "y"]
submit_plan          → server materialises items + parent relationships
                       + dependency edges in topological order

For Agile, the default proposal shape is hierarchical — one Epic at the root, 2-5 Features under it, Stories under each Feature, Tasks/Subtasks inside each Story. Tiny goals (one cohesive change) skip the epic and produce 1-3 stories directly. The user can always override mid-flight (“don’t bother with an epic, just stories”) and Claude reshapes before push.

For Waterfall, hierarchy gives way to dependency chains. Tasks parent into phases via parent_local_id, with FinishToStart edges via depends_on_local_ids so the Gantt and critical-path scheduler render correctly.

For Kanban, plans are flat. No artificial hierarchy.

The PROJECTS.md companion file

Skills can reference any file in the user’s repo via Claude’s normal file-reading. We use this for project-specific context that varies per team but stays stable across sessions: PROJECTS.md at the repo root.

What goes in there:

Project ids and aliases. The user types “acme” — Claude needs to know that resolves to the Acme website rebuild project, id 78995612-…, methodology Agile, prefix WEB.
Default project — when the user says “what’s next?” with no scope, which project to assume.
Writing conventions per work-item type — how stories should be worded (user-story form), what acceptance criteria look like, how bugs should structure their repro steps.
Sizing convention — PERT three-point estimates in hours, our team’s day-length baseline, the threshold above which an item should be split.

Having this in version-controlled markdown means the team agrees on the conventions, the planner respects them, and disagreements get resolved by editing one file rather than coaching Claude session-by-session.

“Download PROJECTS.md” — closing the setup loop

The biggest UX friction in early testing was step 0: filling in PROJECTS.md for the first time. Users had to:

Open the project in the FastLinkIt UI.
Find the project id in the URL bar.
Find the prefix in the edit form.
Find the methodology on the detail page.
Type all of those into a placeholder template.
Make up some aliases.

Six manual steps before they could ask Claude anything. So we shipped a Download PROJECTS.md button on every project’s detail page. One click, browser downloads a complete PROJECTS.md with:

The project’s actual values pre-filled.
Aliases auto-derived from the prefix + methodology + simplified name.
A Setup checklist section at the top with the resolved API base URL, a direct link to /Account/Manage/ApiKeys on this host, the env vars to set (also pre-filled), and a smoke-test prompt the user can copy-paste.

The button is a <a href download> pointing at /api/projects/{id}/export/projects-md. Server side, the endpoint generates the markdown by templating the project’s data into a static body of conventions and lifecycle rules. Auth schemes are extended to include Identity.Application (cookies) on this single action so the download works from the Blazor UI without an API-key dance.

The lifecycle round-trip

When Claude picks up an item, the server doesn’t just write a WorkItemAgentRun row — it transitions the work item’s status through the same IWorkItemRepository.UpdateAsync path a human dragging the card would use. Audit history fires. SignalR broadcasts. The board re-renders for any teammate watching. Assignee gets set to the agent identifier (e.g. claude-code) when previously empty, so the row clearly shows “Claude Code” in the Assignee column.

The critical bit: lifecycle transitions reuse the same column-resolution helpers as the in-process AI agent layer (ResolveInProgress / ResolveReview / ResolveDone / ResolveBlocked), so a Claude Code session moving an item to In Progress lands in the same column as the server-side orchestrator. One source of truth for “what does in-progress / done / blocked mean for this project’s column vocabulary”.

Bug mode + image attachments

The most common bug-report trigger is “I just hit X” + a screenshot. We made that flow first-class:

User pastes a screenshot in the same message as the description.
Claude reads the image’s disk path (Claude Code surfaces these in conversation).
Drafts the bug with structured repro steps, lists the screenshot in the proposal preview (📎 Attachments: filename.png (442 KB)).
On user’s “yes”: creates the bug via submit_plan, then uploads each attachment via attach_to_work_item to the new work-item id.
Reports back the work-item number + a one-line summary of what landed.

Server side, the attachment endpoint uses the existing /api/projects/work-items/{wid}/attachments (multipart, allowlisted by extension, 25 MB cap). Same path the human-uploaded attachments use — ships in the work-item preview pane’s Attachments section, viewable + downloadable like any other attachment.

Scan mode — the “where was I?” answer

Sessions don’t always start fresh. Often the user opens Claude Code on Monday and wants to know “what was I doing?” — items still assigned to the agent that didn’t get closed out before the weekend.

We added an assignee query parameter to the existing get_project_context endpoint that filters recentItems to items where AssigneeUserId matches. The skill calls it on triggers like “what’s on my plate?” or “pick up the next one”. Items come back grouped by status:

You have 4 items assigned to claude-code on Acme website rebuild:

In progress (1)
- WEB-3  As a visitor I want a clearer value prop above the fold
        ⚠ run from previous session may still be open

Backlog (2)
- WEB-4  As a visitor I want a faster LCP score under 2.5s    priority: high
- WEB-7  Wire the quote carousel to the CMS                   priority: normal

Review (1)
- WEB-3  …  (waiting for human review)

Want me to resume WEB-3, or pick up WEB-4 next?

Never auto-resumes. Always asks before calling start_task. The optional session-start passive mention (“I notice 1 item still in progress assigned to me”) fires once on the first turn and stays out of the way the rest of the session.

Try it yourself

If you want to play with it against your own work:

Sign up at flnk.it — Free plan works, no card required. Project management is on every plan.
Create a project at /projects/create. Pick Agile, give it a prefix.
Click Download PROJECTS.md on the project’s detail page. Drop it into your repo’s root.
Install the skill — see github.com/EnricoMR/FastLinkIt/tree/main/Plugins/claude-skill for the install steps. (For Cursor / Claude Desktop / Continue: use the MCP server on npm instead — same operations.)
Set the env vars the downloaded PROJECTS.md tells you to, with your API key.
Open Claude Code in the repo and ask: “What’s next for <your-project-alias>?”

That’s the loop.

Lessons we’d repeat (and one we wouldn’t)

A few things we’d do the same way next time:

Confirmation gate everywhere. Every state-changing call requires an explicit user “yes”. “Looks good” is not consent. The few times Claude got it wrong in early testing, the gate caught it.
Trigger phrases over slash commands. Natural-language triggers match how people actually speak about work. “Pick up the next one”, “bug: …”, “what’s on my plate?” — these are the words users used unprompted.
Server-side validation, client-side optimism. The submit_plan endpoint validates types against methodology, returns per-item errors in an array. Claude can propose anything; the server is the source of truth for what’s valid.
One file, one source of truth. PROJECTS.md is read on every session, downloaded with one click, version-controlled with the repo. No “settings page” to fall out of sync with reality.

The one we’d reconsider: filtering by assignee via a query param on /context. It works, but it overloads an endpoint that was originally just “give me the planning context for this project”. A dedicated /api/projects/{id}/work-items?assignee=… endpoint would have been cleaner. We’ll likely refactor when Phase 1c (code-execution) lands and needs richer filtering anyway.

What’s next

The next major step is Phase 1c — code execution: letting AI agents go beyond proposing narratives and actually write code into a worktree. Worktree-per-run, tool catalogue (read_file / write_file / run_command), cost guardrails, file-change review workflow. Open questions are still open; if any of this is interesting feel free to weigh in via the contact form.

Until then, the loop we have closes the planning gap. Plans go from chat to board in one round-trip. Lifecycle goes round-trip back to chat. The board stays current.

That’s the whole point.

If you found this useful, follow the FastLinkIt blog or subscribe to our newsletter for more like this. Questions, feedback, or ideas? Reply via the contact form.