The Commit Is No Longer the Unit of Authorship

We built Grain CLI to answer a question nobody could: how do you trace AI-generated code back to the conversation that created it? Here's what we found.

"How can we trust AI-generated code?"

I kept hearing this question. Not from developers, because developers know, roughly, because they are the ones prompting the models. It comes from investors during due diligence, from board members reviewing technology risk, from CTOs trying to modernise their SDLC without losing control of what ships to production.

I did not have a clean answer. So I decided to build one.

The Question Nobody Can Answer

In many engineering teams using AI coding assistants, AI contribution rates have climbed past 80%, sometimes past 90%. Entire features are authored in a single conversation between a developer and an autonomous agent. Thirty seconds of prompting can produce what once took a day of careful, hand-crafted code.

And yet, when leadership asks "which parts?" or "by which models?" or "with what level of human oversight?" the answer is almost always silence.

Git tracks who committed and what changed. It tells you nothing about how the code was produced. A commit that was carefully hand-crafted line by line and a commit generated wholesale by an autonomous AI agent look identical in your commit history. They should not be treated identically in your SDLC.

The EU AI Act is making code provenance non-optional. GDPR already requires organisations to explain automated decision-making. If your production software is substantially AI-generated, and increasingly it is, the question is not whether you will need to account for that. It is when.

I wanted to see how I would answer that question in my own work. So I built Grain CLI.

The Surprising Discovery

My starting assumption was that solving this problem would require new instrumentation: a plugin, a sidecar process, something that would need to be installed and configured and maintained. I was wrong.

The breakthrough came from a simple question: what does Cursor actually store on the developer's machine?

We expected to find basic telemetry. Maybe some usage counters, model preferences, settings. What we found was far more substantial. Cursor maintains three rich local data sources that, when connected, form a complete provenance chain.

There is a SQLite database at ~/.cursor/ai-tracking/ai-code-tracking.db containing AI contribution scores per commit, code fingerprints tied to conversations, and breakdowns by generation type: tab completions, composer/agent mode, and human-written code. There are complete agent transcripts stored on disk: every AI conversation, with the developer's prompt, the model's reasoning, the tool calls it executed, and the code it generated. And there is Git, the ground truth for what actually shipped.

Each source alone is interesting. Together, they reconstruct the chain from prompt to conversation to code generation to file changes to commit. The provenance data exists. Nobody was connecting the dots.

What We Had to Solve

The core technical challenge was that Cursor has no direct foreign key between commits and conversations. The database that tracks AI contribution per commit knows nothing directly about which conversation produced that contribution.

We solved this with a heuristic join: match the files changed in a commit against AI code hash records within a time window, rank conversations by the number of matching file entries, and surface the top candidates. The time window is asymmetric, a four-hour lookback to capture long sessions and a fifteen-minute lookahead for asynchronous processing. In practice, this produces high-confidence results. When a developer uses Cursor's composer to generate code in payments/stripe_webhook.ts and commits that file twenty minutes later, the matching records reliably point back to the originating conversation.

The result is the complete provenance chain. You can run grain explain HEAD against any commit and see: the AI contribution percentage, broken down by tab completions and agent activity; the linked conversations with the developer's prompt and the model used; the files each conversation touched; and, in environments configured for it, the full reasoning chain the AI followed.

The Hardest Problem Was Not Technical

The hardest design problem was philosophical.

Developers rightly resist surveillance of their work. If a tool logs every prompt and exposes every experimental conversation to management, developers stop using AI assistants, or find ways around the tooling. That is a worse outcome than no governance at all.

But organisations have legitimate needs. Production code that handles payments, authentication, or personal data needs traceability. Regulated industries need audit trails. Boards need to understand the risk profile of AI-generated code.

The resolution is what we call progressive governance: provenance disclosure that adapts to context. Grain defines four levels, each tied to a configured environment. A playground repo shows AI percentage and model attribution. A development environment adds conversation summaries. A production environment adds transcript excerpts and tool execution metadata. A regulated environment provides the complete audit trail.

A developer experimenting in a hackathon repo does not need the same oversight as a team shipping code to a financial system handling personal data. The tool adapts rather than imposing a single posture everywhere.

This felt important to get right. Grain is not bossware. It is a utility for the appropriate provenance workflow. The distinction matters for adoption, and adoption is the only path to governance that actually works.

What Running It Against Real Code Showed

The numbers are striking when you run Grain against an active AI-assisted codebase.

Commits at 94% AI contribution are not unusual. A commit that generates a complete test suite shows 100%. The breakdown between tab completions and agent/composer activity matters: a commit with 90% AI contribution from inline tab suggestions, where a developer reviewed every line, carries different risk than 90% from an autonomous agent session that generated an entire module from a single vague prompt.

The linked conversations tell a story. You can see a developer's prompt, the exact phrasing they used, alongside the model's plan, the files it decided to touch, and the code it wrote. For a code reviewer or security auditor, this is actionable information. For the developer, it is a transparent record of their own workflow. No surprises. No surveillance. Just context.

The policy engine adds teeth. Flag commits where AI modified files in your auth/ or billing/ directory. Require review for commits produced by agent mode sessions. Restrict which models are allowed to generate production code. These are the governance rules organisations actually need, and because the tool exits with code 1 on violations, they can plug into existing CI pipelines.

What We Learned About the SDLC

Building Grain clarified something about where software development is heading.

The commit has been the atomic unit of authorship for decades. Code review, blame, bisect, audit: everything flows through the commit graph. The commit tells you who changed what, when.

But when an AI agent generates 340 lines of code from a single prompt, the commit is a lossy record. It captures the output but not the process. The conversation is the new unit of authorship. The prompt, the model's reasoning, the tool calls, the iterations: this is where the engineering decisions actually happen. Our review processes, governance frameworks, and compliance structures need to catch up with that reality.

Grain is a step toward making that shift concrete. But the more important observation is structural: the data to do this already exists on every developer's machine. Every Cursor user already has a provenance trail they are not reading. The infrastructure for AI code governance does not need to be built from scratch. It needs to be connected.

What Is Next

Grain CLI today is a local macOS tool that proves the concept. The roadmap from here involves cross-platform support, Git hook and CI integration to make policy checks automatic, and multi-editor support beyond Cursor. The longer-term vision is Grain Central: an opt-in centralised service for organisation-wide visibility, aggregating provenance events from developer machines to power team-level dashboards and compliance reporting.

The architecture for that is clear. What it needs is adoption at the local layer first, and trust that the privacy guarantees hold as more data flows through the system.

If you are asking "how can we trust AI-generated code?" in your own organisation, Grain CLI is available now. Open source, MIT licence, runs locally, nothing leaves the developer's machine.

The provenance chain from prompt to commit is reconstructable today, using data that already exists. The question is whether engineering organisations will start connecting those dots before regulators require it.

If you are thinking about how AI-native development changes your platform's tooling, DX, and governance posture, our Developer Tools playbooks cover the assessment frameworks and patterns for building platforms that work for humans and agents alike.


Grain CLI will soon be open source on GitHub. Read the full technical deep-dive on the Beach Labs page.

Want to learn more?

We write about AI, product strategy, and the future of building. Get in touch to continue the conversation.

Start a conversation