Overview
"How can we trust AI-generated code?" This question comes up in every board meeting, every investor call, every CTO conversation where AI-assisted development is on the table. Git doesn't have the answer. Your SDLC doesn't account for it. Most compliance frameworks can't explain it.
Grain CLI is our answer, built for our own workflow first. It traces AI-generated code back to the prompt and conversation that created it, reads Cursor's local telemetry to reconstruct provenance, evaluates commits against configurable governance policies, and does all of this locally, with no data leaving the developer's machine.
Open source under the MIT licence. Available on npm. Built with Node.js and TypeScript.
The Trust Gap
In many engineering teams using AI coding assistants, AI contribution rates have climbed past 80%, sometimes 90%. Entire features are authored in a single conversation between a developer and an autonomous agent. A commit that was carefully hand-crafted line by line and a commit generated wholesale by an autonomous AI agent look identical in your commit history. They should not be treated identically.
The regulatory environment is making this concrete. The EU AI Act introduces transparency obligations for AI systems entering enforcement through 2025-2027. GDPR already requires organisations to explain automated decision-making. Code provenance is becoming non-optional.
The Insight: The Data Already Exists
Cursor maintains three rich local data sources on every developer's machine:
- A SQLite database at
~/.cursor/ai-tracking/ai-code-tracking.dbcontaining AI contribution scores per commit, code fingerprints tied to conversations, and model attribution - Agent transcripts at
~/.cursor/projects/capturing complete conversation records: user prompts, assistant reasoning, tool calls, and generated code - Git, the ground truth for what actually shipped
Together, these three sources reconstruct the complete chain from prompt to commit. Nobody was connecting them. That realisation became Grain CLI.
How It Works
Grain's core technical challenge was that Cursor has no direct foreign key between commits and conversations. We solved this with a heuristic join: match changed files from a commit against AI code hash records within a time window, rank conversations by number of matching entries, and surface the top candidates.
The result is the complete provenance chain:
- grain commit-trend shows AI contribution percentage across recent commits, broken down by tab completions, composer/agent, and human-written code
- grain explain traces a specific commit to its originating conversation, showing the developer's prompt, the model used, the conversation mode (agent vs interactive), and the files it touched
- grain policy-check evaluates a commit against configurable governance rules and exits with code 1 on violations, enabling CI integration
- grain convo reads and displays AI conversation context for any identified conversation ID
The Policy Engine
Five policy types evaluate commits against configurable rules defined in a per-repository .grain/config.yaml:
- AI contribution thresholds: Warn above a configurable percentage (e.g. 60%), flag violations above a harder limit (e.g. 90%)
-
Restricted paths: Flag when AI has modified files in sensitive directories such as
auth/,billing/, orsecurity/ - Agent mode review: Flag commits produced by autonomous agent conversations, where the model was planning and executing tool calls with less human oversight per line
- Allowed models: Restrict which AI models may generate production code, supporting organisations with model governance requirements
- AI volume: Flag unusually large AI generation bursts as a proxy for unsupervised agent activity
Progressive Governance, Not Surveillance
The hardest design problem was not technical. Developers rightly resist surveillance of their work. If a tool exposes every experimental conversation to management, developers stop using AI assistants, which is a worse outcome than no governance at all.
Grain resolves this with progressive governance: four provenance levels that adapt disclosure to context.
- playground (level 0): AI%, model attribution, commit metadata only
- development (level 1): Adds conversation summaries, file list, agent mode flag
- production (level 2): Adds transcript excerpts, tool execution metadata
- regulated (level 3): Full transcripts, reasoning chains, complete tool call history
A developer experimenting in a hackathon repo does not need the same oversight as a team shipping code to a regulated financial system. The tool adapts rather than imposing a single posture everywhere.
Privacy Engineering
Privacy is not a feature flag. A sanitisation pipeline runs on every output path, always active:
- Absolute file paths are stripped to repository-relative paths
- Home directory references are replaced with
~across all platforms - API key patterns, bearer tokens, and high-entropy secrets are detected and redacted
- Internal URLs (localhost, private IP ranges, .local hostnames) are replaced with
[INTERNAL_URL] - Transcript excerpts are capped by provenance level, with no conversation content shown at level 0
Architecture
Grain is a layered Node.js CLI application built with TypeScript, targeting Node.js 18+. The architecture separates data access, analysis, rendering, and privacy into independent layers. Key design decisions:
- Read-only database access: the Cursor SQLite database is opened with
readonly: true. Grain observes but never modifies - Local-first, zero network: no HTTP requests, nothing leaves the developer's machine
- Privacy by default: sanitisation runs on every output with no opt-out flag
- Environment-aware output: the same commands produce different levels of detail based on the configured environment
What We Still Need to Build
Grain CLI proves the concept as a local macOS tool. The path to enterprise-grade AI governance involves several dimensions of work we have scoped:
- Cross-platform support: Linux and Windows path conventions, database locations, and home directory handling
- Git hook and CI integration: Pre-commit hooks and PR gates that run policy evaluation automatically
- Multi-editor support: Coverage beyond Cursor to VS Code with Copilot, JetBrains, Windsurf, and others
- Grain Central: An opt-in centralised aggregation service for organisation-wide AI adoption dashboards, model usage trends, and policy compliance reporting across repositories
- Risk scoring: Composite risk scores combining AI%, agent autonomy level, path sensitivity, and model governance status to feed existing code review workflows
The Deeper Shift
The commit is no longer the atomic unit of authorship. For decades, the commit has been the fundamental record of software change. But when an AI agent generates 340 lines of code from a single prompt, the commit captures the output, not the process. The conversation is the new unit of authorship.
Grain is a step toward making that shift concrete. The provenance chain from prompt to commit is reconstructable today, on every developer's machine, using data that already exists.
Status
Active development. Open source under MIT licence.