Beach — Innovation Lab

Overview

"How can we trust AI-generated code?" This question comes up in every board meeting, every investor call, every CTO conversation where AI-assisted development is on the table. Git doesn't have the answer. Your SDLC doesn't account for it. Most compliance frameworks can't explain it.

Grain CLI is our answer, built for our own workflow first. It traces AI-generated code back to the prompt and conversation that created it, reads Cursor's local telemetry to reconstruct provenance, evaluates commits against configurable governance policies, and does all of this locally, with no data leaving the developer's machine.

Open source under the MIT licence. Available on npm. Built with Node.js and TypeScript.

The Trust Gap

In many engineering teams using AI coding assistants, AI contribution rates have climbed past 80%, sometimes 90%. Entire features are authored in a single conversation between a developer and an autonomous agent. A commit that was carefully hand-crafted line by line and a commit generated wholesale by an autonomous AI agent look identical in your commit history. They should not be treated identically.

The regulatory environment is making this concrete. The EU AI Act introduces transparency obligations for AI systems entering enforcement through 2025-2027. GDPR already requires organisations to explain automated decision-making. Code provenance is becoming non-optional.

The Insight: The Data Already Exists

Cursor maintains three rich local data sources on every developer's machine:

A SQLite database at ~/.cursor/ai-tracking/ai-code-tracking.db containing AI contribution scores per commit, code fingerprints tied to conversations, and model attribution
Agent transcripts at ~/.cursor/projects/ capturing complete conversation records: user prompts, assistant reasoning, tool calls, and generated code
Git, the ground truth for what actually shipped

Together, these three sources reconstruct the complete chain from prompt to commit. Nobody was connecting them. That realisation became Grain CLI.

How It Works

Grain's core technical challenge was that Cursor has no direct foreign key between commits and conversations. We solved this with a heuristic join: match changed files from a commit against AI code hash records within a time window, rank conversations by number of matching entries, and surface the top candidates.

The result is the complete provenance chain:

grain commit-trend shows AI contribution percentage across recent commits, broken down by tab completions, composer/agent, and human-written code
grain explain traces a specific commit to its originating conversation, showing the developer's prompt, the model used, the conversation mode (agent vs interactive), and the files it touched
grain policy-check evaluates a commit against configurable governance rules and exits with code 1 on violations, enabling CI integration
grain convo reads and displays AI conversation context for any identified conversation ID

The Policy Engine

Five policy types evaluate commits against configurable rules defined in a per-repository .grain/config.yaml:

AI contribution thresholds: Warn above a configurable percentage (e.g. 60%), flag violations above a harder limit (e.g. 90%)
Restricted paths: Flag when AI has modified files in sensitive directories such as auth/, billing/, or security/
Agent mode review: Flag commits produced by autonomous agent conversations, where the model was planning and executing tool calls with less human oversight per line
Allowed models: Restrict which AI models may generate production code, supporting organisations with model governance requirements
AI volume: Flag unusually large AI generation bursts as a proxy for unsupervised agent activity

Progressive Governance, Not Surveillance

The hardest design problem was not technical. Developers rightly resist surveillance of their work. If a tool exposes every experimental conversation to management, developers stop using AI assistants, which is a worse outcome than no governance at all.

Grain resolves this with progressive governance: four provenance levels that adapt disclosure to context.

playground (level 0): AI%, model attribution, commit metadata only
development (level 1): Adds conversation summaries, file list, agent mode flag
production (level 2): Adds transcript excerpts, tool execution metadata
regulated (level 3): Full transcripts, reasoning chains, complete tool call history

A developer experimenting in a hackathon repo does not need the same oversight as a team shipping code to a regulated financial system. The tool adapts rather than imposing a single posture everywhere.

Privacy Engineering

Privacy is not a feature flag. A sanitisation pipeline runs on every output path, always active:

Absolute file paths are stripped to repository-relative paths
Home directory references are replaced with ~ across all platforms
API key patterns, bearer tokens, and high-entropy secrets are detected and redacted
Internal URLs (localhost, private IP ranges, .local hostnames) are replaced with [INTERNAL_URL]
Transcript excerpts are capped by provenance level, with no conversation content shown at level 0

Architecture

Grain is a layered Node.js CLI application built with TypeScript, targeting Node.js 18+. The architecture separates data access, analysis, rendering, and privacy into independent layers. Key design decisions:

Read-only database access: the Cursor SQLite database is opened with readonly: true. Grain observes but never modifies
Local-first, zero network: no HTTP requests, nothing leaves the developer's machine
Privacy by default: sanitisation runs on every output with no opt-out flag
Environment-aware output: the same commands produce different levels of detail based on the configured environment

What We Still Need to Build

Grain CLI proves the concept as a local macOS tool. The path to enterprise-grade AI governance involves several dimensions of work we have scoped:

Cross-platform support: Linux and Windows path conventions, database locations, and home directory handling
Git hook and CI integration: Pre-commit hooks and PR gates that run policy evaluation automatically
Multi-editor support: Coverage beyond Cursor to VS Code with Copilot, JetBrains, Windsurf, and others
Grain Central: An opt-in centralised aggregation service for organisation-wide AI adoption dashboards, model usage trends, and policy compliance reporting across repositories
Risk scoring: Composite risk scores combining AI%, agent autonomy level, path sensitivity, and model governance status to feed existing code review workflows

The Deeper Shift

The commit is no longer the atomic unit of authorship. For decades, the commit has been the fundamental record of software change. But when an AI agent generates 340 lines of code from a single prompt, the commit captures the output, not the process. The conversation is the new unit of authorship.

Grain is a step toward making that shift concrete. The provenance chain from prompt to commit is reconstructable today, on every developer's machine, using data that already exists.

Status

Active development. Open source under MIT licence.

grain-cli.getforge.io github.com/beachio/grain-cli

Grain CLI