Grant-scale execution plan

CML $75k–$100k Grant Plan

A credible funding ask for turning Causal Memory Layer from an early open-source causal audit artifact into a benchmark-backed, externally reproducible research prototype for causal-validity checking in agentic AI workflows.

One-sentence ask

We request $75k–$100k to expand CML into a reproducible benchmark and external-validation package for detecting causally invalid actions in agentic AI workflows.

Short thesis

AI agents increasingly perform actions, not only generate text. In high-stakes workflows, an action can succeed operationally while lacking valid authorization, approval, intent, or responsibility lineage.

What happened is not enough.
We need to know why it was allowed.

Current baseline

CML already has a Python causal validation and audit engine, causal chain reconstruction utilities, CLI and API surfaces, deterministic benchmark fixtures, tracked benchmark results, Docker demo walkthrough, grant evidence package, explicit non-claims, LTP/CML architecture bridge, and reviewer checklist.

Total cases: 6
Matched cases: 6
Mismatches: 0
Expected passed / failed: 3 / 3
Predicted passed / failed: 3 / 3

What the grant will produce

30–50 deterministic benchmark fixtures
10–12 causal invalidity failure classes
machine-readable expected findings
Markdown + JSON benchmark reports
2–5 external validation notes
Docker-based reproducibility path
short technical report
clear benchmark limitations and non-claims

Why $75k–$100k is justified

A smaller grant can fund maintenance and documentation. A $75k–$100k grant funds a complete evidence package: benchmark taxonomy, fixture expansion, expected finding metadata, report generation, external validation, technical report, API/demo hardening, and integration boundary docs.

The value is not only code. The value is producing a reusable evaluation artifact that other researchers and engineers can run, inspect, critique, and extend.

Workstreams

1. Benchmark taxonomy and fixture expansion

Move from 6 curated fixtures to 30–50 benchmark fixtures across 10–12 causal failure classes.

documented causal invalidity taxonomy
fixture naming convention
positive and negative controls
expected finding metadata
expanded benchmark fixtures

2. Benchmark runner and report generation

Make benchmark outputs easy to reproduce, compare, and cite.

Markdown report generation
JSON result export
benchmark version metadata
commit SHA and environment metadata
mismatch reporting

3. External validation package

Show that reviewers outside the project can reproduce CML results from public instructions.

validation protocol
clean environment instructions
external validation note template
2–5 validation notes
reproduction problems tracked as issues

4. Docker/API demo hardening

Make the demo path reviewer-friendly and reliable on common local environments.

5. Technical report

Publish a short technical report explaining the model, benchmark method, results, and limitations.

6. Integration boundaries

Clarify how CML relates to LTP, T-Trace, CaPU, TTM DB, logs, observability, and runtime policy systems.

Budget model

$75k ask

Category	Amount	Purpose
Benchmark expansion	$25k	taxonomy, fixtures, expected findings, controls
Runner/reporting	$12k	JSON/Markdown reports, metadata, mismatch summaries
External validation	$12k	protocol, validator support, reproduction issue handling
Docker/API hardening	$8k	walkthrough reliability, payloads, demo scripts
Technical report	$10k	methodology, results, limitations, publication-quality draft
Maintenance/community	$8k	contributor review, docs polish, CI support

$100k ask

Category	Amount	Purpose
Benchmark expansion	$32k	50 fixtures, 12 failure classes, stronger controls
Runner/reporting	$15k	richer reports, version comparison, machine-readable outputs
External validation	$18k	3–5 validators, validation notes, reproduction issue loop
Docker/API hardening	$10k	stronger local demo, API examples, troubleshooting
Technical report	$15k	full technical report + publishable artifact
Integration boundary docs	$5k	LTP/T-Trace/CaPU boundary clarity
Maintenance/community	$5k	contributor coordination and review

Milestones

Month 1: taxonomy, fixture metadata, initial 12–15 fixtures
Month 2: 25–35 fixtures, positive/negative controls, report outputs
Month 3: external validation, Docker walkthrough improvements, reproduction issue loop
Month 4: 30–50 fixtures, final benchmark report, technical report draft

Success criteria

I can run CML locally, reproduce benchmark findings, inspect expected results,
read external validation notes, and understand exactly what the benchmark proves
and does not prove.

What this grant does not claim

This grant does not claim that CML will solve AI alignment, provide certified compliance, replace IAM/SIEM/EDR/observability stacks, prevent all unsafe actions, or prove a deployed AI system is safe.

CML will provide a reproducible benchmark-backed research prototype for causal-validity checking in structured agentic action traces.

Best grant ask wording

For $75k

We request $75,000 to expand CML into a benchmark-backed, externally reproducible research artifact for causal-validity checking in agentic AI workflows, including 30–50 fixtures, report generation, external validation notes, and a technical report.

For $100k

We request $100,000 to produce a more complete open-source causal-validity evaluation package for agentic AI workflows, including expanded benchmarks, machine-readable expected findings, external validation across multiple reviewers, Docker/API reproducibility, and a publication-ready technical report.

Bottom line

The $75k–$100k case is credible if the application frames CML as open-source AI safety research infrastructure for reproducible causal-validity evaluation, not as a finished production compliance platform.