Grant-scale execution plan

CML $75k–$100k Grant Plan

A credible funding ask for turning Causal Memory Layer from an early open-source causal audit artifact into a benchmark-backed, externally reproducible research prototype for causal-validity checking in agentic AI workflows.

One-sentence ask

We request $75k–$100k to expand CML into a reproducible benchmark and external-validation package for detecting causally invalid actions in agentic AI workflows.

Short thesis

AI agents increasingly perform actions, not only generate text. In high-stakes workflows, an action can succeed operationally while lacking valid authorization, approval, intent, or responsibility lineage.

What happened is not enough.
We need to know why it was allowed.

Current baseline

CML already has a Python causal validation and audit engine, causal chain reconstruction utilities, CLI and API surfaces, deterministic benchmark fixtures, tracked benchmark results, Docker demo walkthrough, grant evidence package, explicit non-claims, LTP/CML architecture bridge, and reviewer checklist.

Total cases: 6
Matched cases: 6
Mismatches: 0
Expected passed / failed: 3 / 3
Predicted passed / failed: 3 / 3

What the grant will produce

30–50 deterministic benchmark fixtures
10–12 causal invalidity failure classes
machine-readable expected findings
Markdown + JSON benchmark reports
2–5 external validation notes
Docker-based reproducibility path
short technical report
clear benchmark limitations and non-claims

Why $75k–$100k is justified

A smaller grant can fund maintenance and documentation. A $75k–$100k grant funds a complete evidence package: benchmark taxonomy, fixture expansion, expected finding metadata, report generation, external validation, technical report, API/demo hardening, and integration boundary docs.

The value is not only code. The value is producing a reusable evaluation artifact that other researchers and engineers can run, inspect, critique, and extend.

Workstreams

1. Benchmark taxonomy and fixture expansion

Move from 6 curated fixtures to 30–50 benchmark fixtures across 10–12 causal failure classes.

2. Benchmark runner and report generation

Make benchmark outputs easy to reproduce, compare, and cite.

3. External validation package

Show that reviewers outside the project can reproduce CML results from public instructions.

4. Docker/API demo hardening

Make the demo path reviewer-friendly and reliable on common local environments.

5. Technical report

Publish a short technical report explaining the model, benchmark method, results, and limitations.

6. Integration boundaries

Clarify how CML relates to LTP, T-Trace, CaPU, TTM DB, logs, observability, and runtime policy systems.

Budget model

$75k ask

CategoryAmountPurpose
Benchmark expansion$25ktaxonomy, fixtures, expected findings, controls
Runner/reporting$12kJSON/Markdown reports, metadata, mismatch summaries
External validation$12kprotocol, validator support, reproduction issue handling
Docker/API hardening$8kwalkthrough reliability, payloads, demo scripts
Technical report$10kmethodology, results, limitations, publication-quality draft
Maintenance/community$8kcontributor review, docs polish, CI support

$100k ask

CategoryAmountPurpose
Benchmark expansion$32k50 fixtures, 12 failure classes, stronger controls
Runner/reporting$15kricher reports, version comparison, machine-readable outputs
External validation$18k3–5 validators, validation notes, reproduction issue loop
Docker/API hardening$10kstronger local demo, API examples, troubleshooting
Technical report$15kfull technical report + publishable artifact
Integration boundary docs$5kLTP/T-Trace/CaPU boundary clarity
Maintenance/community$5kcontributor coordination and review

Milestones

Success criteria

I can run CML locally, reproduce benchmark findings, inspect expected results,
read external validation notes, and understand exactly what the benchmark proves
and does not prove.

What this grant does not claim

This grant does not claim that CML will solve AI alignment, provide certified compliance, replace IAM/SIEM/EDR/observability stacks, prevent all unsafe actions, or prove a deployed AI system is safe.

CML will provide a reproducible benchmark-backed research prototype for causal-validity checking in structured agentic action traces.

Best grant ask wording

For $75k

We request $75,000 to expand CML into a benchmark-backed, externally reproducible research artifact for causal-validity checking in agentic AI workflows, including 30–50 fixtures, report generation, external validation notes, and a technical report.

For $100k

We request $100,000 to produce a more complete open-source causal-validity evaluation package for agentic AI workflows, including expanded benchmarks, machine-readable expected findings, external validation across multiple reviewers, Docker/API reproducibility, and a publication-ready technical report.

Bottom line

The $75k–$100k case is credible if the application frames CML as open-source AI safety research infrastructure for reproducible causal-validity evaluation, not as a finished production compliance platform.