Skip to main content

Evals And Claims

The public CLI repo includes deterministic product tests and small synthetic fixtures. It is the product claim surface, not a dump of exploratory research material or unreduced evaluation runs.

Public claims should stay tied to:

  • behavior visible in the CLI;
  • reproducible public fixtures;
  • demo transcripts generated from current commands;
  • documented command output;
  • privacy and storage guarantees that are true in the public repo.

For the current CLI boundary, see the public repository's EVALS.md.