Getting Started With Contexta

This guide gets you from installation to a captured metric from work that actually ran.

What You'll Learn

how to install Contexta for a first capture
how to create a local .contexta/ workspace
how an executed workflow produces an observable metric
what to inspect after the run

Prerequisites

Python >=3.14
a writable directory for .contexta/

No cloud account, external database, or collector process is required.

Install

uv add "contexta[sklearn]"

Run Your First Capture

This example runs a small calculation and records the resulting metric.

from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

from contexta import Contexta


features, targets = load_diabetes(return_X_y=True)
train_x, test_x, train_y, test_y = train_test_split(
    features, targets, test_size=0.2, random_state=42
)

model = LinearRegression()
ctx = Contexta(workspace=".contexta", config={"project_name": "getting-started"})

with ctx.run("training") as run:
    with run.stage("train"):
        model.fit(train_x, train_y)

    with run.stage("evaluate") as stage:
        score = r2_score(test_y, model.predict(test_x))
        stage.metric("r2", score, unit="ratio")

print(f"Captured run: {run.ref}")
print(f"Measured r2: {score:.3f}")

Save the code as contexta_example.py, then run it in the environment where you installed Contexta:

uv run contexta_example.py

What Happened

The terminal output tells you that one run was observed and that its evaluation produced an r2 value:

Captured run: run:getting-started.training
Measured r2: 0.453

Contexta also writes the evidence to:

.contexta/
  cache/capture/record.jsonl

record.jsonl is a JSON Lines file: each line is one captured observation. The line created by this example contains information like this:

{
  "captured_at": "2026-05-25T10:26:03.502991Z",
  "payload": {
    "envelope": {
      "record_type": "metric",
      "run_ref": "run:getting-started.training",
      "stage_execution_ref": "stage:getting-started.training.evaluate",
      "completeness_marker": "complete",
      "degradation_marker": "none"
    },
    "payload": {
      "metric_key": "r2",
      "value": 0.4526027629719198,
      "unit": "ratio"
    }
  },
  "payload_type": "MetricRecord",
  "sink_name": "local-jsonl"
}

You can read that record as:

Field	Plain-language meaning
`payload_type: "MetricRecord"`	This observation is a measured number, not a log message or file.
`metric_key: "r2"`	The number measures the evaluation score calculated by the program.
`value: 0.4526...`	This is the actual result measured in this run.
`run_ref`	The result belongs to the `training` run in the `getting-started` project.
`stage_execution_ref`	The result was produced during the `evaluate` step, not during training.
`captured_at`	When Contexta stored this observation.
`completeness_marker` / `degradation_marker`	Contexta captured this record completely and did not mark it as degraded.
`sink_name: "local-jsonl"`	The record was stored locally in this JSON Lines file.

In other words, Contexta did not only save 0.453. It saved what that number means, which run and step produced it, when it was captured, and whether the capture was complete.

Core Concepts

Plane	What it stores
Metadata	Projects, runs, stages, environments, deployments, batches, samples
Records	Metrics, structured events, spans, degraded markers
Artifacts	Models, checkpoints, prompt/evaluation files, reports

Where To Go Next

Capture Evidence explains what the real workflows preserve.
Compare Runs trains and compares actual candidates.
Common Workflows covers reports, diagnostics, and lineage.
API Reference documents the Python boundaries.

What You'll Learn​

Prerequisites​

Install​

Run Your First Capture​

What Happened​

Core Concepts​

Where To Go Next​