Skip to main content

Getting Started With Contexta

This guide gets you from installation to a captured metric from work that actually ran.

What You'll Learn

  • how to install Contexta for a first capture
  • how to create a local .contexta/ workspace
  • how an executed workflow produces an observable metric
  • what to inspect after the run

Prerequisites

  • Python >=3.14
  • a writable directory for .contexta/

No cloud account, external database, or collector process is required.

Install

uv add "contexta[sklearn]"

Run Your First Capture

This example runs a small calculation and records the resulting metric.

from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

from contexta import Contexta


features, targets = load_diabetes(return_X_y=True)
train_x, test_x, train_y, test_y = train_test_split(
features, targets, test_size=0.2, random_state=42
)

model = LinearRegression()
ctx = Contexta(workspace=".contexta", config={"project_name": "getting-started"})

with ctx.run("training") as run:
with run.stage("train"):
model.fit(train_x, train_y)

with run.stage("evaluate") as stage:
score = r2_score(test_y, model.predict(test_x))
stage.metric("r2", score, unit="ratio")

print(f"Captured run: {run.ref}")
print(f"Measured r2: {score:.3f}")

Save the code as contexta_example.py, then run it in the environment where you installed Contexta:

uv run contexta_example.py

What Happened

The terminal output tells you that one run was observed and that its evaluation produced an r2 value:

Captured run: run:getting-started.training
Measured r2: 0.453

Contexta also writes the evidence to:

.contexta/
cache/capture/record.jsonl

record.jsonl is a JSON Lines file: each line is one captured observation. The line created by this example contains information like this:

{
"captured_at": "2026-05-25T10:26:03.502991Z",
"payload": {
"envelope": {
"record_type": "metric",
"run_ref": "run:getting-started.training",
"stage_execution_ref": "stage:getting-started.training.evaluate",
"completeness_marker": "complete",
"degradation_marker": "none"
},
"payload": {
"metric_key": "r2",
"value": 0.4526027629719198,
"unit": "ratio"
}
},
"payload_type": "MetricRecord",
"sink_name": "local-jsonl"
}

You can read that record as:

FieldPlain-language meaning
payload_type: "MetricRecord"This observation is a measured number, not a log message or file.
metric_key: "r2"The number measures the evaluation score calculated by the program.
value: 0.4526...This is the actual result measured in this run.
run_refThe result belongs to the training run in the getting-started project.
stage_execution_refThe result was produced during the evaluate step, not during training.
captured_atWhen Contexta stored this observation.
completeness_marker / degradation_markerContexta captured this record completely and did not mark it as degraded.
sink_name: "local-jsonl"The record was stored locally in this JSON Lines file.

In other words, Contexta did not only save 0.453. It saved what that number means, which run and step produced it, when it was captured, and whether the capture was complete.

Core Concepts

PlaneWhat it stores
MetadataProjects, runs, stages, environments, deployments, batches, samples
RecordsMetrics, structured events, spans, degraded markers
ArtifactsModels, checkpoints, prompt/evaluation files, reports

Where To Go Next