Skip to main content

Contexta Style Guide

This page captures maintainers' recommended practices for instrumenting ML systems.

Priority A: Essential

Observe Real Work

Use metrics obtained from the operation being observed. Workflow examples should perform their operation before recording its results.

"""Train a real regression model and capture its measured evidence."""

import pickle
from pathlib import Path

from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split

from contexta import Contexta
from contexta.capture import LocalJsonlSink


features, targets = load_diabetes(return_X_y=True)
train_x, test_x, train_y, test_y = train_test_split(
features, targets, test_size=0.2, random_state=42
)

workspace = Path(".contexta")
ctx = Contexta(workspace=str(workspace), config={"project_name": "diabetes-regression"})
local_sink = next(sink for sink in ctx.sinks if isinstance(sink, LocalJsonlSink))
model = LinearRegression()

with ctx.run("linear-regression", dataset_ref="dataset:sklearn.diabetes") as run:
run.event(
"dataset.loaded",
message="Loaded the scikit-learn diabetes dataset",
attributes={"rows": len(features), "features": features.shape[1]},
)
with run.stage("train"):
model.fit(train_x, train_y)

with run.stage("evaluate") as stage:
predictions = model.predict(test_x)
r2 = r2_score(test_y, predictions)
mae = mean_absolute_error(test_y, predictions)
stage.metric("r2", r2, unit="ratio")
stage.metric("mae", mae)

model_path = workspace / "models" / "linear-regression.pkl"
model_path.parent.mkdir(parents=True, exist_ok=True)
model_path.write_bytes(pickle.dumps(model))
run.register_artifact("model", str(model_path), attributes={"format": "pickle"})

records_path = local_sink.file_path_for("RECORD").relative_to(Path.cwd())

print(f"Captured run: {run.ref}")
print(f"Measured r2: {r2:.3f}; mae: {mae:.3f}")
print(f"Records: {records_path.as_posix()}")
print(f"Model artifact: {model_path.as_posix()}")

Avoid hardcoded "good" metrics or placeholder model artifacts in workflow examples. They explain syntax but misrepresent observability.

Capture Enough Context For Review

Every important run should include the input or dataset reference, meaningful stage names, measured metrics, and any artifact a reviewer would inspect or promote.

Keep Workspaces Disposable In Examples

Run copied examples in a practice directory so their local .contexta/ workspace does not mix with a reader's real project history. Tests and maintainer-only runners may use temporary directories.

Prefer The Facade First

Start with Contexta for capture, query, comparison, diagnostics, lineage, and reports. Move to direct stores only for storage internals or advanced recovery.

Keep External Cost Optional

Keep external credentials, network behavior, and billing out of introductory examples unless the page explicitly teaches those concerns.

Examples should print a run ref, measured score, artifact path, report path, diagnostic summary, or workspace location.

Share Executable Source

Display checked example files in docs rather than maintaining slightly different inline copies across pages and locales.

Localize Prose, Not Code

Korean documentation should display the same runnable source as English unless a localized output is part of what the page teaches.