Add OpenTelemetry
Scope: turn on, extend, and read distributed traces in an existing NetScript
workspace. OpenTelemetry is not a bolt-on here — @netscript/telemetry wraps
@opentelemetry/api and is wired into the services, the worker dispatcher, the scheduler,
and the subprocess task runtime the scaffold generates. Aspire stands up the OTLP collector
and the trace UI for you. This recipe shows where the instrumentation already lives, how to
add your own spans and structured logs, how traceparent propagates across the oRPC/HTTP
boundary and into job subprocesses, and how to watch it all land in the Aspire dashboard at
http://localhost:18888.
This is a task recipe, not a deep-dive. For the mental model behind spans, structured logs,
and the per-capability health endpoints, read Observability.
For the generated API surface, follow the telemetry and
logger reference pages, and the
Telemetry capability hub for the Learn / Do / Reference triplet.
Prerequisites
| Name | Type | Description |
|---|---|---|
netscript workspace |
netscript init |
An existing workspace. If you have none, scaffold one first — see the tutorials. |
aspire startning |
cd aspire && aspire start |
The AppHost provisions Postgres, Redis, the OTLP collector, and the dashboard. Start it BEFORE you expect traces. Dashboard at http://localhost:18888. |
@netscript/telemetry |
OTel facade |
Wraps @opentelemetry/api and ships the worker/scheduler/queue/SSE instrumentation. Already wired into the generated handlers — no install step. |
A service or plugin to trace |
services/users or plugins/workers |
The users service (:3001) and the workers/sagas/triggers/auth plugins all emit health + trace data once running. |
How the telemetry is already wired
Three layers ship instrumented out of the box. Knowing which is which tells you where you get spans for free and where you add your own.
| Name | Type | Description |
|---|---|---|
Service layer (real spans) |
@netscript/service |
RPC trace context (header extraction into ctx.traceHeaders) is ON by default — traceContext defaults to true when withRPC() is called without arguments. The OTel TracingPlugin that creates real spans is also always active; it is independent of the traceContext option. |
Worker runtime (real spans) |
job dispatcher + scheduler |
Job dispatch and execution, scheduler runs, and the task subprocess emit real OTel spans automatically via @netscript/telemetry — traceJobExecution, scheduler spans, task.execute. Traces show up in Aspire with no handler code. |
Scaffold job tools (stub spans today) |
createJobTools(ctx) |
log / progress / trace handed to defineJobHandler bodies. log.* is REAL; trace.addEvent / withChildSpan / recordProgress are no-op stubs in the scaffold (tracked debt, fix planned). For custom handler spans, call @netscript/telemetry helpers directly. |
OTLP export + UI |
http://localhost:4318 → :18888 |
The Aspire profile points OTLP at http://localhost:4318; the dashboard renders the collected traces and correlated structured logs at :18888. |
Step 1 — Bring up Aspire and confirm the collector
From the workspace root, start orchestration. The AppHost registers the OTLP collector and the dashboard, then boots Postgres, Redis, and every service/plugin resource.
cd aspire
aspire start
# dashboard: http://localhost:18888 (login token printed in the console)
Open http://localhost:18888, authenticate with the token Aspire printed, and select the Traces tab. With nothing exercised yet it is empty — that is expected. Leave it open; it updates live.
Step 2 — Generate a trace without writing any code
Both service-layer and worker-runtime tracing are real and need no code from you. Exercise a running surface and a trace appears.
# Workers API on :8091 — enqueue the sample health-check job by id.
# Dispatch + execution + scheduler spans are emitted automatically.
curl -s -X POST http://localhost:8091/api/v1/workers/jobs/workers-plugin-health-check/trigger
# then list recent executions
curl -s 'http://localhost:8091/api/v1/workers/executions?limit=10'
# Users oRPC service on :3001 — the request is traced at the service layer.
# oRPC services are served under /api/rpc/* (not /rpc).
curl -s -X POST http://localhost:3001/api/v1/users/list \
-H 'content-type: application/json' \
-d '{}'
# Triggers API on :8093 (raw Hono routes, not oRPC) — resolves the inbound
# trigger, whose enqueueJob action enqueues the workers health-check job,
# producing a connected dispatch + execution trace.
curl -s -X POST http://localhost:8093/api/v1/webhooks/inbound/generic \
-H 'content-type: application/json' \
-d '{}'
Refresh the Traces tab in the dashboard. You will see a trace for the request, with the
service or worker resource as the root span. Click it to expand the span tree, attributes, and
the structured logs correlated to that trace. A webhook that enqueues a job shows the inbound
request span and the resulting job-dispatch and job-execution spans — the dispatcher
propagates traceparent into the worker subprocess, so they share one trace.
Step 3 — Add your own spans in a job handler (the real way today)
The scaffold hands you createJobTools(ctx) so you can author against log, progress, and
trace. log.* is real today. The trace.* helpers are no-op stubs in the scaffold — if
you want a real custom span around a unit of work right now, call the @netscript/telemetry
instrumentation helpers directly. Keep the trace.* calls if you like authoring against the
forward-compatible shape, but do not rely on them for live spans yet.
import {
createFailureResult,
createSuccessResult,
defineJobHandler,
} from '@netscript/plugin-workers-core';
// Call the instrumentation helpers directly for a REAL child span today.
import {
recordJobProgress,
withChildSpan,
} from '@netscript/telemetry/instrumentation';
import { createJobTools } from './job-tools.ts';
const handler = defineJobHandler(async (ctx) => {
const { log } = createJobTools(ctx); // log.* is real today
log.info('Starting workers plugin health check');
// withChildSpan opens a real span as a child of the active job span
// and gives you a handle to attach queryable attributes.
const envOk = await withChildSpan('check.environment', async (span) => {
span.setAttribute('check.name', 'environment');
return Boolean(Deno.env.get('PORT'));
});
if (!envOk) return createFailureResult('environment check failed');
// Emit a real job.progress event (current / total / percentage).
recordJobProgress(1, 1);
return createSuccessResult({ status: 'healthy' });
});
export default Object.assign(handler, {
id: 'workers-plugin-health-check' as const,
});
import { defineJobHandler } from '@netscript/plugin-workers-core';
import { createJobTools } from './job-tools.ts';
// createJobTools(ctx) returns:
// log -> console.* wrappers: info/warn/error/debug (plain stdout today, NOT trace-correlated)
// progress -> progress(percent, message) (forwards to ctx.reportProgress)
// trace -> { addEvent, recordProgress, withChildSpan } (STUB spans today)
// traceContext -> { traceparent, tracestate } for manual propagation
//
// Authoring against trace.* is fine — your code is ready for when the
// scaffold helpers are upgraded — but these calls emit NO real spans today.
// Prefer @netscript/telemetry helpers (other tab) for spans you need now.
const handler = defineJobHandler(async (ctx) => {
const { log, trace, traceContext } = createJobTools(ctx);
log.info('health check', { traceparent: traceContext.traceparent });
trace.addEvent('health_check.started'); // no-op in the scaffold today
return { ok: true } as const;
});
export default handler;
// @netscript/telemetry/instrumentation — the real worker helpers:
// traceJobExecution(...) wrap a whole job run in a span (dispatcher uses this)
// withChildSpan(name, fn) open a child span under the active context
// addJobStepEvent(...) emit a job.step.* event
// recordJobProgress(c, t) emit a job.progress event (current / total / %)
// runTracedJob(...) run a job body inside subprocess trace context
// startWorkerSpan(...) worker lifecycle span
//
// Companion subpaths:
// @netscript/telemetry/context active trace context + traceparent helpers
// @netscript/telemetry/attributes canonical OTel attribute keys
// @netscript/telemetry/orpc oRPC client/server trace interceptors
// Authoritative export map: /reference/telemetry/
Step 4 — Extend service tracing and propagate traceparent
The services keep trace context across the oRPC/HTTP boundary. The workers service builds its app with the fluent builder and opts the RPC layer into trace context explicitly:
import { createService } from '@netscript/service';
import { router } from './router.ts';
// The plugin API services use the fluent builder. withRPC({ traceContext: true })
// threads the incoming traceparent through to handlers. oRPC is served under
// /api/rpc/* by default.
await createService(router, { name: 'workers', version: '1.0.0', port: 8091 })
.withCors()
.withLogger()
.withOpenAPI({ title: 'Workers API' })
.withDatabase(dbClient)
.withRPC({ traceContext: true })
.withHealth()
.serve();
import { defineService } from '@netscript/service';
import { router } from './router.ts';
// Local services use the one-call form. Tracing is enabled by the framework;
// debug: true surfaces verbose request/trace logs while you wire things up.
await defineService(router, {
name: 'users',
version: '1.0.0',
port: parseInt(Deno.env.get('PORT') || '3001'),
openapi: { title: 'Users API', description: 'users service' },
debug: true,
});
# When one service calls another over HTTP, forward the W3C traceparent
# header so both spans land in ONE trace in the dashboard.
curl -s http://localhost:3001/api/v1/users/list \
-X POST -H 'content-type: application/json' \
-H 'traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01' \
-d '{}'
Step 5 — Read the traces
Back in the dashboard at http://localhost:18888:
- Traces — every request and job as a waterfall of spans. Click a root span to drill into
children, durations, and attributes (the ones you set via
span.setAttribute(...)). Job runs showjob.started/job.completedevents and, where you added them,job.progressandjob.step.*events. - Structured logs — filter by resource (e.g.
workers-api) or by trace id to see thelog.info/log.errorlines correlated to a span. - Resources — health and console output per resource; cross-reference a failing span with the resource that produced it.
Confirm liveness independently of the UI by hitting the per-capability health endpoints:
curl -s http://localhost:8091/health # workers
curl -s http://localhost:8092/health/live # sagas
curl -s http://localhost:8093/health # triggers
curl -s http://localhost:8094/health # auth