opentelemetryobservabilitydistributed-tracinginstrumentationmonitoringsystem-design

OpenTelemetry Instrumentation Guide: Auto vs Manual, SDK Setup & Vendor-Agnostic Observability

March 29, 2026 5 min readBy Codelit Team Discussion

OpenTelemetry (OTel) has become the industry standard for collecting telemetry data — traces, metrics, and logs — across distributed systems. This guide covers everything from auto-instrumentation to manual spans, SDK setup across languages, exporters, collectors, context propagation, sampling strategies, and migrating away from proprietary agents.

Why OpenTelemetry?#

Vendor lock-in has long plagued observability. Teams adopt Datadog, New Relic, or Dynatrace, then find migration painful because instrumentation is tightly coupled to the vendor SDK.

OpenTelemetry solves this by providing:

A single, vendor-agnostic API for traces, metrics, and logs
Auto-instrumentation libraries that require zero code changes
A collector that decouples data production from data consumption
Wide ecosystem support — every major observability vendor accepts OTel data

Auto vs Manual Instrumentation#

Auto-Instrumentation#

Auto-instrumentation intercepts well-known libraries (HTTP clients, database drivers, messaging frameworks) and generates spans automatically.

Node.js — use @opentelemetry/auto-instrumentations-node:

const { NodeSDK } = require("@opentelemetry/sdk-node");
const {
  getNodeAutoInstrumentations,
} = require("@opentelemetry/auto-instrumentations-node");

const sdk = new NodeSDK({
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

Python — use opentelemetry-distro and opentelemetry-bootstrap:

pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install
opentelemetry-instrument python app.py

Go — auto-instrumentation is more limited; use instrumented library wrappers like otelhttp and otelgrpc.

Manual Instrumentation#

Manual instrumentation gives you full control. You create spans around business-critical operations that auto-instrumentation cannot detect.

const { trace } = require("@opentelemetry/api");

const tracer = trace.getTracer("checkout-service");

async function processOrder(order) {
  return tracer.startActiveSpan("processOrder", async (span) => {
    span.setAttribute("order.id", order.id);
    span.setAttribute("order.total", order.total);
    try {
      await chargePayment(order);
      await reserveInventory(order);
      span.setStatus({ code: 1 }); // OK
    } catch (err) {
      span.setStatus({ code: 2, message: err.message });
      span.recordException(err);
      throw err;
    } finally {
      span.end();
    }
  });
}

Best practice: use auto-instrumentation as a baseline, then add manual spans for domain-specific operations.

SDK Setup#

Node.js#

const { NodeSDK } = require("@opentelemetry/sdk-node");
const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-grpc");
const { OTLPMetricExporter } = require("@opentelemetry/exporter-metrics-otlp-grpc");
const { PeriodicExportingMetricReader } = require("@opentelemetry/sdk-metrics");

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({ url: "http://collector:4317" }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({ url: "http://collector:4317" }),
    exportIntervalMillis: 15000,
  }),
});

sdk.start();

Python#

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

Go#

import (
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
    sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

func initTracer() (*sdktrace.TracerProvider, error) {
    exporter, err := otlptracegrpc.New(ctx,
        otlptracegrpc.WithEndpoint("collector:4317"),
        otlptracegrpc.WithInsecure(),
    )
    if err != nil {
        return nil, err
    }
    tp := sdktrace.NewTracerProvider(
        sdktrace.WithBatcher(exporter),
    )
    otel.SetTracerProvider(tp)
    return tp, nil
}

Exporters and the Collector#

Exporters#

Exporters send telemetry data from your application to a backend. Common choices:

OTLP (gRPC or HTTP) — the native OTel protocol; preferred for collector communication
Jaeger — popular for tracing
Prometheus — standard for metrics
Zipkin — lightweight tracing alternative

The OpenTelemetry Collector#

The collector sits between your applications and backends. It receives, processes, and exports telemetry data.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 5s
    send_batch_size: 1024
  memory_limiter:
    check_interval: 1s
    limit_mib: 512

exporters:
  otlp:
    endpoint: "tempo:4317"
    tls:
      insecure: true
  prometheus:
    endpoint: "0.0.0.0:8889"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlp]
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [prometheus]

Context Propagation#

Context propagation ensures trace context flows across service boundaries. OTel supports two primary propagators:

W3C TraceContext — the standard (traceparent / tracestate headers)
B3 — used by Zipkin-based systems

The SDK automatically injects and extracts context for HTTP requests when auto-instrumentation is enabled. For messaging systems (Kafka, RabbitMQ), you must manually inject context into message headers.

Sampling Strategies#

At scale, tracing every request is expensive. Sampling controls which traces are recorded:

Strategy	Description	Use Case
AlwaysOn	Record everything	Development, low-traffic services
AlwaysOff	Record nothing	Disabled services
TraceIdRatio	Probabilistic sampling	General production use
ParentBased	Inherit parent decision	Consistent cross-service sampling
Tail-based	Decide after span completes	Capture errors and slow requests

Head-based sampling (decided at trace start) is simple but misses interesting traces. Tail-based sampling (decided at the collector) captures anomalies but requires buffering all spans temporarily.

Migrating from Proprietary Agents#

Migrating to OTel follows a phased approach:

Deploy the OTel Collector alongside your existing agent
Dual-ship telemetry — send data to both old and new backends
Replace SDK instrumentation service by service, starting with the least critical
Validate parity — ensure dashboards and alerts produce equivalent results
Remove the proprietary agent once confidence is established

The collector's fan-out capability makes dual-shipping trivial — just add multiple exporters to the same pipeline.

Key Takeaways#

Start with auto-instrumentation, add manual spans for business logic
Use the OTel Collector as a central telemetry gateway
Adopt W3C TraceContext for cross-service propagation
Use tail-based sampling at the collector for cost-effective anomaly capture
Migrate incrementally — the collector enables dual-shipping with zero application changes

OpenTelemetry eliminates vendor lock-in while giving you best-in-class instrumentation. Combined with a well-configured collector, it forms the backbone of modern observability.

Want to go deeper on distributed systems observability? Explore our full library — 349 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

AgentOps Observability for AI Agents

3 min read

AI agents

Agentic Data Pipeline Workflow

2 min read

AI agents

An Incident Response Agent Should Slow Down at the Right Moments

2 min read

Try these templates

Logging & Observability Platform

Datadog-like platform with log aggregation, metrics collection, distributed tracing, and alerting.

8 components

Autonomous Vehicle Platform

Self-driving car system with perception, planning, control, HD maps, and fleet management.

8 components

Prometheus Monitoring Stack

Metrics collection, alerting, and visualization with Prometheus, Grafana, Alertmanager, and exporters.

10 components

Build this architecture

Generate an interactive architecture for OpenTelemetry Instrumentation Guide in seconds.

Try it in Codelit →

opentelemetryobservabilitydistributed-tracinginstrumentationmonitoringsystem-design

OpenTelemetry Instrumentation Guide: Auto vs Manual, SDK Setup & Vendor-Agnostic Observability

March 29, 2026 5 min readBy Codelit Team Discussion

Why OpenTelemetry?#

Vendor lock-in has long plagued observability. Teams adopt Datadog, New Relic, or Dynatrace, then find migration painful because instrumentation is tightly coupled to the vendor SDK.

OpenTelemetry solves this by providing:

A single, vendor-agnostic API for traces, metrics, and logs
Auto-instrumentation libraries that require zero code changes
A collector that decouples data production from data consumption
Wide ecosystem support — every major observability vendor accepts OTel data

Auto vs Manual Instrumentation#

Auto-Instrumentation#

Auto-instrumentation intercepts well-known libraries (HTTP clients, database drivers, messaging frameworks) and generates spans automatically.

Node.js — use @opentelemetry/auto-instrumentations-node:

const { NodeSDK } = require("@opentelemetry/sdk-node");
const {
  getNodeAutoInstrumentations,
} = require("@opentelemetry/auto-instrumentations-node");

const sdk = new NodeSDK({
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

Python — use opentelemetry-distro and opentelemetry-bootstrap:

pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install
opentelemetry-instrument python app.py

Go — auto-instrumentation is more limited; use instrumented library wrappers like otelhttp and otelgrpc.

Manual Instrumentation#

Manual instrumentation gives you full control. You create spans around business-critical operations that auto-instrumentation cannot detect.

const { trace } = require("@opentelemetry/api");

const tracer = trace.getTracer("checkout-service");

async function processOrder(order) {
  return tracer.startActiveSpan("processOrder", async (span) => {
    span.setAttribute("order.id", order.id);
    span.setAttribute("order.total", order.total);
    try {
      await chargePayment(order);
      await reserveInventory(order);
      span.setStatus({ code: 1 }); // OK
    } catch (err) {
      span.setStatus({ code: 2, message: err.message });
      span.recordException(err);
      throw err;
    } finally {
      span.end();
    }
  });
}

Best practice: use auto-instrumentation as a baseline, then add manual spans for domain-specific operations.

SDK Setup#

Node.js#

const { NodeSDK } = require("@opentelemetry/sdk-node");
const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-grpc");
const { OTLPMetricExporter } = require("@opentelemetry/exporter-metrics-otlp-grpc");
const { PeriodicExportingMetricReader } = require("@opentelemetry/sdk-metrics");

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({ url: "http://collector:4317" }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({ url: "http://collector:4317" }),
    exportIntervalMillis: 15000,
  }),
});

sdk.start();

Python#

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

Go#

import (
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
    sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

func initTracer() (*sdktrace.TracerProvider, error) {
    exporter, err := otlptracegrpc.New(ctx,
        otlptracegrpc.WithEndpoint("collector:4317"),
        otlptracegrpc.WithInsecure(),
    )
    if err != nil {
        return nil, err
    }
    tp := sdktrace.NewTracerProvider(
        sdktrace.WithBatcher(exporter),
    )
    otel.SetTracerProvider(tp)
    return tp, nil
}

Exporters and the Collector#

Exporters#

Exporters send telemetry data from your application to a backend. Common choices:

OTLP (gRPC or HTTP) — the native OTel protocol; preferred for collector communication
Jaeger — popular for tracing
Prometheus — standard for metrics
Zipkin — lightweight tracing alternative

The OpenTelemetry Collector#

The collector sits between your applications and backends. It receives, processes, and exports telemetry data.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 5s
    send_batch_size: 1024
  memory_limiter:
    check_interval: 1s
    limit_mib: 512

exporters:
  otlp:
    endpoint: "tempo:4317"
    tls:
      insecure: true
  prometheus:
    endpoint: "0.0.0.0:8889"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlp]
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [prometheus]

Context Propagation#

Context propagation ensures trace context flows across service boundaries. OTel supports two primary propagators:

W3C TraceContext — the standard (traceparent / tracestate headers)
B3 — used by Zipkin-based systems

Sampling Strategies#

At scale, tracing every request is expensive. Sampling controls which traces are recorded:

Strategy	Description	Use Case
AlwaysOn	Record everything	Development, low-traffic services
AlwaysOff	Record nothing	Disabled services
TraceIdRatio	Probabilistic sampling	General production use
ParentBased	Inherit parent decision	Consistent cross-service sampling
Tail-based	Decide after span completes	Capture errors and slow requests

Head-based sampling (decided at trace start) is simple but misses interesting traces. Tail-based sampling (decided at the collector) captures anomalies but requires buffering all spans temporarily.

Migrating from Proprietary Agents#

Migrating to OTel follows a phased approach:

Deploy the OTel Collector alongside your existing agent
Dual-ship telemetry — send data to both old and new backends
Replace SDK instrumentation service by service, starting with the least critical
Validate parity — ensure dashboards and alerts produce equivalent results
Remove the proprietary agent once confidence is established

The collector's fan-out capability makes dual-shipping trivial — just add multiple exporters to the same pipeline.

Key Takeaways#

Start with auto-instrumentation, add manual spans for business logic
Use the OTel Collector as a central telemetry gateway
Adopt W3C TraceContext for cross-service propagation
Use tail-based sampling at the collector for cost-effective anomaly capture
Migrate incrementally — the collector enables dual-shipping with zero application changes

OpenTelemetry eliminates vendor lock-in while giving you best-in-class instrumentation. Combined with a well-configured collector, it forms the backbone of modern observability.

Want to go deeper on distributed systems observability? Explore our full library — 349 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

Build this architecture

Generate an interactive architecture for OpenTelemetry Instrumentation Guide in seconds.

Try it in Codelit →

OpenTelemetry Instrumentation Guide: Auto vs Manual, SDK Setup & Vendor-Agnostic Observability

Why OpenTelemetry?#

Auto vs Manual Instrumentation#

Auto-Instrumentation#

Manual Instrumentation#

SDK Setup#

Node.js#

Python#

Go#

Exporters and the Collector#

Exporters#

The OpenTelemetry Collector#

Context Propagation#

Sampling Strategies#

Migrating from Proprietary Agents#

Key Takeaways#

Comments

Related articles

AgentOps Observability for AI Agents

Agentic Data Pipeline Workflow

An Incident Response Agent Should Slow Down at the Right Moments

Try these templates

Logging & Observability Platform

Autonomous Vehicle Platform

Prometheus Monitoring Stack

Build this architecture

OpenTelemetry Instrumentation Guide: Auto vs Manual, SDK Setup & Vendor-Agnostic Observability

Why OpenTelemetry?#

Auto vs Manual Instrumentation#

Auto-Instrumentation#

Manual Instrumentation#

SDK Setup#

Node.js#

Python#

Go#

Exporters and the Collector#

Exporters#

The OpenTelemetry Collector#

Context Propagation#

Sampling Strategies#

Migrating from Proprietary Agents#

Key Takeaways#

Comments

Related articles

AgentOps Observability for AI Agents

Agentic Data Pipeline Workflow

An Incident Response Agent Should Slow Down at the Right Moments

Try these templates

Logging & Observability Platform

Autonomous Vehicle Platform

Prometheus Monitoring Stack

Build this architecture