observabilitytelemetryOpenTelemetrymonitoringinfrastructuresystem design

Observability Pipeline Architecture: Collecting, Processing & Routing Telemetry at Scale

March 29, 2026 5 min readBy Codelit Team Discussion

Observability Pipeline Architecture#

Modern systems generate enormous volumes of telemetry — logs, metrics, and traces. Without a well-designed pipeline, you either drown in data costs or fly blind when incidents hit. An observability pipeline sits between your applications and your backends, giving you control over what data goes where.

Why You Need a Pipeline#

Sending telemetry directly from applications to backends creates problems:

App → Datadog       (vendor lock-in)
App → Elasticsearch (tight coupling)
App → Prometheus    (no transformation)

With a pipeline:

App → Pipeline → Datadog (sampled traces)
               → S3 (full archive)
               → Prometheus (aggregated metrics)
               → Elasticsearch (filtered logs)

The pipeline gives you sampling, filtering, transformation, and routing — all without changing application code.

Core Pipeline Components#

Every observability pipeline has three stages:

1. Collection (Receivers)#

Agents or SDKs emit telemetry data into the pipeline.

# OpenTelemetry Collector - receivers
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "0.0.0.0:4317"
      http:
        endpoint: "0.0.0.0:4318"
  prometheus:
    config:
      scrape_configs:
        - job_name: "my-service"
          scrape_interval: 15s
          static_configs:
            - targets: ["localhost:8080"]
  filelog:
    include: ["/var/log/app/*.log"]
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.time
          layout: "%Y-%m-%dT%H:%M:%S"

2. Processing (Processors)#

Transform, enrich, filter, and sample data before it leaves the pipeline.

# OpenTelemetry Collector - processors
processors:
  batch:
    send_batch_size: 8192
    timeout: 5s
  memory_limiter:
    check_interval: 1s
    limit_mib: 512
  attributes:
    actions:
      - key: environment
        value: "production"
        action: upsert
  filter:
    logs:
      exclude:
        match_type: strict
        bodies:
          - "health check"
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: error-traces
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: slow-traces
        type: latency
        latency:
          threshold_ms: 2000
      - name: probabilistic
        type: probabilistic
        probabilistic:
          sampling_percentage: 10

3. Export (Exporters)#

Route processed data to one or more destinations.

# OpenTelemetry Collector - exporters
exporters:
  otlp/jaeger:
    endpoint: "jaeger:4317"
    tls:
      insecure: true
  prometheusremotewrite:
    endpoint: "http://prometheus:9090/api/v1/write"
  awss3:
    s3uploader:
      region: "us-east-1"
      s3_bucket: "telemetry-archive"
      s3_prefix: "logs"

Pipeline Tools Compared#

Tool	Strengths	Best For
OTel Collector	Vendor-neutral, traces-first	Full OTLP pipeline
Vector	Performance, flexible transforms	High-volume log routing
Fluentd	Plugin ecosystem, mature	Kubernetes log collection
Fluent Bit	Lightweight, low memory	Edge and IoT
Cribl Stream	GUI, enterprise features	Complex enterprise routing

Vector Pipeline Example#

Vector excels at high-throughput log processing with a declarative config:

# vector.toml
[sources.app_logs]
type = "file"
include = ["/var/log/app/*.log"]

[transforms.parse_json]
type = "remap"
inputs = ["app_logs"]
source = '''
  . = parse_json!(.message)
  .timestamp = parse_timestamp!(.timestamp, format: "%+")
  .environment = get_env_var("ENV") ?? "unknown"
'''

[transforms.filter_noise]
type = "filter"
inputs = ["parse_json"]
condition = '.level != "debug" || .environment == "staging"'

[transforms.sample_info]
type = "sample"
inputs = ["filter_noise"]
rate = 10
exclude.type = "vrl"
exclude.source = '.level == "error" || .level == "warn"'

[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["sample_info"]
endpoints = ["http://elasticsearch:9200"]
bulk.index = "app-logs-%Y-%m-%d"

[sinks.s3_archive]
type = "aws_s3"
inputs = ["filter_noise"]
bucket = "log-archive"
key_prefix = "app-logs/year=%Y/month=%m/day=%d/"
compression = "gzip"

Sampling Strategies#

Sampling is the single most impactful cost-reduction technique.

Head sampling — decide at trace creation:

Simple, low overhead
May miss interesting traces
Good for high-volume, low-criticality services

Tail sampling — decide after the full trace completes:

Can keep all error traces and slow traces
Requires buffering complete traces in memory
Higher resource usage on the collector

Priority sampling — combine both:

High priority (always keep): errors, SLO violations, manual debug flags
Medium priority (sample at 25%): authenticated user requests
Low priority (sample at 5%): health checks, internal RPCs

Multi-Destination Routing#

Route different signals to different backends based on content:

# OTel Collector - routing by log severity
connectors:
  routing:
    table:
      - statement: route() where severity_number >= 13
        pipelines: [logs/errors]
      - statement: route() where attributes["service.name"] == "payments"
        pipelines: [logs/critical]
    default_pipelines: [logs/standard]

service:
  pipelines:
    logs/errors:
      receivers: [routing]
      exporters: [otlp/pagerduty, elasticsearch]
    logs/critical:
      receivers: [routing]
      exporters: [otlp/dedicated, awss3]
    logs/standard:
      receivers: [routing]
      exporters: [awss3]

Deployment Patterns#

Sidecar — one collector per pod:

Strong isolation, simple config
Higher resource overhead

DaemonSet — one collector per node:

Efficient resource usage
Shared by all pods on the node

Gateway — centralized collector pool:

Advanced processing (tail sampling needs full traces)
Single point for routing decisions
Scale independently from application pods

Most production setups use a two-tier approach: lightweight DaemonSet agents forward to a Gateway tier that handles sampling, enrichment, and routing.

Key Design Principles#

Buffer aggressively — disk-backed queues prevent data loss during backend outages
Filter early — drop noise at the agent level, not the backend
Sample traces, not logs — keep complete traces or drop them entirely
Archive everything — send raw data to cheap object storage before sampling
Decouple format from backend — use OTLP as your internal format, convert at export
Monitor the pipeline itself — internal metrics for queue depth, drop rate, and latency

Conclusion#

An observability pipeline transforms telemetry from a cost center into a strategic asset. By adding collection, processing, and routing layers between your applications and backends, you gain the flexibility to control costs, avoid vendor lock-in, and ensure the right data reaches the right place.

Article #401 — part of the Codelit engineering blog. Explore all articles at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI agents

AgentOps Observability for AI Agents

3 min read

AI agents

Agentic Data Pipeline Workflow

2 min read

AI agents

An Incident Response Agent Should Slow Down at the Right Moments

2 min read

Try these templates

OpenAI API Request Pipeline

7-stage pipeline from API call to token generation, handling millions of requests per minute.

8 components

GitHub-like CI/CD Pipeline

Continuous integration and deployment system with parallel jobs, artifact caching, and environment management.

9 components

WhatsApp-Scale Messaging System

End-to-end encrypted messaging with offline delivery, group chats, and media sharing at billions-of-messages scale.

9 components

Build this architecture

Generate an interactive Observability Pipeline Architecture in seconds.

Try it in Codelit →

observabilitytelemetryOpenTelemetrymonitoringinfrastructuresystem design

Observability Pipeline Architecture: Collecting, Processing & Routing Telemetry at Scale

March 29, 2026 5 min readBy Codelit Team Discussion

Observability Pipeline Architecture#

Why You Need a Pipeline#

Sending telemetry directly from applications to backends creates problems:

App → Datadog       (vendor lock-in)
App → Elasticsearch (tight coupling)
App → Prometheus    (no transformation)

With a pipeline:

App → Pipeline → Datadog (sampled traces)
               → S3 (full archive)
               → Prometheus (aggregated metrics)
               → Elasticsearch (filtered logs)

The pipeline gives you sampling, filtering, transformation, and routing — all without changing application code.

Core Pipeline Components#

Every observability pipeline has three stages:

1. Collection (Receivers)#

Agents or SDKs emit telemetry data into the pipeline.

# OpenTelemetry Collector - receivers
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "0.0.0.0:4317"
      http:
        endpoint: "0.0.0.0:4318"
  prometheus:
    config:
      scrape_configs:
        - job_name: "my-service"
          scrape_interval: 15s
          static_configs:
            - targets: ["localhost:8080"]
  filelog:
    include: ["/var/log/app/*.log"]
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.time
          layout: "%Y-%m-%dT%H:%M:%S"

2. Processing (Processors)#

Transform, enrich, filter, and sample data before it leaves the pipeline.

# OpenTelemetry Collector - processors
processors:
  batch:
    send_batch_size: 8192
    timeout: 5s
  memory_limiter:
    check_interval: 1s
    limit_mib: 512
  attributes:
    actions:
      - key: environment
        value: "production"
        action: upsert
  filter:
    logs:
      exclude:
        match_type: strict
        bodies:
          - "health check"
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: error-traces
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: slow-traces
        type: latency
        latency:
          threshold_ms: 2000
      - name: probabilistic
        type: probabilistic
        probabilistic:
          sampling_percentage: 10

3. Export (Exporters)#

Route processed data to one or more destinations.

# OpenTelemetry Collector - exporters
exporters:
  otlp/jaeger:
    endpoint: "jaeger:4317"
    tls:
      insecure: true
  prometheusremotewrite:
    endpoint: "http://prometheus:9090/api/v1/write"
  awss3:
    s3uploader:
      region: "us-east-1"
      s3_bucket: "telemetry-archive"
      s3_prefix: "logs"

Pipeline Tools Compared#

Tool	Strengths	Best For
OTel Collector	Vendor-neutral, traces-first	Full OTLP pipeline
Vector	Performance, flexible transforms	High-volume log routing
Fluentd	Plugin ecosystem, mature	Kubernetes log collection
Fluent Bit	Lightweight, low memory	Edge and IoT
Cribl Stream	GUI, enterprise features	Complex enterprise routing

Vector Pipeline Example#

Vector excels at high-throughput log processing with a declarative config:

# vector.toml
[sources.app_logs]
type = "file"
include = ["/var/log/app/*.log"]

[transforms.parse_json]
type = "remap"
inputs = ["app_logs"]
source = '''
  . = parse_json!(.message)
  .timestamp = parse_timestamp!(.timestamp, format: "%+")
  .environment = get_env_var("ENV") ?? "unknown"
'''

[transforms.filter_noise]
type = "filter"
inputs = ["parse_json"]
condition = '.level != "debug" || .environment == "staging"'

[transforms.sample_info]
type = "sample"
inputs = ["filter_noise"]
rate = 10
exclude.type = "vrl"
exclude.source = '.level == "error" || .level == "warn"'

[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["sample_info"]
endpoints = ["http://elasticsearch:9200"]
bulk.index = "app-logs-%Y-%m-%d"

[sinks.s3_archive]
type = "aws_s3"
inputs = ["filter_noise"]
bucket = "log-archive"
key_prefix = "app-logs/year=%Y/month=%m/day=%d/"
compression = "gzip"

Sampling Strategies#

Sampling is the single most impactful cost-reduction technique.

Head sampling — decide at trace creation:

Simple, low overhead
May miss interesting traces
Good for high-volume, low-criticality services

Tail sampling — decide after the full trace completes:

Can keep all error traces and slow traces
Requires buffering complete traces in memory
Higher resource usage on the collector

Priority sampling — combine both:

High priority (always keep): errors, SLO violations, manual debug flags
Medium priority (sample at 25%): authenticated user requests
Low priority (sample at 5%): health checks, internal RPCs

Multi-Destination Routing#

Route different signals to different backends based on content:

# OTel Collector - routing by log severity
connectors:
  routing:
    table:
      - statement: route() where severity_number >= 13
        pipelines: [logs/errors]
      - statement: route() where attributes["service.name"] == "payments"
        pipelines: [logs/critical]
    default_pipelines: [logs/standard]

service:
  pipelines:
    logs/errors:
      receivers: [routing]
      exporters: [otlp/pagerduty, elasticsearch]
    logs/critical:
      receivers: [routing]
      exporters: [otlp/dedicated, awss3]
    logs/standard:
      receivers: [routing]
      exporters: [awss3]

Deployment Patterns#

Sidecar — one collector per pod:

Strong isolation, simple config
Higher resource overhead

DaemonSet — one collector per node:

Efficient resource usage
Shared by all pods on the node

Gateway — centralized collector pool:

Advanced processing (tail sampling needs full traces)
Single point for routing decisions
Scale independently from application pods

Most production setups use a two-tier approach: lightweight DaemonSet agents forward to a Gateway tier that handles sampling, enrichment, and routing.

Key Design Principles#

Buffer aggressively — disk-backed queues prevent data loss during backend outages
Filter early — drop noise at the agent level, not the backend
Sample traces, not logs — keep complete traces or drop them entirely
Archive everything — send raw data to cheap object storage before sampling
Decouple format from backend — use OTLP as your internal format, convert at export
Monitor the pipeline itself — internal metrics for queue depth, drop rate, and latency

Conclusion#

Article #401 — part of the Codelit engineering blog. Explore all articles at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI agents

AgentOps Observability for AI Agents

3 min read

AI agents

Agentic Data Pipeline Workflow

2 min read

AI agents

An Incident Response Agent Should Slow Down at the Right Moments

2 min read

Build this architecture

Generate an interactive Observability Pipeline Architecture in seconds.

Try it in Codelit →

Observability Pipeline Architecture: Collecting, Processing & Routing Telemetry at Scale

Observability Pipeline Architecture#

Why You Need a Pipeline#

Core Pipeline Components#

1. Collection (Receivers)#

2. Processing (Processors)#

3. Export (Exporters)#

Pipeline Tools Compared#

Vector Pipeline Example#

Sampling Strategies#

Multi-Destination Routing#

Deployment Patterns#

Key Design Principles#

Conclusion#

Comments

Related articles

AgentOps Observability for AI Agents

Agentic Data Pipeline Workflow

An Incident Response Agent Should Slow Down at the Right Moments

Try these templates

OpenAI API Request Pipeline

GitHub-like CI/CD Pipeline

WhatsApp-Scale Messaging System

Build this architecture

Observability Pipeline Architecture: Collecting, Processing & Routing Telemetry at Scale

Observability Pipeline Architecture#

Why You Need a Pipeline#

Core Pipeline Components#

1. Collection (Receivers)#

2. Processing (Processors)#

3. Export (Exporters)#

Pipeline Tools Compared#

Vector Pipeline Example#

Sampling Strategies#

Multi-Destination Routing#

Deployment Patterns#

Key Design Principles#

Conclusion#

Comments

Related articles

AgentOps Observability for AI Agents

Agentic Data Pipeline Workflow

An Incident Response Agent Should Slow Down at the Right Moments

Try these templates

OpenAI API Request Pipeline

GitHub-like CI/CD Pipeline

WhatsApp-Scale Messaging System

Build this architecture