Introduction

Your observability pipeline probably looks something like this: dozens of microservices emitting logs, metrics, and traces, all funneling into a backend that charges by the gigabyte. The data keeps growing, the bill keeps climbing, and half of what you're storing is noise you'll never query.

Cribl Stream sits in the middle of that pipeline. It lets you route, filter, sample, and transform telemetry before it hits your storage layer. The result: you keep the signal, drop the noise, and cut costs without losing visibility.

In this tutorial, we'll connect Cribl Stream to Parseable using QuickConnect, with the OpenTelemetry Demo as our data source. For detailed integration steps, see the Cribl integration docs. By the end, you'll have:

A working pipeline from OTel Demo → Cribl → Parseable
Intelligent sampling rules that reduce volume without losing critical data
Alerts in Parseable for real-time incident detection
Cost-effective storage on object storage (S3, MinIO, etc.)

Why Cribl + Parseable?

Cribl: Control Your Data Before It Lands

Cribl Stream acts as an observability router. Before telemetry reaches your backend, you can:

Sample high-volume, low-value data (health checks, debug logs)
Route different data types to different destinations
Enrich events with context (add environment tags, normalize fields)
Filter out noise (drop internal traffic, mask sensitive fields)

This means you're not paying to store data you'll never use.

Parseable: Store and Query What Matters

Once your curated telemetry reaches Parseable, you get:

Object storage as the source of truth — S3-compatible storage at a fraction of the cost of traditional databases
SQL-first querying — No proprietary query language. Just SQL.
Unified telemetry — Logs, metrics, and traces in one place
Built-in alerting — Anomaly and threshold-based alerts that trigger on any condition you can query

Together, Cribl handles the "what to keep" problem, and Parseable handles the "how to use it" problem.

Architecture Overview

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  OpenTelemetry  │     │                 │     │                 │
│      Demo       │────▶│  Cribl Stream   │────▶│    Parseable    │
│  (Microservices)│     │  (QuickConnect) │     │  (OTLP Endpoint)│
└─────────────────┘     └─────────────────┘     └─────────────────┘
        │                       │                       │
   Logs, Metrics,          Sampling,              SQL Queries,
   Traces (OTLP)           Filtering,             Alerts,
                           Routing                Dashboards

Data Flow:

OpenTelemetry Demo generates realistic e-commerce telemetry (traces from checkout flows, metrics from services, logs from all components)
Cribl Stream receives OTLP data, applies sampling and filtering rules, then forwards to Parseable
Parseable ingests via OTLP endpoint, stores on object storage, and exposes data via SQL

Prerequisites

Cribl Stream instance (Cloud or self-hosted)
Parseable instance (Parseable Cloud or self-hosted)
OpenTelemetry Demo running and emitting telemetry
Network connectivity between all components

Setting Up the OpenTelemetry Demo

If you don't have the OTel Demo running yet:

# Clone the demo
git clone https://github.com/open-telemetry/opentelemetry-demo.git
cd opentelemetry-demo
 
# Start the demo
docker compose up -d

The demo exposes OTLP endpoints on:

gRPC: localhost:4317
HTTP: localhost:4318

You'll configure Cribl to receive from these endpoints (or point the demo's collector at Cribl).

Step 1: Add Parseable as a Destination in Cribl

Using QuickConnect

QuickConnect is the fastest way to wire up sources and destinations in Cribl Stream.

In Cribl Stream, navigate to Routing → QuickConnect
Click Add Destination
Select OpenTelemetry from the destination list

Configure General Settings

Setting	Value
Output ID	`parseable-otel`
Description	`OpenTelemetry telemetry to Parseable`
Protocol Version	`0.10.0`
Protocol	`HTTP`
URL	`https://<your-parseable-instance>/v1/metrics`

For Parseable Cloud, your URL will look like:

https://<instance>-ingestor.parseable.com

For self-hosted Parseable:

http://localhost:8000 or <parseable-host>:<parseable-port>

Configure Compression and Batching

Navigate to Advanced Settings:

Setting	Value	Why
Compression	`Gzip`	Reduces network transfer by 70-90%
Request Timeout	`30` seconds	Allows for larger batches
Max Retries	`5`	Handles transient failures
Max Body Size	`4096` KB	Optimal batch size for OTLP
Flush Period	`1` second	Balance between latency and efficiency

Configure HTTP Headers

Add these headers to route data to the correct Parseable streams:

Header	Value	Purpose
`X-P-Stream`	`cribl-otel`	Target stream name in Parseable
`X-P-Log-Source`	`otel-logs`	For logs data
`X-P-Log-Source`	`otel-metrics`	For metrics data
`X-P-Log-Source`	`otel-traces`	For traces data

Tip: You need to create separate datasets for logs, metrics, and traces in Parseable. This makes querying and retention policies more flexible.

Configure Authentication

In the Authentication tab:

Username: Your Parseable username
Password: Your Parseable password

For Parseable Cloud, use your login credentials. For self-hosted, use the credentials you configured during setup.

Save and Test

Click Save
Go to the Test tab
Send a test event to verify connectivity
Check the Live Data tab to confirm data is flowing

Step 2: Configure Intelligent Sampling in Cribl

Raw telemetry from a microservices application is noisy. Health checks fire every second. Debug logs repeat endlessly. Successful requests look identical. Sampling lets you keep representative data without storing everything.

Create a Sampling Pipeline

In Cribl Stream, go to Processing → Pipelines and create a new pipeline called otel-sampling.

Rule 1: Drop Health Check Noise

Health checks and readiness probes generate massive volume with zero debugging value.

// Filter Function
filter {
  // Drop Kubernetes health checks
  if (__inputId.includes('health') || 
      __inputId.includes('ready') || 
      __inputId.includes('live')) {
    return false;
  }
  
  // Drop internal service mesh traffic
  if (_raw.includes('/healthz') || 
      _raw.includes('/readyz') || 
      _raw.includes('/metrics')) {
    return false;
  }
  
  return true;
}

Impact: Typically removes 30-50% of log volume with zero loss of debugging capability.

Rule 2: Sample Successful Requests

You don't need every successful request. A 10% sample gives you statistical validity without the storage cost.

// Sampling Function
sampling {
  // Keep all errors (status >= 400)
  if (status_code >= 400) {
    return true;
  }
  
  // Keep all slow requests (> 1 second)
  if (duration_ms > 1000) {
    return true;
  }
  
  // Sample 10% of successful requests
  if (Math.random() < 0.1) {
    return true;
  }
  
  return false;
}

Impact: Reduces successful request volume by 90% while keeping 100% of errors and slow requests.

Rule 3: Aggregate Repetitive Metrics

Instead of storing every metric scrape, aggregate over time windows.

// Aggregation Function
aggregate {
  // Group by metric name and labels
  groupBy: ['metric_name', 'service', 'instance'],
  
  // Aggregate over 60-second windows
  timeWindow: 60,
  
  // Keep min, max, avg, count
  aggregations: {
    value_min: min(value),
    value_max: max(value),
    value_avg: avg(value),
    value_count: count()
  }
}

Impact: Reduces metric volume by 60x (from per-second to per-minute) while preserving statistical accuracy.

Apply the Pipeline

Go to Routing → QuickConnect
Select your OTel source
Attach the otel-sampling pipeline
Connect to your parseable-otel destination

Step 3: Verify Data in Parseable

Once Cribl is forwarding data, verify it's landing in Parseable.

Check Stream Creation

In Parseable UI, navigate to Streams. You should see cribl-otel (or whatever you named it in the X-P-Stream header).

Query Sample Data

Run a simple query to verify data structure:

SELECT *
FROM "cribl-otel"
ORDER BY p_timestamp DESC
LIMIT 10;

Verify Sampling is Working

Compare raw volume to sampled volume:

-- Count events by type
SELECT 
  DATE_TRUNC('hour', p_timestamp) AS hour,
  COUNT(*) AS event_count
FROM "cribl-otel"
GROUP BY hour
ORDER BY hour DESC;

If your sampling rules are working, you should see significantly lower volume than raw OTel Demo output (which generates thousands of events per minute).

Step 4: Set Up Alerts in Parseable

Parseable provides a form-based alert configuration system with three detection types: threshold alerts, anomaly detection, and forecasting.

Creating an Alert

Step 1: Set Rule

Navigate to Alerts in Parseable UI
Click Create Alert
Configure the rule:

Setting	Example Value	Description
Dataset	`cribl-otel`	Select your stream
Monitor Field	`status_code` or `All rows (*)`	Field to monitor
Aggregation	`COUNT`, `AVG`, `SUM`, `MIN`, `MAX`	How to aggregate
Filter	`status_code >= 400`	Optional condition
Group By	`service_name`	Optional grouping

Step 2: Set Evaluation

Choose one of three alert types:

Alert Type 1: Threshold

Static threshold alerts for known conditions.

Setting	Example
Condition	`Greater than`
Value	`100`
Evaluation Window	`5 minutes`

Use case: Alert when error count exceeds 100 in 5 minutes.

Alert Type 2: Anomaly Detection

ML-based detection for unusual patterns using historical data.

Setting	Example
Sensitivity	`Medium`
Historical Window	`7 days`
Evaluation Window	`15 minutes`

Use case: Detect unusual spikes in request volume without setting manual thresholds. Parseable learns normal patterns and alerts on deviations.

Alert Type 3: Forecast

Predictive alerts that trigger before problems occur.

Setting	Example
Forecast Window	`1 hour`
Trigger Condition	`Will exceed 1000`
Confidence	`High`

Use case: Alert when forecasted error count will exceed threshold in the next hour, giving you time to act before impact.

Example Alert Configurations

High Error Rate (Threshold)

Dataset: cribl-otel
Monitor: All rows (*)
Aggregation: COUNT
Filter: status_code >= 400
Group By: service_name
Condition: Greater than 50 in 5 minutes

Traffic Anomaly (Anomaly Detection)

Dataset: cribl-otel
Monitor: All rows (*)
Aggregation: COUNT
Sensitivity: Medium
Historical Window: 7 days

Capacity Warning (Forecast)

Dataset: cribl-otel
Monitor: request_count
Aggregation: SUM
Forecast Window: 2 hours
Trigger: Will exceed 10000

Configure Alert Destinations

In Parseable, alerts can trigger:

Webhook — POST to Slack, PagerDuty, or custom endpoints

Step 5: Optimize Storage Costs

Parseable's Storage Efficiency

Parseable stores data on object storage (S3, MinIO, GCS) using columnar Parquet format. This delivers:

Benefit	Impact
Compression	10-20x reduction vs JSON
Columnar storage	Only read columns you query

Conclusion

You now have a production-ready observability pipeline:

OpenTelemetry Demo generates realistic microservices telemetry
Cribl Stream samples and routes data intelligently
Parseable stores everything on cost-effective object storage with SQL access

The combination delivers:

70-90% cost reduction through intelligent sampling
SQL-native alerting for real-time incident detection
Unified telemetry — logs, metrics, traces in one place
Object storage economics — pennies per GB instead of dollars

Stream OpenTelemetry Data from Cribl to Parseable

Predictive Observability at Scale

Table of Contents

Try Parseable Pro free for 14 days

Subscribe to our newsletter

Home

Pricing

Resources

Company

SFO

BLR