Introduction
Your observability pipeline probably looks something like this: dozens of microservices emitting logs, metrics, and traces, all funneling into a backend that charges by the gigabyte. The data keeps growing, the bill keeps climbing, and half of what you're storing is noise you'll never query.
Cribl Stream sits in the middle of that pipeline. It lets you route, filter, sample, and transform telemetry before it hits your storage layer. The result: you keep the signal, drop the noise, and cut costs without losing visibility.
In this tutorial, we'll connect Cribl Stream to Parseable using QuickConnect, with the OpenTelemetry Demo as our data source. For detailed integration steps, see the Cribl integration docs. By the end, you'll have:
- A working pipeline from OTel Demo → Cribl → Parseable
- Intelligent sampling rules that reduce volume without losing critical data
- Alerts in Parseable for real-time incident detection
- Cost-effective storage on object storage (S3, MinIO, etc.)
Why Cribl + Parseable?
Cribl: Control Your Data Before It Lands
Cribl Stream acts as an observability router. Before telemetry reaches your backend, you can:
- Sample high-volume, low-value data (health checks, debug logs)
- Route different data types to different destinations
- Enrich events with context (add environment tags, normalize fields)
- Filter out noise (drop internal traffic, mask sensitive fields)
This means you're not paying to store data you'll never use.
Parseable: Store and Query What Matters
Once your curated telemetry reaches Parseable, you get:
- Object storage as the source of truth — S3-compatible storage at a fraction of the cost of traditional databases
- SQL-first querying — No proprietary query language. Just SQL.
- Unified telemetry — Logs, metrics, and traces in one place
- Built-in alerting — Anomaly and threshold-based alerts that trigger on any condition you can query
Together, Cribl handles the "what to keep" problem, and Parseable handles the "how to use it" problem.
Architecture Overview
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ OpenTelemetry │ │ │ │ │
│ Demo │────▶│ Cribl Stream │────▶│ Parseable │
│ (Microservices)│ │ (QuickConnect) │ │ (OTLP Endpoint)│
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
Logs, Metrics, Sampling, SQL Queries,
Traces (OTLP) Filtering, Alerts,
Routing Dashboards
Data Flow:
- OpenTelemetry Demo generates realistic e-commerce telemetry (traces from checkout flows, metrics from services, logs from all components)
- Cribl Stream receives OTLP data, applies sampling and filtering rules, then forwards to Parseable
- Parseable ingests via OTLP endpoint, stores on object storage, and exposes data via SQL
Prerequisites
- Cribl Stream instance (Cloud or self-hosted)
- Parseable instance (Parseable Cloud or self-hosted)
- OpenTelemetry Demo running and emitting telemetry
- Network connectivity between all components
Setting Up the OpenTelemetry Demo
If you don't have the OTel Demo running yet:
# Clone the demo
git clone https://github.com/open-telemetry/opentelemetry-demo.git
cd opentelemetry-demo
# Start the demo
docker compose up -d
The demo exposes OTLP endpoints on:
- gRPC:
localhost:4317 - HTTP:
localhost:4318
You'll configure Cribl to receive from these endpoints (or point the demo's collector at Cribl).
Step 1: Add Parseable as a Destination in Cribl
Using QuickConnect
QuickConnect is the fastest way to wire up sources and destinations in Cribl Stream.
- In Cribl Stream, navigate to Routing → QuickConnect
- Click Add Destination
- Select OpenTelemetry from the destination list
Configure General Settings
| Setting | Value |
|---|---|
| Output ID | parseable-otel |
| Description | OpenTelemetry telemetry to Parseable |
| Protocol Version | 0.10.0 |
| Protocol | HTTP |
| URL | https://<your-parseable-instance>/v1/metrics |
For Parseable Cloud, your URL will look like:
https://<instance>-ingestor.parseable.com
For self-hosted Parseable:
http://localhost:8000 or <parseable-host>:<parseable-port>
Configure Compression and Batching
Navigate to Advanced Settings:
| Setting | Value | Why |
|---|---|---|
| Compression | Gzip | Reduces network transfer by 70-90% |
| Request Timeout | 30 seconds | Allows for larger batches |
| Max Retries | 5 | Handles transient failures |
| Max Body Size | 4096 KB | Optimal batch size for OTLP |
| Flush Period | 1 second | Balance between latency and efficiency |
Configure HTTP Headers
Add these headers to route data to the correct Parseable streams:
| Header | Value | Purpose |
|---|---|---|
X-P-Stream | cribl-otel | Target stream name in Parseable |
X-P-Log-Source | otel-logs | For logs data |
X-P-Log-Source | otel-metrics | For metrics data |
X-P-Log-Source | otel-traces | For traces data |
Tip: You need to create separate datasets for logs, metrics, and traces in Parseable. This makes querying and retention policies more flexible.
Configure Authentication
In the Authentication tab:
- Username: Your Parseable username
- Password: Your Parseable password
For Parseable Cloud, use your login credentials. For self-hosted, use the credentials you configured during setup.
Save and Test
- Click Save
- Go to the Test tab
- Send a test event to verify connectivity
- Check the Live Data tab to confirm data is flowing
Step 2: Configure Intelligent Sampling in Cribl
Raw telemetry from a microservices application is noisy. Health checks fire every second. Debug logs repeat endlessly. Successful requests look identical. Sampling lets you keep representative data without storing everything.
Create a Sampling Pipeline
In Cribl Stream, go to Processing → Pipelines and create a new pipeline called otel-sampling.
Rule 1: Drop Health Check Noise
Health checks and readiness probes generate massive volume with zero debugging value.
// Filter Function
filter {
// Drop Kubernetes health checks
if (__inputId.includes('health') ||
__inputId.includes('ready') ||
__inputId.includes('live')) {
return false;
}
// Drop internal service mesh traffic
if (_raw.includes('/healthz') ||
_raw.includes('/readyz') ||
_raw.includes('/metrics')) {
return false;
}
return true;
}
Impact: Typically removes 30-50% of log volume with zero loss of debugging capability.
Rule 2: Sample Successful Requests
You don't need every successful request. A 10% sample gives you statistical validity without the storage cost.
// Sampling Function
sampling {
// Keep all errors (status >= 400)
if (status_code >= 400) {
return true;
}
// Keep all slow requests (> 1 second)
if (duration_ms > 1000) {
return true;
}
// Sample 10% of successful requests
if (Math.random() < 0.1) {
return true;
}
return false;
}
Impact: Reduces successful request volume by 90% while keeping 100% of errors and slow requests.
Rule 3: Aggregate Repetitive Metrics
Instead of storing every metric scrape, aggregate over time windows.
// Aggregation Function
aggregate {
// Group by metric name and labels
groupBy: ['metric_name', 'service', 'instance'],
// Aggregate over 60-second windows
timeWindow: 60,
// Keep min, max, avg, count
aggregations: {
value_min: min(value),
value_max: max(value),
value_avg: avg(value),
value_count: count()
}
}
Impact: Reduces metric volume by 60x (from per-second to per-minute) while preserving statistical accuracy.
Apply the Pipeline
- Go to Routing → QuickConnect
- Select your OTel source
- Attach the
otel-samplingpipeline - Connect to your
parseable-oteldestination
Step 3: Verify Data in Parseable
Once Cribl is forwarding data, verify it's landing in Parseable.
Check Stream Creation
In Parseable UI, navigate to Streams. You should see cribl-otel (or whatever you named it in the X-P-Stream header).
Query Sample Data
Run a simple query to verify data structure:
SELECT *
FROM "cribl-otel"
ORDER BY p_timestamp DESC
LIMIT 10;
Verify Sampling is Working
Compare raw volume to sampled volume:
-- Count events by type
SELECT
DATE_TRUNC('hour', p_timestamp) AS hour,
COUNT(*) AS event_count
FROM "cribl-otel"
GROUP BY hour
ORDER BY hour DESC;
If your sampling rules are working, you should see significantly lower volume than raw OTel Demo output (which generates thousands of events per minute).
Step 4: Set Up Alerts in Parseable
Parseable provides a form-based alert configuration system with three detection types: threshold alerts, anomaly detection, and forecasting.
Creating an Alert
Step 1: Set Rule
- Navigate to Alerts in Parseable UI
- Click Create Alert
- Configure the rule:
| Setting | Example Value | Description |
|---|---|---|
| Dataset | cribl-otel | Select your stream |
| Monitor Field | status_code or All rows (*) | Field to monitor |
| Aggregation | COUNT, AVG, SUM, MIN, MAX | How to aggregate |
| Filter | status_code >= 400 | Optional condition |
| Group By | service_name | Optional grouping |
Step 2: Set Evaluation
Choose one of three alert types:
Alert Type 1: Threshold
Static threshold alerts for known conditions.
| Setting | Example |
|---|---|
| Condition | Greater than |
| Value | 100 |
| Evaluation Window | 5 minutes |
Use case: Alert when error count exceeds 100 in 5 minutes.
Alert Type 2: Anomaly Detection
ML-based detection for unusual patterns using historical data.
| Setting | Example |
|---|---|
| Sensitivity | Medium |
| Historical Window | 7 days |
| Evaluation Window | 15 minutes |
Use case: Detect unusual spikes in request volume without setting manual thresholds. Parseable learns normal patterns and alerts on deviations.
Alert Type 3: Forecast
Predictive alerts that trigger before problems occur.
| Setting | Example |
|---|---|
| Forecast Window | 1 hour |
| Trigger Condition | Will exceed 1000 |
| Confidence | High |
Use case: Alert when forecasted error count will exceed threshold in the next hour, giving you time to act before impact.
Example Alert Configurations
High Error Rate (Threshold)
- Dataset:
cribl-otel - Monitor:
All rows (*) - Aggregation:
COUNT - Filter:
status_code >= 400 - Group By:
service_name - Condition: Greater than
50in5 minutes
Traffic Anomaly (Anomaly Detection)
- Dataset:
cribl-otel - Monitor:
All rows (*) - Aggregation:
COUNT - Sensitivity:
Medium - Historical Window:
7 days
Capacity Warning (Forecast)
- Dataset:
cribl-otel - Monitor:
request_count - Aggregation:
SUM - Forecast Window:
2 hours - Trigger: Will exceed
10000
Configure Alert Destinations
In Parseable, alerts can trigger:
- Webhook — POST to Slack, PagerDuty, or custom endpoints
Step 5: Optimize Storage Costs
Parseable's Storage Efficiency
Parseable stores data on object storage (S3, MinIO, GCS) using columnar Parquet format. This delivers:
| Benefit | Impact |
|---|---|
| Compression | 10-20x reduction vs JSON |
| Columnar storage | Only read columns you query |
Conclusion
You now have a production-ready observability pipeline:
- OpenTelemetry Demo generates realistic microservices telemetry
- Cribl Stream samples and routes data intelligently
- Parseable stores everything on cost-effective object storage with SQL access
The combination delivers:
70-90% cost reduction through intelligent sampling
SQL-native alerting for real-time incident detection
Unified telemetry — logs, metrics, traces in one place
Object storage economics — pennies per GB instead of dollars

