From Datadog to a Data Lake: A Practical Migration Guide

D
Debabrata Panigrahi
February 27, 2026
Step-by-step guide to migrate from Datadog to an observability data lake with Parseable. Covers auditing, dual-shipping via OTel, and cutover.
From Datadog to a Data Lake: A Practical Migration Guide

Migrating off Datadog is not a weekend project. It is a deliberate, multi-phase effort that touches instrumentation, dashboards, alerts, runbooks, and team habits. But teams do it successfully every quarter, and the ones who plan well report the same thing: the hardest part was deciding to start.

This guide walks through the full journey of moving from Datadog to an observability data lake architecture using Parseable. It covers the audit, the parallel run, the dashboard rebuild, the cutover, and the optimization that follows. It is honest about what is hard and what is easier than you expect.

If you are unfamiliar with the observability data lake concept, start with What Is an Observability Data Lake? for the architectural foundation. If you are still evaluating whether to migrate at all, The True Cost of Observability and The Traditional SaaS Pricing Model for Observability Is Broken lay out the financial case.

Why Teams Migrate from Datadog

The decision to migrate from Datadog is almost always financial. The technical capability is rarely the complaint. The bill is.

A mid-sized team running 100 hosts with 100 GB/day of log ingestion, APM, and infrastructure monitoring pays roughly $195,000 per year on Datadog. That number tends to grow 20-40% annually as infrastructure scales, new services ship, and custom metrics proliferate. At some point, engineering leadership asks the question every Datadog customer eventually asks: what are we getting for $200K that we could not get for $30K?

The answer, increasingly, is: not enough to justify the gap.

The observability data lake approach stores telemetry on object storage in open formats (Apache Parquet on S3), uses standard query languages (SQL), and ingests via open protocols (OpenTelemetry). The result is 80-90% lower cost with comparable query performance, longer retention, and zero vendor lock-in. The trade-off is that you manage the platform yourself, or use a managed service like Parseable Cloud that runs on the same architecture at a fraction of Datadog's price.

Before You Start: What to Expect

Let's set expectations clearly. Here is what gets easier and what gets harder when you migrate from Datadog to a data lake.

What gets easier:

  • Cost predictability. Parseable Pro charges $0.39/GB ingested with no per-host fees, no per-metric fees, and no surprise overages from custom metrics. Your bill scales linearly with data volume.
  • Retention. Parseable Pro includes 365 days of retention. On Datadog, 90-day log retention is an expensive add-on.
  • Querying. SQL instead of Datadog's proprietary query language. Your team already knows SQL.
  • Data portability. Your data is stored in Apache Parquet. You can query it with Parseable, DuckDB, Spark, Athena, or any Parquet-compatible tool. On the Enterprise plan with BYOB, the Parquet files live in your own S3 bucket.

What gets harder:

  • Datadog-specific features like RUM (Real User Monitoring), Synthetic Monitoring, and SIEM have no direct equivalent in Parseable. You will need separate solutions for these if you use them.
  • Datadog's 750+ integrations provide out-of-the-box dashboards and auto-discovery. With Parseable, you instrument via OpenTelemetry, which is powerful and vendor-neutral, but requires more upfront configuration.
  • Team habits. Engineers who have spent years in Datadog's UI have muscle memory. Retraining takes time, though the SQL query interface shortens the curve significantly.
  • Dashboard rebuild. Every Datadog dashboard, monitor, and alert must be recreated in Prism (Parseable's web UI). There is no automated migration tool for this.

Knowing these trade-offs upfront prevents surprises. Now let's walk through the migration phase by phase.

Phase 0: Audit Your Datadog Usage

Before writing a single line of configuration, you need a clear picture of what you are actually using in Datadog. Most teams discover they are paying for significantly more than they actively use.

Step 1: Inventory Your Datadog Products

Log into Datadog and document which product lines are active:

ProductActive?Monthly Volume/CountMonthly Cost
Infrastructure Monitoring__ hosts$
Log Management__ GB/day ingested$
APM / Distributed Tracing__ hosts$
Custom Metrics__ metric series$
Synthetic Monitoring__ test runs$
RUM__ sessions$
Security / SIEM__ GB/day$
CI Visibility__ test runs$
Total$

Step 2: Identify What Parseable Replaces

Parseable directly replaces:

  • Log Management -- full replacement. Parseable ingests logs via OTLP, stores in Parquet, queries with SQL. 365-day retention included on Pro.
  • Infrastructure Monitoring (metrics) -- Parseable ingests metrics via OTLP. Dashboards and alerts in Prism.
  • APM / Distributed Tracing -- Parseable ingests traces via OTLP. Trace visualization and correlation available in Prism.
  • Custom Metrics -- no per-metric pricing on Parseable. Send everything.

Parseable does not replace:

  • Synthetic Monitoring -- use a dedicated tool like Checkly, Grafana Synthetic Monitoring, or Uptime Kuma.
  • RUM (Real User Monitoring) -- use a tool like Sentry, PostHog, or OpenTelemetry browser instrumentation feeding into Parseable for basic session data.
  • SIEM / Security Monitoring -- if you rely on Datadog Cloud SIEM, you will need a dedicated SIEM solution alongside Parseable.

Step 3: Export Your Dashboard and Alert Inventory

Use the Datadog API to list all dashboards and monitors:

# List all dashboards
curl -s -X GET "https://api.datadoghq.com/api/v1/dashboard" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \
  | jq '.dashboards[] | {id, title, author_handle}'
 
# List all monitors
curl -s -X GET "https://api.datadoghq.com/api/v1/monitor" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \
  | jq '.[] | {id, name, type, query}'

Export the results to a spreadsheet. For each dashboard and monitor, note:

  • Who owns it? Some dashboards were created years ago by engineers who have left.
  • Is it actively used? Datadog's audit trail shows when a dashboard was last viewed.
  • Is it critical? Does an on-call runbook reference it?

Most teams find that 60-70% of their Datadog dashboards are unused or stale. You only need to recreate the ones that matter.

Step 4: Calculate Your Expected Savings

Here is a concrete example. Suppose your current Datadog spend breaks down as follows:

Line ItemDatadog Annual Cost
Infrastructure (100 hosts, Enterprise)$27,600
Log Management (100 GB/day, 30-day retention)$164,000
APM (50 hosts)$18,600
Custom Metrics (5,000 series)$3,000
Total$213,200

Now the Parseable equivalent on the Pro plan:

Line ItemParseable Annual Cost
Ingestion: 100 GB/day x 30 days x $0.39/GB$1,170/month = $14,040/year
Retention: 365 days included$0
Users, dashboards, alerts: unlimited$0
Query scanning (up to 10x ingestion included)$0
Total~$14,040/year

That is a 93% cost reduction -- from $213,200 to roughly $14,040 per year. Even accounting for additional tooling to replace Synthetics or RUM, the savings are substantial. The Pro plan includes a 14-day free trial, so you can validate the approach before committing.

For larger deployments, the Enterprise plan (starting at $15,000/year) adds BYOB (Bring Your Own Bucket), Apache Iceberg support, premium support, and flexible deployment options including BYOC and self-hosted.

Phase 1: Parallel Run with Dual-Shipping

This is the most important phase. You will run Parseable alongside Datadog, sending the same telemetry to both systems simultaneously. This lets you validate Parseable's ingestion, query performance, and alerting without any risk to your existing monitoring.

Step 1: Deploy the OpenTelemetry Collector

If you are not already using OpenTelemetry, deploy the OTel Collector in your infrastructure. If you are using Datadog's proprietary agents, you will eventually replace them with OTel Collectors, but during the parallel run you can run both.

For Kubernetes deployments, use the OpenTelemetry Operator or the Helm chart:

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm install otel-collector open-telemetry/opentelemetry-collector \
  --set mode=daemonset

Step 2: Configure Dual-Shipping

The OTel Collector's pipeline model makes dual-shipping straightforward. Configure exporters for both Datadog and Parseable, then reference both in your pipelines:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  # Collect container logs from Kubernetes
  filelog:
    include:
      - /var/log/pods/*/*/*.log
    operators:
      - type: router
        routes:
          - output: parser
            expr: 'true'
      - id: parser
        type: json_parser
        timestamp:
          parse_from: attributes.time
          layout: '%Y-%m-%dT%H:%M:%S.%fZ'
 
exporters:
  # Existing Datadog exporter -- keep running during parallel phase
  datadog:
    api:
      key: ${DD_API_KEY}
      site: datadoghq.com
    metrics:
      resource_attributes_as_tags: true
    logs:
      dump_payloads: false
    traces:
      span_name_as_resource_name: true
 
  # New Parseable exporter via OTLP HTTP
  otlphttp/parseable:
    endpoint: "https://your-parseable-instance.parseable.cloud/v1"
    headers:
      Authorization: "Basic <base64-encoded-credentials>"
      X-P-Stream: "app-telemetry"
    compression: gzip
 
processors:
  batch:
    timeout: 5s
    send_batch_size: 1024
  resourcedetection:
    detectors: [env, system, docker, eks]
    timeout: 5s
 
service:
  pipelines:
    logs:
      receivers: [filelog, otlp]
      processors: [resourcedetection, batch]
      exporters: [datadog, otlphttp/parseable]
    traces:
      receivers: [otlp]
      processors: [resourcedetection, batch]
      exporters: [datadog, otlphttp/parseable]
    metrics:
      receivers: [otlp]
      processors: [resourcedetection, batch]
      exporters: [datadog, otlphttp/parseable]

This configuration sends identical telemetry to both Datadog and Parseable. During the parallel run, Datadog remains your system of record. Parseable is the system you are validating.

Step 3: Validate Data Integrity

Once dual-shipping is running, verify that Parseable is receiving the same data as Datadog. Run comparison queries:

In Datadog:

logs("service:payment-api status:error") .rollup(count).last("1h")

In Parseable (SQL via Prism):

SELECT COUNT(*) as error_count
FROM "app-telemetry"
WHERE service_name = 'payment-api'
  AND severity_text = 'ERROR'
  AND timestamp > NOW() - INTERVAL '1 hour'

The counts should be very close. Small discrepancies (1-3%) are normal due to timing differences in batch exports. Large discrepancies indicate a configuration issue in the OTel Collector pipeline.

How Long to Run in Parallel

Run the parallel phase for at least two to four weeks. This gives you enough time to:

  • Validate data completeness across all services
  • Experience at least one on-call incident where you use Parseable for investigation
  • Build confidence that Parseable's query performance meets your needs
  • Identify any gaps in telemetry coverage

The cost of running both systems in parallel is real, but it is insurance. Do not skip this phase.

Phase 2: Rebuild Critical Dashboards and Alerts

With validated data flowing into Parseable, start recreating your critical dashboards and alerts in Prism.

Prioritize Ruthlessly

Do not attempt to recreate every Datadog dashboard. Start with:

  1. On-call dashboards -- the ones referenced in your incident runbooks
  2. SLO dashboards -- the ones your team reviews weekly
  3. Service health dashboards -- the top 5-10 services by criticality
  4. Cost/usage dashboards -- if you track infrastructure spend

Everything else can be rebuilt on demand after cutover.

Translating Datadog Queries to SQL

The biggest practical difference is the query language. Datadog uses a proprietary query syntax. Parseable uses SQL. Here are common translations:

Error rate by service (Datadog):

logs("status:error").by("service").rollup(count).last("24h")

Error rate by service (Parseable SQL):

SELECT service_name,
       COUNT(*) as error_count
FROM "app-telemetry"
WHERE severity_text = 'ERROR'
  AND timestamp > NOW() - INTERVAL '24 hours'
GROUP BY service_name
ORDER BY error_count DESC

P99 latency by endpoint (Datadog):

avg:trace.http.request.duration{service:api-gateway} by {resource_name}.rollup(p99, 300)

P99 latency by endpoint (Parseable SQL):

SELECT span_name as endpoint,
       PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY duration_ms) as p99_ms
FROM "app-telemetry"
WHERE service_name = 'api-gateway'
  AND timestamp > NOW() - INTERVAL '1 hour'
GROUP BY span_name
ORDER BY p99_ms DESC

Log volume over time (Datadog):

logs("*").rollup(count, 3600).last("7d")

Log volume over time (Parseable SQL):

SELECT DATE_TRUNC('hour', timestamp) as hour,
       COUNT(*) as log_count
FROM "app-telemetry"
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY hour
ORDER BY hour ASC

SQL is more verbose than Datadog's shorthand, but it is also more powerful and more familiar to most engineers. There is no proprietary syntax to memorize, and any SQL reference applies.

Rebuilding Alerts

Parseable supports alerts in Prism with configurable thresholds and notification channels. For each critical Datadog monitor, create an equivalent alert:

Datadog MonitorParseable Alert (SQL Condition)
Error rate > 5% on payment-apiSELECT ... WHERE service = 'payment-api' HAVING error_rate > 0.05
Log volume drops below 1000/minSELECT COUNT(*) ... HAVING count < 1000
P95 latency > 500ms on checkoutSELECT PERCENTILE_CONT(0.95) ... HAVING p95 > 500

Test every alert during the parallel run. Trigger known failure modes in staging and verify that Parseable fires the alert with the correct severity and routing.

Phase 3: Cutover

Once you have validated data integrity, rebuilt critical dashboards, and tested alerts, you are ready to cut over.

Step 1: Update OTel Collector Configuration

Remove the Datadog exporter from your OTel Collector pipelines:

service:
  pipelines:
    logs:
      receivers: [filelog, otlp]
      processors: [resourcedetection, batch]
      exporters: [otlphttp/parseable]  # Datadog exporter removed
    traces:
      receivers: [otlp]
      processors: [resourcedetection, batch]
      exporters: [otlphttp/parseable]
    metrics:
      receivers: [otlp]
      processors: [resourcedetection, batch]
      exporters: [otlphttp/parseable]

Step 2: Remove Datadog Agents

If you were running Datadog's proprietary agents alongside the OTel Collector during the parallel phase, decommission them:

# Kubernetes: remove the Datadog agent DaemonSet
helm uninstall datadog-agent
 
# Linux VMs: stop and remove the agent
sudo systemctl stop datadog-agent
sudo apt-get remove datadog-agent  # or yum remove

Step 3: Update Runbooks

Every on-call runbook that references "go to Datadog and search for..." needs to be updated with the equivalent Parseable query and Prism dashboard link. This is tedious but critical. An engineer at 3 AM should not have to figure out the new query syntax from scratch.

Step 4: Communicate the Change

Notify your engineering organization:

  • Send a migration guide with the top 10 most common Datadog queries translated to SQL
  • Run a 30-minute hands-on session where engineers practice using Prism
  • Designate a migration champion on each team who can answer questions during the first two weeks

Step 5: Cancel Datadog

Contact Datadog to cancel your subscription. Be aware that Datadog contracts often have annual commitment terms with penalties for early termination. Plan your cutover timing around contract renewal dates to avoid paying for both systems longer than necessary.

Phase 4: Optimize

With Datadog fully decommissioned, you can now take advantage of data lake capabilities that were not possible before.

Extend Retention

On Datadog, keeping 90 days of log retention was a budget negotiation. On Parseable Pro, 365 days of retention is included at no additional cost. This means you can investigate incidents that happened months ago, run trend analysis across quarters, and meet compliance requirements without paying retention surcharges.

Leverage Open Formats

Your telemetry is now stored in Apache Parquet. On the Enterprise plan with BYOB, the Parquet files live in your own S3 bucket, which means you can query them with external tools:

-- Query your observability data with DuckDB (Enterprise BYOB only)
SELECT DATE_TRUNC('day', timestamp) as day,
       service_name,
       COUNT(*) as total_events,
       SUM(CASE WHEN severity_text = 'ERROR' THEN 1 ELSE 0 END) as errors
FROM read_parquet('s3://your-bucket/app-telemetry/2026/02/**/*.parquet')
GROUP BY day, service_name
ORDER BY day DESC, errors DESC

This is impossible with Datadog. Your data was locked in their proprietary format on their infrastructure. With a data lake, your observability data becomes a first-class analytical asset. For more on why this matters, see Why Your Observability Data Should Live in Apache Parquet and Bring Your Own Bucket: Data Ownership in Observability.

Optimize Ingestion Costs

With Parseable, you are no longer penalized for sending more data. But that does not mean you should ignore telemetry hygiene. Use the OTel Collector's processors to:

  • Filter debug logs in production (they add volume without operational value)
  • Sample verbose traces using tail-based sampling for non-critical services
  • Drop duplicate events using the dedup processor

These optimizations reduce ingestion volume, which reduces cost even further. The difference from Datadog is that you are optimizing by choice for efficiency, not out of desperation to stay within budget.

Build Cross-Signal Correlations

One of the key advantages of a unified data lake is the ability to correlate across logs, metrics, and traces using SQL joins. In Datadog, cross-signal correlation relies on Datadog's UI and proprietary linking. In Parseable, you write SQL:

-- Correlate trace errors with pod restarts
SELECT t.trace_id,
       t.service_name,
       t.span_name,
       t.status_code,
       l.body as pod_event
FROM "app-traces" t
JOIN "k8s-events" l
  ON t.resource['k8s.pod.name'] = l.resource['k8s.pod.name']
  AND l.timestamp BETWEEN t.timestamp - INTERVAL '5 minutes'
                       AND t.timestamp + INTERVAL '5 minutes'
WHERE t.status_code = 'ERROR'
  AND l.body LIKE '%OOMKilled%'
  AND t.timestamp > NOW() - INTERVAL '24 hours'
ORDER BY t.timestamp DESC

This kind of query is natural in SQL and nearly impossible in Datadog's UI without manually clicking through multiple views. For a deeper exploration of this architecture, see Building an Observability Lakehouse with OpenTelemetry.

What Is Honestly Hard

Migration guides tend to oversell the easy parts. Here is what is genuinely difficult about moving from Datadog to a data lake.

Datadog's Integrations Are Hard to Match

Datadog has 750+ integrations that provide out-of-the-box dashboards, pre-built monitors, and auto-discovery. When you install the Datadog agent on a host running PostgreSQL, you immediately get a PostgreSQL dashboard with query performance metrics, replication lag, and connection pool utilization.

With OpenTelemetry, you get the same data, but you need to configure the OTel Collector's postgresql receiver, define which metrics to collect, and build the dashboard yourself. The telemetry is equivalent; the setup cost is higher.

Mitigation: Start with your most critical services. You do not need 750 integrations on day one. Most teams actively use 10-20 integrations. Build those first.

Datadog's APM Is Very Good

Datadog's APM, particularly its auto-instrumentation, service catalog, and continuous profiling, is a polished product. Parseable ingests traces via OTLP and provides trace visualization in Prism, but features like continuous profiling and the service catalog are not direct equivalents.

Mitigation: For most teams, trace ingestion, visualization, and SQL-based trace analysis cover 80-90% of APM use cases. If you rely heavily on Datadog's continuous profiler, evaluate tools like Pyroscope as a complement.

Team Inertia

Engineers resist change, especially to tools they use during stressful incidents. The engineer who has been using Datadog for three years knows exactly which dashboard to open and which query to run when a service goes down at 2 AM. Asking them to learn a new tool is asking them to be slower during the highest-pressure moments of their job.

Mitigation: The parallel run phase is critical here. Engineers need weeks of low-pressure exposure to Prism before they rely on it during incidents. Pair experienced Parseable users with skeptics during the parallel phase. And lean into the SQL advantage -- most engineers are more comfortable with SQL than they are with Datadog's proprietary query language.

No Automated Dashboard Migration

There is no tool that converts Datadog dashboards into Parseable dashboards automatically. Each dashboard must be manually recreated. For teams with dozens of dashboards, this is a real time investment.

Mitigation: Audit first, rebuild selectively. Most teams discover that fewer than 30% of their Datadog dashboards are actively used. Only recreate the ones that are referenced in runbooks or reviewed regularly.

Migration Timeline

Here is a realistic timeline for a mid-sized team (50-200 hosts, 50-200 GB/day):

PhaseDurationKey Activities
Phase 0: Audit1-2 weeksInventory Datadog usage, calculate expected savings, get stakeholder buy-in
Phase 1: Parallel Run2-4 weeksDeploy OTel Collector, configure dual-shipping, validate data integrity
Phase 2: Parity2-3 weeksRebuild critical dashboards and alerts in Prism, test alert routing
Phase 3: Cutover1 weekRemove Datadog exporters, decommission agents, update runbooks
Phase 4: OptimizeOngoingExtend retention, build cross-signal queries, refine OTel pipelines
Total6-10 weeks

Larger organizations with more complex Datadog usage (SIEM, RUM, Synthetics) should expect 3-4 months for a full migration, with separate workstreams for replacing each Datadog product line.

This guide is part of a series on the observability data lake approach:

Getting Started

The best way to evaluate whether Parseable can replace Datadog for your team is to run the parallel phase. You do not need to commit to a migration to start testing.

  1. Sign up for a 14-day free trial on Parseable Cloud. Pro plan is $0.39/GB ingested with 365-day retention and unlimited users.
  2. Deploy an OTel Collector with dual-shipping (Datadog + Parseable) on one non-critical service.
  3. Run comparison queries for a week.
  4. If the data matches and the query experience works for your team, expand to more services and proceed through the phases above.

The migration from Datadog to a data lake is not painless. But teams that complete it consistently report the same outcome: comparable observability at a fraction of the cost, with the added benefit of owning their data in open formats. The hard part is the transition. The result is worth it.

Share:

Subscribe to our newsletter

Get the latest updates on Parseable features, best practices, and observability insights delivered to your inbox.

SFO

Parseable Inc.

584 Castro St, #2112

San Francisco, California

94114-2512

Phone: +1 (650) 444 6216

BLR

Cloudnatively Services Private Limited

JBR Tech Park

Whitefield, Bengaluru

560066

Phone: +91 9480931554

All systems operational

Parseable