Should Datadog pivot from agent-based to agentless?
Direct Answer
No standalone pivot — ship hybrid. The Datadog Agent stays the deep-visibility play (custom metrics, APM tracing, profiling, real user monitoring correlation) because no agentless approach captures application-internal signal at that fidelity. But Datadog must add first-class agentless intake for three lanes: cloud-provider-side telemetry (CloudWatch, Azure Monitor, GCP Operations), eBPF-based network and security observability, and OpenTelemetry-native push from apps that refuse to install vendor agents. Four reasons: (1) multi-cloud buyers want zero-host-install onboarding, (2) eBPF gives kernel-level visibility without per-app instrumentation, (3) OpenTelemetry adoption is now table-stakes for new logos, (4) competitive pressure from Honeycomb, Grafana, and hyperscaler-native tooling. The risk: hybrid dilutes the agent's per-host pricing model and trains buyers to think of Datadog as an aggregator instead of a platform — mitigate with bundled SKUs and agent-only premium features (Continuous Profiler, Live Processes).
What "Agent-Based" Means For Datadog Today
- Datadog Agent is a Go binary installed on every monitored host, container, or Kubernetes node — collects metrics, logs, traces, processes, and network flows
- Agent ships 600+ integrations (Postgres, Redis, Kafka, etc.) that auto-discover and pull metrics via local sockets
- APM tracing libraries (dd-trace) instrument application code in 10+ languages — agent forwards spans to backend
- Custom metrics via DogStatsD socket — the moat: no agentless approach captures app-internal counters at this granularity
- Per-host pricing ($15-23/host/month Infra, $31/host APM) is the revenue spine — agent install equals billable unit
What "Agentless" Means In 2026
Cloud-provider-side telemetry
- AWS CloudWatch + X-Ray, Azure Monitor, GCP Operations Suite stream metrics/traces from managed services without any host install
- Datadog already pulls these via API integrations — but treats them as second-class versus agent data
eBPF-based agentless
- Pixie (acquired by New Relic 2020), Cilium Hubble, Groundcover use eBPF kernel probes to capture HTTP/gRPC/DB calls with zero app instrumentation
- Single DaemonSet on the node — no per-pod sidecar, no code changes
OpenTelemetry-native push
- Apps emit OTLP directly to a vendor endpoint via SDK — no Datadog Agent required
- Datadog accepts OTLP today but the path is lossy (limited tag cardinality, no Live Processes, weaker correlation with infra metrics)
Why Hybrid Beats Pure-Agentless
- Custom metric depth: DogStatsD captures 50k+ unique metric keys per host — OTLP push caps far lower in practice due to cardinality limits
- Continuous Profiler requires the agent — CPU/memory flame graphs at production scale are a $200M+ ARR product line
- Live Processes / Live Containers needs host-resident collection — agentless can't see process trees
- Network Performance Monitoring historically required agent eBPF probes — pure cloud-side telemetry misses pod-to-pod flows
- Cloud Security Management (CSPM, CWPP, Cloud SIEM) anchors on agent-collected runtime signals — agentless posture scanning is a feature, not a replacement
- Pricing integrity: per-host SKU breaks if half the fleet is agentless free-tier OTLP
Why Hybrid Beats Pure-Agent
- Multi-cloud onboarding friction: Fortune 500 buyers running 3+ clouds will not deploy agents to every Lambda, Cloud Run, and Azure Function — agentless is the only path
- Serverless gap: AWS Lambda extensions help but cold-start tax pushes teams toward native CloudWatch + agentless trace forwarding
- OpenTelemetry mandate: enterprise architecture review boards now require OTLP support — refusing agentless means losing RFPs to Honeycomb and Grafana
- Security-sensitive workloads: regulated industries (banking, defense) often forbid third-party agents on production hosts — agentless eBPF or cloud-side is the only entry
- Edge and IoT: thousands of low-resource devices can't run a Go agent — push-based telemetry wins
- New logo velocity: agentless trial = 5-minute time-to-first-dashboard versus 30-minute agent rollout
The Competitive Landscape
- Splunk Observability (Cisco): SignalFx agent + OpenTelemetry-first push — already hybrid, weaker on integrations breadth
- Honeycomb: OTLP-native pure agentless — wins on trace exploration UX but lacks infra/log breadth
- New Relic + Pixie: bought eBPF agentless capability in 2020 — bundled with agent platform, the closest hybrid template
- Grafana Cloud: OpenTelemetry-native, agent optional (Grafana Alloy) — undercuts on price for OTLP workloads
- AWS X-Ray + CloudWatch: free-with-AWS gravitational pull on AWS-only shops — Datadog wins multi-cloud but loses single-cloud price wars
What Datadog Should Build Through 2027
- First-class OpenTelemetry-native intake: full tag/cardinality parity with agent-collected data, OTLP gateway as a managed service
- eBPF-based Cloud Network Monitoring v2: agentless DaemonSet for service mesh visibility — close the Pixie gap
- Acquisition target: Groundcover or Cilium-commercial (Isovalent already gone to Cisco) — buy eBPF agentless rather than rebuild; failing that, deepen partnership with Pixie's open-source fork
- Serverless-native SKU: bundled CloudWatch ingest + Lambda extension + agentless trace correlation, priced per-invocation not per-host
- Agent-only premium tier: Continuous Profiler, Live Processes, custom metrics over 500/host stay agent-exclusive — protect the per-host moat
- Hybrid pricing model: "Observability Units" that abstract host vs OTLP volume — survives the pricing-mix shift Wall Street will ask about on every earnings call
Approach Comparison
| Approach | Strength | Weakness | Datadog fit | FY27 priority |
|---|---|---|---|---|
| Datadog Agent (status quo) | Deepest visibility, custom metrics, profiler | Per-host install friction, multi-cloud burden | Core moat | Protect |
| Cloud-provider-side (CloudWatch, Azure Monitor) | Zero install, native managed-service coverage | Vendor-locked, lossy, expensive at scale | Already integrated, upgrade UX | Medium |
| eBPF agentless (Pixie, Cilium, Groundcover) | Kernel visibility, no app instrumentation | New tech, Linux-only, learning curve | Build or acquire | High |
| OpenTelemetry push (OTLP) | Vendor-neutral, RFP requirement | Cardinality limits, weaker correlation | Must reach parity | Critical |
| Pure agentless pivot | Simplicity narrative | Destroys per-host pricing, abandons profiler moat | Wrong move | Avoid |
Decision Flow
Bottom Line
Agent-based is a moat, not a millstone — pivoting away from it would burn the per-host pricing model and the profiler/custom-metrics franchise that competitors can't match. But refusing to ship best-in-class agentless lanes is the slow-bleed scenario: Honeycomb takes the OTLP-native logos, AWS-native tooling takes the single-cloud shops, and Cisco/Splunk takes the regulated enterprise. Hybrid wins. Build OpenTelemetry parity and eBPF agentless inside the same backend, price them as Observability Units, and keep the agent's premium features as the upsell ladder.
Related reading: [q1683](/answer.html?id=q1683), [q1696](/answer.html?id=q1696), [q1709](/answer.html?id=q1709).