Data Prepper 2.14: OpenTelemetry-powered APM service maps and production-ready Prometheus support

The OpenSearch Data Prepper maintainers are happy to announce the release of Data Prepper 2.14. This version expands support for observability use cases with a new application performance monitoring (APM) service map and improved Prometheus support.

APM service map

The otel_apm_service_map processor analyzes OpenTelemetry trace spans to automatically generate APM service map relationships and metrics. It creates structured events that can be visualized as service topology graphs, showing how services communicate with each other and their performance characteristics.

Key features include:

Automatic service relationship discovery: Identifies service-to-service interactions from OpenTelemetry spans.
APM metrics generation: Creates latency, throughput, and error rate metrics for service interactions using three-window processing with sliding time windows to ensure complete trace context.
Environment awareness: Derives new attributes from existing span attributes to support service environment grouping and custom attributes. It includes environment detection capabilities for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), AWS Lambda, and Amazon API Gateway and can be extended to support other cloud providers.
Service map snapshots: Enables users to view service connections for specific time periods with customizable resource attribute filtering.

Improved Prometheus sink support

The Prometheus sink now ensures compliance with remote write requirements through integrated sorting and deduplication logic. It chronologically organizes incoming events and strips duplicate samples for identical series/timestamps before transmission, preventing broker-side rejections.

To further handle data ingestion challenges, the new out_of_order_time_window option allows a configurable grace period for late-arriving data. This window enables the sink to accept and re-sort samples that arrive out of sequence, significantly improving pipeline resilience in distributed environments where perfectly ordered delivery is difficult to maintain.

AWS Lambda streaming

One of AWS Lambda’s features is response streaming, which allows functions to stream data back to clients. This reduces latency for the first responses and supports larger payloads, up to 200 MB.

In Data Prepper 2.14, you can now configure the aws_lambda processor to use streaming invocations. This allows you to receive responses larger than 6 MB, making it especially useful when the output exceeds the size of the input data.

Cross-region s3 sink

Data Prepper’s s3 sink now supports writing to Amazon Simple Storage Service (Amazon S3) buckets across multiple AWS Regions.

Previously, a single s3 sink could only write to buckets in one Region, which limited the use of one of its key features—dynamic bucket names.

With this enhancement, you can specify dynamic bucket names that adapt to different Regions. For example, you can define a bucket like myorganization-${/aws/region}. Data Prepper will then write to buckets such as myorganization-us-east-2 and myorganization-eu-central-1.

forward_to pipelines

In certain workflows, you may need to send data to sinks in a specific order or use the output from one sink as input for another.

The opensearch sink now supports the forward_to configuration. This allows you to define a target pipeline that receives events after they are written to OpenSearch. The forwarded events include the document ID field.

ARM architecture support

Data Prepper now provides a multi-architecture Docker image with support for both ARM and x86.

As many organizations adopt ARM to reduce compute costs, this change allows you to pull Data Prepper images directly on ARM systems without relying on emulation.

Additionally, Data Prepper offers ARM archive files, making it easier to run on ARM systems that do not use Docker.

Other notable changes

The Data Prepper Docker image is now 46% smaller and has fewer layers, improving Docker pull times.
The AWS Lambda processor now supports improved timeout configuration.
The aggregate processor now has enhanced support for end-to-end acknowledgments and configurations for disabling acknowledgments.
Data Prepper provides several new metrics for observing pipeline health.

Getting started

To download Data Prepper, visit the Download & Get Started page.
For information about getting started with Data Prepper, see Getting started with OpenSearch Data Prepper.
To learn more about the work in progress for Data Prepper 2.15 and other releases, see the Data Prepper Project Roadmap.

Thanks to our contributors!

Thanks to the following community members who contributed to this release!

ashrao94
chenqi0805 — Qi Chen
chrisale000
cwperks — Craig Perkins
divbok — Divyansh Bokadia
dlvenable — David Venable
eatulban
graytaylor0 — Taylor Gray
joelmarty — Joël Marty
kennedy-onyia — Kennedy Onyia
kkondaka — Krishna Kondaka
LeilaMoussa — Leila Moussa
mananrajotia — Manan Rajotia
peterzhuamazon — Peter Zhu
sabarinathan590 — Sabarinathan Subramanian
san81 — Santhosh Gandhe
sb2k16 — Souvik Bose
stelucz — Stehlík Lukáš
Subrahmanyam-Gollapalli — Subrahmanyam-Gollapalli
TomasLongo — Tomas
Utkarsh-Aga — Utkarsh Agarwal
vamsimanohar — Vamsi Manohar
vecheka — Vecheka
wandna-amazon — Nathan Wand
wjyao0316
Zhangxunmt — Xun Zhang

Authors

Krishna Kondaka

Krishna is a senior software engineer working on observability in OpenSearch at Amazon Web Services. He is also a contributor to the Data Prepper project. Prior to joining AWS, Krishna worked on development of AI infrastructure and caching services at Facebook. In addition, he has significant experience in developing networking products from his time at Cisco Systems and VMWare.

View all posts
David Venable

David is a senior software engineer working on observability in OpenSearch at Amazon Web Services. He is a maintainer on the Data Prepper project. Prior to working at Amazon, he was the CTO at Allogy Interactive - a start-up creating mobile-learning solutions for healthcare.

View all posts

Data Prepper 2.14: OpenTelemetry-powered APM service maps and production-ready Prometheus support

APM service map

Improved Prometheus sink support

AWS Lambda streaming

Cross-region s3 sink

forward_to pipelines

ARM architecture support

Other notable changes

Getting started

Thanks to our contributors!

Authors

OpenSearch is a community-driven, Apache 2.0-licensed open source search and analytics suite that makes it easy to ingest, search, visualize, and analyze data.

Participate

Providers

Resources

Data Prepper 2.14: OpenTelemetry-powered APM service maps and production-ready Prometheus support

APM service map

Improved Prometheus sink support

AWS Lambda streaming

Cross-region s3 sink

forward_to pipelines

ARM architecture support

Other notable changes

Getting started

Thanks to our contributors!

Share or Summarize with AI

Authors

Participate

Providers

Resources