Skip to main content
search

OpenSearch Data Prepper 2.12 is now available for download! This release includes a new way to ingest OpenTelemetry (OTel) data as well as two new sinks.

Unified OTLP source

Data Prepper now includes a unified OpenTelemetry protocol (OTLP) source that streamlines telemetry data ingestion through a single, consolidated configuration. This source supports multiple protocols, seamlessly handling both gRPC and HTTP (with proto encoding) endpoints. It enables ingestion of OTel logs, traces, and metrics through exposed OTLP endpoints, simplifying configuration management and improving the efficiency of the data processing pipeline.

Additionally, to help you process the different signal types that OTel provides, Data Prepper now includes the getEventType() function. This feature enables dynamic classification and conditional routing of events within pipelines for more flexible and intelligent processing. Specifically for the otlp source, you can use this to route different types to different pipelines.

This sample pipeline shows a basic otlp source that routes logs, metrics, and traces to three different pipelines for processing:

source:
  otlp:
  route:
    - logs: 'getEventType() == "LOG"'
    - traces: 'getEventType() == "TRACE"'
    - metrics: 'getEventType() == "METRIC"'

Amazon SQS sink

Data Prepper now supports Amazon Simple Queue Service (Amazon SQS) as an output sink. Amazon SQS is a widely adopted message queuing service designed for decoupling producers and consumers in distributed systems. It is especially suited for lightweight, structured messages that require timely delivery and reliable processing.

Sending Data Prepper output to SQS enables seamless communication between producers and consumers of data. Sending data directly to SQS is significantly faster and more efficient than traditional approaches such as sending output to an Amazon Simple Storage Service (Amazon S3) bucket and configuring an SQS notification on that bucket. With the new SQS sink, Data Prepper bypasses the overhead of writing to S3 and triggering SQS indirectly, reducing latency and improving responsiveness. This eliminates the need to configure S3 event notifications, write intermediate files, or manage bucket lifecycle rules. You can now go straight from processing to queuing with a clean, minimal configuration.

Here’s how to get started with the SQS sink:

sink:
  - sqs:
        queue_url: <queue-url>
        codec:
          json:
        aws:
          region: <region>
          sts_role_arn: <role>

OTLP sink for AWS X-Ray

You can now enhance your observability pipeline’s interoperability by seamlessly exporting processed trace data to AWS X-Ray through Data Prepper’s new OTLP sink plugin. This integration enables organizations to leverage Data Prepper’s powerful transformation and enrichment capabilities while maintaining compliance with OTel standards and sending data directly to AWS X-Ray using the OTLP format. The OTLP sink currently supports exporting spans to AWS X-Ray endpoints, with future versions planned to support sending spans, metrics, and logs to any OTLP protobuf-compatible endpoint. The plugin is designed for high performance, sustaining up to 3,500 transactions per second with sub-150ms p99 latency while using minimal system resources. Built with production reliability in mind, it features configurable retry logic with exponential backoff, gzip compression for efficient data transfer, and comprehensive metrics for monitoring pipeline health. Here’s how to get started with the OTLP sink for AWS X-Ray:

source:
  otel_trace_source:
sink:
  - otlp:
      endpoint: "https://xray.{region}.amazonaws.com/v1/traces"
      aws: { }

Maven releases

Many community members have expressed interest in using various Data Prepper features as libraries. To help support the broader community, the Data Prepper team is now publishing all Data Prepper libraries to Maven Central.

The following Maven groups are available to the community:

  • org.opensearch.dataprepper — Includes the data-prepper-api library that plugin authors use to write plugins.
  • org.opensearch.dataprepper.test — Test libraries to support common test scenarios when developing against Data Prepper.
  • org.opensearch.dataprepper.plugins — The plugins that deploy with Data Prepper. Each plugin either has its own jar or is combined with highly related plugins.
  • org.opensearch.dataprepper.core — Data Prepper core functionality, such as the plugin framework, events, expressions, and running as a pipeline.

Other features and improvements

  • Data Prepper expressions now support the modulus operator (%).
  • Data Prepper can now authorize with OpenSearch using an API token. The new parameter api_token sets a bearer token and can be used with a JWT to access OpenSearch.
  • You can now enable specific experimental plugins rather than enabling all or none.
  • You can now disable reporting of specific Data Prepper metrics. This can help reduce the overall quantity of metrics when there are some you don’t need to monitor.

Getting started

Thanks to our contributors!

Thanks to the following community members who contributed to this release!

Authors

  • David is a senior software engineer working on observability in OpenSearch at Amazon Web Services. He is a maintainer on the Data Prepper project. Prior to working at Amazon, he was the CTO at Allogy Interactive - a start-up creating mobile-learning solutions for healthcare.

    View all posts
  • Krishna is a senior software engineer working on observability in OpenSearch at Amazon Web Services. He is also a contributor to the Data Prepper project. Prior to joining AWS, Krishna worked on development of AI infrastructure and caching services at Facebook. In addition, he has significant experience in developing networking products from his time at Cisco Systems and VMWare.

    View all posts
  • Huy is a software development engineer working at Amazon. He is a contributor to the OpenSearch Data Prepper project.

    View all posts
  • Shenoy Pratik is a software engineer at AWS working on observability for the OpenSearch Project. He maintains multiple plugins, including OpenSearch Observability, Reporting, and Query Workbench. Prior to joining AWS, he worked at SAP, focusing on computer vision and machine learning.

    View all posts