The OpenSearch Data Prepper maintainers are happy to announce the release of Data Prepper 2.13. This release includes a number of improvements and new capabilities that make Data Prepper easier to use.
Prometheus sink
Data Prepper now supports Prometheus as a sink—initially, only Amazon Managed Service for Prometheus is supported as the external Prometheus sink. This enables you to export metric data processed within Data Prepper pipelines to the Prometheus ecosystem and allows Data Prepper to serve as a bridge between various metric sources (like OpenTelemetry, Logstash, or Amazon Simple Storage Service [Amazon S3]) and Prometheus-compatible monitoring systems.
A core aspect of the Prometheus sink is its handling of different metric types. The implementation ensures that Data Prepper’s internal metric representations are correctly mapped to Prometheus time series families:
- Counters: For
Summetrics with cumulative aggregation temporality and monotonically increasing values, the sink generates a single time series using the metric name. The value represents the cumulative count. - Gauges: Similar to counters,
Gaugemetrics are mapped to a single time series with the current value and also forSummetrics that are not mapped to counters. - Summaries: Summary metrics are converted into a time series with
quantilelabels, along with corresponding\_sumand\_countseries. - Histograms: Support for histograms is more complex. The sink generates many distinct types of time series for each histogram metric to fully represent the distribution, including
buckets,sum,count,min, andmax. - Exponential histograms: Support for histograms is more complex. The sink generates many distinct types of time series for each histogram metric to fully represent the distribution, including
scale,zero threshold,zero count,sum,count,min, andmax.
In addition to mapping metrics, the sink handles attribute labeling and name sanitization, creating labels for all metric, resource, and scope attributes.
It can be easily configured for Amazon Managed Service for Prometheus as follows:
sink:
- prometheus:
url: <amp workspace remote-write api url>
aws:
region: <region>
sts_role_arn: <role-arn>
OpenSearch data stream support
Data Prepper now supports OpenSearch data streams natively in the opensearch sink. With this change, Data Prepper will look up the index to determine whether it is a data stream. If so, it will configure the bulk writes to the sink so that they work directly with data streams.
Prior to this feature, Data Prepper pipeline authors would need to make manual adjustments to the sink configuration to write to data stream indexes. Now users can create a minimal sink configuration that will set up the sink correctly. Additionally, Data Prepper will automatically set the @timestamp field to the time received by Data Prepper if the pipeline does not already set this value.
For example, the configuration could be as simple as the following:
sink:
- opensearch:
hosts: [ "https://localhost:9200" ]
index: my-log-index
Cross-Region s3 source
The s3 source is a popular Data Prepper feature for ingesting data from S3 buckets. This source can read from S3 buckets using Amazon Simple Queue Service (Amazon SQS) notifications or scan multiple S3 buckets. It is common for users to have S3 buckets in multiple AWS Regions that they want to read in a single pipeline. For example, some teams may want to get VPC flow logs from multiple Regions and consolidate them into a single OpenSearch cluster. Now Data Prepper users can read from multiple buckets in different Regions. And there is no need to create a custom configuration for this feature—Data Prepper will handle this for customers.
Other great changes
- The maintainers have invested in performance improvements for expressions and core processors. Our benchmarking indicates that this has improved throughput by over 20% when using expressions.
- The
dynamodbsource now fully checkpoints within shards. This change reduces duplicate processing from Amazon DynamoDB tables when failures occur. Before this change, when restarting reading from a DynamoDB shard, Data Prepper would start from the beginning of the shard. With this change, a Data Prepper node will start from the last successfully processed event in the shard. - The
delete_entriesandselect_entriesprocessors now support regex patterns to determine whether to delete or select fields to help pipeline authors clean up their events. - The
rename_keysprocessor can now normalize keys, allowing pipeline authors to write simple pipelines to get data into OpenSearch.
Getting started
- To download Data Prepper, visit the Download & Get Started page.
- For instructions on how to get started with Data Prepper, see Getting started with OpenSearch Data Prepper.
- To learn more about the work in progress for Data Prepper 2.14 and other releases, see the Data Prepper Project Roadmap.
Thanks to our contributors!
Thanks to the following community members who contributed to this release!
- akshay0709 — Akshay Pawar
- alparish
- chenqi0805 — Qi Chen
- danhli — Daniel Li
- Davidding4718 — Siqi Ding
- derek-ho — Derek Ho
- dinujoh — Dinu John
- divbok — Divyansh Bokadia
- dlvenable — David Venable
- FedericoBrignola
- franky-m
- gaiksaya — Sayali Gaikawad
- Galactus22625 — Maxwell Brown
- graytaylor0 — Taylor Gray
- huypham612 — huyPham
- ivan-tse — Ivan Tse
- janhoy — Jan Høydahl
- jayeshjeh — Jayesh Parmar
- jeffreyAaron — Jeffrey Aaron Jeyasingh
- jmsusanto — Jeremy Michael
- joelmarty — Joël Marty
- juergen-walter — Jürgen Walter
- KarstenSchnitter — Karsten Schnitter
- kkondaka — Krishna Kondaka
- LeeroyHannigan — Lee
- linghengqian — Ling Hengqian
- mishavay-aws
- MohammedAghil — Mohammed Aghil Puthiyottil
- niketan16 — Niketan Chandarana
- nsgupta1 — Neha Gupta
- oeyh — Hai Yan
- ps48 — Shenoy Pratik
- quanghungb — qhung
- RashmiRam — Rashmi
- Rishikesh1159 — Rishikesh
- saketh-pallempati — Saketh Pallempati
- san81 — Santhosh Gandhe
- savit-aluri — Savit Aluri
- sb2k16 — Souvik Bose
- seschis — Shane Schisler
- shenkw1 — Katherine Shen
- srikanthjg — Srikanth Govindarajan
- timo-mue
- TomasLongo — Tomas
- Zhangxunmt — Xun Zhang