Link Search Menu Expand Document Documentation Menu

This version of the OpenSearch documentation is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Get started with Data Prepper

Data Prepper is an independent component, not an OpenSearch plugin, that converts data for use with OpenSearch. It’s not bundled with the all-in-one OpenSearch installation packages.

1. Install Data Prepper

To use the Docker image, pull it like any other image:

docker pull opensearchproject/data-prepper:latest

2. Define a pipeline

Create a Data Prepper pipeline file, pipelines.yaml, with the following configuration:

  workers: 2
  delay: "5000"
    - stdout:

3. Start Data Prepper

Run the following command with your pipeline configuration YAML.

docker run --name data-prepper \
    -v /full/path/to/pipelines.yaml:/usr/share/data-prepper/pipelines.yaml \

This sample pipeline configuration above demonstrates a simple pipeline with a source (random) sending data to a sink (stdout). For more examples and details on more advanced pipeline configurations, see Pipelines.

After starting Data Prepper, you should see log output and some UUIDs after a few seconds:

2021-09-30T20:19:44,147 [main] INFO - Data Prepper server running at :4900
2021-09-30T20:19:44,681 [random-source-pool-0] INFO - Writing to buffer
2021-09-30T20:19:45,183 [random-source-pool-0] INFO - Writing to buffer
2021-09-30T20:19:45,687 [random-source-pool-0] INFO - Writing to buffer
2021-09-30T20:19:46,191 [random-source-pool-0] INFO - Writing to buffer
2021-09-30T20:19:46,694 [random-source-pool-0] INFO - Writing to buffer
2021-09-30T20:19:47,200 [random-source-pool-0] INFO - Writing to buffer
2021-09-30T20:19:49,181 [simple-test-pipeline-processor-worker-1-thread-1] INFO -  simple-test-pipeline Worker: Processing 6 records from buffer