Verifying migration tools

Before using the Migration Assistant, take the following steps to verify that your cluster is ready for migration.

Verifying snapshot creation

Verify that a snapshot can be created of your source cluster and used for metadata and backfill scenarios.

Installing the Elasticsearch S3 Repository plugin

The snapshot needs to be stored in a location that Migration Assistant can access. This guide uses Amazon Simple Storage Service (Amazon S3). By default, Migration Assistant creates an S3 bucket for storage. Therefore, it is necessary to install the Elasticsearch S3 repository plugin on your source nodes (https://www.elastic.co/guide/en/elasticsearch/plugins/7.10/repository-s3.html).

Additionally, make sure that the plugin has been configured with AWS credentials that allow it to read and write to Amazon S3. If your Elasticsearch cluster is running on Amazon Elastic Compute Cloud (Amazon EC2) or Amazon Elastic Container Service (Amazon ECS) instances with an AWS Identity and Access Management (IAM) execution role, include the necessary S3 permissions. Alternatively, you can store the credentials in the Elasticsearch keystore.

Verifying the S3 repository plugin configuration

You can verify that the S3 repository plugin is configured correctly by creating a test snapshot.

Create an S3 bucket for the snapshot using the following AWS Command Line Interface (AWS CLI) command:

aws s3api create-bucket --bucket <your-bucket-name> --region <your-aws-region>

curl -X PUT "http://<your-source-cluster>:9200/_snapshot/test_s3_repository" -H "Content-Type: application/json" -d '{
  "type": "s3",
  "settings": {
    "bucket": "<your-bucket-name>",
    "region": "<your-aws-region>"
  }
}'

Next, create a test snapshot that captures only the cluster’s metadata:

curl -X PUT "http://<your-source-cluster>:9200/_snapshot/test_s3_repository/test_snapshot_1" -H "Content-Type: application/json" -d '{
  "indices": "",
  "ignore_unavailable": true,
  "include_global_state": true
}'

Check the AWS Management Console to confirm that your bucket contains the snapshot.

Removing test snapshots after verification

To remove the resources created during verification, you can use the following deletion commands:

Test snapshot

curl -X DELETE "http://<your-source-cluster>:9200/_snapshot/test_s3_repository/test_snapshot_1?pretty"

Test snapshot repository

curl -X DELETE "http://<your-source-cluster>:9200/_snapshot/test_s3_repository?pretty"

S3 bucket

aws s3 rm s3://<your-bucket-name> --recursive
aws s3api delete-bucket --bucket <your-bucket-name> --region <your-aws-region>

Troubleshooting

Use this guidance to troubleshoot any of the following snapshot verification issues.

Access denied error (403)

If you encounter an error like AccessDenied (Service: Amazon S3; Status Code: 403), verify the following:

Make sure you’re using the S3 bucket created by Migration Assistant.
If you’re using a custom S3 bucket, verify that:
- The IAM role assigned to your Elasticsearch cluster has the necessary S3 permissions.
- The bucket name and AWS Region provided in the snapshot configuration match the actual S3 bucket you created.

Older versions of Elasticsearch

Older versions of the Elasticsearch S3 repository plugin may have trouble reading IAM role credentials embedded in Amazon EC2 and Amazon ECS instances. This is because the copy of the AWS SDK shipped with them is too old to read the new standard way of retrieving those credentials, as shown in the Instance Metadata Service v2 (IMDSv2) specification. This can result in snapshot creation failures, with an error message similar to the following:

{"error":{"root_cause":[{"type":"repository_verification_exception","reason":"[migration_assistant_repo] path [rfs-snapshot-repo] is not accessible on master node"}],"type":"repository_verification_exception","reason":"[migration_assistant_repo] path [rfs-snapshot-repo] is not accessible on master node","caused_by":{"type":"i_o_exception","reason":"Unable to upload object [rfs-snapshot-repo/tests-s8TvZ3CcRoO8bvyXcyV2Yg/master.dat] using a single upload","caused_by":{"type":"amazon_service_exception","reason":"Unauthorized (Service: null; Status Code: 401; Error Code: null; Request ID: null)"}}},"status":500}

If you encounter this issue, you can resolve it by temporarily enabling IMDSv1 on the instances in your source cluster for the duration of the snapshot. There is a toggle for this available in the AWS Management Console as well as in the AWS CLI. Switching this toggle will turn on the older access model and enable the Elasticsearch S3 repository plugin to work as normal. For more information about IMDSv1, see Modify instance metadata options for existing instances.

Switching over client traffic

The Migration Assistant Application Load Balancer is deployed with a listener that shifts traffic between the source and target clusters through proxy services. The Application Load Balancer should start in Source Passthrough mode.

Verifying that the traffic switchover is complete

Use the following steps to verify that the traffic switchover is complete:

In the AWS Management Console, navigate to EC2 > Load Balancers.
Select the MigrationAssistant ALB.
Examine the listener on port 9200 and verify that 100% of the traffic is directed to the Source Proxy.
Navigate to the Migration ECS Cluster in the AWS Management Console.
Select the Target Proxy Service.
Verify that the desired count for the service is running:
- If the desired count is not met, update the service to increase it to at least 1 and wait for the service to start.
On the Health and Metrics tab under Load balancer health, verify that all targets are reporting as healthy:
- This confirms that the Application Load Balancer can connect to the target cluster through the target proxy.
(Reset) Update the desired count for the Target Proxy Service back to its original value in Amazon ECS.

Fixing unidentified traffic patterns

When switching over traffic to the target cluster, you might encounter unidentified traffic patterns. To help identify the cause of these patterns, use the following steps:

Verify that the target cluster allows traffic ingress from the Target Proxy Security Group.
Navigate to Target Proxy ECS Tasks to investigate any failing tasks. Set the Filter desired status to Any desired status to view all tasks, then navigate to the logs for any stopped tasks.

Verifying replication

Use the following steps to verify that replication is working once the traffic capture proxy is deployed:

Navigate to the Migration ECS Cluster in the AWS Management Console.
Navigate to Capture Proxy Service.
Verify that the capture proxy is running with the desired proxy count. If it is not, update the service to increase it to at least 1 and wait for startup.
Under Health and Metrics > Load balancer health, verify that all targets are healthy. This means that the Application Load Balancer is able to connect to the source cluster through the capture proxy.
Navigate to the Migration Console Terminal.
Run console kafka describe-topic-records. Wait 30 seconds for another Application Load Balancer health check.
Run console kafka describe-topic-records again and verify that the number of RECORDS increased between runs.
Run console replay start to start Traffic Replayer.
Run tail -f /shared-logs-output/traffic-replayer-default/*/tuples/tuples.log | jq '.targetResponses[]."Status-Code"' to confirm that the Kafka requests were sent to the target and that it responded as expected. If the responses don’t appear:
- Check that the migration console can access the target cluster by running ./catIndices.sh, which should show the indexes in the source and target.
- Confirm that messages are still being recorded to Kafka.
- Check for errors in the Traffic Replayer logs (/migration/STAGE/default/traffic-replayer-default) using CloudWatch.
(Reset) Update the desired count for the Capture Proxy Service back to its original value in Amazon ECS.

Troubleshooting

Use this guidance to troubleshoot any of the following replication verification issues.

Health check responses with 401/403 status code

If the source cluster is configured to require authentication, the capture proxy will not be able to verify replication beyond receiving a 401/403 status code for Application Load Balancer health checks. For more information, see Failure Modes.

Traffic does not reach the source cluster

Verify that the source cluster allows traffic ingress from the Capture Proxy Security Group.

Look for failing tasks by navigating to Traffic Capture Proxy ECS. Change Filter desired status to Any desired status in order to see all tasks and navigate to the logs for stopped tasks.

Snapshot and S3 bucket issues

When using the CDK deployment for Migration Assistant, you might encounter the following errors during snapshot creation and deletion.

Bucket permissions

To make sure that you can delete snapshots as well as create them during the CDK deployment process, confirm that the OSMigrations-dev-<region>-CustomS3AutoDeleteObjects stack has S3 object deletion rights. Then, verify that OSMigrations-dev-<region>-default-SnapshotRole has the following S3 permissions:

List bucket contents
Read/Write/Delete objects

Snapshot conflicts

To prevent snapshot conflicts, use the console snapshot delete command from the migration console. If you delete snapshots or snapshot repositories in a location other than the migration console, you might encounter “already exists” errors.

Resetting before migration

After all verifications are complete, reset all resources before using Migration Assistant for an actual migration.

The following steps outline how to reset resources with Migration Assistant before executing the actual migration. At this point all verifications are expected to have been completed. These steps can be performed after Accessing the Migration Console.

Traffic Replayer

To stop running Traffic Replayer, use the following command:

console replay stop

Kafka

To clear all captured traffic from the Kafka topic, you can run the following command.

This command will result in the loss of any traffic data captured by the capture proxy up to this point and thus should be used with caution.

console kafka delete-topic

Target cluster

To clear non-system indexes from the target cluster that may have been created as a result of testing, you can run the following command:

This command will result in the loss of all data in the target cluster and should be used with caution.

console clusters clear-indices --cluster target

Verifying snapshot creation
Switching over client traffic
- Verifying that the traffic switchover is complete
- Fixing unidentified traffic patterns
Verifying replication
Resetting before migration

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Verifying migration tools

Verifying snapshot creation

Installing the Elasticsearch S3 Repository plugin

Verifying the S3 repository plugin configuration

Removing test snapshots after verification

Troubleshooting

Access denied error (403)

Older versions of Elasticsearch

Switching over client traffic

Verifying that the traffic switchover is complete

Fixing unidentified traffic patterns

Verifying replication

Troubleshooting

Health check responses with 401/403 status code

Traffic does not reach the source cluster

Snapshot and S3 bucket issues

Bucket permissions

Snapshot conflicts

Resetting before migration

Traffic Replayer

Kafka

Target cluster

OpenSearch Links

Get Involved

Resources

Contact Us