Data Source Connection

Data source connection issues can occur when associating OpenSearch domains or serverless collections with your application, or when an existing connection becomes unhealthy. This guide covers the most common failure scenarios and their resolutions.

Data source health statuses

When you view your application's data sources, each one shows a health status:

Status	Meaning	Action Required
Green / Healthy	Connected and operational	None
Yellow	Connected but degraded (replica shards unassigned)	Monitor — may self-resolve
Red / Unhealthy	Connected but critical issues (primary shards unavailable)	Investigate domain health
Unavailable	Cannot establish connection	Check permissions, VPC, domain status

Association failures

"Domain not found" error

Symptoms: When calling update-application, you get an error indicating the domain doesn't exist.

Common causes:

Wrong region — The domain is in a different region than the application

# Check which region the domain is in
aws opensearch list-domain-names --region us-east-1
aws opensearch list-domain-names --region us-west-2

Wrong account — The domain is in a different AWS account
- For cross-account access, see Cross-Account Access

Domain name typo — The ARN contains an incorrect domain name

# List all domains to verify the name
aws opensearch list-domain-names --region us-east-1

Domain was deleted — The domain no longer exists

aws opensearch describe-domain --domain-name my-domain --region us-east-1

"Access denied" when associating a data source

Symptoms: The update-application call fails with an access denied error.

Diagnosis:

Check your IAM permissions:
```
aws sts get-caller-identity
```

Verify the required permissions exist:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "es:UpdateApplication",
        "es:DescribeDomain",
        "es:DescribeDomains"
      ],
      "Resource": "*"
    }
  ]
}

Check the domain's access policy:

aws opensearch describe-domain-config \
    --domain-name my-domain \
    --query 'DomainConfig.AccessPolicies' \
    --region us-east-1

Solution:

The domain's access policy must allow the OpenSearch UI service:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "application.opensearchservice.amazonaws.com"
      },
      "Action": "es:ESHttp*",
      "Resource": "arn:aws:es:us-east-1:123456789012:domain/my-domain/*"
    }
  ]
}

"Validation error" on data source ARN

Symptoms: The API rejects the data source ARN format.

Common ARN format issues:

Data Source Type	Correct ARN Format
Managed domain	`arn:aws:es:REGION:ACCOUNT:domain/DOMAIN_NAME`
Serverless collection	`arn:aws:aoss:REGION:ACCOUNT:collection/COLLECTION_ID`

Note that serverless collections use aoss (not es) and use the collection ID (not name):

# Get the collection ID
aws opensearchserverless list-collections \
    --query 'collectionSummaries[?name==`my-collection`].id' \
    --region us-east-1

Cross-account data source issues

Connection shows as "Pending"

The inbound cross-cluster connection hasn't been accepted.

# List pending connections in Account A
aws opensearch describe-inbound-connections \
    --filters '[{"Name": "connection-status", "Values": ["PENDING_ACCEPTANCE"]}]' \
    --region us-east-1

Accept the connection:

aws opensearch accept-inbound-connection \
    --connection-id conn-abc123 \
    --region us-east-1

Cross-account domain shows "Unavailable"

Verify the cross-cluster connection is Active:

aws opensearch describe-outbound-connections \
    --filters '[{"Name": "connection-status", "Values": ["ACTIVE"]}]' \
    --region us-west-2

Check the remote domain's access policy includes the source account

For VPC domains, verify VPC endpoint authorization includes the source account:

aws opensearch authorize-vpc-endpoint-access \
    --domain-name remote-domain \
    --service application.opensearchservice.amazonaws.com \
    --account 123456789012 \
    --region us-west-2

Data source showing as "Unhealthy"

An unhealthy data source means the application can connect but the domain itself has issues.

Diagnosis

Check the domain's cluster health:

aws opensearch describe-domain \
    --domain-name my-domain \
    --region us-east-1 \
    --query 'DomainStatus.{
      Processing: Processing,
      EngineVersion: EngineVersion,
      ClusterConfig: ClusterConfig,
      Endpoint: Endpoint
    }'

Check CloudWatch metrics for the domain:

# Check cluster status
aws cloudwatch get-metric-statistics \
    --namespace AWS/ES \
    --metric-name "ClusterStatus.red" \
    --dimensions Name=DomainName,Value=my-domain Name=ClientId,Value=123456789012 \
    --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
    --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
    --period 300 \
    --statistics Maximum

Common causes of unhealthy status

Cause	Symptoms	Resolution
Red cluster status	Primary shards unavailable	Check disk space, node health
Domain processing	Configuration change in progress	Wait for processing to complete
High JVM memory pressure	Slow queries, timeouts	Scale up instance type or add nodes
Disk space full	Write rejections	Delete old indices or increase storage
Too many shards	Cluster instability	Reduce shard count, use ISM policies

Quick fixes

Disk space full:

# Check disk usage via the domain's Discover
source = _cat/allocation

High memory pressure:

Scale up the instance type in the domain configuration
Reduce concurrent query load

Too many shards:

Use Index State Management (ISM) policies to automatically delete or rollover old indices
Merge small indices

Serverless collection connection issues

Collection not appearing in data source list

Verify the collection exists and is Active:

aws opensearchserverless list-collections --region us-east-1

Check that the collection is in the same region as the application
Verify your IAM permissions include aoss:BatchGetCollection

"Access denied" on serverless collection

Serverless collections require both a network policy and a data access policy:

Network policy (allows the service to connect):

[{
  "Rules": [{
    "ResourceType": "collection",
    "Resource": ["collection/my-collection"]
  }],
  "SourceServices": ["application.opensearchservice.amazonaws.com"]
}]

Data access policy (allows the service to read data):

[{
  "Rules": [{
    "ResourceType": "index",
    "Resource": ["index/my-collection/*"],
    "Permission": ["aoss:ReadDocument", "aoss:DescribeIndex"]
  }],
  "Principal": ["arn:aws:iam::123456789012:role/OpenSearchUIServiceRole"]
}]

Check existing policies:

# List network policies
aws opensearchserverless list-security-policies --type network --region us-east-1
 
# List data access policies
aws opensearchserverless list-access-policies --type data --region us-east-1

Direct query source failures

S3 direct query not connecting

Verify the Glue database and table exist:

aws glue get-table \
    --database-name my_database \
    --name my_table \
    --region us-east-1

Check the IAM role used for direct query has permissions:

{
  "Effect": "Allow",
  "Action": [
    "glue:GetTable",
    "glue:GetDatabase",
    "s3:GetObject",
    "s3:ListBucket"
  ],
  "Resource": "*"
}

Verify the OpenSearch domain version is 2.13 or later (direct query requires it)

CloudWatch direct query not working

Ensure the CloudWatch log group exists and has data
Verify the IAM role has logs:GetLogEvents and logs:DescribeLogGroups permissions
Check that the direct query data source is configured correctly in the OpenSearch UI

Connection recovery

Data source was healthy but is now unavailable

If a previously working data source becomes unavailable:

Check domain status — The domain may be undergoing a configuration change
Check access policy — Someone may have modified the domain's access policy
Check VPC authorization — The authorization may have been revoked
Check security groups — Rules may have been modified
Check for service events — AWS may be experiencing issues in the region

Forcing a reconnection

There's no explicit "reconnect" action. To force the application to re-establish the connection:

Remove the data source:

aws opensearch update-application \
    --id app-abc123def456 \
    --data-sources '[]' \
    --region us-east-1

Wait 2-3 minutes

Re-add the data source:

aws opensearch update-application \
    --id app-abc123def456 \
    --data-sources '[{
      "dataSourceArn": "arn:aws:es:us-east-1:123456789012:domain/my-domain",
      "dataSourceDescription": "My domain"
    }]' \
    --region us-east-1

Remember: update-application replaces the entire data source list. Include all data sources you want to keep.

Diagnostic commands reference

Quick reference for common diagnostic commands:

# Check application status and data sources
aws opensearch get-application --id app-abc123def456
 
# Check domain health
aws opensearch describe-domain --domain-name my-domain
 
# Check VPC endpoint authorization
aws opensearch list-vpc-endpoint-access --domain-name my-domain
 
# Check domain access policy
aws opensearch describe-domain-config \
    --domain-name my-domain \
    --query 'DomainConfig.AccessPolicies'
 
# List serverless network policies
aws opensearchserverless list-security-policies --type network
 
# List serverless data access policies
aws opensearchserverless list-access-policies --type data
 
# Check cross-cluster connections
aws opensearch describe-inbound-connections
aws opensearch describe-outbound-connections
 
# Check CloudTrail for recent errors
aws cloudtrail lookup-events \
    --lookup-attributes AttributeKey=EventName,AttributeValue=UpdateApplication \
    --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)

Getting help

If you've exhausted this guide:

Collect the application ID, domain name, and region
Run the diagnostic commands above and save the output
Check CloudTrail for any error events in the last hour
Open a support case with AWS Support including all collected information