Data source connection issues can occur when associating OpenSearch domains or serverless collections with your application, or when an existing connection becomes unhealthy. This guide covers the most common failure scenarios and their resolutions.
Data Source Connection
Data source health statuses
When you view your application's data sources, each one shows a health status:
| Status | Meaning | Action Required |
|---|---|---|
| Green / Healthy | Connected and operational | None |
| Yellow | Connected but degraded (replica shards unassigned) | Monitor — may self-resolve |
| Red / Unhealthy | Connected but critical issues (primary shards unavailable) | Investigate domain health |
| Unavailable | Cannot establish connection | Check permissions, VPC, domain status |
Association failures
"Domain not found" error
Symptoms: When calling update-application, you get an error indicating the domain doesn't exist.
Common causes:
-
Wrong region — The domain is in a different region than the application
# Check which region the domain is in aws opensearch list-domain-names --region us-east-1 aws opensearch list-domain-names --region us-west-2 -
Wrong account — The domain is in a different AWS account
- For cross-account access, see Cross-Account Access
-
Domain name typo — The ARN contains an incorrect domain name
# List all domains to verify the name aws opensearch list-domain-names --region us-east-1 -
Domain was deleted — The domain no longer exists
aws opensearch describe-domain --domain-name my-domain --region us-east-1
"Access denied" when associating a data source
Symptoms: The update-application call fails with an access denied error.
Diagnosis:
-
Check your IAM permissions:
aws sts get-caller-identity -
Verify the required permissions exist:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "es:UpdateApplication", "es:DescribeDomain", "es:DescribeDomains" ], "Resource": "*" } ] } -
Check the domain's access policy:
aws opensearch describe-domain-config \ --domain-name my-domain \ --query 'DomainConfig.AccessPolicies' \ --region us-east-1
Solution:
The domain's access policy must allow the OpenSearch UI service:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "application.opensearchservice.amazonaws.com"
},
"Action": "es:ESHttp*",
"Resource": "arn:aws:es:us-east-1:123456789012:domain/my-domain/*"
}
]
}"Validation error" on data source ARN
Symptoms: The API rejects the data source ARN format.
Common ARN format issues:
| Data Source Type | Correct ARN Format |
|---|---|
| Managed domain | arn:aws:es:REGION:ACCOUNT:domain/DOMAIN_NAME |
| Serverless collection | arn:aws:aoss:REGION:ACCOUNT:collection/COLLECTION_ID |
Note that serverless collections use aoss (not es) and use the collection ID (not name):
# Get the collection ID
aws opensearchserverless list-collections \
--query 'collectionSummaries[?name==`my-collection`].id' \
--region us-east-1Cross-account data source issues
Connection shows as "Pending"
The inbound cross-cluster connection hasn't been accepted.
# List pending connections in Account A
aws opensearch describe-inbound-connections \
--filters '[{"Name": "connection-status", "Values": ["PENDING_ACCEPTANCE"]}]' \
--region us-east-1Accept the connection:
aws opensearch accept-inbound-connection \
--connection-id conn-abc123 \
--region us-east-1Cross-account domain shows "Unavailable"
-
Verify the cross-cluster connection is Active:
aws opensearch describe-outbound-connections \ --filters '[{"Name": "connection-status", "Values": ["ACTIVE"]}]' \ --region us-west-2 -
Check the remote domain's access policy includes the source account
-
For VPC domains, verify VPC endpoint authorization includes the source account:
aws opensearch authorize-vpc-endpoint-access \ --domain-name remote-domain \ --service application.opensearchservice.amazonaws.com \ --account 123456789012 \ --region us-west-2
Data source showing as "Unhealthy"
An unhealthy data source means the application can connect but the domain itself has issues.
Diagnosis
Check the domain's cluster health:
aws opensearch describe-domain \
--domain-name my-domain \
--region us-east-1 \
--query 'DomainStatus.{
Processing: Processing,
EngineVersion: EngineVersion,
ClusterConfig: ClusterConfig,
Endpoint: Endpoint
}'Check CloudWatch metrics for the domain:
# Check cluster status
aws cloudwatch get-metric-statistics \
--namespace AWS/ES \
--metric-name "ClusterStatus.red" \
--dimensions Name=DomainName,Value=my-domain Name=ClientId,Value=123456789012 \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period 300 \
--statistics MaximumCommon causes of unhealthy status
| Cause | Symptoms | Resolution |
|---|---|---|
| Red cluster status | Primary shards unavailable | Check disk space, node health |
| Domain processing | Configuration change in progress | Wait for processing to complete |
| High JVM memory pressure | Slow queries, timeouts | Scale up instance type or add nodes |
| Disk space full | Write rejections | Delete old indices or increase storage |
| Too many shards | Cluster instability | Reduce shard count, use ISM policies |
Quick fixes
Disk space full:
# Check disk usage via the domain's Discover
source = _cat/allocationHigh memory pressure:
- Scale up the instance type in the domain configuration
- Reduce concurrent query load
Too many shards:
- Use Index State Management (ISM) policies to automatically delete or rollover old indices
- Merge small indices
Serverless collection connection issues
Collection not appearing in data source list
-
Verify the collection exists and is Active:
aws opensearchserverless list-collections --region us-east-1 -
Check that the collection is in the same region as the application
-
Verify your IAM permissions include
aoss:BatchGetCollection
"Access denied" on serverless collection
Serverless collections require both a network policy and a data access policy:
Network policy (allows the service to connect):
[{
"Rules": [{
"ResourceType": "collection",
"Resource": ["collection/my-collection"]
}],
"SourceServices": ["application.opensearchservice.amazonaws.com"]
}]Data access policy (allows the service to read data):
[{
"Rules": [{
"ResourceType": "index",
"Resource": ["index/my-collection/*"],
"Permission": ["aoss:ReadDocument", "aoss:DescribeIndex"]
}],
"Principal": ["arn:aws:iam::123456789012:role/OpenSearchUIServiceRole"]
}]Check existing policies:
# List network policies
aws opensearchserverless list-security-policies --type network --region us-east-1
# List data access policies
aws opensearchserverless list-access-policies --type data --region us-east-1Direct query source failures
S3 direct query not connecting
-
Verify the Glue database and table exist:
aws glue get-table \ --database-name my_database \ --name my_table \ --region us-east-1 -
Check the IAM role used for direct query has permissions:
{ "Effect": "Allow", "Action": [ "glue:GetTable", "glue:GetDatabase", "s3:GetObject", "s3:ListBucket" ], "Resource": "*" } -
Verify the OpenSearch domain version is 2.13 or later (direct query requires it)
CloudWatch direct query not working
- Ensure the CloudWatch log group exists and has data
- Verify the IAM role has
logs:GetLogEventsandlogs:DescribeLogGroupspermissions - Check that the direct query data source is configured correctly in the OpenSearch UI
Connection recovery
Data source was healthy but is now unavailable
If a previously working data source becomes unavailable:
- Check domain status — The domain may be undergoing a configuration change
- Check access policy — Someone may have modified the domain's access policy
- Check VPC authorization — The authorization may have been revoked
- Check security groups — Rules may have been modified
- Check for service events — AWS may be experiencing issues in the region
Forcing a reconnection
There's no explicit "reconnect" action. To force the application to re-establish the connection:
-
Remove the data source:
aws opensearch update-application \ --id app-abc123def456 \ --data-sources '[]' \ --region us-east-1 -
Wait 2-3 minutes
-
Re-add the data source:
aws opensearch update-application \ --id app-abc123def456 \ --data-sources '[{ "dataSourceArn": "arn:aws:es:us-east-1:123456789012:domain/my-domain", "dataSourceDescription": "My domain" }]' \ --region us-east-1
Remember: update-application replaces the entire data source list. Include all data sources you want to keep.
Diagnostic commands reference
Quick reference for common diagnostic commands:
# Check application status and data sources
aws opensearch get-application --id app-abc123def456
# Check domain health
aws opensearch describe-domain --domain-name my-domain
# Check VPC endpoint authorization
aws opensearch list-vpc-endpoint-access --domain-name my-domain
# Check domain access policy
aws opensearch describe-domain-config \
--domain-name my-domain \
--query 'DomainConfig.AccessPolicies'
# List serverless network policies
aws opensearchserverless list-security-policies --type network
# List serverless data access policies
aws opensearchserverless list-access-policies --type data
# Check cross-cluster connections
aws opensearch describe-inbound-connections
aws opensearch describe-outbound-connections
# Check CloudTrail for recent errors
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=UpdateApplication \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)Getting help
If you've exhausted this guide:
- Collect the application ID, domain name, and region
- Run the diagnostic commands above and save the output
- Check CloudTrail for any error events in the last hour
- Open a support case with AWS Support including all collected information