Debugging DNS Issues in Docker: A Practical Breakdown
A real-world troubleshooting breakdown of DNS resolution issues between Docker containers and the host system, and how the issue was resolved.
This post covers a real issue encountered while running containerised services: DNS resolution behaving differently between the host system and Docker containers.
Problems like this are valuable because they move beyond setup guides and require understanding how multiple layers interact.
The Problem
Services running on the host were resolving correctly, but some containers were unable to resolve internal DNS records as expected.
This created inconsistent behaviour:
- Host system could reach services
- Some containers could not
- External access still worked
At first glance, the issue appeared to be random, but it was caused by differences in DNS resolution paths.
Initial Checks
The first step was to confirm what was working and what was failing.
This included checking:
- DNS resolution from the host
- DNS resolution from affected containers
- Reverse proxy behaviour
- Network connectivity between containers and the DNS server
This helped narrow the issue to container DNS rather than general network failure.
Architecture Context
The environment used:
- Pi-hole for DNS filtering and local records
- Unbound for recursive resolution
- Docker for workloads
- Reverse proxy for service access
This meant multiple components could potentially influence name resolution.
Root Cause
The key issue was that containers were not always using the same DNS path as the host.
Depending on configuration, containers may:
- Use Docker's embedded DNS
- Use inherited host DNS settings
- Use explicitly defined DNS servers
This can lead to different results between host and container environments.
Simplified DNS Path Comparison
Host System
↓
Pi-hole
↓
Unbound
Container
↓
Docker DNS / Alternate Resolver
↓
Different Result
Resolution
The fix involved making DNS behaviour explicit rather than relying on defaults.
This included:
- Verifying which resolver containers were using
- Updating container DNS configuration where required
- Retesting resolution from inside containers
- Confirming consistent behaviour across services
Once DNS paths were aligned, service resolution became consistent.
Why This Matters
Issues like this can be misleading because applications appear broken when the real problem is infrastructure.
A web service failing to connect may actually be:
- DNS failure
- Network path issue
- Resolver mismatch
Understanding dependencies is critical for effective troubleshooting.
Key Learnings
- Defaults should not always be trusted in multi-layer environments
- Host behaviour and container behaviour can differ significantly
- DNS issues often appear as application issues
- Troubleshooting improves when each layer is tested separately
Takeaway
The most useful learning from this issue was not the specific fix, but the process:
- Observe behaviour
- Isolate variables
- Test assumptions
- Validate the final state
That approach applies to almost every infrastructure problem.