Back to Documentation
Disaster Recovery Testing
DRaaS test procedures and validation
DR/BCInfrastructure
Why Test DR?
Regular DR testing ensures VMs and services can failover as expected. Testing validates that your DR plan will work when you actually need it.
Types of DR Tests
Low Impact
Bubble Test
Replicated VMs are powered on at the recovery site in an isolated network "bubble". Production remains online during testing.
Validates:
- • Server OS integrity
- • Data integrity
- • Boot sequence
Limitations:
- • No inter-VLAN connectivity
- • No internet access
- • Console access only (no RDP)
Production Impact
Live Failover
Production servers are powered off while replicated copies are built and powered on at the recovery site. Full validation of DR capabilities.
Process:
- Production powered off
- Failover to recovery site (~1 hour typical)
- Validation at recovery site
- Commit failover (deletes production copies)
- Reverse replication begins (up to 48 hours)
- Fail back to production
Test Only
Live Failover (No Commit)
Similar to live failover but cancelled instead of committed. Production returns to pre-test state.
Warning: Any changes made to servers at the recovery site will be lost when the test is completed.
Planning a DR Test
Test Plan Components
Before testing, define what "success" looks like across these areas:
Infrastructure
- • VMs boot successfully
- • Network connectivity
- • Storage accessible
- • DNS resolution
Application
- • Services start
- • Database connectivity
- • Application responds
- • Data integrity
User Acceptance
- • Users can log in
- • Core functions work
- • Performance acceptable
- • Data is current
Scheduling Considerations
When to Schedule
- • After initial DRaaS implementation
- • Quarterly or semi-annually
- • After major infrastructure changes
- • After adding new critical systems
Lead Time Required
- • Schedule several weeks in advance
- • Coordinate with DR provider
- • Notify affected stakeholders
- • Prepare test plan documentation
Best Practices
- Complete initial test early:Don't postpone your first DR test after implementation. This validates your DR solution actually works.
- Document everything: Keep detailed records of each test including timing, issues encountered, and resolution steps.
- Involve stakeholders: Include application owners and key users in validation to ensure business-critical functions are tested.
- Test recovery procedures:Don't just validate failover - test the full recovery process including fail-back.
- Update runbooks: Use test results to improve your DR runbooks and documentation.