Redundancy Protocols
Failure-tolerant systems are designed to fail gracefully, not to avoid failure. OnionHat builds redundancy into the architecture rather than relying on fallback mechanisms that may not be tested in production conditions.
Redundancy Principles
Active-Active Operation
Redundant components operate continuously, not as cold standbys. Failover is not a special event—it is normal operation.
Independent Failure Domains
Redundant components should not share failure modes. Common dependencies (power, network, provider) are eliminated where possible.
Tested Regularly
Redundancy that is never exercised is not redundancy. Failure scenarios are tested as part of normal operation.
Levels of Redundancy
Data Redundancy
Data is replicated across independent storage systems. Replication is synchronous where consistency is required, asynchronous where latency is prioritized.
Service Redundancy
Services run on multiple independent hosts. Load distribution and failover are automatic.
Network Redundancy
Multiple network paths exist between components. Routing automatically adapts to path failures.
Geographic Redundancy
Critical systems are replicated across jurisdictions. Regional failures do not cause global outages.
What Redundancy Does Not Solve
- Correlated failures across all replicas
- Bugs that affect all instances simultaneously
- Operator errors that propagate to all systems
- Adversaries with access to all redundant components
Redundancy reduces risk. It does not eliminate it.