Ensuring node fault tolerance through uncorrelated replication and load balancing

The best way to achieve high availability is to run multiple instances in uncorrelated hosting environments with a load balancer that chooses the least loaded instance to process a request. While blockchain itself is fault tolerant, round-robin AuRA consensus does mean block processing slowdown, if one of the nodes falls out of line. I know there’s been some experimentation on introducing fault-tolerance at the node level. Do we have any wikis I could follow to set this up? I’d like to have the two instances run in different clouds and regions - e.g. WestUS AWS and EastUS Azure. This will make it resilient to both cloud specific down times and regional down times.

Best, Michael

1 Like