Plan for Unsuccessful Changes; A Use Case of the AWS Health API
Boardgame
Radlands is a fast-paced card game where you lead a tribe of survivors in a harsh post-apocalyptic world. Your tribe has settled near a rare water source and uses strange old technology from abandoned military experiments to stay alive. Other tribes want your water and are preparing to attack. To win, you’ll need to use your cards wisely, protect your three camps, and perform regular healthchecks to ensure their survival. If all your camps are destroyed, you lose. Radlands is all about strategy, survival, and fierce battles. For more information, visit: boardgamegeek
Resiliency: A Shared Responsibility in the Cloud
This blog post is inspired by the Chalk Talk session titled ARC317 - Operational Excellence: Best Practices for Resilient Systems from AWS re:Invent 2024. You can explore the presentation deck on the AWS events content.
According to the AWS Well-Architected Framework, resiliency is defined as the ability of a system to recover from failures caused by load, attacks, or internal faults. These failures can stem from various sources, such as hardware malfunctions, software bugs, operational errors, or environmental disruptions.
In the cloud era, ensuring resiliency is not solely the responsibility of cloud consumers. Instead, it operates on a shared responsibility model, where both cloud providers and customers play critical roles.