Architecting Your Cloud Infrastructure for Failure (and Resilience)
Companies aggressively moved to the public cloud in 2020, driven in part by the pandemic and the shift to a remote workforce. That pace is expected to accelerate in 2021, as more and more companies move applications and data to the cloud to achieve the benefits that have been touted for years, including cost savings, flexibility and on-demand scalability. It would be a challenge for an individual organization to match the ability of a cloud service provider in these areas. This is their business.
There is, however, one wrinkle to building on cloud services, and that is the shared responsibility model.
Cloud solutions cannot be simple “lift and shift,” but must be intelligently designed to prevent, adapt and recover from an operational disruption in a manner that is aligned to and consistent with the importance of the service and data placed in the cloud.
Regardless of what cloud provider an organization is using, it is the responsibility of the organization to manage its own space within the cloud. The cloud provider will maintain the contracted digital space, but the organization must deploy and ensure diligence of use within the space. This starts with proper architecture within the cloud. And in addressing concerns of proper architecture, we are not initially contemplating the disruption event that could occur, but we assume the inevitability of a disruption event and we are architecting for the results of such an event. In other words, we are architecting for failure.
It is hard for most people to wrap their heads around cloud resiliency. “My data and my applications are somewhere ‘in the cloud’ and they are safe” is a hard concept to understand, mostly because the virtual, amorphous service doesn’t have the same simplicity as a server in a firm’s data center. However, in terms of resiliency and the processes that lead to disruption events, such as a security incident, cloud service providers are simply better than most. CSPs have architected their platforms with the intentional design for staying resilient regardless of component failure. And by using their platforms, organizations can embrace the same architectural patterns. By building on top of the provided cloud services, organizations can deploy in a manner architected for failure at a fraction of the cost and administrative burden that achieving the same outcome would require with multiple physical data center environments.
Our supply chain will continue to grow more intertwined. Customers’ expectations will continually advance. And regulatory bodies will continue to press firms on concepts of operational resilience, specifically designed to look at the ability of organizations to deliver goods and services within a stressed environment. This means ideas like “lift and shift” are outdated and may be detrimental to firms without appropriate risk analysis and consideration of re-architecting to meet the evolving threat and regulatory landscape.
We have noted below our five keys to architecting for failure:
Whether you have moved to the cloud, are in the process of moving, or are in the process of architecting a cloud solution, we can ensure the business and technical robustness of your solution. Protiviti’s integrated solution combines our cloud, security and operational resilience practices to help you seamlessly move to the cloud while satisfying the needs of the business, IT and global regulatory obligations. Our services to help firms architect for resilience include.