Delivering better software faster is imperative in today’s digital world. When Ben Treynor, a Google engineer, was tasked with improving their site reliability, it became apparent that their current architectural method of DevOps could not meet their needs. This led to a new methodology of development and operations, SRE (site reliability engineering).
Where the traditional corporate structure of DevOps bridges the gap between departments, SRE allows developers to create the framework themselves. While both methodologies have the same core principles, SRE can better meet the needs of larger scale frameworks. As a result, the traditional corporate mindset of spending more time working on operations and less time developing software has shifted.
This shift now has developers
continuously releasing deliverables at a higher frequency, provides a more
autonomous approach to development, and gives developers increased freedoms.
Although DevOps and SRE are similar in nature, there are contrasting
differences in organizational structures.
What is SRE
SRE focuses on removing
departmental communication issues by installing a team-lead engineer who has an
operational background and mindset. This allows for progress to be made from
the top down because developers have the freedom to monitor and maintain
software releases and function normally within their IT department. This
provides a more scalable framework offering continuous development and
improvement of complex frameworks.
What is DevOps
DevOps focuses on
bridging the gap between development and operational departments to align key
goals and initiatives that are set forth by the company. The intention is to
bring development and operations together to mitigate issues and allow for more
frequent releases of digital products.
Both methods are not competing methods, instead, they are often referred to as cousins designed to help break down organizational barriers to deliver better software faster. SRE is referred to as an extension of DevOps but does have important differences.
Each framework fosters a collaborative environment to reduce organizational friction and achieve success. Both accept failures and understand that when human elements are involved, errors can and will arise. To prepare for missteps, rollouts are done incrementally to ensure that rollbacks and bug fixes can happen easily to ensure changes can be made prior to a complete feature rollout. Automation is also leveraged to reduce manual labor to speed up the development process resulting in unnecessary errors on simple tasks performed by humans.
SRE is operationally driven from the top down by sharing ownership of production between developers, whereas DevOps brings the operations and development teams together. Communication in the latter method can often break down causing organizational hiccups. This can result in the same issue arising multiple times without a different solution being provided.
SRE heavily utilizes automation and Artificial Intelligence (AI) to ensure that the same mistake does not occur twice. A solution like Blameless helps to automate incident resolutions and learn without pointing fingers to adhere to best practices in the postmortem process.
DevOps fits many
organizational structures, big and small, thanks to the focus on frequent
releases and enhancements. SRE however, is much more scalable for continuous
development and improvement of complex frameworks.
Which should your company adopt?
Because every business operates differently, there is no cookie cutter solution. Enterprise organizations looking to adopt one of these methods should prepare for big organizational, culture, and production change.
Important starting blocks consist of assessing your baseline. Understand where your current process stands and the importance of developing a roadmap that allows prioritization of enterprise requirements. View where you currently stand when it comes to staff, vendors, and your budget. Will additional staff need to be brought on? Will you need to partner with additional vendors? Can your budget handle what you are about to implement?
Most importantly, what is the desired outcome of switching methods? If the goal is to increase product output at a rapid rate for revenue growth, DevOps is most likely the correct fit. If you have a larger framework with a focus on continued advancements and improvements, SRE is most likely the better fit for you.
How should a company think about SRE adoption if they already have DevOps Engineers?
Switching your framework to SRE can help increase ROI and relationships between team members, however, it is imperative that there are team members who are capable of leading development teams. If not, breakdowns will occur and production time will increase. Ideally, this method is best suited for those who have larger systems and who are looking to streamline growth tactics.
Watts, Stephen. “SRE vs. DevOps: What’s the difference?” CIO Magazine, 15 June, 2018. https://www.cio.com/article/3282424/sre-vs-devops-whats-the-difference.html
Markel, Liz. “What is an SRE and how does it relate to DevOps?” USENIX, 19 October, 2018. https://www.usenix.org/blog/what-is-sre-how-does-it-relate-to-devops-lisa18