How a DevOps Team Became a Platform Engineering Team







The last couple of years have seen a lot of job titles updated from DevOps engineer to platform engineer.

For most, it’s been a change in name only. It wasn’t a natural substitution anyway, since DevOps focuses on operations, while platform engineering centers on developer experience. A successful DevOps or platform engineering transformation will optimize technology, people and processes on both sides of that persistent silo. It will also consider all the other stakeholders in the increasingly complex software development life cycle.

That’s exactly what Allianz Direct, an international insurance company, sought to do when it set off on its journey of a DevOps team toward platform engineering. Sergiu Petean, as director of cloud engineering and operations, shared this story of scale on a recent episode of the Platformers Community livestream.

Getting Started with Platform Engineering



When Petean joined Allianz in 2021, he found his team in what he dubbed “a state of fake SRE topology,” referencing an anti-pattern from “Team Topologies,” which has operations engineers dubbing themselves site reliability engineers without making changes.

“Devs still throw software that is only ‘feature-complete’ over the wall to SREs,” he said. “Software observability suffers because devs are no closer to actually running the software that they build, and the SREs [site reliability engineers] still don’t have time to engage with devs to fix problems when they arise.”

This isn’t an uncommon place to start, but companies must be careful not to get stuck there.

Petean also wanted to avoid another platform engineering anti-pattern that only looks to address a single stakeholder — the software engineer.

“In a huge enterprise, you have to consider as many stakeholders as possible, ideally all stakeholders that have anything to say about your production status,” he said, particularly in such a regulated industry. You also need thorough documentation that represents the decisions of each of those stakeholders.

Once everyone was considered, it was time to define DevOps for Petean’s new team and the organization as a whole. Building on Google’s DevOps definition, Allianz Direct’s DevOps principles are grounded in:

  • Automated deployments and rollbacks to reduce human error.
  • Lean management and continuous improvement.
  • Measurable outcomes.
  • Sharing knowledge, best practices and resources among teams.
  • Security integrated within development and operations.
  • Governance best practices that reduce risk.
  • A culture of closer collaboration and shared responsibility.


It took the DevOps team a couple of months to clarify and collect all the tasks, stories and epics among these stakeholders and to reflect on its current challenges.

Why DevOps Must Own the Backlog

It was extremely confusing for anyone in our organization to know what we were doing,” Petean said

“Sometimes, even for us, it was quite challenging to understand: What are we working on? Why are we working? What’s the value that we are bringing to the overall organization?” he added. “How is that going to evolve together with the whole stack in the next six months?”

This is why DevOps burnout is exceedingly common.

“The life of a DevOps [team] is not easy, especially when you have a lot of incidents, technical debt, technical questions, so you don’t know much of your backlog,” Petean said.

Which is when his team decided “we actually want to build a platform, and we want that platform to be self-service,” turning the DevOps team into an internal product and platform engineering team.

“In order to do that,” he said, “we needed to own at least half of the backlog.”

The DevOps team originally wanted to do Scrum, but since it didn’t own the engineering organization’s backlog, it couldn’t set its own sprint goals. Instead Petean and his DevOps engineers went the Kanban route. This allowed them to better understand their workload, which enabled them to understand how big the team needed to be to cope with that pressure.

Controlling the backlog was also the only way he felt the team could drive automation, security and DevOps concepts into their processes.

In the modern multinational corporation, there are countless stakeholders outside your team influencing your work. Inevitably this team’s demand was met with resistance.




#devops
#platformengineering
#engineering
#softwaredevelopment
#teamwork
#techtransformation
#cloudengineering
#infrastructure
#automation
#agile
#sre
#devopsculture
#platformteam
#digitaltransformation


Comments

Popular posts from this blog

ISRO Receives Satellites for SPADEX Docking Mission