Tear Down Your Castle! – A Story Of Culture & Cloud Natives.

We run our own path, but I still say I have been pretty lucky to see a lot of architectures in the last 15 + years, some rather average, through some pretty amazing architectures. The ones I count as amazing ones, you probably interface with as a user in your daily life as apps on your phone.

As I pen this post it’s 2022, we know change is constant, are you keeping up to speed? I used to write VbScript for almost 10 years, I was not keeping up!

Cloud natives, digital natives, what ever you call them, this new generation of builders, who know not what a data-center is, or how to install an operating system have a different approach on how they make technology decisions.

You can teach tech to almost any organisation. I am new to Microsoft, I have just got my Azure expert certification amongst the others I hold. Does it mean I really am expert?

Culture, The Secret Sauce?

The thing is, whats unique in this space of cloud natives is their culture, It really is what is business differentiating.

For many who know me, you may know my history (Startup / SEEK / Amazon / Microsoft) but what you may fail to know, I worked for the third largest food manufacturer (JR Simplot) in the world, for a short 13 months stint.

Why only 13 months? It was a great role. Great title, an ability to influence global architecture, factories around the world.

Why did I leave, it was culture.  Heading global architecture but unable to install the latest version of of an SDK (Software Development Kit) because the architecture function was bound to a SOE (Standard Operating Environment) and I had no local admin rights.  I can virtually see your faces and tears.

That speaks volumes to why LPAR’s, ISeries and DB2 still exist there.

Conways law, like Moores law, & Brooks law it holds true to this date.

Digital natives, they don’t know all the cards, but what they do know is how to experiment. They experiment and learn and do this in a method that is unlike traditional organisations. They do it fast, but it doesn’t mean they experiment dangerously.

They have systems and mechanism in place and are structurally organisaed so there is blast radius in place, and they can be very targeted in their experiments and of course they measure. Remember, you can’t improve what you can’t measure.

So, we are talking 2-way doors.

  • A 1 way door, almost impossible to reverse, once you make one of these decisions its really hard to go back.
  • A 2 way door,  it means they can be quickly changed, like choosing instance types, leveraging pytorch or tensorflow. While these decisions might feel momentous, with a little time and effort, often a lot less than you think they can be reversed.

These organisations all use this principal, they will make change aren’t afraid to roll things back. They’ve built cultures and processes around rapid innovation. They use models like challenger vs champion, where they will send a fraction of traffic to a challenger to see how it performs and in some cases it becomes the champion.

Most importantly they ship faster to learn faster

Lastly they know their stuff, I will talk more about this later, but there is a good reason why Netflix pay their senior engineers all > 500k USD.

There are no free passes here, most conversations are 400 level API / SDK, so what value are you bringing to the conversation?

So what is so special about cloud native culture. Lets sumarise it in to 3 points

  1. Empowered decentralized teams
  2. Fail fast
  3. They know their stuff

But it’s the first point I want to talk to you about. It is that empowerment and decentralized function.

Classic Account Structures != Not conducive to speed

DevTest and DevOps for IaaS solutions – Azure Solution Ideas | Microsoft Docs

I copied and cropped this image from the Microsoft Architecture center. Why? This guidance is centered around a centralized governance of IT. That classic pattern of, Development, Test and Production subscriptions.

This assumes your governance is very centralized. This is an assumption that may work in the Enterprise, but you know what they say about assumptions?

It doesn’t bode well these style of organisations.

We spoke about moving fast, this is a good start but we need to expand on this architecture
Let me frame it like this.

This picture should make it clearer, especially the snapping alligators / crocodiles.

Centralised operations, are not conducive to speed, full stop. They bring other benefits, but speed with an ivory tower is not one of them.

But Shane, by having centralized operations you can build economies of scale, that one large K8 (AKS/EKS) cluster all throw us your containers, we will run them, that single name space for your ServiceBus/SQS?

True I would say,  however scaling to meet the demands of the business will mean this style of architecture, gaurantees the operations function will in effect become a choke point, more so you are increasing business risk, all your eggs are now in one basket, that production subscription, that single Event Hubs/Kinesis namespace, those same resource limits.

It drives a hero mentality which we want to avoid.

These soldiers, the operations people in this picture lets be frank, often have little idea why a specific microservice API error rate is increasing and conversely the developers are really at arms length away from reality and seeing how their software fails.

This style of command and control doesn’t work with cloud natives and is not a core attribute to this style of customers.

My advice to you all here is. If you are trying to run faster and are experiencing outages, technology may not be the solution, have you looked at their culture.  It can be difficult to approach, so use tact.

I am going to say, everything fails all the time. No mater, whose cloud.

Broadly speaking

Complex systems, microservie architectures contain changing mixtures of failures latent within them.

How can we address the speed to market constraints, whilst reducing business risk where change is embraced and is a non-event?

Empowering Development Teams

I need you to use your imagination here, because my ambition to draw you an amazing picture, is let down by my Visio skills.

I want you to take this and picture, to which I have scaled out hugely 8 subscriptions here. No imagine what this would look like with 1000+ subscriptions.

What do you notice about this picture? Gone is the castle, the soldiers.
Each team is accountable for their environment.

You build it, you run it (that includes supporting) and in most places, guess what. You pay for it. Do  you know what the cost to serve is, well in this topology you do, no hiding costs.

What I am trying to convey here is , in these organisations there may be 50 to 1000 different service teams.
Each team will have many subscriptions, I have one per team in this diagram here, but there could be 5 to maybe 20+.

For example you might have 5 dev subscriptions, staging, QA, stress and volume, production. When they finish building their code, they don’t throw it over the fence, the run it them self.

What was the production subscription today, may be nuked tomorrow.  Layer 7 routing is key here.

Layer 7 routing, imagine the ingress will resolve to a CDN with an origin pointing to a L7 load balancer.
Their will then be routing rules, path based routing, /search goes to this subscription, /tax goes to this subscription, /dogs to another.

Notice the red in this picture? Deano the dinosaur is sick of cat pics taking over the internet and wants some dog pictures.

Each team, this self empowered team will then as part of their CI/CD process will update the configuration to point to the path to the right subscription. What I want to emphasise here is, each team, squad has complete control of how  and when they deploy and release.

Yes there is  probably some guard rails, but typically, the language, the release process is controlled by the team. One team with Python, another DotnetCore, one team with Go. Does it really matter if all you are exposing is in most cases an API endpoint?

That API is your contract. Does it all need to be one language. Of course not, hence the polyglot eco-system

The true production environment may includes hundreds linked / peered vnets to create a psudeo production environment.

This is empowerment, this decentralisation. All of these self empowered teams in effect are running their own small business.

 There are inefficiencies sure. For example, each squad will be running their own infra in most cases. but it is also reducing risk and guess what, you have bought yourself a ton of speed. Tight coupling, be gone. Hello independence.

This is culture here ladies and gentlemen, this is what separates the traditional enterprise from this style of customer.

There is for the better part here no central governance, I say better part as there is none, but yes there are going to be guardrails in the form of an ingress / egress hub, some common infrastructure but by large the architecture function will define this

If I had to codify this, this is what I would say

When IT is no longer  blocker, that is where the magic happens.

Many of these companies will have ‘company in a box’, the premise being. As a developer I want a full end-to-end site, with sanitized production date to run a hypothesis, and I want it in 15 minutes, and guess what, most of these labels can do this. If the idea is a flop, they fail, and fail fast and it may have cost them a few cups of coffee in cloud usage.

Culture Eats All

Learn More About the Spotify Squad Framework — Part I | by Thaisa Fernandes | PM101 | Medium

You can read a lot about culture, fare more than what I have spoken to you about today. Today is my lived experience in dealing with this special but ever-growing style of organisation.

I want to close this section with a pictures, its from Spotify.

Think about your workplace, which quadrant are you playing in? How are you leading?
Culture eats strategy for breakfast, technology for lunch and everything else for dinner
You may not be working for a cloud native but it doesn’t mean you cant adopt some of their culture to advance your business and at the same time yourself as a thought leader.

How can you help your organisation shift towards the top right and do you need alter your behaviors?

Is this part of your career development plan and the way you add value to your organisation. If it is not I would encourage you think about what you can do in this space.

I know this was a long post, so thanks for sticking it out.

– Shane

2 thoughts on “Tear Down Your Castle! – A Story Of Culture & Cloud Natives.”

  1. Nicely done.

    I think where I have my fledgling team now is a hybrid of your proposal. We still work within that central dev/test/prod model (sometimes just dev/prod) but we own every solution from design to build to CICD and monitoring and finally support. We build it and own it.


Leave a Comment