Intro: Cloud Cost Landscape and Organizing for Success

Cloud…bombe? No not bomb! For those not familiar with the life and work of Alan Turing – among other accomplishments, he advanced a cryptographic device known as the bombe – to crack the Enigma machine’s cypher and help turn the tide of WWII.

Anyone who’s sat down in front of the AWS Cost and Usage Report (CUR) to try to answer a question for their organization can relate to how Allied naval officers must have felt when they picked up an Axis u-boat transmission. “The information I need to plan my next move is in there, but how do I translate it into something I can use?

The theme of this blog is to equip every AWS customer with their own cloud “bombe” – the knowledge, skills, and tools needed to fully understand what they have going on. Thus armed, the cloud cost practitioner is empowered to make the right moves at the right time and optimize their organization’s key outcomes.

For large enterprises, the difference between cracking this code or not can mean tens or even hundreds of $Millions in annual cloud cost. For tech startups where cloud might be an even larger share of overall costs, it can influence whether or not there’s funding enough to hire that new hotshot developer and deliver on the killer new feature sought by a collection of would-be customers.

Nobody is perfect or has this all figured out – and even if they did, Amazon’s staggering pace of product and feature releases would see this perfection crumbling to pieces by the next press cycle. Staying on top of cloud costs means weathering this storm – establishing the core principles as your organization’s guiding compass, then taking each course-altering blast of wind and tidal wave from AWS as they come, keeping it all on track as best you can.

Who should be looking at Cloud Cost Management at my Company?

The problem space for cloud cost management is large and multi-dimensional. To get started in thinking about how to best line up your organization’s resources to meet the challenge, we’ll start with this diagram:

Cloud cost management landscape - people and problems

In this graph, the vertical or Y-axis is the “Altitude” Axis. A value higher (more towards the top) of the graph indicates a problem that is more strategic in nature. In this context, a strategic decision is one that is more abstract, has farther-reaching consequences, and is probably more heavily deliberated. The highest-level of these decision types are generally made by a function’s executive leadership. A lower placement on this axis, indicating a more tactical problem, are the sorts of things dealt with more often, and by persons closer to the front lines.

The horizontal or X-axis is the “Discipline” Axis. While in reality many disciplines will be brought to bear, the spectrum of this axis has traditional Finance/Accounting at the far left, and traditional Technology at the far right.

To show how cloud cost management hits us with problems all over the map, here are some challenges likely to be faced by any organization on its cloud journey. This isn’t every challenge to be faced, and those listed might not all be faced on the same day, but they are all problems with non-obvious solutions that your organization will one day have to solve for itself:

  1. How does the arrival of a new prepay product (ex: AWS Savings Plans) affect our prepay rate-lowering strategy?
  2. On average, what was the cost to my organization to run an m5.2xlarge Linux EC2 instance in the us-east-2 region in the 4pm hour yesterday afternoon?
  3. How do our company’s financial objectives influence our prepay strategy for the coming 1-3 years?
  4. Who inside our organization should be managing the cloud vendor relationship and our portfolio of prepaid investments? Should it be centralized or distributed?
  5. How will our transition away from traditional EC2 and into containers affect our usage patterns – and thus our prepay strategy?
  6. How will we hold teams accountable for inefficient operations, even if they still manage to meet their budget?
  7. What prepay investments should we be making RIGHT NOW?
  8. What’s the right way for us to handle the amortization of our prepay investments?
  9. Should we be trying to lock-in with a single cloud vendor, or should we be considering a multi-cloud approach?
  10. How should we handle the $25,000 credit we received on last month’s cloud invoice?

Two key takeaways from this exercise:

  1. The challenges faced by an organization in optimizing its cloud use are “all over the map” and involve a mixture of technical and financial disciplines.
  2. A lot of the key pieces are in their own disciplinary lane in the center of the map. This area represents a distinct body of knowledge and experiences which can be considered a discipline unto itself.

Unless you have persons experienced in this space already on staff, you’ll need to get your Finance folks together with your Cloud/Technology folks to form a cross-functional team to address the challenge. Together, their objective is to build a bridge of knowledge from their disciplines and into what is for your org, the undiscovered world of successful cloud cost management.

One other skillset not called out above but paramount to success – data engineering and analytics. While some large enterprises may have dedicated data teams, most do not, but the cloud is inherently a thoroughly data-driven place. Even for modest-size customers, it is actually a “big data” problem, as one month of detailed cloud costs can easily amount to 100x more than what Excel can handle. To really get to the bottom of your organization’s cloud operation will require the skills to manage and query this data, harvesting it for the value insights it can provide your org in maximizing its outcomes.

In some places this collective discipline might be called “Cloud Cost Optimization”, in others, “Fin Ops”. Whatever the name, it is important to establish and nurture this capability within your organization to ensure it is empowered to make the most of its cloud.

What guides this team in its efforts ?

A few years ago Netflix gave a presentation at AWS Re:Invent on their internal approach and tooling to running their cloud efficiently. In the industry, Netflix is one of the most mature and sophisticated organizations in cloud optimization.

Here’s the video replay:

One of the earlier slides is key in illustrating something every cost opt team will have to figure out early on – what’s important to our organization?

Credit for this image to Andrew Park and Sebastien de Larquier of Netflix from the above Re:Invent talk, slides available here: https://www.slideshare.net/AmazonWebServices/tooling-up-for-efficiency-diy-solutions-netflix-abd319-reinvent-2017

Indicative of its maturity, Netflix exhibited a high degree of self-awareness in this image. As a consumer-focused video streaming service, their organization’s focus is on creating and capturing market growth. They recognized how delivering product and service innovations was key to their overall success, and was consciously willing to live with some inefficiencies to make it happen. Recognition of this balance became a compass to the myriad of fork-in-the-road decisions their team made daily.

Of course not everyone is Netflix, and the priorities of a bank, a nonprofit strapped for cash, or a company that makes most of its revenue on just 2-3 days each year, might have different priorities. There is no right or wrong shape to an organization’s priority spread – it’s only wrong to not be conscious of the balance for your org.

Another question to have in mind when looking at the radar graph is, “How good are we at providing for each of these things?” To those watching costs, it might be painful to see new services launched in a wastefully overprovisioned state because time-to-market trumped efficiency-minded engineering experiments. To product managers fighting to deliver for their customers, frustrating to see development teams suffer through weeks of laborious security review processes to release a single change, due to absence of trusted security automation.

Every decision an organization makes in conducting its business impacts the continuous set of tradeoffs being made between these priorities. The initial response to every new business challenge – whether it’s a new regulation, a new piece of technology, or a change in the market – should favor the characteristics most important to your business. Each initial response might come with unsatisfactory impact to other characteristics. It is through focused investment we can minimize the negative effects of these changes over time.

As tempting as it might be to strive for it, there’s no such thing as “being perfect at everything”, and that includes efficiency, the focus of this blog. What’s better, is to be aware of the pros and cons of each decision, and in your role as a champion for efficiency in the org, ensure the best mitigations are put in place to preserve and optimize efficiency in the appropriate balance with the other characteristics.

As shifts occur in the organization’s priorities, it is important too, to be responsive and offer options and solutions to increase efficiency when appetite for a different set of tradeoffs becomes apparent. At the time of this writing (April 2020, in the midst of a global-economy-ravaging pandemic) many companies are re-addressing their prioritizations, and more willing than ever to sacrifice a little bit of innovation, some reliability, and while it’s not always safe to admit it – yes, even some security – in the interest in efficiency gains.

How do we get executive buy-in to invest in the space?

A problem I’ve seen many groups encounter is a difficulty in convincing leadership – particularly those holding for the corporate checkbook – to commit to the resource investments required to really optimize their cloud. While every org and leader is unique, remember the world of cloud cost management and optimization is very deeply data driven. In contrast with other spaces which might be driven more by feel – and where the journey from investment to return can be fuzzy or unconvincing, the data set describing a non-optimized cloud tells a clear and rich story. Real $ opportunities lie right on the surface, no hijinx required to show what is possible.

Think of how hard it is to generate true profit. How many people in sales, marketing, product development, and so on, have to work to secure each new customer? How much is left after the cost of goods sold? Cloud cost optimization offers a path from investment to return that’s faster, simpler, and more certain that just about any other material financial vehicle.

I know of a small, struggling enterprise software company where every million-dollar deal closed shakes the halls from the ensuing celebration. Yet this same company is hesitant to make any financial commitments to cloud use, despite the immediate effects to their corporate profitability being equivalent to closing several of those deals.

Take it from Alex...
Coffee is for … savers?

It’s rare for the path from an investment to its return to be so simple and certain. Yet, because it is so new and unfamiliar to those in traditional Finance or Cloud Ops roles, many cloud customers leave these opportunities on the table.

If the above sounds anything like your situation, the next two installments of this blog will endeavor to arm you with the insights needed to sell your internal org on making the right moves to save big on its cloud spend. Beyond that, I intend to cover several other hot topics I often see as challenges in the industry:

  1. AWS billing and the CUR (Cost and Usage Report) Decoded
  2. Saving Transparently: Prepay Management Strategy (RIs, SPs)
  3. Finding, Tracking, and Holding Teams Accountable for Savings Opportunities (Waste)
  4. Handling Prepay Amortization through Rate Blending
  5. Automating Allocations to End Month-End Surprises
  6. EDP and Contract Negotiations
  7. Asset Attribution and CMDB: Are Tags the Only Way?
  8. Cloud Cost Forecasting

Something else you’d like to see covered? Email me (jason@cloudbombe.com) and I’ll fit it in!

Thanks for reading,