Yesterday, I had an enjoyable late evening conversation with a colleague of mine, a first-class information security and compliance consultant. We have collaborated on several projects in the past, and it always is a pleasure working with him (contact if you need one).
One of the issues we discussed is why so many companies feel their infrastructure costs – both data centre and cloud – are too high. Of course, “feel” and “too high” are very subjective and relative terms. Admittedly, cloud does provide much more flexible cost management with the ability to scale up and down to meet variable needs, thus reducing wastage, but even there, at times, the costs simply are “too high.”
At heart, data centre, cloud, or any infrastructure costs are due to three key factors. I have seen and mitigated all three of them at companies:
We will explore each in turn.
Sometimes, you have a vendor who simply is overcharging you. In the worst cases, they also are providing you with terrible service. This is more likely with colocation providers, where pricing never is transparent, than with cloud providers, where the costs are visible and open to everyone. However, even cloud providers with transparent pricing will negotiate aggressive pricing when a deal is sufficiently large.
In several instances I have either renegotiated a colocation agreement or planned and executed a move to an alternate provider that has saved up to and sometimes beyond 50% for better service.
Why do colocation providers behave this way? Lock-in. Once you have 2, 3, 10 or more racks along with multiple cross-connects in their data centre, your cost to move can be extremely high. They know it, and safely charge a premium. Unfortunately, some become abusive, knowing the “battered customer” will keep coming back for more.
How do you avoid becoming entrapped with a vendor?
- Invest in comprehensive due diligence up front. No matter how small your deployment, an infrastructure agreement is a long-term partnership.
- Structure a long-term contract. Even if you only will deploy 1 rack, if there is the slightest chance of tripling or more in size, negotiate the pricing before they have you.
- Evaluate at least annually. Most importantly, make sure your software and infrastructure teams are part of the evaluation. The more they understand that you are not committed to this provider ad infinitum, the more they will build you a service that is less painful to move.
- Architect to move. From day one, design everything under the assumption that you will need to move tomorrow. Of course, that is hard to do. Of course, a startup has neither the time nor the resources to build it perfectly day one. But always keep it as a goal. The thorniest move problems are solved at the beginning.
As discussed earlier, cloud providers are far less likely to entrap you. Not only does their business model require them to have transparent pricing, but the very nature of the technology makes it much easier to move. Dedicated hardware and 25 cross-connects at Equinix are far harder to move than 3x 1Gbps Internet pipes, a VPN and 200 virtual servers at Amazon or Rackspace.
Let’s be very clear: an infrastructure vendor who is overcharging is a financial cost; bear it or boot them. An infrastructure vendor who provides subpar service is a strategic risk to your company; leave quickly.
How much capacity – storage, CPU, memory, network – do you really need? How much do you pay for? Companies always need to balance the cost of lost business because there was insufficient capacity with the cost of maintaining spare capacity.
If 90% of the time you process 100 transactions per second (tps), and 10% of the time you process 1,000 transactions per second, do you build for 100 tps or 1,000 tps? Will you pay for 10 times the server, storage and network just to handle the 10% case? What if it is 1% of the time? 0.1%? What if those transactions are of a higher value?
If you are in the cloud, this is less of an issue… if you have sufficient warning time. If it takes you 2 minutes to launch extra instances, but you spike from 100 tps to 1,000 tps in 10 seconds, you have the same spare capacity problem that your peers buying dedicated hardware have.
Every single business struggles to find the balance between wasted capacity and lost transactions.
Here is an important secret: there is no one answer for everyone. Only you, in your business, can evaluate the costs of one versus the other and make a business decision about the technology.
However, a good analysis can determine if you have the right capacity even for your business needs.
Continuing our previous example, what if you decide that you are willing to build for 600 tps. You accept an overprovisioning most of the time of 500 tps (600 you build minus the 100 you always have) and the loss of the extra 400 during peak times.
That is a perfectly fine decision to make. But have you actually provisioned correctly for 600 tps? Once you have made the decision how much to build for and how much to walk away from, you still need to figure out how you meet that capacity. It is very common for organizations to overbuild even for that capacity.
Your architecture determines your infrastructure requirements. Yes, nowadays hardware is “cheap”. The cost of optimizing software, except in latency-sensitive high performance computing (HPC) cases, rapidly hits the point of diminishing returns. Oftentimes one more CPU core and an extra 1GB of memory is much cheaper.
It is important to distinguish, though, between optimizing and architecting.
- Optimizing your software saves a some CPU, memory, network or storage.
- Architecting eliminates entire transactions and components.
Additionally, as you scale, the “minor” returns from the optimizations or larger returns from architecting add up much more quickly. Saving 5% or even 25% on your monthly $2,000 colocation or Amazon bill is nothing compared to your labour costs of optimizing or architecting. If your monthly bill is $50,000 or more, 25% adds up very quickly.
Your architectural choices determine your infrastructure choices and hence your operational model:
- How do you deploy your application components?
- Do you have a monolithic design? Separate services? Microservices?
- Do customers access your app over the Web? Desktop apps? Are you hosting legacy desktop apps over Citrix for them?
- Do you have a single data store? Dedicated ones for each service? Different types for different use cases? Caches?
- Which languages are your services written in? Do you have the best ones for each use case? Low-latency behaviours are different than i/o bound are different than memory- and cpu-intensive report generation.
Every one of these questions – and many more – determine your infrastructure needs, your usage efficiency, and ultimately your costs, both infrastructure and labour.
Colocation or cloud costs are among the highest expenses in a SaaS company. Getting them right means having the right combination of:
- Vendor selection and agreements
- Capacity planning
Once done right, your infrastructure costs – data centre or cloud – are more aligned with your business. Doing your architecture correctly reaps huge benefits not only for your infrastructure costs, but reduces your deployment costs, shrinks your maintenance windows, improves service reliability, speeds your time to market, and makes your team much much happier.
Do you have the right vendors? How well suited are the agreements to your current needs? How well do you manage capacity? Are you losing customers unnecessarily? Or are you paying for extra capacity you really do not need?
Finally, when was the last time you took a good, hard look at your architecture, including software engineering, infrastructure, customer support, and finance? Are you 100% confident that your design gives you the best availability, resiliency, supportability, nimbleness and bang for the buck?
We love helping companies get better and faster at lower cost. Ask us to help you.