It's All About the White Rats

Published: by

No, this is not about "White Hats" - security hackers who try to break into systems in order to strengthen them, as opposed to "Black Hats" - but really about what we can learn from white rats.

In the last few weeks, I have helped solve a number of vexing problems on behalf of customers, both in technology and process. Each time I am asked how I do it, and each time the answer is the same. I learn all of the parts of the system, understand how they interact and what drives them, and then inspect each part to look for expected and unexpected behaviours.

Of course, it never is that simple. Doing so requires knowing a lot of different technologies, processes and human behaviours, having seen them in lots of different scenarios, and having enough experience to know which one to use. It also requires enough patience on behalf of the customer to give me a chance to truly understand the problem.

This week, a close old friend of mine, who unfortunately is fighting cancer, used a great analogy in his blog to describe the problem-solving methodology I use: white rats. He quoted an exchange from the "West Wing" TV series:

“Let me ask you this: red meat has been found to cause cancer in white rats.

“Maraschino cherries has been found to cause cancer in white rats.

“Cellular phones has been found to cause cancer in white rats.

“Has anyone examined the possibility that cancer might be hereditary in white rats?”

This, in a nutshell, is my method of problem-solving for my clients - combined with enough experience to ask the right question and know how to get the answer.

It is all about gathering sufficient data of the right type to ask the right questions. The right question here isn't, "does red meat cause cancer in white rats?" nor is it "do cellular phones cause cancer in white rats?" Rather, it is "is cancer hereditary in white rats?"

If your cloud service is behaving poorly, whether scale or functionality or any other part, the right question isn't, "how do we fix it?" or even "how can we prevent it?" The right question is, "what inputs cause it to behave in the expected fashion, and what are the inputs when it behaves differently?" To answer that question, you need to know all of the inputs, what the expected behaviour is, and how to gather and examine all of the relevant inputs. Once you know why, fixing it and preventing future issues becomes much simpler, a simple matter of project management.

Similarly, if your people are not responding to requests quickly enough, you need to understand what they are doing, what they are expecting to do, what gets in their way, and what you are activities you actually are rewarding them for. Most important of all, though, you need to understand all of the parts of the system and how others work, why some create better outcomes, and then begin to map them to your needs.

In my years hiring and managing technologists, I always said I wanted "engineers", not "administrators" or "developers" or "programmers." The reasoning was simple. Engineers are a mindset, a way of thinking that wants to understand how things work under the covers, and only then how to use them.


Don't try to solve your problems by fixing things or making them better; first try to understand why they work the way they do, what they do under the covers. Apply methodological critical thinking to your technology systems, your people, your processes and your product. Make sure the people doing the analysis have deep and broad experience well beyond your own systems.

And when you are ready to make your technology and the people that support them scale better and faster... call us.