26 February 2008

Heuristically Thinking

You don't have to understand calculus to appreciate this entertaining story:

A teacher, trying to explain what a theory is, asked this question: “If you take a letter half the distance to a mailbox and stop, then start over going half the remaining distance and stop, then repeat the process over and over, theoretically will you ever really get to the mailbox?” One bright student said, “No, but you’ll get close enough to mail the letter.”

There are many definitions for what a heuristic is, but the one I like best is illustrated in this story. A heuristic is "close enough." I am beginning to believe that more and more what matters in software isn't so much building the perfect algorithms, that match and mimic real life in every way and hold up in every edge case. What matters most is getting a model that comes close enough. What matters are heuristics.

My favorite chemistry textbook, has this to say on the topic of "The Kinetic Molecular Theory of Gases" (KMT):

However, although laws summarize observed behavior, they do not tell us why nature behaves in the observed fashion. This is the central question for scientists. To try to answer this question, we construct theories (build models). The models in chemistry consist of speculation about what the individual atoms or molecules (microscopic particles) might be doing to cause the observed behavior of the macroscopic systems (collections of very large numbers of atoms and molecules).

A model is considered successful if it explains the observed behavior in question and predicts correctly the results of future experiments. It is important to understand that a model can never be proved absolutely true. In fact, any model is an approximation by its very nature and is bound to fail at some point. Models range from the simple to the extraordinarily complex. We use simple models to predict approximate behavior and more complicated models to account very precisely for observed quantitative behavior. In this text we will stress simple models that provide and approximate picture of what might be happening and that fit the most important experimental results.

The textbook then goes on to explain postulates for this model of thinking, many of which by themselves are absolutely false, but taken together and used properly, they create a system of thinking that works for a wide range of situations. This collection of half-truths produce a half-truth baked solution, absolutely, but one that is indeed close enough.

For example, some "half-truths" or "simplifications" if you will, involve assuming things like: each molecule of gas is perfectly spherical in shape and any collision is perfectly elastic in result. Or worse, the volume of all these individual molecules is assumed to be zero! Individually, each of these 3 statements is categorically false, but they were the right bits to "design away" as they defined the problem space so that the model could be simplified, made useful and the problem of dealing with billions of particles be made into something tractable.

To me, this is more than simple object oriented encapsulation and abstraction. This is thinking about the whole problem differently. It's about looking at individual behaviors from a very small sample and knowing, by some spark of genius, which attributes are the important ones to the ultimate outcome of the system as a whole, and which are not. This is writing software that is able to predict things, for example, test software that is able to predict when "something good" has happened and also when "something bad" has occurred. This is about designing software by building software models that categorically do not reflect the real complexities of the system, but that taken together, get "close enough" to do real work in the real world. Ultimately your models will fail when pushed to the limits, but even just understanding those limits will help you better understand the problem you are tasked with solving! It even begs the question: What kind of programming language best allows for the definition and use of heuristic models?

I don't know where I first heard this, but someone once said something like, "Where Microsoft codes if statements, Google codes in Bayesian probabilities." There are more data and variation in that data than there ever was before. If you are going to write or use programs (very likely) that deal with large amounts of data (also very likely) it might be a good idea to get used to thinking about things in heuristic terms. It may not be exactly perfect in every case, but it will be close enough.

1 comment:

David Weiss said...

I found where I think I heard that comment first. It was Joel Spolsky who said:

A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. "Google uses Bayesian filtering the way Microsoft uses the if statement," he said.