Tuesday, April 7, 2009

Bayesian AI Advisor - Drill Here? Drill Now?

Today, April 7, is the deathday of Rev. Thomas Bayes who was born c. 1702 and passed away April 7, 1761.

His great contribution to mathematics was Bayes Theorem, seen on the t-shirt to the left. Bayes came up with what is called inverse probability at at time when only forward probability was generally known.

In short, if you know the probability of some event B given some other event A, and if you also know the probabilities of A and B, you can figure out the inverse probability of A given B.

Many people think -incorrectly- that forward and inverse probabilities are the same. That is, if a given test detects, say, 99.9% of people who have used illegal drugs recently, they think anyone who fails that test is 99.9% certain to have used illegal drugs. NOT SO! In some cases, "false positives" can outnumber "true positives" by a factor of two or more.

Here is a great example of how much forward and inverse probabilities may differ. The probability a person is female given that they are pregnant is: P(F;P) = 100%. But probability a person in pregnant given that they are female is: P(P;F) ~ 3%


Bayesian probability means, using oil exploration as a practical example, you can figure out the probability you will strike commercially-viable oil if you drill a well at a particular location given that a seismic test was positive. With the value of petroleum going up and down so rapidly in the past few years, this illustrates the fine line between making a big profit and going broke in the oil patch.


Some years ago I created a Bayesian AI ("Artificial Intelligence") Advisor spreadsheet that runs on Excel. I've recently improved the Bayesian AI Advisor and today I published a new Google Knol that explains its use. You are invited to download the Bayesian AI Advisor to your PC.

Of course, Bayesian probability applies to many areas in addition to oil exploration. My Knol looks at targeted marketing (what is the chance a given person will buy my product given that he or she has bought some other product in the past?) and medical testing (what is the chance a person has a particular disease given that he or she tests positive?)


The figure shows three cases for oil exploration with three different recomendations. I think this illustrates the risk of oil exploration and that it could be applied to the financial implications of investment in alternative forms of energy.

The user has to input the data in the clear cells, based on known probabilities and financial factors. The leftmost panel shows a case where the Bayesian AI Advisor recommends Test first, If Test is Positive, do the Procedure - in other words, do a seismic test and, if the results are positive, drill.

The middle panel shows the case where all factors are the same except the value of petroleum has gone down by 15% and now the recommendation is Hopeless venture. (Can you reduce expected ROI?) - in other words, do not waste money doing the seismic testing because, even if you get a positive result, it will not pay to do the drilling given the financial assumptions you have entered, unless you are willing to increase the risk to your investors by reducing the expected Return on Investment.

The rightmost panel shows the case where, in a different location in the oil patch, the probabilities of success are much better. Now the recommendation is No Need to Test. Go ahead with the Procedure. - in other words, this area is so good you don't have to waste money doing seismic testing, just go right ahead and drill and you are likely to strike commercially-viable oil and get rich.

Ira Glickstein


joel said...

Thanks Ira. That was an interesting explanation of the Bayesian approach. The fact that you call it the "AI" tutor, reminds me of the fact that one of my AI friends used to say that a problem is artificial intelligence until you solve it. Then it's just an algorithm.

I used to teach some of this decision making stuff in a robotics course. An interesting application is in mixed products on an assembly line. Suppose a single assembly line factory makes plywood from two different species of wood. Only when the sheets have passed through the entire process are they segregated into packages labelled according to species. The automated equipment measures the brightness or reflectance of the wood in order to "guess" the species. If we make an additional automatic measurement of the coarseness of the grain, the machine can make a much more accurate composite guess using Bayes' approach to conditional probability. What is important in this method is that we have data on the probability of any given brightness for each of the wood species and the probability of any given coarseness for each of the species. What is absolutely crucial is that we have the "a priori" probability of each of the species occurring. (It many real world cases this is not available.) In the tee-shirt case one has this data from the bookstore sales. In the plywood factory, one has the number or trees of each type that enter the factory. I like to say that we know the "parent distribution."

None of this actually addresses the intelligent part of the decision making problem. Someone has to set goals. One might minimize error, because customers get upset when we deliver the wrong species. One can maximize short run profit. One can maximize long term profit. In short, one needs a business model and/or a corporate ethos. The same is true in your oilwell model, only it's hidden behind the "return on investment" factor. That has to come from humans based upon such psychological factors such as level of greed, tolerance to risk and whether or not they have a happy marriage and their feelings about the stability of the Middle-East.

My point is that algorithms are essential, but results most often depend upon blind faith and intuition about things beyond our ken.

With respect -Joel

Ira Glickstein said...

Thanks for your comments and I agree that once an "AI" problem is solved it is an algorithm. However one of my Profs at Binghamton U. was a "computationalist" and claimed that our brains when functioning as thinking machines with input-process-output were also just (very complex) algorithms (based on the Turing idea of the Universal Computer).

Your example of using a Bayesian algorithm for sorting products according to the species of wood is a job that, until a couple decades ago, absolutely had to be done by a human. Now it could be done by a computer using an algorithm, and faster and more accurately and also cheaper.

Of course, in the case of the computer decision making, someone had to program the computer with the Bayes algorithm and much more. On the other hand, in the olden days when a human did the sorting, someone had to explain to him how to recognize different species of tree (and before he reported to work he had to learn language and may other things as he was growing up).

Of course, the computer can only do the specific job programmed (sorting wood), and could not, for example write poetry or make love to a woman, etc. But neither can some humans!

Ira Glickstein