Archive for May, 2010

On the use and abuse of AI

May 29th, 2010 | Category: Uncategorized

One of the most exciting realms of computer science, at least from my appreciation, is the bipolar field of artificial intelligence. Bipolar, I say, because of it’s tremendous ups and downs that seem to cycle like the seasons (not surprisingly, the times when the hype that surrounds the field fails to live up to its promise and a lack a funding derived thereof are called AI winters). During each new cycle, some models that were discarded as useless and naive in the previous iteration of innovation re-emerge in more mathematically sound incarnations that add the necessary rigor to produce meaningful results. One of the greatest example of this type of behavior is the story of the perceptron, a predictor function based on a simple threshold value that roughly models how a neuron reacts given some stimulus. The perceptron was heavily criticized in the late 60’s by the highly cited work of Marvin Minsky and it was more or less abandoned until the return of AI during the 90’s, when the concept of neural networks became formalized and empowered through the backpropagation learning techniques.

We are now living inside of one of those AI summers (the opposite of the AI winters), precisely when the field starts gathering the hype of what it is capable of doing. Like the resurrection of the perceptron, we are now experiencing the resurrection of deep network architectures. You see, quite a few years back, when neural networks were proven effective for some pretty neat tasks (such as recognizing your handwriting in your documents automatically), all kinds of research spawned around them. One of these threads of investigation was construction of multi-layered networks, that had enhanced capabilities as more layers were added (well, sort of, there are some technical issues that have something or other to do with “over-training” the network). However, as more layers were added, the complexity of the learning algorithm increased, to a point that it became unfeasible. Thus, deep architectures were forgotten for some time, until the recent advancements in deep belief networks have made them trendy once more.

Another characteristic of the cyclic nature of AI is the number of its applications in fields other than pure computer science. When there is a burst of innovation in AI, and AI is trendy enough, people all around science and engineering want to apply the new methods to their problem. Sadly, most of them fail (yep, most of them because they haven’t got the slightest clue as how the method truly works…this is in part because the innovation is so, well, new, that even the inventors of the method do not have a complete view of its limitations), but those that do succeed more often than not rebrand the method and make it a subfield of their field (OMG! This support vector machine method works really great on predicting climate changes, let’s change tweak it and say that we are doing, err, non-linear adaptive meteorology) and the credit to the original method is a bit lost in the trail of application. Both effects when combined take important credibility away from the AI folks, who are most of the time too occupied trying to make the toaster talk to them or teaching their car to park itself in dukes-of-hazard-fashion to even notice (until funding suddenly starts faltering, in which case the guy with the poetic toaster swears revenge, which leads to flying, MIG-equipped toasters and other crazy stuff).

Being in a field that makes heavy use of AI, I cannot help but notice the difference between doing science with these powerful statistical and heuristic tools (which I like to call meta-modeling) and doing it in the old-school Newton-Hilbert-inspired framework. Because Ā perhaps of my pure math bias in higher education, I consider standard modeling as the premier, elegant way of describing the world. I definitely enjoy more the solution of counting the number of alveoles we have through some powerful group-theoretic theorem than through the classification and analysis of all images of the human lungs available through some machine learning method. Elegance is a compact approach that masterfully brings down a problem with seemingly little effort, and most AI applications lack this (note that the model per se may be quite elegant, but the application may be not). Thus, to my appreciation, AI should be used as the tool that will help you leverage your problem when standard, classical models fail or are too limitative (one great example is the prediction of the variation of RNA splicing across tissues, a problem in which we have so little insight and that was recently addressed using a complex code that was selected using AI techniques and models). It does not entitle you, however, to abuse it by applying it to everything you come up to. This is specially true in the biological domains. Even if you are versed enough in the art, even if you know how to pick the right set of features to make things work, such model should be either used as a first rough draft to better understand the problem, or as a last resort when things aren’t really working out (and will most probably fail with AI as well, but hey, it is worth a shot no?). It is more intellectually satisfying (and challenging) to come up with a simple equation, relation or whatever that faithfully describes your system than to use a computational powerhouse to predict your system’s outcomes.