Starting from Scratch: The Basic Building Blocks of AI

Born to be Problem Solvers

As far back as Alan Turing and the epoch of World War II, people realized that computers could be more than just machines that crunched numbers. Scientists were interested in having these machines solve real problems. “As early as the 1970’s, people were talking about grand visions, having computers thinking like humans, moving and solving problems in a human-like way. That turned out to be vastly overambitious”, Dr. MacCormack remarks. These pioneers discovered that computers cannot so easily be programmed for ‘original thought’.

The Invisibility of AI Today

For those born into a ubiquitous world of computers, screen time is a part of the landscape, the same as cars and skyscrapers were a given to generations before. Despite scientists’ premature visions, the progress made within AI in the last few decades has completely changed the face of society, and computers do a lot of amazing things that we take for granted every day.

Take Skype, for example, the technology used to record our interview. “Skype is something that you’re using without necessarily understanding all the components, most people wouldn’t even regard Skype as having anything to do with artificial intelligence. That’s part of the point,” notes MacCormack. As soon as something becomes easily done by computers, it is not consciously considered as an intelligent function anymore.

“A more common example would be chess playing, one of the classic AI challenges...we don’t even acknowledge that as genuine artificial intelligence anymore.” Most human beings would probably assume that the computer is doing a lot of crunching numbers and running through possible computations, really just another term for ‘cheating’, right? This is a powerful idea when you think about the fact that, as John phrases, “the bar moves every time computers and algorithms get bigger; we move the bar as to what we consider to be genuine artificial intelligence.”

Deconstructing the Magician’s Hat

Most of us love a good magic show. We “ooohhh” and “awwwee” at the accuracy of the magician’s ability to read our minds, to pull out the rabbit that was not visible moments ago. Most people fail to see what is really going on inside the magician’s hat, so to speak. To garner a true appreciation for the performance in its holistic form, we must disassemble the hat. If we slow down the moving frames and analyze what is going on at every key juncture in the performance, we will understand the logic behind the “magic”, and be able to appreciate the result in a new light.

In parallel fashion to this idea, MacCormack wrote his book with the intention of taking complex algorithms and explaining these concepts in simple language so that the public could understand some of the basic underpinnings of today’s AI systems. “One of the tricks” is used when computers try to classify information.

Consider face recognition, for example. Computers are given an image, and then there’s an amazingly simple trick used and known in the AI literature as the ‘Nearest Neighbor algorithm’. Basically, you start off with a big database of things that you want to classify, such as hand-written digits for recognizing postal codes on US mail. You can have thousands of handwritten digits, each one manually classified by humans, saying this picture is a ‘seven’, this picture is a ‘nine’, and so on.

When the system is given a new picture of a digit that hasn’t been classified, it looks at that picture and says, ‘Okay, what’s the single most similar image in the database?’ It then looks through the available database and picks the most similar image classified in order to recognize the new image. “We call it nearest neighbor…because we take the new example that we’re trying to automatically classify and find the neighbor that is nearest to it in our database of pre-classified information, it’s an amazingly simple trick that works surprisingly well on a wide variety of hand recognition tasks.” John adds that people continue to come up with clever ways of enhancing this recognition all the time.

Another ‘simple’ algorithm is the ‘Decision Tree’, which MacCormack compares to a game of 20 questions. “If I were to think of an object and give you a chance to ask 20 questions (a common kid’s game), very often you would be able to hone in on the object in less than 20 questions.” Computers given data can use a similar strategy and ‘come up’ with a pre-programmed list of questions to help it recognize the object in question. “We can get a very high-degree of accuracy on recognition problems in everyday systems.”

A classic example of the decision tree at work is medical diagnoses. With a list of seven or eight questions, the computer can decide “this is likely the cause of the problem”, which physicians can then integrate into their assessment of various medical conditions. Another, less obvious system is Microsoft’s Kinect Vision pipeline, which extends the game-playing experience by recognizing voices and gestures. The system asks all sorts of desired questions about the pixels that are coming in, such as ‘is there a pixel that looks like a piece of human skin over here, yes or no?’

“Even though this sounds like it could never work, it actually does,” says MacCormack. “The people at Microsoft Research built a system based on decision trees that is able to classify and recognize human poses in real-time and it works well enough to play computer games.”

MacCormack acknowledges that there are other AI approaches that “don’t boil down to some simple nugget” of an idea. There are, of course, fancier sounding algorithms, with names like ‘Support Victim Machine’ and ‘Deep Belief Network’, that rely on more advanced mathematical models. But for the general public, having a solid grasp of simple algorithms like ‘Nearest Neighbor’ and ‘Decision Tree’ are just enough information to build a basic understanding and illuminate the ‘ghost in the machine’ that occupies so much space in our daily routines.