This is The definitive guide to the history of Artificial Intelligence.
From the first use of the term “robot” to the Singularity.
So if you want a deep dive into the history of artifical intelligence, then this is the article for you.
Let’s jump right in!
The Definitive Guide to the History of Artificial Intelligence
Since ideas are often the precursor of invention, Alan Turing’s 1936 concept of a Universal Machine, a “thinking machine,” continued to play inside philosophers’, scientists’, engineers’, and mathematicians’ minds throughout the twentieth century.
How could one create computers that not only manage information and foster communication more effectively and efficiently, but also take it to a higher level; that is, create a machine that can think and interact with the world as humans do.
While they didn’t initially call what they were working on “artificial intelligence,” that concept was very much there waiting to leap out into the world.
As with so much of technological development, it took many minds from many different fields of study to establish the foundation for the area of research that we now call artificial intelligence. Learn all about the history of digital information technology here.
This is the story of those pioneers.
Robots Are Artificial Intelligence?
For many of us, when we think of artificial intelligence we immediately think of robots, as did many thinkers before the 1940s.
Philosophers, scientists, and writers seriously considered the idea of a “mechanical man.” That type of creature had permeated literature worldwide for centuries.
Philosophers, Scientists, Writers and the Mechanical Man
Here are just a few examples:
- Fifth Century BCE Daoist text Lieh-tzu mention the account of King Mu of the Zhou Dynasty. His engineer named Ye Shi presented the king with a life-sized, human-shaped machine that could interact with people naturally. When it started getting a little too friendly with the ladies, well, the King insisted on knowing whether it was a man or a machine. As legend states it was a machine; no little person hiding inside.
- Scientists in ancient Greece described several types of autonomous machines created by mechanics and inventors. One such inventor named Archytas, who lived in the fourth century BCE, created a mechanical bird. Homer wrote of “tripods”—mechanical assistants—which served the Gods.
- Although we know about the robotic inventions of Al-Jazari through his book, Book of Knowledge published in 1206 that discussed all kinds of robotic tools such as a robotic fountain, alarm clocks, and musical instruments, there was an even earlier book describing the first known programmable flute. In the ninth century, the Banu Musa brothers, who lived in Baghdad published a book title Book of Ingenious Devices in which they described an earlier programmable flute and how it worked. The only reference we now have is stored in the Vatican Archives.
- The creature Golem found in Jewish folklore was made from mud or clay and was controlled by man as mindless servant. It was neither good nor bad. They could only do what their master instructed them to do, which often lead to unintended consequences because of their unpredictability.
- Gottfried Leibniz (1646-1716) and Blaise Pascal (1623-1662) thought that machines, because they required logic to create and function, might be developed to serve as reasoning devices to help settle disputes.
- Philosophers as varied as Rene Descartes (1596-1650) and Etienne Bonnot de Condillac (1714-1780) discussed the idea of mechanical men or machines containing all of the world’s knowledge. A pipe dream for them, a regular occurrence for us with the advent of the Internet.
- Jacques de Vaucanson (1709-1782) created various automated machines, such as a tambourine player. Probably his most famous automatons were his anatomically correct animals used to demonstrate digestion and other natural actions to the delight, or disgust, of his wealthy patrons.
- Eighteen, nineteenth, and early twentieth century writers such as Jules Verne, Mark Twain, L. Frank Baum, Jonathan Swift, Samuel Butler, and Mary Shelley all provided examples of “mechanical men” or machines with the potential for consciousness.
- In 1921, Karel Capek, a Czech playwright, wrote the play “Rossum’s Universal Robot”—an English translation. One of the first times the term “robot” was ever used.
- Movies such as 1927 Fritz Lang’s Metropolis tapped into the fears regarding the growing power of machines.
- Isaac Asimov, in his science fiction short story, “Runaround” in Astounding Science Fiction magazine in March 1942 went so far as to lay out the idea of the “Three Laws of Robotics.”
First Law: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second Law: A robot must obey orders given to it by human beings except where such orders would conflict with the First Law.
Third Law: A robot must protect its own existence as long as such protection does not conflict with the First of Second Law.
It is also considered the first time anyone used the term “robotics.”
The Term “Robot”
Yet, the term “robot” was really a catch word for any machine with the ability to think, reason, and act upon its understanding independently of human intervention.
As we will see, discussions did not require anthropomorphizing the machine, only its capabilities.
By the 1950s, early pioneers thought that creating a “thinking” machine would be a very simple thing to do.
Some even gave wildly optimistic predictions along with tantalizing intelligence tests that proved to be, in reality, a low bar for what would actually be required to create a fully automated “thinking” machine.
By the 1970s, as this reality began to sink in, the catch phrase or mantra for those in the artificial intelligence field became, to quote Steve Jobs, “Simple can be harder than the complex.”
Ideas converged. The time was right.
Two scientist, Harry Nyquist and his colleague, Ralph V. L. Hartley, at Bell Labs in the 1920s and 1930s worked on questions related to the issue of noise on telegraph and telephone lines.
This led to them to ask the question about how to transport more than just the human voice or some impulses created by a telegraph key.
They wondered how to transmit information electronically.
They wanted to know how to measure information so it could be transmitted, because in order to make it work they needed to calculate the time and amount of channel or “bandwidth” necessary to do this.
They were not just discussing code or sound. They were contemplating musical pieces and motion pictures and how to analyze them to be able to count the symbols required to transport them.
Basically, what they wanted was to measure “the capacity to transmit a particular sequence of symbols.”
In 1943, two neuroscientists Walter Pitts and Warren McCulloch, published a paper in the Bulletin of Mathematical Biophysics, “A Logical Calculus of the Ideas of Immanent in Nervous Activity” that very rapidly established possible methods for replicating how the brain worked.
They described the architecture of the neural networks of biological beings.
However, their explanation was much more simplified than what one would actually find in an animal or human.
Even so, this study provided a way to approach replicating brain function when creating artificial intelligence.
While Vannevar Bush is best known for taking Babbage’s ideas and creating his own Differential Analyzer, he should also be recognized for his ground breaking article in Atlantic Monthly in July 1945 titled “As We May Think.”
In this article, he discussed the issue of how information processing units could be used in peacetime.
He anticipated a glut of information that would overwhelm current systems unless we found a way to manage it.
He introduce the idea of a system called Memex that could not only manage information, but also allow humans to exploit it, including the ability to jump from idea to idea without much effort.
In a way, he anticipated the World Wide Web.
Memex was never built. Yet, as with all ideas ahead of its time, it hung around until its time would come.
Norbert Wiener, who worked on electronic control of armaments during the war and was considered an expert on how to mitigate noise in communication equipment, published a book in 1948 that would establish the tone for all of the artificial intelligence work to come.
In his book he discussed issues related to communication and control.
He would coin the word, “Cybernetics”.
That was also the title of his book.
Psychiatrist W. Ross Ashby was fascinated by the workings of the human brain and nervous system.
Throughout his medical experiences during the 1920s through the 1940s he kept notes of his observations.
In March 1948, he created what he called the “Homeostat” which was an electro-mechanical machine which he felt reflected the adaptive ability of the brain—that is, what we now call “machine learning”.
It was based upon his belief that the brain was a “self-organizing dynamic system.”
He would go on to write two books that would be influential in artificial intelligence studies, Design for the Brain (1952) and An Introduction to Cybernetics (1956).
Both would help future researchers think about approaches to “train” a machine to “think.”
However, it would be Claude Shannon—a student of Vannevar Bush—who in 1948 would light the fuse with his article, “A Mathematical Theory of Communication” in the Bell System Technical Journal.
In his paper, he introduced four major concepts that set the groundwork for what would become modern information technology and artificial intelligence.
- In the first point, he offered a solution for the problem of noisy communication channels, what would become known as the noisy channel theorem. He found a way to not only increase the speed of transmission, but also decrease the noise and error during transmission. The key was encoding information and including redundancy in the information to decrease the probability of error. This theorem would also become known as the Shannon Limit. Basically, it is channel capacity; the limit you can send data through a channel depended upon the given bandwidth and noise level.
- He introduced the architecture and design for communication systems by demonstrating that the system could be broken down into components.
- Shannon realized that for transmission purposes it didn’t matter what the content was. It could be in any format. It just had to be encoded into “bits”—the first time this word was ever used. He defined a “bit” as a unit of measurement for measuring information.
- He then introduced the idea of source coding—what we call data compression today. Channel coding would become the key method to do this.
Shannon explained that the key to transmissions was the concept of entropy in electronics, which had originated in the study of thermodynamics.
Basically, it was energy loss, but can also be considered as energy available. The idea was to exploit this during transmission through encoding.
By suggesting this, he managed to solve Nyquist’s and Hartley’s dilemma.
And Shannon managed to solve it so well that along with Robert Fano, he would create what would become known as the Shannon-Fano coding.
Next came Robert Fano’s prodigy, David Huffman, who three years later wrote a paper as a graduate student titled, “A Method for Construction of Minimal Redundancy Codes.”
In this paper, he explained an extremely effective way to encode information for transmission.
We still use his method in data compression today—think JPEG, .ZIP files, and many others.
Huffman’s approach was to reverse the standard format for coding by assigning codes from the least used first, progressing to the most used.
The goal was to create an “optimal algorithm.”
Shannon’s 1948 article would set the foundation of what would become the science of Information Theory.
More importantly, he established the idea that something as fundamental as “information” could now be reduced to a mathematical formula, including being quantified and analyzed.
It was a mind-blowing revelation.
Others began to ask the big question about the future capabilities of machines.
Many felt it could be possible to create a digital brain and with it a “thinking” machine.
They saw similarities between how electronics worked when compared to their observations of how the brain behaved.
McCulloch said, “Think of the Brain as a telegraphic relay, which, tripped by a signal, emits another signal.”
And just as electrons began to bounce around inside the vacuum tubes when heated, McCulloch explained, “Of the molecular events of brains these signals are the atoms.”
And these atoms, or electrical impulse respond in a binary fashion.
“Each goes or does not go,” he said.
Yes, but were computers really giant brains? Would a computer actually be able to think?
Can Computer Think?
One of the first to jump in to offer his opinion was computer scientist Edmund Berkeley who in 1949 wrote Giant Brains: Or Machines That Think.
He argued that because computers have advanced so much to be able to process huge amounts of information beyond any human capacity that the next step had to be the creation of a thinking machine.
He saw computer innards as no different than the flesh and nerves of our human brains.
Thus, a machine could be made to “think.”
His ideas helped to shape the public’s perception of computers and how computers worked.
Alan Turing in October 1950 presented the seminal argument regarding computers and their ability to think in the journal Mind.
In “Computing Machinery and Intelligence,” he began with the provocative statement,” I propose to consider the question, ‘Can machines think?’”
He then proceeded to present his argument without defining “machine” or “think”.
Instead he provided a challenge, what would become known as the Turing Test.
The test was called the “Imitation Game.”
If the computer could fool judges in a blind test into thinking it was human by communicating with them as a human would during an examination, then yes, one could say the computer can think.
However, he ends is article with these comments:
The original question, “Can machines think?” I believe to be too meaningless to deserve discussion. Nevertheless, I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.
The Turing Test would become the future benchmark to determine if we have actually reached a “thinking” machine.
As of 2021, the jury is still out on that, albeit some would disagree.
Another classic test would become board games of strategy.
Could a computer beat a human in checkers, chess, the Asian game Go or even top contestants in the TV gameshow Jeopardy!?
One of the first to attempt this test was Arthur Samuel, a computer scientist.
In 1949, he began developing an algorithm, a computer program, that allowed him to play a game against a computer.
Although completed in 1952, he publicly demonstrated the computer’s ability to play checkers in 1959.
During that demonstration, he coined the phrase “machine learning” to help explain the method he used to teach his IBM 701 computer to play the game.
In retrospect, this was not necessarily a huge feat, but at the time it proved that computers were teachable through rote learning.
In 1955, three men from varied backgrounds worked together to create a program to test their ideas about creating an “intelligent” computer.
These were: Allen Newell, a political scientist who realized that computers might be able to accomplish complex tasks if simpler units worked together; Herbert Simon, an economist who actively wondered if machines made decisions via symbols could they also simulate thought?; and Cliff Shaw, a computer programmer who helped them write their proposed computer program.
They would call it the Logical Theorist and it would become one of the first artificial intelligence computer programs.
In the end, the program would solve thirty-eight of the fifty-two theorems in Principia Mathematica by Alfred North Whitehead and Bertrand Russell.
Newell and Simon would go on to establish the AI Laboratories at Carnegie Tech which would later develop the technology for systems such as Global Positioning System (GPS).
John McCarthy, a mathematics professor at Dartmouth College wanted to create a way to formulate a foundation from which to build the study of computer intelligence.
He persuaded his Princeton University classmate Martin Minsky as well as Claude Shannon and Nathanial Rochester (an innovative electrical engineer) to lead an effort to create “a 2 month, 10 man study of artificial intelligence to be carried out during the summer of 1956.”
It was the first time the term “artificial intelligence” was used.
As McCarthy would later defensively argue years later, they were looking for “genuine” intelligence, but in a machine, so he had to call it something.
He also wanted to differentiate what they were doing from cybernetics (the study of communication and automatic control systems in both machines and living things—dictionary.com).
In August 1955, they applied to the Rockefeller Foundation for funding.
In their grant application, they called it the Dartmouth Summer Research Project for Artificial Intelligence.
Their goals were to address seven key problems:
- Automatic Computers
- How Can a Computer be Programmed to Use Language
- Neuron Nets (how to arrange “hypothetical” neurons to form concepts)
- Theory of the Size of a Calculation (set criteria to improve efficiency of calculation)
- Self-Improvement (machine learning)
- Abstractions (machine action outside of the distinct)
- Randomness and Creativity (machine acting from intuition and educated guess)
While the meeting did not develop a clear set of rules for the area of study, it did set general guidelines.
Many would consider this 1956 meeting the beginning of artificial intelligence study.
The Game is On
Achievements in artificial intelligence paralleled almost exactly with the achievements in computer and information technology.
And, all of those achievements would not have happened if it had not been for Sputnik and the Cold War, where defense spending exploded through the late 1950s to the early-1970s.
Computers improved in speed and effectiveness with the creation of the integrated circuit by Jack Kilby of Texas Instruments in 1958 and the first microprocessor by Intel Corp in 1971.
Information technology was able to experiment with various types of coding that permitted clear communication with minimal errors.
This development would be critical for NASA so it could communicate with its astronauts in space and later scientific missions of Pioneer, Mariner, and Voyager—in essence, information/data collecting robots.
Important technological innovations came out of this effort that we now use on a daily basis such as fax machines and modems.
One of the first modem was based upon the theory of Quadrature Amplitude Modulation (QAM) leading to the creation of the 9600 baud QAM modem in 1971.
Today, anything from your CD player, cell phone, and the Internet to improved telecommunications between computers and satellites has its roots in this work.
The Shannon Limit—as he discussed in his 1948 article—utilizing channel coding was all but reached by about 2001.
And artificial intelligence studies also flourished during this time.
Government money not only helped to accelerate developments in computer and information technologies, it also created an environment where experiment with untried theories was not only permissible, but encouraged.
Many of the foundational studies in artificial intelligence began with the advent of the space program.
But, first some housekeeping was needed.
As Voltaire said, “Define your terms…or we shall never understand one another” (Mitchell, p. 19.) —a seemingly simple task, but as with so much in artificial intelligence, it proved harder than it first appeared.
However some basic ideas were settled upon—there were two main lines of research: the scientific and the practical.
The scientific focuses on trying to recreate biological intelligence within the mechanisms of the computer; the goal is to get the computer to think.
The practical emphasized getting machines to process information better than humans; they didn’t care if the computer had the ability to think or not.
Because artificial intelligence pulls from so many different fields in order to create methods to accomplish its goals, other questions become rooted in which method to choose.
Mathematicians use the language of rational thought; that is logic and deductive reasoning.
Others use inductive reason because so much of the information that is processed in the brain is random and uncertain so studying probability factors in digital decision making made sense.
And biologists and psychologists were fascinating by the workings of the brain and wanted to recreate brain function.
According to Melanie Mitchell in Artificial Intelligence: A Guide for Thinking Humans, by 2010 one area of focus rose above all others and that was the idea of deep learning or a method using deep neural network systems to foster machine learning.
However, it took some time and some unfortunate digressions to get there.
No matter what the method has been, almost every artificial intelligence project has focused in on efforts to help a machine “learn.”
Those efforts fall under two main categories: Symbolic AI/Logic-Based AI or Sub-symbolic AI/Non-Logicist AI.
Symbolic AI has its roots firmly in mathematics.
As its name infers, it uses symbols to process information and then problem solve based upon the information presented.
With the rise of the Internet from the 1980s to today, the era of Big Data has helped this method flourish.
Basically, it was assumed that the more data one could dump into the computer, the better the outcome would be because it will have more information to analyze and process.
During the early years of artificial intelligence, it would be Symbolic AI that would dominate the field.
Other work has focused on Sub-symbolic AI which takes its method from neuroscience. Its objective is to look at how neurons in the brain process information.
One of the early innovators in this method was a young Cornell psychologist, Frank Rosenblatt, who developed the concept of Perceptron based upon the study of how neurons work in the human brain.
One has to remember that this was before any kind of digital imaging like an MRI.
What he understood was that the neurons in the brain responded to both electrical and chemical input. These neurons were interconnected via synapses.
The brain, through these neurons, weighed the input, and the output would be based upon these electrical interactions. Learning occurred when the synapse connections were strengthened.
Some saw similarities with computers because of the seemingly binary response of the brain because the neurons either fired or they don’t.
In 1962, Rosenblatt would publish Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms.
In 1969, Marvin Minsky and Seymour Papert would slam these ideas as too brittle in their book Perceptrons.
However, as more researchers conducted research and neuroscience learned more about how the brain work, by the 1980s, Principles of Neurodynamics would become required reading for many interested in artificial intelligence.
Sub-symbolic was on its ascendancy.
The next question for those using this method was how to determine the proper threshold and weights required for the computer to learn.
So algorithms were created to assist with the process of “conditioning” the computer or what would become known as “supervised learning.”
Basically, the computer would be fed a “training set” and the computer via its algorithm would learn how to process the data to come up with a response.
It soon became apparent that it would require more layering of information units, just like a human brain works, albeit substantially simplified.
They would call this multilayer neural networking.
The way to think about neural networking is rather than having three stages—that is the input, the processing unit, and then output—the research would be required to create “hidden” or “inner” units, and lots of them.
Data for each unit of these units would need to be imputed and then data between the units networked so when the computer analyzed all the data under several layers of classification it could create an output based upon this networked data.
The idea behind this was rooted in the work by two psychologists David Rumelhart and James McClelland, who, in the 1980s, pointed out that humans are able to do so much more than computers because their brains are not linear, but have instead “a basic computational architecture.”
In other words, the brain consisted of interconnected layers.
Their book, published in 1986, Parallel Distributed Processing, explained that “knowledge in these networks resides in weighted connections between units.”
This was the beginning of the idea of connectionism, what some would later call “Deep Network,” which formed the basis of “deep learning” that is the bedrock of the field today.
And to address the “sterility” of neural networking propagated by Minsky and Papert, other researchers developed methods that addressed mistakes found within the networks that led to wrong outcomes.
So they instituted what would become known as “back-propagation” algorithm to find the mistake hidden in all of these layers of processing.
In essence, this entire process involved trial and error and back again.
The next evolution of these methodologies as the layers of networking grew larger became the idea of “tuning the hyperparameters,” meaning setting up all aspects of the network needed to assist the machine in its learning and modifying it as necessary while the conditioning proceeded.
Another method became reinforcement learning which was based upon the ideas of operant conditioning—that is one get rewards from one’s environment when one does something right.
Yes, this sounds odd considering we are talking about a machine.
But, in an eerie way, it was similar to how one might train a dog with doggie treats.
However, in this case it, usually a robot of some kind, receives a reward by getting a higher number in a chart within its operational algorithm—it’s “Q-Chart”.
Basically, the robot performs an action within its environment such as move a step forward or kick a ball.
The algorithm will place a number in its “Q-Chart or Q-Table” if the robot acted appropriately or nothing if it did not. The higher the number the greater the reward for the robot.
Through trial and error after many “episodes,”—learning modules—the robot begins to acquire a “memory” and “learns” the correct response.
This process has worked well in controlled trials, but place the robot outside of its nice, safe, controlled nest, and things begin to go haywire.
Scientists continued to build upon these earlier ideas and perfected them.
Some combined methods in unique ways.
One of those methods was something created by Demis Hassabis and his colleagues at Deep Mind Technologies (founded in 2010).
They called it Deep-Q Learning which was a combination of reinforcement learning with deep neural network systems.
Basically, through a long process of trial and error and reenforced learning the computer begins to anticipate appropriate action depending upon circumstance.
And unlike most of the systems like supervised learning, the system does not act upon the highest estimated value based upon inputs.
It actually does some exploring and some exploiting, meaning it “chooses” how to act.
It is important to note that the system doesn’t match its outputs to human requirements, but instead it matches its outputs based upon previous iterations or episodes and then estimates the value and acts on it.
They call this temporal difference learning.
Based upon what was learned through experiments at Deep Mind and other companies, there came a realization that they could use these various methods with some gaming twists to get a computer to play games well.
One method is called “game tree” because one starts at the root of the question and then branch out.
For games like checkers and chess where some of the moves can be anticipated, this type of system allows for the computer to walk the present scene out about three moves ahead to determine its best options and then act on it.
Basically one would use a “minmax” algorithm to determine the best move.
A variance of this is called the Monte Carlo Tree Search (yes, it is based upon game theory which is based upon how casino games function) which takes uncertainty into account as part of the data processing.
These methods would become important as people tried to reach those game-based intelligence benchmarks established years before.
Other research embraced our new understanding of how the brain works and methods to accommodate uncertainty to look into replicating human functions such as sight, speech recognition, and navigation.
In order to do this researchers took methods used in deep learning along with the most recent research in neural networks and combined them to creation something called Convolutional Neural Networks or CONVNets.
Basically, the idea was how to train a computer to recognize something it “sees.”
The process was in essence layered pixel recognition where the computer engages in pattern recognition.
As with a robot learning how to walk or to kick a ball, the computer was slowly trained to recognize a dog or a number by “noticing” features such as the nose and ears of the dog or the roundness of the shapes that comprise the number eight.
These are called convolutional filters. Researchers then built upon these types of filters creating layers of recognition called compositionality.
As the computer weighed these factors, a “map” of what encompasses a dog or the number eight would take shape.
Eventually, the computer would recognize a dog or a number eight consistently based upon this map.
The real problem became when there were more elements in the photo beyond just a dog or a single number.
So, the computer would then take each one of the elements in the photo and map them individually in order to recognize them to create an output of a dog with a child or a number eight found in a house address such as 823 Pine Street.
If you ever wondered why you get those security request where it asks you to find the bicycle in a series of nine photos, this is why.
Computers, unless they have been specifically trained to recognize a specific item, are unable to do it automatically. However, as the science evolves, maybe someday this won’t be the case.
Even with some remarkable breakthroughs, what had become clear to anyone working under the artificial intelligence umbrella was that both symbolic and sub-symbolic apparatus had their strong points and their weaknesses.
Symbolic systems worked well, but only if a system was engineered and continually managed by humans, with its prescribed data input for a particular area of expertise (supervised learning), and could work using “human understandable reasoning to solve problems.”
Sub-symbolic worked best in areas of perception and motor tasks that lack easily definable rules such as face recognition.
As John McCarthy said nearly fifty years after their groundbreaking Dartmouth meeting, “AI was harder than we thought.” (Mitchell, 33.)
Key Successes, Big Ideas, and Reality Checks
1960s:
UNIMATE—Considered to be the first industrial robot.
It was first conceived when George Devol, an inventor, and Joe Engelberger, a business executive interested in robotics, met at a cocktail party in 1956.
In 1961, George Devol obtained the patent for his invention which could follow step-by-step instructions stored in a metal drum for doing repetitive actions such as working on an assembly line—similar to the Jacquard loom.
UNIMATE took on its first job at General Motors stacking hot pieces of die-cast metal.
Quicksort Algorithm—Quicksort algorithm, also known as partition exchange sort, was developed by Charles Anthony “Tony” Richard Hoare in 1959 and published in 1961.
It was an algorithm that allows computers to sort data faster than ever before.
It was so effective that it is still used today for sorting data.
David Hubel and Torsten Wiesel—Neurophysiologists who conducted experiments throughout the 1960s to determine how animals see.
They monitored the electrical impulses from a cats visual cortex into the brain.
This information would later help the development of CONVNets (Convolutional Networks) which would lead to visual recognition programs.
The Rancho Arm—Researchers at Rancho Los Amigos Hospital in Downey, California wanted to create an artificial prosthetic arm using computer technology.
They created a six-jointed mechanical arm that was as flexible as a human art. Stanford University acquired it in 1963.
It was the first robotic arm controlled by a computer.
ELIZA—Computer Scientist Joseph Weizenbaum finished ELIZA in 1965.
ELIZA was a computer program that could functionally converse with a human by answering questions on a computer terminal.
While very brittle and, at times, nonsensical, it did mark an important step in natural language research.
DENDRAL—Many began to see the medical applications of using both computers and artificial intelligence to begin to understand the human body and its functions.
One of the earliest applications was to create something called an “Expert System,” basically analyze data like a human expert, only faster.
Using the programming language LISP (list processing) developed by John McCarthy at MIT in 1959, a team at Stanford—Ed Feigenbaum, Joshua Lederberg, and Carl Djerassi—in 1965 created DENDRAL, short for dendritic algorithm.
This system was used to study the molecular structure of organic compounds.
It had some flaws, but it would become a foundation for many medical “expert systems” to come over the next three decades.
Robotics—Other important developments occurred to robotic arms and robotics.
- The Tenacle Arm—Developed by Marvin Minsky in 1968. It moved like an octopus with twelve joints and was powered by hydraulic fluids.
- Victor Scheinman’s Stanford Arm—One of the first robotic arms to be both electronically-powered and computer-controlled, developed in 1969. This system soon came to permeate industrial applications.
- Shakey the Robot—Created by the Stanford Research Institute through studies done from 1966 to 1972. Shakey became one of the first robots that was controlled by artificial intelligence systems using a three-tiered software architecture. While it did not use neural networks, it did use algorithms that mimic how the human brain worked. It had television cameras to see and “whiskers” to sense the world around it. It would use this data to determine actions, that is to respond to its environment in a simple way using its planning system called STRIPS. While it had no direct use outside of its controlled environment, it did serve as a foundation for future artificial intelligence experiments and developments.
1970s:
MYCIN—Developed over a six year period in the early 70s using the LISP programming language and modelled off of the DENDRAL system.
It became an important diagnostic tool for medical doctors in determining the right dose of antibiotics for patients. It was based upon a dissertation written by Edward Shortliffe.
It used symbolic methodology.
The problem became its inability to work outside its particular context.
SHRDLU—Named after the second line of keys in a linotype typesetting machine, SHRDLU was created by Terry Winograd while working on his PhD Thesis at MIT.
It used natural language rather than computer code to communicate with the computer to give it instructions.
Using particular syntax and phrases, a programmer could give the computer particular instructions to carry out particular actions such as moving block of a particular shape, size and color.
Although uses for this process turned out to be limited in actual practice, it did hint at the possibility for the future use of natural language and human reason in the development of future artificial intelligence applications.
Nanotechnology—In 1974, Norio Taniguchi, a professor at Tokyo Science University, first used the term to describe methods to cut materials within atomic-scale size.
Speak and Spell—First introduced to the public in 1978, it is rooted in the study of creating synthetic speech.
Created by scientists at Texas Instruments, it became one of the first publicly used applications of a human vocal tract duplicated on a silicon chip with solid state circuitry.
It permitted a child to not only learn how to spell a word, but also to pronounce it.
It was a tremendous breakthrough and led to the TMS5100 chip which became one of the first speech synthesizers ever made.
The GEB—In 1979, Douglas Hofstadter, a physicist, cognitive scientist and philosopher, published his groundbreaking book Godel, Escher, Bach: The Eternal Golden Braid, that won him not only a Pulitzer Prize, but also the National Book Award.
One of this book’s unanticipated, but significant impacts was that many of the researchers and scientists working in computer science, artificial intelligence, and allied field today were strongly influenced by this book and for many it had inspired them to enter those fields.
In this book, Hofstadter presented what has become known as Hofstadter’s Law: “It will always take longer than you expect, even when you take into account Hofstadter’s Law.”
Robotics—Developments in robotic arms and robots continued creating important refinements. Improving the ability to pick up and manipulate objects became a particular focus of this area of study.
- The Silver Arm—Created by David Silver at MIT in 1974. Using pressure sensors, the arm could mimic the dexterity of human fingers.
- The Soft Gripper—The Shigeo Hirose’s Soft Gripper, took this refinement one step further by creating grippers that could conform to the shape of the object being picked up. It was designed in 1977 at the Tokyo Institute of Technology. This systems served as the foundation for many modern versions of robotic “hands.”
For a deeper understanding of the different types of AI-controlled entities like robots, androids, and cyborgs, you might be interested in our article on Robot vs. Android vs. Cyborg vs. Humanoid vs. Bionic Drone.
1980s:
John Searle—A philosopher who in 1980 wrote an article “Minds, Brains and Programs,” Behavioral and Brain Sciences.
He believed strongly that machines would never reach the capacity to actually think like humans do.
Kunihiko Fukushima (1982)—An engineer who developed cognition and neo-cognition systems to better translate how an actual eye sees into a program for computers so they can “see” through the use of algorithms and back-propagation.
These efforts created the concept of Convolutional Neural Networks or CONVNets which proved key in creating visual recognition programs.
DaVinci System (1987)—Created by Philip S. Green at Stanford’s Scientific Research Institute, Joseph M. Rosen, MD, and Jon Bowersox, an army surgeon created one of the robotic assisted surgery systems, sometimes called a “telepresence surgery system.”
To explore more about how AI is revolutionizing healthcare, check out our article on A Breakthrough in AI Gives Voice to Woman Silenced by Stroke Nearly Two Decades Ago.
Fifth Generation Computer Project—Throughout the 1980s, the Japanese government dumped over $400 million dollars with the goal to create a platform to foster the growth of artificial intelligence systems with the intent of creating “thinking” machines.
While the project never reached its lofty goals, it did spur a new generation of AI scientists leading to important innovations in the 1990s.
Robotics—Robots during this era became even more sophisticated by innovations in programming and in the applications ability to function more independently of human input and other external programming.
- The Stanford Cart in 1980 had applications that led to improvements on independent mobility and navigation.
- Starting in 1981, innovations in robotic arm construction included motors housed inside of the arm joins (direct drive or DD) eliminating the need for wires and chains to control join movement.
- Some robots like the Denning Sentry robot (1983) improved the effectiveness of security systems.
1990s:
Carbon Nanotube (CNT)—In 1991 Sumio Iljima of NEC discovered tubular structures.
Soon applications for these tubes began to be explored in areas as diverse as electronics, multifunctional fabrics, photonics, biology, and communication.
CyberKnife (1992)—Medical technology had moved into the realm of increased precision using a surgical robot developed by neurosurgeon John R. Adler that used x-rays to locate tumors or deliver a set dose of radiation.
It was an important development in medical science’s fight against cancer.
Nanocrystal Synthesis—This method was invented by Moungi Bawendi of MIT. It is best known to many of us as quantum dots.
Soon application were found for this in computing, biology, high-efficiency photovoltaics, and lighting.
Checkers Benchmark—In 1994, Marion Tinsley had been the world’s best checker player for forty years, at least until he squared off against Chinook. Johnathan Schaeffer, a computer scientist, had programmed Chinook over a series of years.
Chess Benchmark—Back in 1958 Allen Newell and Herbert Simon, “If one could devise a successful chess machine, one could seem to have penetrated to the core of human intellectual endeavor.” (Mitchell, 156.)
IBM’s Deep Blue beat Garry Kasparov, a world champion chess player in 1997. This was an incredibly important milestone.
However, Deep Blue when used for other applications such as medical prognostication it proved not as effective without substantial reprogramming.
In other words, did it actually penetrate “the core of human intellectual endeavor?” Could it actually “think” as humans do?
Many scientists in artificial intelligence felt that it did not because it did not have full rational and informational flexibility that humans do.
Dragon Systems (1990)— Starting as early as the 1950s various researchers studied ways to help machines be able to recognize and comprehend human speech.
One of the first was the AUDREY system, which used vacuum tube circuitry and could comprehend numerical digits about 97% of the time.
The IBM shoebox machine presented at the 1962 World’s Fair in Seattle did only slightly better.
Using a database to retrieve information to improve comprehension, Carnegie Mellon developed HARPY in the 1970s.
By the 1980s, systems were using predictive methods (Hidden Markov Model-HMM) rather than trying to match the sound exactly.
The first commercial use of this technology was “Julie” in 1987 which could understand basic phrases and answer back.
In 1990, company by the name of Dragon Systems developed a dictation system called “Dragon NaturallySpeaking” that could hear, understand, and transcribe the human voice.
One of its key application was in medical dictation.
In 1997, BellSouth released their VAL system. Most of us are familiar with what this sounds like since it is the computer that speaks to you in voice activated menus today.
Yann LeCun (1998)—Geoffrey Hinton in the 1980s made back-propagation work for training neural networks. His former graduate student Yann LeCun created a convolutional network to read handwritten numbers accurately.
They called this program LeCun.
Kismet (1998)—Kismet, a robotic head, created by Dr. Cynthia Breazeal, could read and express emotion. Kismet was the first robot that could do this.
Dip-pen Nanolithography—Invented by Chad Mirkin at Northwestern University. This allowed for the “writing” of electronic circuits as well as creation of microscopic biomaterials and even applications for encryption.
2000s
National Nanotechnology Initiative (NNI)—In 2000, Present Bill Clinton created NNI to coordinate efforts to promote the develop of nanotechnologies.
These technologies had already begun to hit the consumer markets in clear sunscreens, stain- and bacterial-resistant clothing, better screen interface with electronics, and scratch-resistant coatings.
Three years later Congress would pass the 21st Century Nanotechnology Research and Development Act to promote development.
The European Commission in 2003 also adopted the communication, “Towards a European Strategy for Nanotechnology” to encourage members of the European Union to promote research and development in nanotechnology.
Drones—Although drones have existed as early as 1849 when Austria used incendiary bombs attached to balloons to attack Venice, the first use of a robotic drone in military conflict to specifically target a military combatant was in 2002 when the CIA used a Predator Drone to try kill Osama Bin Laden.
By 2006, drones used GPS technology and became readily available to anyone who wanted one.
In that year, the public use of drones had grown to such a point that the Federal Aviation Administration had to begin issuing permits and setting regulations for use.
Ray Kurzweil—Former student of Marvin Minsky who became a foremost innovator with inventions ranging from music synthesizers to one of the first text-to-speech machines to optical character recognition.
However, he is best known for his unfailing certainty in the ability of computers to exceed human intelligence.
In his most famous work, The Singularity is Near: When Humans Transcend Biology (2005), he makes several startling predictions:
- Artificial intelligence will reach human levels by 2029—that is, pass the Turing Test.
- Non-biological intellectual capacity will exceed human intelligence by the early 2030s.
- By 2045 we will reach the Singularity, the point at which there will a profound and disruptive change in our intellectual capabilities, implying we may no longer be fully biological, but part machine.
DNA-Based Computation—In 2005, Erik Winfree and Paul Rothemund from the California Institute of Technology developed theories to embed computations within nanocrystal growth.
They also called it “algorithmic self-assembly”.
Google Voice Search App—In 2008, Google introduced its Voice Search App for iPhones.
It was an important demonstration of just how far things had come in the field of voice recognition.
ImageNet—Fei-Fei Li created this image database in 2009 using convolutional networking. It has millions of images curated (i.e. labelled) through crowdsourcing.
Robotics—
- Robots had become more lifelike such as the SONY AIBO (1999), a robot that acted like a dog and could respond to voice commands or Honda’s (2002) Advanced Step in Innovative Mobility (ASIMO), a humanoid robot that could walk, climb stairs, recognize faces and objects and could respond to voice commands.
- One of the first mobile robotic cleaning machines called iRobot Roomba was first introduced in 2002. The vacuum robot was able to detect and avoid obstacles using an insect-like reflex behavior giving it more flexibility than using a more centralized “brain.”
- Other robots, like those created by Defense Advanced Research Project Agency (DARPA), called Centibots (2003), could enter into dangerous areas in groups communicating with each other to coordinate efforts without human intervention. If one of the units failed, another could move into its place. It could be used for mapping or to find items as well.
- And probably one of the most famous robots, the Mars Rover, landed on Mars in 2004 and continue to return messages and data to Earth until 2010. The information it beamed back to earth opened up greater understanding of the Mars surface than ever before.
- In 2004, DARPA introduced the first “Grand Challenge” for autonomous vehicles with the goal of encouraging research in self-driving cars. In 2005, Stanford’s entry named, “Stanley,” managed to drive an off-road, 142-mile race course well within the ten-hour time limit set by the contest organizers. It did it in seven hours.
2010s:
Watson—In 2011, IBM created a computer that had been programmed for natural language and communication.
In that year it beat Ken Jennings and Brad Rutter (two former high winning players) in a televised episode of Jeopardy!.
Again, Watson fell into the same trap as Deep Blue.
Was it actually able to “think” and process information as humans do? Many still believe we have a long way to go before we reach that point.
Siri—In 2011, Apple took personal assistant and voice recognition technology to a new level with the introduction of Siri.
ALEXNet—Taking ImageNet to the next stage in 2012 with over fourteen million images. Visual recognition had improved immensely.
Nanotube Computer—In 2013, the first carbon nanotube computer was developed by Stanford Researchers. They named it Cedric.
The goal was to determine if was possible to build a computer out of carbon rather than silicon to try to help improve energy efficiency.
Passed the Turing Test?—In 2014, the Royal Society in London hosted event following the format of the “Imitation Game” described by Turing back in 1950. There were several groups of contestants.
A Russian/Ukrainian group presented Eugene Goostman, who (in reality, was a Chatbox) won with much press fanfare and excitement that the Turing Test had finally been beaten.
But was it? Did this computer actually interact with the judges or was it just programmed well to perform this one task. Most experts felt it was just programmed well.
It could not “think” outside of this activity.
Joint Letter—In 2015, Stephen Hawking, Elon Musk, Steve Wozniak, Bill Gates and about one hundred others sent a joint letter to the International Joint Conference on Artificial Intelligence in Buenos Aires, Argentina issuing a warning about the risk of artificial intelligence to humanity, especially if used for military purposes.
AlphaGo—In March 2016, a competition between Lee Sedol, a world renown Go player and a computer program called AlphaGo beat him in a five game match.
First time a machine ever beat a human in the game of Go.
Where Things Are Now and Closing Thoughts
Our lives have been changed by computer, information, and artificial technologies in ways almost unimagined just a few decades ago.
The list of impressive achievements since that fateful meeting at Dartmouth the summer of 1956 is long.
Computers today can process more information faster and can provide us answer to questions with just a few keystrokes that used to take years of research to find.
Both voice and visual recognition systems have improved the lives for those who are hearing and sight impaired.
Computers can now talk and respond to us as we regularly interact with voice recognition systems.
Computer visual recognition has helped to improve security by providing positive identification to use ATMs, unlock phones and control facility access.
It also allows us to find missing persons and to identify criminals in a crowd. In medicine, visual recognition allows doctors to recognize and diagnose diseases.
Computer-managed sensors in cars have helped to make driving safer and improved fuel efficiency.
We have computers that can serve as companions and personal assistants to bring us our favorite music upon voice request or to make sure we make that important board meeting.
We have computers creating music better than musicians can.
There is 3D printing that has revolutionized how we can duplicate materials such as in museums to create replicas that visitors can manipulate and handle without damaging the original object.
We have sophisticated robots exploring the far reaches of space and beaming back images and data that even forty years ago scientist would have thought unbelievable.
Other robots can clean our homes or mow our lawns and we may within a few decades ride in driverless cars.
We even have robots that we can play games against or serve as a friend, toy or teacher for children.
Scientists are using these technologies to gain a better understanding about how our natural world works, such as how birds migrate and how many have disappeared because of man’s carelessness with chemicals and agriculture.
With the advent of nanotechnology, medicine and similar sciences have already begun to change how we fight killer diseases such as cancer or heart disease.
It has been an extraordinary seventy years.
Because of these developments, we have a greater understanding of the world around us than ever before in human history.
We also more clearly understand now just how complex, miraculous, and yet fragile living creatures truly are.
Even so, can we honestly say we have created a “thinking” machine?
Have we managed to achieve any of the seven goals established for the Dartmouth meeting in 1956?
Many in Silicon Valley today invoke the Hofstadter Law and say that artificial intelligence was more complicated than they had original thought.
The new technologies we have developed also come with huge moral implications.
At a meeting at Google Headquarters in 2014, Hofstadter spoke of Ray Kurzweil and his idea of the Singularity, where machines might someday think, learn, and achieve human-like intelligence and maybe even surpass us.
While this was an exciting idea when Hofstadter first started his exploration of the issue in the 1970s and the possibility was so remote, now, if this were to actually happen, “we will be superseded. We will be relics. We will be left in the dust.”
He went on to say that, “I find it very scary, very troubling, very sad, and I find it terrible, bizarre, baffling, bewildering, that people are rushing ahead blindly and deliriously in creating these things.”
He knew of some researchers going too far, envisioning the creation of robots with human brains created by grafting human brain neurons into the robotics and even insisting that these “robots” or “cyborgs” would need to have rights just like humans do.
Some propose we should graft microchips into human brains and muscles so we could create super humans who could compute faster and be stronger than other humans.
He saw that some artificial intelligence researchers were moving beyond its original intent.
What he was trying to say, to warn us against, was we should not take artificial intelligence too far. He wanted humanity to remain special.
He did not want those inspirational attributes that have measured humanity’s value to be boiled down to nothing more than a bunch of algorithms.
It would mean the concept of the human spirit was nothing.
That we strove to recreate the human mind only to find it superfluous.
He found this possibility demoralizing and he prayed his children would not live long enough to see it happen.
What Hofstadter and others have tried to address was the reality of unintended consequences.
For example, think about the continued growth of the socio-economic divide rooted in digital technology.
Those with technology will live ever more comfortable and healthy lives, but what about those who have little if any access to these technologies?
How do we ensure they will not be left behind?
How do we distribute the benefits equitably while keeping at bay the harm technology can do?
What about the issue of determining the point when scientific development crosses a moral line from creating hugely beneficial applications such as curing cancer or helping a paralyzed person to walk again to creating a race of super humans to control society?
The fundamental technology might be the same, but how it is applied matters.
They are opposites on the same coin.
While these scenarios may seem to exist in the realm of science fiction or fantasy, we are closer to achieving them than ever before.
How do we walk into the future and know we can ensure science will only be used for good?
If the past is any guide, Hofstadter’s concern is more than apt.
So, how do we navigate these ever increasing moral dilemmas?
To quote Christian Lous Lang, “Technology is useful servant, but a terrible master.”
Which will it become?