The Building Blocks of Artificial Intelligence

n 1950, mathematician Alan Turing said that computers would surpass the point where machine intelligence could be questioned when a chatbot could convince humans they were talking to another human instead of a chatbot for a full five minutes. The idea became a cornerstone in basic computer science and the basis of a 1990s competition funded by an eccentric millionaire. When a computer was able to fulfill Turing’s prophecy and trick 30% of judges into thinking it was human in a 2014 Turing Test competition, humans jokingly wrote about “welcoming their new robot overlords.”

However, humans uncomfortable with this recent descent into the uncanny valley ought to consider buckling up their seat belts for the long haul. Over the past few years, processes associated with artificial intelligence (AI) for years, such as machine learning or neural networks, left their computer science context and came to be popularly associated with mainstream technological advances such as so-called deepfakes and voice recognition technology. From predicting and identifying disease to revolutionizing the way people work, the next few decades will tell the story of the rise of machine intelligence.

Explaining Artificial Intelligence

As AI has gotten a larger and larger share of the spotlight in the popular press, news articles have thrown around terms like “neural network” and “machine learning” almost interchangeably. As machine learning and neural networks become the backbone of more and more impressive technologies (last year’s video clips of an animated Mona Lisa produced at Samsung’s AI lab in Moscow come to mind), it’s important to understand what these subfields of AI are capable of producing currently, and what they might create in the future.

A blog post from IBM offers an excellent analogy for understanding the subfields of AI technology. Program manager Eva Kavlakoglu describes the subfields of AI existing as nesting dolls of machine learning, deep learning, and neural networks. Machine learning is the broadest subfield, referring to technologies which allow computer systems “to learn and improve without being programmed or supervised.” “Non-deep” or classical machine learning refers to processes that require human intervention vis a vis labeled datasets to learn. In applications like speech learning, classical machine learning might require something like a human speaker reading vocabulary words into the system to learn how to recognize speech. In contrast, deep machine learning applications don’t require labeled datasets, instead utilizing unlabeled data to train themselves.

Deep machine learning differs from classical or supervised machine learning in that it is unsupervised learning, using unlabeled data to teach itself how to complete a task. Deep learning is also defined by how many neural networks are involved in the learning process; an algorithm that takes advantage of more than three layers in a neural network can be considered a deep learning algorithm.

Neural networks, the smallest subfield of AI, describes algorithmic networks which simulate the human brain. Neural networks are made up of thousands of processing nodes which receive data items at each of their connections. The node gives each connection a weight and multiplies the data item at each connection by the weight. The node then adds the resulting products together. If the sum exceeds a specific threshold value, the node passes the information along to the next layer. After traveling and transforming through multiple layers (more than three layers if it’s a deep learning neural network), very different looking data arrives at the output layer.

Current and Future Applications of AI

Neural networks can already be used in a variety of machine learning applications. For example, neural networks can be trained to sort through images and identity certain objects within each image; this process is known as image classification. More complex neural networks can identity objects even within images not intended for image classification. Neural networks can also be trained to generate text of a particular type; for example, you could feed an algorithm all of Shakespeare’s works to try to get a neural network to write like the Bard. Neural networks can even be used to generate images of a particular class – for example, photorealistic images of faces that are not actually photographs of human faces.

Perhaps the most notorious application of neural networks has been the notorious deepfake videos that exploded across the internet over the last few years. The term “deepfake” refers to fake videos made by feeding hours of real video footage into a neural network to make videos where a celebrity or public figure can say whatever the creator wants. In 2017, researchers at the University of Washington collaborated with researchers from the VISITEC institute in Thailand collaborate on a project called “Synthesizing Obama: Learning Lip Sync From Audio.” To make the deepfake video, researchers trained a computer on hours of footages from former U.S. president Barack Obama’s speeches and grafted the president’s mouth shape onto the head of a person from another video. That same year, Reddit became populated with deepfake porn videos of celebrities such as Gal Gadot or Taylor Swift.

In what some have already labeled a “post-truth world,” deepfakes could pose as a stealthy tool for manipulating and falsifying political discourse. However, going forward, deepfakes are just one thing neural networks will continue to master. In the future, neural networks might master improved stock prediction. They could lead to robots that can see and feel the world around themselves. Neural networks might be able to build on handwriting analysis to automatically transform handwritten documents into word-processed versions. They might even be able to help researchers understand trends within the human genome and diagnose medical problems.

Neural Networks and Speech Recognition

Speech recognition is just one area where neural networks will help researchers develop better technology in the future. Researchers have been using technology to construct speech recognition devices since the 1950s, but the technology wasn’t able to recognize natural speech until the 1990s. In 1962, Bell Labs’ Audrey became the first computer to recognize human voice; she could recognize the numbers zero through nine when spoken out loud by her programmer 90% of the time. Today, AI-powered voice assistants (think Apple’s Siri and Google’s Alexa) are household names.

Originally, speech recognition technology was empowered by classification algorithms. Classification algorithms are simpler machine learning programs which apply labels to various inputs (think of the computer process that helps sort emails into “spam” and “not spam”). In speech recognition technology, classification algorithms worked to identify a distribution of possible phonemes (i.e. distinctive language sounds) across a given period.

Now, neural networks can be used in many applications of speech recognition technology, including “phoneme classification, isolated word recognition, audio-visual speech recognition, and speaker adaptation.” Neural networks, unlike simpler machine technologies, are better able to mimic the way that humans learn language. When children learn languages, they absorb a wide variety of sounds, intonations, and verbal cues. In traditional machine learning algorithms, humans must feed manually labeled datasets in the programs. Neural networks, in contrast, can absorb large amounts of unlabeled datasets and form connections on their own, mimicking the way the human brain initially learns speech and language.

Currently, speech recognition technology exists in the form of voice assistants, both in-home and on the road vis a vis the new in-car voice assistance technology. However, neural networks can stimulate the development of speech recognition technology in even more fields in the future. For example, many video game companies are working on speech recognition technology in game play. Adding voice effects would make gaming more accessible to people with visual impairments and other physical disabilities and would add another layer of interaction to the gaming experience. Additionally, speech recognition technology might expand beyond what today’s voice assistants are capable of in the office. For example, Microsoft’s Cortana can currently handle basic office tasks like scheduling meetings or making travel plans. Future speech recognition technology might allow virtual assistants to generate financial reports or search through computer files for specific pieces of information, making the office assistants currently only available to senior company positions available to all.

Neutral Networks and Computer Vision

Computer vision is another area of AI that has taken off with the recent improvements in neural networks. Computer vision technology seeks to replicate the human process for identifying pictures and videos in machines. Much like with voice recognition technology, researchers began experimenting with computer vision in the 1950s. In the 1970s, computer vision as first used commercially to distinguish between handwritten and typed text. Today, computer vision has been behind technologies in everything from facial recognition software to self-driving cars.

Before neural networks were used in computer vision, a programmer wanting to perform facial recognition would have to manually create a database of images and annotate the distance between various facial features in each image. With neural network technology, to develop facial recognition software, a developer would simply have to train an existing algorithm with enough images without having to label or annotate the images ahead of time.

In the future, computer imaging algorithms are expected to discern even more data from individual images than they do now. Self-driving cars currently use cameras and computer vision to identify objects so that they can move safely through the street; as computer vision technology improves, self-driving cars’ ability to operate safely will improve. It is also thought that image captioning technology will be combined with natural language generation (NLG) to read aloud captions for people with visual impairment disabilities. Computer vision might even ultimately be used to recognize a phone or computer user via camera and display targeted ads accordingly.

Singularity: AI’s Final Frontier

Recent developments in AI have led to cars that drive themselves, facial recognition software, and virtual assistants which fit in a pocket. The future might produce robots that can feel their surroundings and software that can sort trends within the human genome. However, what if at some point AI technology becomes so powerful it eliminates the need for human brain power? Fifteen years ago, futurist Ray Kurzweil predicted when computers might do just that. According to Kurzweil, singularity—the point in time where machine intelligence exceeds the capabilities of the human brain—can be achieved by 2045. Recent research, alongside the advent of neural networks and quantum computing, make it look like singularity might be achieved even sooner than that, perhaps even within the early 2020s. However, while researchers disagree about when exactly the advancement will arrive, they are certain singularity exists sometime in the near future.

And as for what singularity means for the future? AI experts are far from optimistic. The late Stephen Hawking predicted that with singularity will come the end of the human race. Ray Kurzweil believes that humans will ultimately be replaced by AI, or by a hybrid of humans and machines. Gideon Shmuel, the CEO of eyesight Technologies, argues that once machines can truly learn by themselves, they will be able to pass up human intelligence in a matter of hours (if not faster). Shmuel argues that it serves human interests to train AI to recognize specific circumstances and ascribe them with the correct meaning. However, risk comes into play with “the AI brain that is responsible for taking the sensory inputs and translating them into action.” Of course, the risk lies in how AI technology will respond to human life once it surpasses the human ability to respond to sensory inputs with action. Until then, computer scientists are left debating what these risks might entail.

The world has come a long way since Alan Turing predicted what would need to happen for machine intelligence to merely stand unquestioned. Recent advances in machine learning vis a vis neural networks have changed the way humans work, play, and problem-solve, and this technology will only improve in the next several years. In 2014, when a machine passed the Turing Test for the first time, humans joked about welcoming robot reign. With AI likely to surpass human intelligence sometime soon, humans will likely never see a day when they can bid artificial intelligence farewell.