fbpx
Wednesday, 25 December 2019 14:53

What is Natural Language Processing (NLP)?

Author:  [Source: This article was published in pcmag.com By Ben Dickson]

How does AI extract meaning from the text? It's not as simple—and definitely not as easy—as you might think.

In September, the Allen Institute for Artificial Intelligence (AI2) unveiled a computer program c called Aristo that could correctly answer more than 90 percent of the questions on an eighth-grade science test. Passing a middle-school exam might sound mundane, but it's complicated for computers.

Aristo found its answers from among billions of documents using natural language processing (NLP), a branch of computer science and artificial intelligence that enables computers to extract meaning from unstructured text. Though we're still a long way from machines that can understand and speak human language, NLP has become pivotal in many applications that we use every day, including digital assistants, web search, email, and machine translation.

Words Are Hard

Replicating the language-processing capabilities of the human mind is a historic pain point for artificial intelligence. Imagine an AI agent that must respond to weather-condition queries; it has to understand all the different ways someone can ask about the weather:

  • How is the weather today?
  • Will it rain tomorrow?
  • When will it stop raining?
  • Is it sunny in Chicago?
  • Will it be warmer tomorrow?
  • Which days are sunny next week?

And in many cases, language carries hidden meanings that imply general knowledge about the world and how objects relate. Consider the following queries:

  • Will the weather be good for soccer tomorrow?
  • Is it snowing in the kitchen?

Any human hearing the first sentence will know that you're implicitly asking whether it will be sunny tomorrow—or perhaps just whether it won't rain. As for the second sentence, people know it doesn't snow in the kitchen. But encoding this kind of background knowledge and reasoning in artificial intelligence systems has always been a challenge for researchers.

Classical approaches to natural language processing used symbolic AI systems, in which software engineers explicitly specified the rules of parsing the meaning of language. The process was labor-intensive and had limited application. For instance, developers would have to manually write down all the ways a user might ask the weather and then provide the appropriate answer.

These systems only worked as long as users stayed within the limits of their specified behavior. As soon as they would receive a new query that would be slightly different from their encoded rules, they would break. This required asking users to adjust their behavior with the limits of the AI system, which would make for an error-prone and frustrating experience.

The limits of rule-based systems became even more evident when they processed long excerpts of text composed of several sentences that required a lot of contextual knowledge. This was especially true in domains such as translation, where converting a long text from one language requires information about the source and destination languages as well as history and culture. In these instances, the behavior of the AI became so erratic that using it would become nearly impossible to use, except for very simple tasks.

Deep Learning and NLP

The past few years have seen a revolution in deep learning, an AI technique that is especially good at handling unstructured information such as images, sound and text. Instead of manually defining the behavior of deep learning algorithms, software engineers "train" them by providing them with many examples.

To train a weather reporting algorithm, the engineers provide it with many different examples of how users ask the weather and the proper way to answer them. The algorithm analyzes creates a statistical model that represents the common traits in the sequence of words used to ask the weather. It can then map new sentences it hasn't seen before to the correct answers.

Not only does deep learning obviate the need for manually engineering behavior, but it also helps perform much more complicated NLP tasks, such as translation.

A deep-learning algorithm trained on a large corpus of English documents and their corresponding French versions can find clever ways to do more than word-for-word translation, such as finding equivalent idioms and proverbs in languages. In 2016, Google saw a sudden improvement in its Translate service after it switched it to deep learning.

Today, most NLP applications use some form of deep learning.

Applications

NLP is leaving its mark in many domains, but in several areas, advances in the field have even paved the way for new applications.

Digital assistants: Alexa, Siri, and Cortana use natural language processing to map your sentences to specific skills and applications. Thanks to advances in NLP, you can speak to your assistant in an almost-casual way. Digital assistants can respond to variations of simple commands such as setting alarms and reminders, playing music, turning the lights on and off.

Google's Duplex service is an example of how far advances in NLP have come: With some caveats, Duplex can make reservations on behalf of the user and engage in conversations with receptionists. It can also monitor conversations and extract actionable items from chats and emails.

Chatbots: Advances in natural language processing in the past few years have renewed interest in chatbots, applications that replace user interface elements (buttons, menus, and the like) with conversational interfaces such as messenger and social media apps.

You'll find catbots apps in many different domains, including health care, banking, customer service, and news. Users can interact with a chatbot as though they were interacting with a person (almost), such as a physician or a banking advisor.

Web search: Previously, searching the web was limited to looking for keywords on webpages. Currently, search engines use technologies such as word embedding, a type of AI model that looks for keywords and terms that are related to the original search query.

More recently, Google has incorporated BERT, a state-of-the-art language model, into its search engine to further improve its search results. Aristo, the AI previously mentioned, also uses a variation of BERT to find answers to science questions in its corpus of science material.

Email: Many email services use NLP to detect and filter spam. Also, features such as autocomplete and smart compose use NLP to take on some of a user's typing, especially on mobile devices.

Social media: Social media platforms use NLP for a variety of tasks, including detecting hate speech (or trying), evaluating the sentiment of post content, and flagging suicidal posts.

Limits of Current NLP

Despite the flexibility deep learning brings to NLP, current AI is still far from understanding language in the way that humans do.

Deep-learning models owe their accuracy to the huge amounts of data they are trained on. The more examples an AI system sees, the more likely it is to find answers to the questions it must answer. At its heart, deep learning is doing pattern matching—using complex math to map inputs to outputs based on statistics and similarities. And pattern-matching is different from understanding the meaning of words and sentences.

In fact, language models based on deep learning still suffer from some of the same fundamental problems that their rule-based predecessors did. When they become involved in tasks that require general knowledge about people and things, deep-learning language models often make silly errors. This is why many companies are still hiring thousands of human operators to steer AI algorithms in the right direction.

To be fair, real natural language processing probably won't be possible until we crack the code of human-level AI, the kind of synthetic intelligence that really works like the human brain. But as we move toward that elusive goal, our discoveries are helping bridge the communication gap between humans and computers.

[Source: This article was published in pcmag.com By Ben Dickson - Uploaded by the Association Member: Barbara larson]

Leave a comment

airs logo

Association of Internet Research Specialists is the world's leading community for the Internet Research Specialist and provide a Unified Platform that delivers, Education, Training and Certification for Online Research.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.

Follow Us on Social Media

Book Your Seat for Webinar - GET 70% OFF FOR MEMBERS ONLY      Register Now