What is Natural Language Processing?¶
Natural language processing is a subfield of artificial intelligence where computer algorithms are used to process natural language data including natural language understanding and natural language generation.
What methods are applied in Natural Language Processing?
Symbolic (or rule based) NLP was the initial methodology applied to Natural Language starting from 1950s and was the predominant approach until 1990s where Statistical methods gained dominance. With the turn of the century, machine learning algorithms and more recently artificial neural networks are predominantly applied to Natural Language Processing problems.
What are the common high level Natural Language Processing tasks?
Producing an automatic summary of large body of text.
Translating text from one human language to another human language.
Holding chat conversation with humans in order to for example gather information or help with queries.
Providing automatic answers to human language questions most often where specific answer is present.
Computing polarity score for often subjective information such as reviews or tweets.
Automatic discovery of abstract topics occurring in a body of text.
Assigning an occurrence probability to any sequence of words.
Searching for requested information and ranking of the results.
Word Sense Disambiguation
Identifying which sense of the word is used in a sentence.
What are the common low level Natural Langaue Processing tasks?
Word Segmentation (Tokenization)
Segmentation of a body of text into smaller tokens (often words).
Sentence Boundary Disambiguation
Determining the start and end of the sentences within a body of text.
Part of Speech Tagging
Associating every word with a part of speech based on the definition and context.
Named Entity Recognition
Determining which words in a body of text map to proper names such as names of people or places.
Identifying the root form or word stem for inflected or derived words.
Grouping all inflected or derived forms of a word into one group.
Constructing a tree structure representing the syntactic structure according to phrase structure grammar.
Constructing a tree structure representing the syntactic structure according to a dependency grammar.
Identifying all expressions that refer to the same entity in the text.
Timeline of Natural Language Processing
Here we include a list of important events in the history of natural language processing:
1990s Statistical Methods such as A tree-based statistical language model (Bahl et al., 1989), educing linguistic struc- ture from the statistics of large corpora (Brill et al., 1990), Statistical parsing of messages (Chitrao and Grishman, 1990), A statistical approach to machine translation (Brown et al., 1991)
1980s Symbolic methods such as Passing Markers (Charniak, 1983), In depth understanding (Dyer, 1983), (Direct Memory Access (Parsing Riesbeck and Martin, 1986), TEAM (Grosz et al., 1987), Semantic Interpretation and the Resolution of Ambiguity (Hirst, 1987)
1970s Conceptual ontologies such as MARGIE (Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin (Meehan, 1976), QUALM (Lehnert, 1977), Politics (Carbonell, 1979)