The third stage of NLP is syntax analysis, also known as parsing or syntax analysis. The goal of this phase is to extract exact meaning, or dictionary meaning, from the text. Syntax analysis examines the text for meaning by comparing it to formal grammar rules. The sentence "hot ice cream," for example, would be rejected by a semantic analyzer.
In this sense, syntactic analysis or parsing can be defined as the process of analyzing natural language strings of symbols in accordance with formal grammar rules.
- Concept of Parser: Used to establish the parsing. It is a software component that accepts input data (text) and returns a structural representation of the input after checking for correct syntax using formal grammar. Additionally, a data structure that typically takes the form of a parse tree, abstract syntax tree, or another hierarchical structure is built. Top-down Parsing and Bottom-up Parsing are available.
- Concept of Derivation: There are production rules for derivation. During parsing, we must determine the non-terminal to be replaced as well as the production rule that will be used to replace the non-terminal. To determine which non-terminal should be replaced with a production rule, two different types of derivations can be used: left-most and right-most.
- Concept of Parse Tree: It is a graphical representation of a derivation. The parse tree's root is the derivation's start symbol. The leaf nodes and interior nodes of each parse tree are terminal and non-terminal, respectively. An attribute of a parse tree is that it will return the original input string upon in-order traversal.
- Concept of Grammar: Grammar is critical for describing the syntactic structure of well-formed programs. They denote syntactical rules for conversation in natural languages in the literary sense. Linguistics has attempted to define grammar since the beginning of natural languages such as English, Hindi, and others.
- Phrase Structure or Constituency Grammar: The constituency relation is the foundation of phrase structure grammar. As a result, it is also known as constituency grammar. It is the polar opposite of dependency grammar.
- Dependency Grammar: Its foundation is a dependency relationship and it is the opposite of constituency grammar. Dependency grammar (DG) differs from constituency grammar in that it lacks phrasal nodes.
- Context-Free Grammar: Context-free grammar, also called CFG, is a notation for describing languages and a superset of Regular grammar.