Tagging is a type of classification that can be defined as the automatic assignment of token descriptions. The descriptor is called a tag in this context, and it can represent a part of speech, semantic information, and so on. The task of labeling each word in a sentence with the proper part of speech is known as "POS tagging." Parts of speech include nouns, verbs, adverbs, adjectives, pronouns, conjunction, and their subcategories, as we already know. There are various types of tagging methods available such as Rule-based POS Tagging, Stochastic POS Tagging, Transformation-based Tagging, and Hidden Markov Model (HMM) POS Tagging.
- Rule-based POS Tagging: The oldest techniques of tagging. To determine potential tags for each word, rule-based taggers consult dictionaries or lexicons. Rule-based taggers use handwritten rules to determine the correct tag when there are multiple possible tags for a word.
- Stochastic POS Tagging: A stochastic model incorporates frequency or probability (statistics). The term "stochastic tagger" refers to a variety of methods for addressing the issue of part-of-speech tagging.
- Transformation-based Tagging: Also known as Brill tagging. It is an example of transformation-based learning (TBL), a rule-based algorithm that automatically tags POS to the provided text. By transforming one state into another state using transformation rules, TBL enables us to have linguistic knowledge in a readable form.
- Hidden Markov Model (HMM) POS Tagging: A doubly-embedded stochastic model, or HMM, conceals the underlying stochastic process. A different set of stochastic processes that generate the sequence of observations are the only way to detect this hidden stochastic process.