Part-of-Speech Tagging

Our part-of-speech tagger uses the generalized model from dynamic model selection and utilizes ambiguity classes trained on a large corpus. It processes over 82K tokens per second on an Intel Xeon 2.30GHz machine and shows the state-of-the-art accuracy (97.64% on the WSJ corpus).

English Tags

Tag Description Version
$ Dollar 1.0.0
: Colon 1.0.0
, Comma 1.0.0
. Period 1.0.0
```` Left quote 1.0.0
'' Right quote 1.0.0
-LRB- Left bracket 1.0.0
-RRB- Right bracket 1.0.0
ADD Email 1.0.0
AFX Affix 1.0.0
CC Coordinating conjunction 1.0.0
CD Cardinal number 1.0.0
DT Determiner 1.0.0
EX Existential there 1.0.0
FW Foreign word 1.0.0
GW Go with 1.0.0
HYPH Hyphen 1.0.0
IN Preposition or subordinating conjunction 1.0.0
JJ Adjective 1.0.0
JJR Adjective, comparative 1.0.0
JJS Adjective, superlative 1.0.0
LS List item marker 1.0.0
MD Modal 1.0.0
NFP Superfluous punctuation 1.0.0
NN Noun, singular or mass 1.0.0
NNS Noun, plural 1.0.0
NNP Proper noun, singular 1.0.0
NNPS Proper noun, plural 1.0.0
PDT Predeterminer 1.0.0
POS Possessive ending 1.0.0
PRP Personal pronoun 1.0.0
PRP$ Possessive pronoun 1.0.0
RB Adverb 1.0.0
RBR Adverb, comparative 1.0.0
RBS Adverb, superlative 1.0.0
RP Particle 1.0.0
SYM Symbol 1.0.0
TO To 1.0.0
UH Interjection 1.0.0
VB Verb, base form 1.0.0
VBD Verb, past tense 1.0.0
VBG Verb, gerund or present participle 1.0.0
VBN Verb, past participle 1.0.0
VBP Verb, non-3rd person singular present 1.0.0
VBZ Verb, 3rd person singular present 1.0.0
WDT Wh-determiner 1.0.0
WP Wh-pronoun 1.0.0
WP$ Wh-pronoun, possessive 1.0.0
WRB Wh-adverb 1.0.0
XX Unknown 1.0.0