Context-Free Grammar (CFG) in NLP
August 5, 2025•260 words
Context-Free Grammar (CFG) in NLP
Context-Free Grammar (CFG) is a formal system used to define the syntax of languages — both programming and natural languages — in terms of rules or productions. It is a core concept in syntax analysis within NLP.
Standard Terminology
A Context-Free Grammar is a 4-tuple:
G = (V, Σ, R, S) where:
- V = set of non-terminal symbols (e.g.,
S
,NP
,VP
) - Σ = set of terminal symbols (actual words, e.g.,
dog
,runs
) - R = set of production rules (e.g.,
S → NP VP
) - S = start symbol (typically
S
for sentence)
Example CFG
Let’s define a simple CFG for English-like sentences:
- Non-terminals (V):
S
,NP
,VP
,Det
,N
,V
- Terminals (Σ):
the
,dog
,cat
,runs
,sleeps
- Start symbol (S):
S
- Rules (R):
S → NP VP NP → Det N VP → V Det → the N → dog | cat V → runs | sleeps
Sample Derivation
Sentence: the dog runs
Derivation Steps:
S → NP VP → Det N VP → the dog VP → the dog V → the dog runs
Parse Tree
S
/ \
NP VP
/ \ |
Det N V | | | the dog runs
Applications of CFG in NLP
- Parsing sentences
- Syntax checking
- Machine translation (syntax-based models)
- Speech recognition (grammar-driven constraints)
Would you like this saved as a .md file or want to expand it into a full guide with diagrams?