Correct implementation of GMM-HMM for Speech Recognition
For N words, we have N HMMs. In each of the HMMs, the states are the GMMs. Each GMM models unique phoneme (with or without belonging to a particular context). It is these HMMs (words) that constitute nodes of an FSN (sentence). For each sentence, there is one FSN.