|Proc. ECSA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition
Abstract: We argue for a surficial pronunciation model: a model without underlying forms. The surficial model outperforms a traditional generative model by a significant margin on conversation speech (Switchboard) as well as on read speech (TIMIT). Our results suggest that the true mapping from underlying forms to surface forms is too complex to be accurately modeled using current techniques, and that we would be best server to model the surface forms directly.