Research in automatic Speaker Identification (SID) for non-commercial applications has focused on telephone issues for several years. Researchers have focused on problems such as recognizing a speaker on a cell-phone using a landline telephone-based model. While this focus has improved performance, it has ignored key problems such as far-field speaker recognition and SID in the presence of noise. When speech data is collected by microphones more than a few inches away from the speaker, room acoustics and noise sources become important. In a room with lively acoustics, echo and multi-path propagation become key concerns.
Conventional SID techniques often fail because of mismatches between the data used for training and identification. In this scenario, the identification data always comes from a live microphone, but the training data might be from a landline telephone, a cell phone, or another live microphone recording. A more serious mismatch may occur if the system is trained on telephone data while the identification data are recorded live at a higher sampling rate. This project aims to study the far-field effects on current state-of-the-art SID systems and investigate strategies to improve SID system performance in far-field scenarios.
For more information please refer to http://penance.is.cs.cmu.edu/FarSID/