- Computer Science Laboratory

LIP6 1999/010

  • Habilitation «Indexation et interface Homme-Machine. Reconnaissance d'un signal vocal»
  • C. Montacié
  • 164 pages - 04/06/1999 - document en - http://www.lip6.fr/lip6/reports/1999/lip6.1999.010.ps.gz 396 Ko
  • Contact Claude.Montacie (at) nulllip6.fr
  • Ancien Thème : APA
  • Theses works are located in the field of the speech processing and the methods and software architectures necessary to process such a signal. This research is based on the cooperation of complex statistical models to extract two a priori additional informations : characteristics of the speaker and contents of the speech. The recent availability of test databases made research enormously progress but a consequence was a sophistication of the employed methods. These methods are essentially based on a statistical modeling, for the important size of the bases of training makes difficult the use of methods using 'knowledge'. The use of knowledge is then in the choice of an adapted statistical modeling. This adaptation will depend on the task of speech or speaker recognition: noise conditions, vocabulary size, numbers of the speakers. The know-how of a speech processing team consists then often of a set of libraries, of software modules, of modelings and processing making it possible to solve mainly new problems and guiding the necessary theoretical axis. This research paradigm made it possible for us to obtain speech technologies applicable to fields as various as acces control (Orphée), vocal dictation (D-DAL) or multimedia data indexation to the information retrieval in videos.