The Basque Association

of Language Industries

  • Home
  • News
  • Odyssey 2016: The Speaker and Language Recognition Workshop

Odyssey 2016: The Speaker and Language Recognition Workshop

Jun 13, 2016

Experts from around the world on the characterization and automatic identification of speakers and languages meet in Bilbao for 4 days to present the latest advances in technology and discuss future developments

Odyssey 2016: The Speaker and Language Recognition Workshop will be held in Bilbao, from June 21 to June 24, 2016. Technical sessions will take place in the auditorium of the University of the Basque Country in Abandoibarra (Bizkaia Aretoa), gathering more than 100 researchers (mathematicians, physicists , engineers, computer scientists, linguists, etc.) from around the world, as well as representatives of industry, forensic specialists and members of law enforcement and intelligence agencies from various countries. In Odyssey 2016, more than 60 works and demonstrations (with 242 authors from 31 countries) will be presented, including the latest advances in the characterization, modeling, identification and segmentation of speakers and languages (also dialects and accents), sometimes dealing with speech signals recorded in very challenging conditions.

The production and storage of audiovisual contents have grown dramatically in recent years as a result of the development of wideband networks, Internet services and the availability of affordable mobile recording devices. The processing of such contents may require modules that perform speaker and language identification. These modules operate as auxiliary tools in the framework of more general transcription, segmentation, labeling and indexing systems. Speech analysis (including speaker and language indentification) is also performed in the forensic and surveillance fields. Finally, voice identification is used in biometrics, typically in combination with other techniques (such as fingerprint or iris identification), for authentication in the access to buildings or electronic devices, and for secure banking.

The development of fast, accurate, efficient and robust sepaker and language recognition technology is particularly interesting for government agencies responsible for security and surveillance (civil and military intelligence agencies) as well as for justice institutions, which have to process a growing number of acoustic evidences where the identity of the speaker is a key element. In recent years, government agencies from several countries have funded the creation of resources (databases, metrics, evaluation protocols, etc.) and have launched international evaluation campaigns in order to automate processes that until now were expensive, depended solely on the criterion of human experts and often did not reach the desired precision levels. These investments have resulted in a remarkable improvement in performance of the technology available.

Beyond the fields of security and speaker forensics, the development of automatic or semiautomatic methods of segmentation and identification of speakers and languages may be of interest to public institutions such as parliaments, courts, etc. and broadcast (TV, radio) companies that need to transcribe and segment its contents. This may also contribute to the more efficient production of language resources for minority languages that large companies do not typically care for. Finally, the extraordinary degree of maturity achieved in recent years by speech technologies, thanks to the introduction of deep neural networks (DNN), is making possible spoken interaction devices of all kinds. In this context, speaker and language verification are essential tasks for both authentication (access) and optimization of the performance of many applications, as for example, speaker adaptation in an Automatic Speech Recognition (ASR) system.

Odyssey 2016 is the tenth edition of the event, after three pioneering editions in 1994, 1998 and 2001 and its re-definition as a biannual conference since 2004. The Speaker Odyssey (as it is known in the research community) is driven by the Special Interest Group on Speaker and Language Characterization (SIG-SpLC) of the International Speech Communication Association (ISCA). ISCA integrates most of the researchers working in the field and hosts the proceedings of past editions of Odyssey in a permanent repository known as ISCA Archive. In addition, ISCA awards a number of students with attendance scholarships in each edition of the event.

Odyssey 2016 is organized by two Spanish groups that have actively participated in this event since 2006: the Software Technologies Working Group (GTTS) of the University of the Basque Country and the ViVoLab group of the University of Zaragoza, with Luis Javier Rodriguez Fuentes (GTTS) as Chair and Eduardo Lleida (ViVoLab) as co-Chair. It is remarkable the large representation of Spanish groups in this field of research. In fact, Odyssey 2004 was organized in Toledo by the ATVS group of the Autonomous University of Madrid, from which emerged (as a spin-off) Agnitio, one of the leading companies in the Speaker ID industry, with major contracts worldwide and products of the highest quality. Besides those already mentioned, there are groups working on related subjects at the Polytechnic University of Catalonia, the Polytechnic University of Madrid and the University of Vigo.

Both Agnitio as Cirrus Logic (which recently acquired the Agnitio division oriented to consumer products) and Dialoga Systems are sponsors of Odyssey 2016, which also counts on the vast and crucial support of the University of the Basque Country and the Department of Education, Language Policy and Culture of the Basque Government. To complete the picture, Odyssey 2016 is also supported by the University of Zaragoza, the Thematic Network on Speech Technologies (RTTH), funded by the Ministry of Economy and Competitiveness (MINECO), and the city of Bilbao, through the Bilbao Tourism & Convention Bureau.

In the last 10 years, GTTS has made contributions in various fields related to Odyssey, with the support of several research projects funded by the University of the Basque Country, the Basque Government and MINECO, two doctoral theses submitted in 2010 and 2015 (focused on speaker and language recognition, repectively), the creation of language resources and the organization of several international evaluation campaigns. The group currently consists of 4 associate professors —with teaching duties at the Department of Electricity and Electronics, Faculty of Science and Technology, in the Leioa Campus of the University of the Basque Country— and a postdoctoral researcher. The four permanent members of GTTS have a Physics degree and a PhD in Science, although their research activity (including their theses) was developed in the area of speech technologies, with contributions related to signal processing, computer science and machine learning. Group publications during the past 10 years include around 60 contributions in international journals and conferences of utmost importance in those areas (see Group members are also part of scientific committees of various conferences and journals. The choice of GTTS and ViVoLab for the organization of Odyssey 2016 recognizes the contributions made by these groups over the years.

Bizkaia Aretoa, Bilbao, June 21-24, 2016 -