The team led by João Magalhães, NOVA LINCS researcher from Multimodal Systems Group and Professor at the Department of Computer Science at NOVA School of Science and Technology (DI - FCT NOVA), ranked silver among the 10 international research groups selected by Amazon to participate in the Alexa Taskbot research challenge and push Multimodal Conversational AI beyond the frontiers of knowledge.
The members of the team are Rafael Ferreira, Diogo Tavares, Diogo Silva, Frederico Vicente, Mariana Bonito, Gustavo Gonçalves, Rui Margarido, Paula Figueiredo, Helder Rodrigues, David Semedo (co-PI) and João Magalhães (PI).
Artificial Personal Assistants are quickly entering different sectors of today’s society. Assistants such as Amazon Alexa, Apple Siri, Google Assistant and Microsoft Cortana help users in a wide range of tasks, including travel related bookings, online shopping, information search, playing music, house control, among others. The future generation of conversational assistants will be multimodal (that is, integrating video, images, sound, not just text) and its connection to the physical world will be a groundbreaking advance. Multimodal conversational assistants will be able to guide consumers in manual tasks, making use of recent advances in visual and language communication algorithms between humans and machines. This new level of sophistication demands for new AI algorithms, task planning and a level of computational commonsense knowledge that is largely unexplored.
This is the challenge that Amazon proposed to universities all over the world in 2021. After almost one year of research work, the challenge has finished and the three top performing conversational AI systems were selected. The NOVA LINCS / DI - FCT NOVA team created the second best system, among the selected ten world-wide Universities to participate in this research challenge: six from the USA, three from Europe (two from the UK and one from Portugal) and one from Asia.
Task Wizard (TWIZ), the taskbot developed by the NOVA LINCS / DI - FCT NOVA team, proposes the support of an engaging experience, where users are guided through multimodal conversations, towards the successful completion of the selected task. The TWIZ introduced an element of curiosity to the challenge, mentioning curious facts related to the tasks being performed, which increased customer engagement and the interaction empathy.