Motivation

Our social robotics research track is focused on improving the well-being of humans through the use of natural interfaces and artificial intelligence. Instead of adapting human behavior and society to technology, this adapts technology to meet natural human behavior, creating social and cross-culturally intuitive interfaces. We aim to achieve this by researching and developing embodied humanoid robots and virtual avatars, or humanoids.

Hanson Robotics, one of Osiris cofounding firms, is focused on humanoid robots capable of interacting naturally with people. Osiris and Hanson are designing systems to nurture multiple species of robots as next-generation interfaces for delivering AI services and applications and fostering the emergence of global artificial general intelligence.

AI drives these technologies in several key ways:

Deep learning, machine learning, and computer vision models enable auditory and visual understanding of human interactions. Accurate perception is at the root of all social interactions and shapes the quality and flow of interaction with artificial humanoids.

In this track, we emphasize perceiving social cues. We aim to advance the state of the art in multimodal emotion recognition, a field with increased visibility and relevance in recent years, and we advocate for inclusive and cross-cultural research in both data collection and modeling. We are also interested in training machines to understand relationships between people—where they are looking, at whom they are looking, to whom they are talking, the eye contact they make, and their body posture cues. While a lot of recent advancements in audiovisual perception have been made in the field of deep learning, we believe we can contribute through our holistic approach of data collection through all aspects of humans–humanoid interactions.

This track involves active research into not just perception but also the actions of our humanoids, such as speech synthesis, body gestures, and facial movements. We are developing methods of speech synthesis more emotionally expressive than the current state of the art and capable of a wide range of different intonation styles. Because of our focus on humanoid agents, we also focus on data-driven modeling of facial expression and facial expression mirroring.

We have built a dialogue and behavioral engine developed within the GHOST framework (the General Holistic Organism Scripting Tool). It aims to use the OpenCog cognitive architecture to integrate our data-driven perception and expression models with the behavior of our humanoids. Our initial implementation resembles traditional rule-based approaches to dialogue; however, through the shared knowledge representation and tight integration within the AtomSpace even at this first stage of design and development, we can integrate all components more tightly than traditional turn-based systems can. Over the course of research and development, we aim to replace more and more of the rule-based aspects with higher-level algorithms developed within other tracks of Osiris research, such as language learning, PLN, ECAN.

Last updated