FABG: Facial Affective Behavior Generation for Human-Robot Interaction

Overview

FABG is an end-to-end imitation learning system that generates natural and fluid facial affective behaviors for human-robot interaction. The following video demonstrates the overall capabilities and key features of our system.

System Overview

Core Functionalities

Our system demonstrates various interaction capabilities through multiple scenarios, including affective interactions, multiplayer engagement, dynamic tracking, foveated attention, gesture recognition, and quick response mechanisms.

Affective Interaction 1

Affective Interaction 2

Multiplayer Interaction

Dynamic Tracking

Foveated Attention

Gesture Recognition

Quick Response

Close-up Demonstration

The following video provides a detailed close-up view of our system's facial affective behavior generation capabilities, showcasing the fine details and natural expressions produced by our model.

Close-up View

Abstract

This paper proposes FABG (Facial Affective Behavior Generation), an end-to-end imitation learning system for human-robot interaction, designed to generate natural and fluid facial affective behaviors. In interaction, effectively obtaining high-quality demonstrations remains a challenge. In this work, we develop an immersive virtual reality (VR) demonstration system that allows operators to perceive stereoscopic environments. This system ensures "the operator's visual perception matches the robot's sensory input" and "the operator's actions directly determine the robot's behaviors" — as if the operator replaces the robot in human interaction engagements. We propose a prediction-driven latency compensation strategy to reduce robotic reaction delays and enhance interaction fluency. FABG naturally acquires human interactive behaviors and subconscious motions driven by intuition, eliminating manual behavior scripting. We deploy FABG on a real-world 25-degree-of-freedom (DoF) humanoid robot, validating its effectiveness through four fundamental interaction tasks: expression response, dynamic gaze, foveated attention, and gesture recognition, supported by data collection and policy training.

BibTeX

@misc{zhang2025fabgendtoendimitation,
    title={FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction},
    author={Yanghai Zhang and Changyi Liu and Keting Fu and Wenbin Zhou and Qingdu Li and Jianwei Zhang},
    year={2025},
    eprint={2503.01363},
    archivePrefix={arXiv},
    primaryClass={cs.RO},
    url={https://arxiv.org/abs/2503.01363},
}