China's first national standard for virtual digital human provides unified technical requirements and evaluation criteria for the research, production, and application of such entities in customer service.
The customer service virtual digital human is one of the most significant application areas of digital human technology and is now widely used across multiple sectors, including finance, government affairs, and education.
Issued on Oct 5, the standard, titled "Information technology—General technical requirements for customer service virtual digital human," establishes a comprehensive technical specification system.
It defines a reference framework for customer service virtual digital human systems, covering modules such as avatar generation, visual, speech, and emotional interactions, setting clear requirements for digital human of different types and application scenarios.
Regarding avatar generation, the standard stipulates that 2D digital human avatars must provide complete and clear facial feature details, while 3D hyper-realistic digital human models must have at least 200,000 polygons to ensure fine geometric detail.
For interactive functions, it requires a digital human to support multi-modal interaction, including voice, gesture, and body movement, and possess operational maintenance capabilities, such as keyword maintenance and corpus updates, to ensure continuous service optimization.
The standard specifies a lip-sync accuracy rate of no less than 90 percent, ensuring precise synchronization between the digital human's speech and lip movements.
The average success rate for gesture interaction and for body movement interaction is set at no less than 90 percent each, making body language communication more natural.
It also sets an emotional interaction success rate of no less than 80 percent, requiring the digital human to accurately recognize user emotions such as joy, sadness, and anxiety, and provide appropriate feedback through methods like expression generation and emotional speech synthesis.
With a speech interaction response time within two seconds and a semantic understanding accuracy rate of no less than 85 percent, the standard facilitates this evolution towards more empathetic and context-aware interactions.
The standard is applicable to the upgrade and transformation of existing 2D and 3D digital human products.
It also leaves room for the integrated application of new technologies, such as AI-generated content, allowing enterprises to make flexible choices based on their application scenarios.
Meanwhile, supporting testing methods are under development, which will provide enterprises with a unified testing benchmark to help them quickly identify issues and optimize their products.
"The standard serves as a threshold. Although it is recommended, we hope all digital human enterprises will meet its most basic requirements. When the quality of our digital human products improves further, we will also revise the standard in a timely manner to raise the requirements for corresponding technical indicators," said Sun Qifeng, director of the digital technology research center under China Electronics Standardization Institute.
China issues first standard for unified metrics for customer service digital human
