What Will Visual Assistant of AV Look Like?

2024년 01월호 지면기사 / written by Han

While Mercedes-Benz, testing the introduction of ChatGPT's voice assistant, announced the unveiling of their 'MBUX VIRTUAL ASSISTANT' at CES 2024, featuring the Unity engine to advance their 'Hey Mercedes' voice assistant, a German project 'KARLI' has released research results on the impact of various AI avatars in the era of autonomous driving on users. The focus of this research was the question of which among 'realistic, semi-realistic, and completely abstract' avatars would be most preferable.

written by Han

In movies where artificial intelligence often appears, protagonists frequently fall in love with entities described as 'more human than humans.' Enhancing intimacy between machines and humans involves a deep understanding and interest in human nature, and such efforts are evident in the automotive industry. Companies like Stellantis and Mercedes-Benz are integrating AI technologies like ChatGPT into their driver assistants, while also assigning faces to them, similar to service robots or NIO's NOMI.

As the year comes to a close, Mercedes-Benz, which previously tested the introduction of ChatGPT's voice assistant during the summer in North America, has teased the 'MBUX VIRTUALASSISTANT' at CES 2024, promising a significant advancement using the Unity engine. Concurrently, a German project has gained attention by unveiling research results on the potential impact of various AI avatars on users in the era of autonomous driving.

Dr. Peter Rossger (CEO of beyond HMI), a member of KARLI project summarized, "We conducted user experience research to test the visual appearance of AI avatars inside vehicles. The key question was, 'Among realistic, semi-realistic, and completely abstract avatars, which one is preferred by users?'"

KARLI's goal is to develop adaptive and responsive AI features for future cars, capable of recording and interacting with the driver's state based on autonomous driving stages.

KARLI

During everyday driving, how should the future AI driving assistant interact with us? Furthermore, what personality should the voice assistant adopt while autonomously navigating roads? Importantly, how should AI respond in unforeseen emergency situations? Answers to these questions are crucial because AI designs need to not only operate based on programming but also be adjusted according to user demands and preferences.

The Team including studiokurbos, Continental, Ford, Audi, Allround Team, Stuttgart University, Stuttgart Media University, TWT GmbH Science & Innovation, Paragon Semvox, Fraunhofer IAO, Fraunhofer IOSB, and INVENSITY, dedicated themselves to the pioneering KARLI project supported by the German Federal Ministry for Economic Affairs and Climate Action. Started as a three-year project in the summer of 2021, KARLI aims to support the implementation of advanced technology strategies by the federal government in the field of intelligent mobility. The goal is to develop adaptive and responsive AI features that can record and interact with the driver's state during autonomous driving stages.

Driving situations, including autonomous driving stages, provide specific requirements for the driver's state, behavior, and corresponding abilities. Recording this information allows for realistic goal adjustments and appropriate approaches in conversations between humans and machines. Therefore, considerations extend beyond the use of advanced technology to include the desires and preferences of individuals inside the vehicle. Ultimately, the experiential and synthetic data collected and researched by KARLI may expand into big data usable in future mass-produced vehicles.

Dr. Rossger explained, "The KARLI project is a collaborative project supported by the German government, focusing on the basic research of AI utilization within vehicles. It includes three task packages: motion sickness, driver behavior level compliance, and three AI-based HMI tasks. Studiokurbos has generated HMIs for all three task packages."

The research team conducted user studies by immersing participants in simulated driving experiences using virtual reality (VR) settings in a design studio.

AI Avatar and Experimental Environment

Part of the KARLI project focuses on understanding how various forms of AI avatars impact users. The research team conducted user studies by immersing participants in simulated driving experiences using virtual reality (VR) settings in a design studio. In this VR environment, passengers encountered various driving scenarios, from friendly greetings to handling potentially dangerous situations. The questions posed to participants in these scenarios became crucial factors in enhancing the driving experience.

"We created three versions of visual animations and added some voice. The speech output was the same and static across all three versions. The reason for this was to obtain maximum comparability between the three versions. So, there was no AI technology used in this research," said Dr. Peter Rossger.

The team built an immersive VR environment using custom-made chairs with a unique steering wheel and pedals. In this setup, they brought in 20 participants of diverse genders and age groups to a VR world created using Unity. Participants received visual data through VR headsets, providing them with an immersive and lifelike experience. The research team developed three types of AI avatar designs with significantly different appearances and movements, testing them in this setup.

"We showcased various versions of videos in a virtual environment created with Goodpatch Athena. Data was collected through rankings, Net Promoter Score, meCUE, and public interviews. There was no interaction between the test participants and avatars; only videos were present."

The 'Abstract' design, labeled as A, is a completely abstract avatar that dynamically changes itself based on the situation. For example, it shapes its mouth when speaking and transforms into an exclamation mark during risky moments. It exhibits movement during interactions and pauses to indicate its availability. Design B, named 'Robotic,' features a faceless, robot-like avatar with moving action lines to convey activity and availability. Design C, 'Humanoid,' is a humanoid avatar with a face image generated by a point cloud. This avatar can display different facial expressions and regular blinking during inactive periods.

The team designed three scenarios for each AI: ▶ Driver Greeting ▶ Encouraging the Driver to Retake Control ▶ Response to the Driver's Inability to Handle an Emergency.

During scenario evaluations, participants assessed various aspects, including clarity of understanding, fear, usefulness, visual appeal, social impression, and emotional connection. They responded to questions like 'How well did you understand the avatar?' or 'How creepy did it appear to you?' on a scale from 0 to 10.

The team created three versions of visual animations and added some voice.

Robotic Avatar

Dr. Rossger stated, "Ultimately, none of the three versions were perfect, but we gained insights into improving the robotic version to garner more acceptance through user groups."

According to the research results, Design B, labeled 'Robotic,' emerged as the top choice in rankings, Net Promoter Score (measuring the likelihood of consumers recommending a product or service), and overall survey evaluations. Following closely were Design A, 'Abstract,' and Design C, 'Humanoid.' Younger participants, especially those with AI experience, showed a preference for the Robotic design. In contrast, Humanoid received predominantly positive responses from participants aged 40 and above.

Interestingly, the overall most popular choice was Robotic. Intriguingly, as participants' age increased, preferences for Humanoid also increased, while preferences for Abstract decreased. Participants found the Robotic avatar the easiest to understand and visually appealing. Humanoid was perceived as challenging to understand and visually unappealing. On the other hand, the least preferred but abstract avatar maintained a satisfactory balance between being easy to understand and aesthetically pleasing.

'Abstract' was perceived as not creepy and useful, while Robotic was considered not creepy but slightly unsettling yet useful. Humanoid was regarded as extremely creepy and difficult to understand.

In summary, in this study, the 'Robotic' design was relatively easy to understand and visually appealing. In contrast, the 'Abstract' design was deemed most useful and less creepy but had lower preferences. The 'Humanoid' design was perceived as creepy and challenging to understand.

The team sees these user research results as a crucial step in creating more intelligent, adaptable, and responsive AI interactions for future vehicles. The future of driving depends on the fusion of technology and user-centric design to provide all passengers with a comfortable and safe driving experience.

“We plan to publish these results in English at the HCII in Washington, DC, next June.”

In summary, in this study, the 'Robotic' design was relatively easy to understand and visually appealing. In contrast, the 'Abstract' design was deemed most useful and less creepy but had lower preferences. The 'Humanoid' design was perceived as creepy and challenging to understand.

"Meanwhile, in the teaser for the CES 2024 MBUX VIRTUAL ASSISTANT, a star-shaped Mercedes pattern point cloud is floating on the screen. These points could take on various forms, including abstract, robotic, humanoid, depending on the choice of Mercedes-Benz or, more importantly, the customer.

Nevertheless, what is certain now is that in the late 2010s, when Mercedes-Benz was developing an intelligent virtual assistant, they stated that it would be able to connect to everything, including Google Home or Alexa, anytime, anywhere. It was emphasized that the assistant would not only be about 'cars' but could provide lively responses to various information and even chit-chat. The goal was to create a natural, human-like interaction for a highly personalized and intuitive in-car experience.“

Dr. Peter Rossger commented on MBUX VIRTUALAssist, stating, "I don't have any experience with it up to now, so I cannot give you a final impression. In general, driving is highly loaded with visual activities. Even on L3, the driver will need to take over within seconds. Perhaps Daimler has found a good solution delivering real value, but I need to use it to make a final judgment!"