Empathic AI Making Headway; Skeptics Abound
Emotion AI startups winning funding, despite concerns about bias and uncertainty in the results; auto manufacturers investing heavily in the tech to monitor driver behavior.
By John P. Desmond, Editor, AI in Business
A team of researchers in 2019 concluded that there is no evidence that emotional state can be predicted from the expression on a person’s face.
The Association for Psychological Science assembled five scientists from various sides of the debate to conduct “a systematic review of the evidence testing the common view” that emotion can be reliably determined by external facial movements, according to an account from the ACLU published in July 2019.
After reviewing over 1,000 scientific papers in the psychological literature, the experts came to a unanimous conclusion: there is no scientific support for the common assumption “that a person’s emotional state can be readily inferred from his or her facial movements,” according to Lisa Feldman Barrett, Professor of Psychology at Northeastern University and an expert on emotion,
The skepticism is not holding back the emotion AI market. Since then, emotion AI leader Affectiva was bought by Smart Eye for $73 million in June 2021, and venture capitalists are backing a number of emotion AI startups.
The emotion AI industry is projected to almost double in size from $19 billion in 2020 to $37.1 billion by 2026, according to Markets and Markets. Venture capitalists have invested tens of millions of dollars in companies including Affectiva, Realeyes and Hume, according to a recent account in VentureBeat.
Startup Hume AI was founded by Alan Cowen, a former Google scientist, who is calling the company an “empathetic AI” company. The company maintains that it has developed datasets and models that enable customers to measure emotions from a person’s facial, vocal and verbal expressions.
“With new computational methods, I discovered a new world of subtle and complex emotional behaviors that nobody had documented before, and pretty soon I was publishing in the top journals. That’s when companies began reaching out,” Cowen stated in the VentureBeat account.
Hume, with 10 employees and $5 million in funding, uses “large, experimentally-controlled, culturally diverse” datasets from people spanning North American, Africa, Asia, and South America to train its emotion-recognizing models.
Cowen envisions that social media companies could use his technology to assess the mood of users. “Companies have been lacking objective measures of emotions, multifaceted and nuanced measures of people’s negative and positive experiences. And now they can’t say that anymore,” he stated in an account in The Washington Post
Hume has collected more than 1.1 million images and videos of facial expressions from over 30,000 different people in the U.S., China, Venezuela, India, South Africa, and Ethiopia, according to the VentureBeat account. Also, the company has more than 900,000 audio recordings from over 25,000 people voicing their emotions labeled with people’s self-reported emotional experiences.
Hume has established The Hume Initiative, described as a nonprofit aimed at “charting an ethical path for empathetic AI.” Its ethics committee, which includes Taniya Mishra, the former director of AI at Affectiva, has released regulatory guidelines that Hume promises to abide by with its commercial products. The guidelines, according to VentureBeat, ban applications that employ manipulation and deception.
Affectiva Has 11.9 Million Face Videos, Says Rana el Kaliouby
The Affectiva dataset has 11.9 million face videos from people in 90 countries, adding to some five billion facial frames, according to Rana el Kaliouby, Deputy CEO of Smart Eye, in an interview with AI in Business this week. “It’s a ton of data. We believe it’s still the largest and most diverse facial repository,” she said.
She acknowledged that some research challenging the validity of emotion AI is accurate, but she suggested it is simplistic. “When I was working on my PhD 20 years ago, I steered away from this naive approach that if you are smiling you are happy or if you frown, you are sad. Unfortunately, some companies are taking this overly-simplistic approach,” she said.
Context is important. “In the context of driving a car, when we see people closing their eyes, it is safe to infer they are falling asleep at the wheel,” she said. “But you can’t make a blanket declaration about it.”
“Fixating on six basic emotions [happiness, sadness, fear, anger, surprise and disgust] is naive. Humans can express a much wider range of emotional states. My conclusion is that we should be more thoughtful about how we do the research,” she said.
For example, the “temporal signature,” how much time is going by, needs to be considered within the context, as well as what a driver may be doing with his or her hands.
Smart Eye had built its business with technology that tracks eye movements, with a focus on the automotive industry. The auto industry was also a target of Affectiva, which was making headway. The two companies started talking at the Consumer Electronics Show in 2020. “We were on a path to compete,” el Kaliouby said. “So we decided to join forces.”
The automotive focus for Affectiva was based on lessons learned. “My biggest takeaway over the last 12 years is that it’s really hard to build a general purpose AI emotion recognition solution,” she said. “The generic solution gets you 70 percent of the way there. But if you want to apply it to automotive, you need to do the extra 30 percent of work to customize it to the specific application.”
Affectiva has been working to combine its technology stack with that of Smart Eye. A demonstration was planned for CES 2022, but Affectiva pulled out of the event after a number of appointments were canceled. Private demos are being conducted. Smart Eye’s technology is in use by 500,000 vehicles worldwide. “Hopefully our technology will be added to that in the next few years,” she said.
Outside of automotive, el Kaliouby is interested in applying its technology to support mental health evaluations. She pointed to the example of Videra Health of Utah, which offers a remote monitoring app that can assess voice and facial expressions to assess whether the user is deviating from a baseline, as fitting the profile.
Asked what areas of research Affective/Smart Eye is pursuing, she mentioned “multimodality,” such as in the automotive context, the combination of voice intonation and facial expression. “This has changed since we spun out of MIT,” el Kaliouby said. “When we spun out, it was almost impossible to combine signals. So we focused on the face, and other companies focused on the voice. Now, the idea of multimodality is very possible.”
To pursue this area, Smart Eye in October 2021 entered an agreement to acquire iMotions, a company offering multi-sensor data collection and analysis. The offer for the Copenhagen- and Boston-based company was for $46.6 million. “Their secret sauce is the ability to combine and triangulate all these signals,” el Kaliouby said.
Certain areas of research for emotion AI - security and surveillance – Affectiva will not pursue. “The technology is not there yet to make sure you do it right,” she said, noting that the company has turned down business in the area.
Read the source articles and information from the ACLU, in VentureBeat and in The Washington Post.