Hosted by IDTechEx
Artificial Intelligence Research
Posted on August 13, 2018 by  & 

Speech emotion recognition technology

AudEERING develops next-generation, intelligent audio analysis algorithms and speech emotion recognition technology. With the award-winning open-source speech and emotion analysis framework openSMILE as core, the company builds world-leading proprietary solutions for intelligent speech, music, and sound analysis. Listed as one of eight Competitive Profiles in a report published by Gartner, Inc. under the name "Competitive Landscape Emotion AI Technologies, Worldwide," audEERING is a relevant provider of next-generation artificial intelligence based on affective computing. The company's product portfolio comprises software systems for automatic emotion and speaker state recognition from speech signals and methods for music signal analysis.
 
Audiary
When undergoing medical treatment, especially in the case of psychological disorders, patients are often asked by their doctors to keep a diary. In most cases, those diaries are still paper notebooks in which the patients manually note down their state of health. One of the goals at audEERING is to revolutionise this medical procedure by shifting it to the digital era. This is how Audiary was born - the audio diary. The main goal is to make the daily routines easier for the patients by sparing them of taking notes by hand. But also physicians benefit from this new technology which makes analysing the data much easier. Audiary enables patients to concentrate completely on what they feel. All they need to do is to tell the app about their state of health, just like in a personal talk with their doctor. The rest is being done by Audiary. What makes this application special in comparison to other speech recording tools is the use of audEERING's sensAI technology which offers a complete analysis of the user's emotional state.
 
 
Callyser
The market for call centres is growing steadily since almost every company is nowadays offering their customers some kind of service desk. Online shops, mobile phone providers, banks or insurance companies - they all offer and rely on instant support via telephone for their customers. But not only the number of call centres is growing - so is the frustration on both ends of the lines. Angry customers complain about swamped call centre agents and vice versa.
 
The main reason for complaints from customers are bad service and a lack of competence of the call centre agents. There are several reasons for those problems: a high labour turnover rate as well as the low level of decision-making power play an important role. Thus, the agents are often not able to offer appropriate solutions to their callers.
 
This is exactly where Callyser, a call centre speech analysis software developed by audEERING, comes into play. Callyser is a tool which is capable of automatically analysing voice recordings on multiple levels. Callyser not only tells you the formal parameters of the analysed audio material, like the duration of a conversation or the respective speakers' shares of a dialogue. What makes this tool special is the ability of detecting the speakers' mood as well as the dominating atmosphere during the phone call which is provided by audEERING's sensAI technology. Thus, Callyser can detect and prevent escalations before they happen. When a call centre agent loses control of a situation, a more experienced call centre agent or their boss is alerted and can take over and calm the situation.
 
 
SensAI Music
Music plays an inerasable role in the history of mankind. It is our daily companion with whom we share the happiest as well as the saddest moments. It is almost impossible to over-evaluate the role that music plays in our lives. At home, in our cars, at work or in a club at night - we listen to music everywhere. And some people not only listen to it, but give it a shot themselves at the turntables.
 
Music is constantly changing. New genres evolve, new albums are published nearly every day, new stars are born. How not to lose track of what's happening? Music aficionados at audEERING have set for themselves the ambitious goal of combining music editing software and applications for deejays with intelligent technology that, on a technical level, outclasses existing solutions.
 
The result is sensAI-Music - an innovative software solution that automatically detects various different elements of a music track, such as tempo, meter, tune or vocals, and calculates - based on a large number of musical parameters - the genre as well as the emotional setting of a music track. sensAI-Music makes planning set lists and dealing with large music databases easier for deejays und music editors. Furthermore, lighting effects can easily be synchronised with music tracks, videos and effects can be trimmed and synchronised automatically along with a song and avatars and robots can be animated in a way to be in sync with the music.
 
 
"Different nuances can have many different meanings. This is also the crux in intelligent man-machine communication that this 'how' was neglected for a very long time in favor of the 'what'," says Dagmar Schuller, CEO and co-founder of audEERING. "Of course it was important to recognize the what first, but the way one speaks says much more, whether I mean something sarcastic or ironic, whether I sound happy, whether I sound sad or negative or positive."
 
"Many neurocognitive diseases manifest themselves first in language at an early stage; how to articulate, how to speak and also how to behave," explains Dagmar Schuller. "In the long-term perspective, there are classic indicators such as uncontrolled emotional outbursts in Alzheimer's patients. If these people are suddenly particularly aroused or particularly unhappy and you don't really know why, because it has been taken out of context, this is a classic Alzheimer's indicator that you can recognize".
 
In addition to emotional changes, language also plays a decisive role in early detection - and that's where audEERING comes into play. "What you can tell from the language or the voice is that the rhythm of speech changes. That the tonality changes and sometimes also the semantics if you go in the direction of understanding. Words are inserted that are completely taken out of context because you can no longer remember the correct word." And here, too, the how is more important than the what. "Even if you pronounce certain vowels differently, this might by an indication. In Parkinson's disease, for example, a classic test is to let patients say "A", then one examines this different vowel tonality in order to get an early indication."
 
 
Source and top image: AudEERING
More IDTechEx Journals