Researchers at Kaunas University of Technology (KTU) have pioneered an artificial intelligence model that diagnoses depression with 97.53% accuracy by analyzing speech and brain neural activity, marking a significant breakthrough in mental health diagnostics.
Depression affects approximately 280 million people globally, and diagnosing it accurately has always been a significant challenge. Researchers at Kaunas University of Technology (KTU) have made a breakthrough in this field by developing an artificial intelligence (AI) model that can identify depression with exceptional precision by analyzing both speech patterns and brain neural activity.
“Depression is one of the most common mental disorders, with devastating consequences for both the individual and society, so we are developing a new, more objective diagnostic method that could become accessible to everyone in the future,” co-author Rytis Maskeliūnas, a professor in the Department of Multimedia Engineering at KTU, said in a news release.
Multimodal Approach Enhances Diagnostic Accuracy
Published in the Brain Sciences Journal, the innovation stems from a multimodal approach that integrates two types of data: speech and electrical brain activity (EEG).
The researchers argue that while most traditional diagnostic methods rely on one type of data, this dual approach offers a more comprehensive understanding of a person’s emotional state.
This combined analysis achieved an impressive 97.53% accuracy in diagnosing depression, a significant improvement over existing methods.
“This is because the voice adds data to the study that we cannot yet extract from the brain,” added Maskeliūnas.
Voice and Brain Data: A Potent Diagnostic Duo
The choice of data sources was carefully deliberated, according to Musyyab Yousufi, a doctoral student who contributed to the project. He noted that while facial expressions could offer some insights into a person’s psychological state, they can be easily manipulated.
“We chose voice because it can subtly reveal an emotional state, with the diagnosis affecting the pace of speech, intonation and overall energy,” Yousufi said the news release.
Patients’ privacy was another critical consideration. Traditional methods like facial recognition can intrude on privacy, whereas speech and EEG offer less invasive but equally informative data.
“[C]ollecting and combining data from several sources is more promising for further use,” added Maskeliūnas.
The Path Forward: Enhancing AI Transparency and Understanding
The KTU research team utilized the Multimodal Open Dataset for Mental Disorder Analysis (MODMA) for their EEG data. This data was collected in a controlled setting, with the participants at rest, eyes closed for five minutes. Concurrently, the participants’ natural speech was recorded during a question-and-answer session and while describing pictures.
To process this data, it was transformed into spectrograms, visual representations of the signals. Advanced noise filters and a modified DenseNet-121 deep-learning model were employed to identify depression indicators in these images.
Moving forward, this AI model could make depression diagnosis quicker and more accessible, potentially facilitating remote evaluations and reducing subjective biases. However, challenges remain.
“The main problem with these studies is the lack of data because people tend to remain private about their mental health matters,” Maskeliūnas explained.
A significant future task for the researchers is enhancing the algorithm’s ability to explain its diagnostic process clearly.
“The algorithm still has to learn how to explain the diagnosis in a comprehensible way,” Maskeliūnas added.
The Broader Implications: Explainable AI in Health Care
As AI solutions gain traction in sensitive fields like health care, finance and law, the demand for explainable artificial intelligence (XAI) is increasing. XAI aims to make AI’s decision-making process transparent, thereby building trust and ensuring that these systems can be reliably integrated into critical areas.
With this development, KTU opens a promising avenue towards more accurate, objective and understandable diagnoses of depression, potentially revolutionizing the way mental health issues are identified and treated.