Interview: How voice AI will change neuro diagnosis

By Andrew MerninPublished On: 3 May 2023

The voice AI platform Canary Speech has achieved an accuracy of 96 per cent in recognising early signs of Huntington’s disease. NR Times spoke to CEO and co-founder Henry O’Connell about its potential role in neuro-rehab.

Speech is the most complex motor function in the human body and a key indicator of disease progression in patients with neurological conditions.

As a neurological disease impacts the muscular system it leads to changes in patients’ voice and language.

These changes occur gradually, beginning as subtle clues before developing into more noticeable symptoms as the disease progresses.

Parkinson’s patients, for example, might slur words, mumble or struggle with articulation, while people with Alzheimer’s may begin to forget words as their condition deteriorates.

The minute changes to speech that begin at the early stages of neurological diseases are difficult to spot. But, as a new era of AI and natural language processing technology (NLP) emerges, new platforms are showing promise for identifying dysarthria at a much earlier stage.

One of the leaders in this field is Canary Speech, a voice biomarker platform that has already had success in screening for mood, stress, and energy levels using a single 20-second audio clip.

Now, the company is turning its attention to neurological diseases as it seeks to use the same technology to detect the first signs of Parkinson’s, Huntington’s and Alzheimer’s Disease.

“[With] a neurological disease, the complexity of speech is such that if all of the pistons aren’t firing, then speech is going to be impaired in some way,” Henry O’Connell, CEO and co-founder of Canary Speech, told NR Times.

“That impairment may be difficult for us to hear, but a trained model, which may eliminate other chatter and noise, may pick that up very early, so you may see the manifestation of a disease occurring with accuracy and earlier.”

Canary Speech was founded by O’Connell and Jeff Adams in 2016.

The pair met 38 years ago while O’Connell was researching neurological diseases at the National Institute of Health and Adams was working at another institution building mathematical models to decrypt spy messages.

Adams went on to develop the very first natural language processing products, such as Dragon NaturallySpeaking and other specialised language models for the medical and legal sectors.

He later developed the speech recognition technology behind the Amazon Alexa.

The Canary Speech technology was first developed for mental health conditions when seven years ago, Adams and O’Connell had an “aha moment”.

The realisation was simple. When people speak, their emotions come across not in what they say, but in how they say it.

This ‘sense’ we get for how people are feeling has little to do with the words coming out of their mouths, O’Connell explained but instead comes from our brain’s complex analysis of subtle cues.

O’Connell said: “Clinical people do this all the time. How could we create a tool, we thought, that could augment that and could assist them?”

Advancements in deep learning, machine learning and super-fast processing speeds allow Canary Speech to measure 12 to 15 million data points every minute. In other words, it analyses speech in almost real time.

Rather than identifying words, the platform seeks to identify human conditions, including anxiety, depression and stress.

This week, the Beth Israel Deaconess Medical Centre at Harvard Medical School presented a study that used Canary Speech technology to identify speech biomarkers in Huntington’s Disease.

The aim is to understand and identify the inflection point where a person transitions into early stage manifest when symptoms are, in most cases, too subtle to spot.

The authors of the paper concluded that the platform had “significant potential as an efficient and sensitive tool”.

Researchers determined that the platform’s models were 96 percent accurate in identifying the transition from manifest to pre-manifest HD. Comparing this AI model with neurologist evaluations, researchers found that the platform was 40 percent more accurate than a neurologist’s diagnosis.

In clinical practice, a neurologist records the voice of their patient via a tablet fitted with the Canary Speech app.

This can be done in person or over Microsoft Teams, which the app is integrated with.

In real-time, the clinician is given a toolbar indicating levels of stress, depression, anxiety, or in the case of neurological conditions, dysarthria.

“All of [this] makes the person a better clinician because they have better data,” O’Connell said.

“Their experience, the raw intelligence, everything else becomes augmented by having more accurate data. AI is part of this, of course, but AI is an extension of natural capabilities, experience and talent of individuals.”

Speech as a biomarker is gaining increasing traction in the research space, but in O’Connell’s opinion, analysis based on language alone should not technically be classed as a biomarker and the word is used too liberally when speaking about NLP.

While a biomarker like eye colour is the same no matter where you live in the world, analysis based on language is not absolute. This is because an algorithm based on English words will not neccessarily translate to other languages.

On the other hand, analysis of sub-language characteristics, like those collected by Canary Speech, are universal.

The motor functions that create language such as the power of respiration, contraction of vocal cords and the speed of speech are governed by elements that are common across populations.

“Our algorithms for anxiety and depression in English are nearly identical for anxiety and depression in Japanese,” O’Connell said. “The transition from an English-based programme to a Japanese-based programme was simple.

“We collected 500 individual samples, validated it against what we were doing, correlated it with specific noise characteristics in the Japanese telephone system versus English, and we had virtually identical accuracies.

“That’s a biomarker, but the word I speak is not. A biomarker should transcend population.”

Generally, language data has been collected from patients by having them read a script, but O’Connell and his team realised that they could gather more useful information by using asking people questions, such as ‘How do you make a sandwich?’, or ‘Tell me what you see in this picture’.

“We use conversational engagement and that’s really critical,” O’Connell explained.

“We also collected read speech, but read speech is not as rich in information because it uses different elements of the brain. We’re reading, we’re not actually thinking about what we’re saying.

“The industry for many years gave people a script and said ‘read this’ and they built a history of gathering that and also a history of not getting very good accuracy on determining or predicting anything.”

The company’s platform for mental health is already in use as a triaging tool in hospitals and medical centres worldwide including national institutes in Japan, Tallaght Hospital in Dublin, Ulster University and Hackensack Meridian hospitals in New Jersey.

The company is currently working towards FDA approval for the platform’s use in assessing anxiety and depression. Huntington’s Disease will be the next area of focus for federal approval.

In Japan, the healthcare service carries out 30 million annual health checks to assess people for anxiety, depression, mild cognitive impairment, fatigue and Alzheimer’s disease. Each call takes around 30 minutes to complete.

A test programme carried out by Canary Speech involving 5000 individuals showed that the platform could take each participant through all question sets in just 40 seconds, reducing the call to just 12 minutes.

Voice AI technology has the potential to revolutionise the way neurological conditions are diagnosed and measured.

Its impact may also have the potential to extend to the research space thanks to it’s high level of accuracy.

O’Connell predicts that Canary Speech could have a “significant impact” in the development of new treatments in the coming years.

“There have been periods of time where we have had surges of technology,” Henry said.

“No one knows without the perspective of looking backwards, but I believe we’ve entered one.

“I believe, as I always do, that AI and tools like this are going to make us better.

“It’s going to elevate us to that next level of accuracy and treatment. I believe we’re going to look back and say to ourselves, maybe this was the most exciting 10 years we’ve ever seen.”

Learning from experiences: Folia Health CEO on working towards a better future for healthcare

MS treatment outlook: Interview with research leader Deborah Backus