How can we make speech recognition technology work for everyone, including those with the most significant speech impairments? - Accessibility, assistive technology and inclusive practice

In our new blog series looking at researching assistive technology, Geena Vabulas (a popular contributor to Assistive Technology Network events) explains why she’s focused on working in a user-centred way to ensure everyone can harness the power of speech recognition. Interested in being a participant in this important project? Find details at the end of this post on how to get involved.

About me

Hi, I’m Geena Vabulas, a disability and technology researcher with the Karten Network. I’m interested in lots of things – policy, neurodiversity, and most especially, getting more disabled people directly involved in technology research and development. In my current role on the Nuvoic Project, I work with disabled people in-person and remotely.

About the project

Speech recognition technology is all around us – Microsoft, Google, Amazon, and Apple all offer versions of automatic speech recognition that allow people to control devices with their voice. This tech represents a very exciting opportunity for people who struggle to do certain tasks physically, like control their TV or close their blinds. Voice technology can also be used to support verbal and written communication. Examples of this include speech-to-text dictation for those who struggle with spelling or typing or live captioning that helps others understand a speaker.

Advances in mainstream speech recognition technologies are being driven by the power of machine learning algorithms and massive data sets. As large tech companies collect voice data from billions of users around the world, this information is fed into algorithms which are then optimised for those voices. This has led to huge leaps in the quality of speech recognition technology, a reduction in cost, and an increase in its availability in mainstream tech like smartphones.

Unfortunately, speech recognition technology usually doesn’t work for people with significant speech impairments. Algorithms are not optimised for nonstandard or dysarthric voices because they aren’t built using data sets of these minority voices.

The Nuvoic Project, led by developers Voiceitt, is all about making the benefits of voice technology accessible to people with atypical speech.

Methodology

The role of the Karten Network in the project is to lead on user research – evaluating the performance and usability of the technology through ‘real world’ testing with disabled people across the UK and Ireland.

We ask participants to install and use the Voiceitt app over a period of three to four months and provide feedback on various aspects of their experience, through structured questions and through regular informal discussions.

We have engaged in an iterative design process with participants and the tech developers. This means we use ongoing participant feedback to prioritise changes to the technology, which are then tested with the users again and continually refined.

To supplement our qualitative findings, we also have quantitative success measures. Examples include the average number of repetitions required to train a new phrase, and the number of weekly active users.

Project progress

Lots of people are already using the Voiceitt app to help them use their own voice to communicate or control Smart Home technology more independently. We wanted to find out about the different ways people chose to use Voiceitt, how they found the user interface and functionality, and any problems or ideas for what could be improved.

Some improvements resulting from participants feedback so far include:

Reducing training repetitions from ~40 to ~10 per phrases
Adding gamification and rewards to make it more fun to use the app- especially helpful for users who struggle to stay focused/motivated
Hands free/switch access enabled for users who cannot control a device with their hands

Next steps

Currently, users must teach the technology in advance to recognise every phrase they want to use. One of the most common requests we heard from participants is to have a version that works without having to train each phrase separately- we call this continuous speech recognition.

People with non-standard speech have been contributing recordings via Voiceitt’s online platform called Ensemble. Using these recordings, Voiceitt have built and started testing a version of continuous speech recognition. While this is very exciting progress, we need more recordings to make the tech even better.

Get involved

To be successful, we need more voice recordings, and we need to work with lots of different people, such as:

Individuals with nonstandard/impaired speech
Education staff who support students with nonstandard speech
Professionals who work with dysarthric people, such as speech and language therapists, occupational therapists, assistive technologists, and care workers

The team at Karten Network can provide on-going support to participants, and we offer gift vouchers of up to £100 as a thank you.

Contact the team and one of us will get back to you.

One reply on “How can we make speech recognition technology work for everyone, including those with the most significant speech impairments?”

[…] by Voiceitt to create a data set of non-typical speech. Project Ensemble is part of larger work, The Nuvoic Project, being conducted by Voiceitt with partners, The Karten Network, to develop an application that can […]