Voice assistants are the latest technological advance in consumer electronics that are making their way into people’s lives. These devices evidence the impressive development and capability of artificial intelligence and present a tangible contrast to the depictions of this technology in iconic films. With every tech behemoth such as Amazon and Apple now having their own voice assistants, the odds are very strong that these devices are here to say and will become more prominent in daily life.

This “explainer” article looks at the future of voice assistants. It explains what voice assistants are, how voice assistants work, the evolution of voice assistants and, how voice assistants might evolve and progress in future.

What are voice assistants?

A voice assistant is a piece of voice recognition software used through electronic devices such as smartphones and speakers can produce audible and natural communication with an end user. Commands can be given and questions asked of the voice assistant which can perform the tasks or services requested.

The market leading and most well-known voice assistants include Amazon’s Alexa, Apple’s Siri, the Google Assistant and Microsoft’s Cortana.

Following improvements in Artificial Intelligence (“AI”) technologies voice assistants can now display relatively high accuracy in their functionality and are consequently becoming more and more popular among consumers.

How does a voice assistant work?

A voice assistant begins to work once activated by a user. It can be activated with a  ‘wake’ word so, for example, users of Amazon’s Alexa can call out “Alexa” and Apple users can call out “Hey Siri” to activate the respective voice assistants.

When a user then says something, the voice assistant works to convert this ‘actionable data’. At this initial stage, any speech from the user is converted into text.

Then comes the complex task of syntax and semantic processing, which means that the voice assistant works to understand the meaning of the converted text, looking at sentence structure, grammar, contextual information and the meaning of words.

Once the meaning of the received data is understood, the voice assistant retrieves information from the Internet, a cloud platform or an application to answer the question as appropriately as possible, generating a sentence in text to convey this answer.

The final step is to convert this text to speech through audio for the user.

Do voice assistants use AI?

Improvements in voice assistants aim to make an assistant that solves problems rather than introducing frustrations. These come from continued introduction and refinement of machine learning techniques. Machine learning is seen as a subset of artificial intelligence and it provides systems such as those used with voice assistants the ability to automatically learn and improve from experience without being explicitly programmed to do so.

As explained above, voice assistants do use voice recognition to convert users’ speech to audio and then back to speech again. However, it is the use of AI and more specifically machine learning that enables voice assistants to become smarter, function extremely accurately and potentially drive a lot of consumer adoption.

History and evolution of voice assistants

Voice control first appeared in the public imagination in the 1960’s, via HAL 9000, then the sentient computer in 2001: A Space Odyssey, followed by the Starship Entreprise’s helpful computer and most recently Jarvis in Iron Man. The evolution of voice assistants in fiction at least has been rapid.

However, voice assistants only began to become commonplace in real life in 2011, when Apple launched its phone-based assistant, Siri. In this section we run through the evolution of voice assistants.

Siri

Siri was originally acquired by Apple in 2010, for an estimated 100-250 million USD. It is arguably the most well-known voice assistant thanks to an earlier launch than its competitors, availability in more languages and native support in many Apple devices, iPhones in particular.

Alexa

Amazon launched the Amazon Echo in 2014. This ‘smart speaker’ introduced consumers to the voice assistant Alexa that was designed in-house. The Echo was a way for Amazon to establish a presence in the home.

Securing and building upon this presence in the home could be a key strategy for Amazon to develop its members’ base and further increase revenue.

Google Assistant

Google Assistant is Google’s voice assistant which is available on smartphones and smart home devices. Launched in 2016, the Google Assistant has spread quite far as it is not only available on its own products but a huge range of devices through partnerships with other companies.  

Other voice assistants

Siri, Alexa and Google Assistant are the most used voice assistants in the market. However, as voice assistants present a lucrative opportunity for big tech companies to increase revenue and make their way into consumers’ homes, there are many other competitors. For example, Facebook has launched M and Microsoft has launched Cortana. As people are continuing to embrace this technology, the evolution of voice assistants is sure to continue.

Who uses voice assistants?

It is mostly in homes that voice assistants have been adopted by consumers with workplace usage not being reported. Also, younger consumers are driving adoption but not necessarily heavy usage with 25-49 year olds using them most often and statistically being considered as “heavy” users.

A clear benefactor of using a voice assistant are people with disabilities. They are an affordable way way to enable more independence, break down accessibility barriers and help disabled people be part of an inclusive society.

Are voice assistants useful?

Voice assistants are undoubtedly useful as they allow consumers to do daily tasks hands free and are seen as the smarter, faster and easier way to perform everyday activities.

However, despite growing capabilities, basic tasks remain the norm for most users and the majority of consumers have yet to use them for advanced activities like shopping or controlling other smart devices in the home.

As this is still a new technology, consumers are clearly taking time to adjust. They are useful now but once there is a foundation of trust and reliability, the usefulness of voice assistants could transform lives everywhere.

What can voice assistants do?

There are a range of things that voice assistants can do which include:

  • Contract thermostats, lights and locks

  • Send email/text messages or initiate calls

  • Buy items online

  • Locate lost smartphones or other devices

  • Check traffic conditions and map travel routes

  • Read books or newspapers aloud

  • Schedule meetings

Are voice assistants intelligent?

As noted above, after voice assistants have processed speech, they effectively draw responses and actions requested by an end user from the internet. Through machine learning they can learn from previous actions to give a better and contextualised experience to the user. So arguably, they are intelligent.

However, as something that is not a sentient being, you could equally argue that it is not intelligent. The real intelligence could be said to be with the engineers who have developed the software and hardware for the voice assistant. They are the parents of this technology and have programmed the voice assistant what to learn, when to learn and how to learn.

Multiturn conversations

With a voice assistant, you can have different ‘types’ of conversation. For example, you can have a ‘single-turn’ conversation. This is simply where a user asks a question and the voice assistant responds verbally.

A multi-turn conversation is where there is a dialog between the voice assistant and the user. The voice assistant responds appropriately to a second question by remembering the intent (or context) established by the first interaction. Voice assistants cannot do this perfectly, so this is where machine learning is continually sought to be improved to increase multi-turn conversation accuracy.   

Natural speech

Voice assistants are still distinctly robotic sounding, making it easy to distinguish between them and a person. Many would not be able to pass the Turing test which was proposed by English computer scientist Alan Turing. This test evaluates a machine’s intelligence and to pass, a robot must behave in a way that is indistinguishable to a human.

There have been some claims that voice assistants do pass the Turing test in some ways. The board chairman of Google’s parent company, Alphabet, has said that Google Duplex passes the Turing test in the task of booking appointments. But he stressed that it only passes in this specific way.

Voice assistants and gender stereotypes

According to a UN study entitled, ‘I’d blush if I could’, AI-powered voice assistants with female voices are perpetuating harmful gender biases.

With most smart-speaker assistants having a female voice and, according to the study, most companies like Apple and Amazon being overwhelmingly staffed by male engineering teams, people find this worrying.

The report goes on to say, "Because the speech of most voice assistants is female, it sends a signal that women are... docile helpers, available at the touch of a button or with a blunt voice command like 'hey' or 'OK'. The assistant holds no power of agency beyond what the commander asks of it. It honours commands and responds to queries regardless of their tone or hostility,".

Spokespeople from various tech companies say that the reasoning behind the prominent use of a female voice is that many voices were tested with customers before launching and the female voice tested best. Where objectives for building a helpful, supportive, trustworthy assistant were sought, a female voice was the stronger choice. Others might say that voice assistants use female voices because people just prefer talking to women.

Apple’s Siri and the Google Assistant currently offer the option to switch to a male voice.

Voice assistants and human interaction

With voice assistants, many tasks which provide us with some human interaction, can be carried out by a computer. Therefore, a question arises as to what social implications will arise from the adoption of voice assistants and the consequential reduction in human interaction.

Implications from a reduction in human interaction

It is possible to draw parallels from the effect of national lockdowns which have taken place during the coronavirus pandemic. During this time, people have been confined to their homes and human interaction has been dramatically reduced and massively changed. The result of this is that people’s mental health has been affected with more anxiety, obsessive compulsive disorders, and loneliness being experienced. It is easy to imagine that with voice assistants reducing human interaction and people finding themselves with few close connections and unsupportive networks, mental health problems could increase.

Interactions with a voice assistant and a person differ immensely. It is therefore conceivable that personal and professional development could be hampered by many tasks being outsourced to a voice assistant. A voice assistant cannot provide the debate, the challenge and the spontaneity that is unique to human interactions. It will become increasingly important to weigh up the pros and cons of the usage of voice assistants and whether they replace communication or redefine it.

Helping combat isolation

In older people who are more likely to experience loneliness and social isolation, voice technology can have an incredibly positive impact. Greenwood Campbell, Abbeyfield and the University of Reading conducted a study which found that the use of voice technology alleviated loneliness in all participants involved.

Voice assistants and the workplace

Voice assistants have entered the market primarily through people’s homes but it is easy to imagine that due to their usefulness, businesses are also considering how they could benefit operations.

One hurdle of getting voice assistants into the workplace is a general concern that voice assistants pose a privacy risk. As businesses can be dealing with extremely confidential and sensitive information, they will need to be convinced that there is no privacy risk before introducing any voice assistants. However, businesses are always looking to reduce costs and voice assistants could add efficiency that dramatically cuts expenditure.  

Voice assistants in 2030

Voice assistants are one of many ‘smart’ devices making their way into our lives that make up the Internet of Things. These are devices that are connected to wireless networks which can communicate with other devices within our homes.

In 2017, there was an estimated 27 billion Internet of Things devices and this is expected to grow by 12% every year to reach more than 125 billion devices by 2030. It is clear big tech companies are investing heavily in these devices, so it will be interesting to consider how these will have developed by 2030.

What tasks will voice assistants be able to do in the future?

The ambitions of some companies show what voice assistants could successfully do in the future. As an example, Google has developed Google Duplex, which is a technology that can conduct natural conversations and carry out “real world” tasks over the phone.

Users can ask questions or give commands through Google Assistant and this can then use Google Duplex to carry out specific tasks such as making a restaurant booking or scheduling a hair appointment.

What roles will be affected by voice assistants?

Using Google Duplex as an example, roles that heavily require phone calling may be open to disruption. Google Duplex is seen as beneficial to businesses by being able to arrange appointments, gather information from different sources and address accessibility and language barriers.

Therefore, roles such as a PA could be changed with improvements in voice assistant technology. However, they will not necessarily be made redundant, as day to day interactions can be complex for a computer and there would need to be a human operator to defer to from time to time in order for a difficult task to be completed. It could be that a PA turns into a VA manager.

Conclusion

There can be no doubt that voice assistants are, and will continue to become, a great feat of human ingenuity and they are already creeping into our lives in some shape or form. With the eventual roll-out of 5G and the improvement in machine learning voice assistants may be setting themselves up to be a tool we cannot live without.

However, before we get to that stage, there are hurdles to cross which include heavy investment, improvement in the technology and confidence from consumers that this device that is in their lives does not pose a risk to their privacy.


Videos on future of voice assistant

Comment