Artificial Intelligence (AI) is a term we are hearing more and more these days. Whether you believe it will ‘take our jobs’ or not (I personally believe it will change our jobs!), it will surely play an ever-growing role in the future of IT and computing.
Although people have been talking about AI since the late 1960s, it is still in its relative infancy. Looking as far back as the computer ‘Hal’ from 2001: A Space Odyssey (1968), it is proving to take many decades to make such intelligent technology a reality. Whilst amazing technical advancements have been made since then, demand has increased due to more possible use cases.
This post is designed to give a general overview on some of the lesser known AI services offered by Amazon Web Services (AWS). Their main service is Lex, which powers Alexa, and has many potential applications for web and voice communications. However, we will focus on:
- Amazon Transcribe
- Amazon Comprehend
- Amazon Translate
Naturally, there are some inaccuracies in certain parts of the transcription, particularly with names (e.g. ‘for us’ is meant to be my colleague ‘Faraz’). However, Transcribe gives us the option to upload CSV or TXT custom vocabulary files to improve the accuracy by defining things (‘entities’) such as the names of places, people etc. Although Transcribe did get the essence of the conversation right.
In the context of a contact centre, if you enable ‘Channel identification’ when creating the Transcribe job, you can separate what the ‘caller’ (Channel 0) and ‘agent’ (Channel 1) said, as per below:
Amazon Comprehend can give some meaningful insights once you have your transcript. It can detect up to 10 different voices using the ‘speaker identification’ feature, although its accuracy is significantly increased if you specify the number of different speakers yourself.
In contact centres, the interaction between the caller and the agent is split between two audio channels, namely left and right. Transcribe separates these channels using the ‘channel identification’ feature. Comprehend has a feature called ‘sentiment analysis’ that can be carried out for both parties, which consists of a score based upon four aspects:
The total of these scores adds up to 1, like a percentage analysis. The example above is of a dissatisfied customer calling to complain about receiving an order late, a lack of communication and the goods being damaged. The scores for the customer are as follows:
Here it gives a dominant sentiment, which is clearly NEGATIVE in this case. Now, the agent is trying to find out what happened as well as reassure the customer:
The overwhelming NEUTRAL sentiment makes sense as the agent is trying to be impartial, with very little POSITIVE or NEGATIVE sentiment in there. A good listener should not offer any opinion or sentiment either way.
These scores can be saved as JSON files and stored using Amazon’s S3 service (think ‘block storage in the cloud’). In order to further manipulate and analyse the data, we will look at this in the Use Cases section below.
Amazon expects another six languages to be added by the end of 2018. Currently, Amazon Translate can translate between English and the following 12 languages:
- Chinese (Simplified)
- Chinese (Traditional)
Translate can dynamically detect the source language, so there is no need to specify the source language – only the target language. It can also work on text files and real-time streams. However, the audio or video streams will need to be converted to, for example, WebVTT files, where WebVTT is a W3C standard for displaying timed text in connection with HTML5. Each line is translated with a time stamp in the stream, which enables tracking of who is saying what and when.
The web console is OK for a one-off single job. For a one-off batch of jobs, you could use a unix/linux script that incorporates CLI commands. However, for a truly automated system, it is best to use a Lambda function, as we will discuss next.
The same code can then invoke Comprehend to get an idea of customer satisfaction as well as agent performance based on sentiment analysis. The data can be written to a variety of database solutions then be further manipulated using an analytics tool of your choice.
Another example is web chat for multinational companies. Both customer and agent can be using their respective native languages and Translate can make the conversation bilingual. Once again, Comprehend can be employed to gauge customer satisfaction and agent performance by looking at sentiment analysis scores.