In addition to their machine learning service, late last year AWS announced three new exciting tools: Lex, Polly and Rekognition. These new tools are vastly different but under the same theme of intelligently identifying or manipulating data automatically with machines. From speech to sight and beyond, these services allow for a number of exciting new possibilities!
Polly is designed to turn text into speech. Before you have traumatic flashbacks to speech synthesisers of days gone by please note that Polly is significantly more advanced. Polly includes 47 voices speaking 24 different languages. Speech from Polly is lifelike and additionally you are able to adjust many different aspects of it from pitch, tone, speed, volume and most importantly adjust pronunciation.
The potential uses of Polly are very exciting. This is an area that has been lacking for many years, for example here in Sydney our train announcement system is a combination of pre recorded chunks of voice acting. There is very little room to make changes to messages or output a unique alert across the PA system. Using a service like Polly means that any text statement can easily be converted and streamed as a voice announcement.
It also comes into play in telephony systems which contain menus tied in with recordings. No longer will you have to be at the mercy of a single voice actor to have a uniform voice across your system and be forced to shell out for costly changes or additions to your menu’s. With Polly you are able to easily and instantly build out a menu voice item and apply it instantly.
In summary, Polly makes it easy for any application or device to have a voice, and to not require the computing capacity to convert or store output of the audio files. This is all handled by AWS and all that is needed is the Poly API.
Amazon Machine Learning
Amazon machine learning lets you use powerful tools to examine and draw patterns and predictions in data. This is coupled with visual tools that display models you have designed. This is a service based on software that is used by Amazon’s Data analyst teams. Its ability to quickly and reliably sort data in real-time is the driving force behind the recommended items you see while shopping on Amazon.
As data is a very broad subject so is the potential uses of the machine learning service. Some companies successfully use this service to flag fraudulent or inconsistent transactions. Others use it to predict foot traffic to restaurants and venues to organize staffing. While this initially seems like a good fit for larger companies with data warehouses, its scalability and pay for what you use mantra mean that it is easily available for anyone to load data in and start making predictions.
If you have any data it is valuable in some way then the key to extracting this value is a service such as Amazon Machine Learning.
Amazon Lex is a service used to build chat bots. The bots are intelligent and integrated into messenger clients. This is not out of left-field as Facebook has recently tried to launch chat bots for messenger, and Slack is treating chat bots very seriously.
A chat bot is a new and engaging way to communicate with people as a business. They can offer up information or even provide functionality as it is integrated with AWS Lambda. Common use cases include ordering pizza and booking hotels. While it may seem trivial, hitting up Netflix on messenger and asking it to recommend something to watch is much more alluring than logging into a website.
Lex is a standout in this department as you can leverage a number of AWS services such as Lambda to take things further, NASA has used Lex, Polly and Alexa to build a system that takes questions from students vocally and then answers them with a voice of its own. The system takes advantage of the NASA data about mars stored in AWS to find the answers to the questions. http://mars.nasa.gov/ask-nasa-mars/#/
Rekognition is a service that can identify faces or components of images. This astonished me when I saw it in action. Part of the cool futuristic robots we see in fiction had this critical element which we were missing until now. Rekognition is able to take an image and identify what it is looking at. You can send it a photo and it can identify components, such as a tree, bicycle, different types of furniture. However it goes deeper than this. In an image scanned by Rekognition it didn’t stop at tree, it said maple tree. This is the part that blew my mind as it was able to analyze the image better than I could, I wouldn’t have any idea what kind of tree I was looking at.
Rekognition can also identify faces from different angles, you are able to sift through thousands of images and find those with you in it with ease. This kind of technology opens the door to a great number of possibilities as if a device or application can tell what it is looking at without being told and then be set to trigger actions based on what it finds then think smart homes which can SMS you if someone unfamiliar is on or in your property, or recycling centers that can pick and sort through materials. Maybe this will usher out the period of hearing the dreaded “unexpected item in the bagging area.” While shopping at a supermarket.
A real world example of this technology in play are things such as a real estate site that has Rekognition organize all the image tagging and provides elastic search. Want to know which properties have pools? This service can scan the images from each property portfolio and then show you which have images tagged pool.
These four services are an exciting preview of things to come, best of all they are easily accessed and leveraged with your current AWS assets. By design these can easily be used with each other and all the other AWS services. And even better, Lex, Polly and Rekognition are all available in some capacity as part of free tier!