December 9, 2021
10 min read
Top AI, ML, and Data Announcements at AWS re:Invent 2021
Author:
Provectus, AI-first consultancy and solutions provider.
The 10th anniversary AWS re:Invent event has concluded.
The 2021 hybrid conference featured both in-person and virtual activities, and welcomed over 30K attendees at the Expo in Las Vegas. On site, 300 sponsors exhibited in 254 booths across two partner pavilions, and in a variety of Industry Lounges across the campus. At the same time, thousands of potential builders, thinkers, and pathfinders were able to stream live and on-demand keynotes, leadership sessions, breakout tracks, and Learn & Play activities from their homes and offices.
As a whole, AWS re:Invent 2021 was dedicated to celebrating the success of Amazon Web Services in the cloud space while providing a strong vision for the future. Adam Selipsky, the new CEO of AWS, recapped AWS progress through the years, from the release of Amazon EC2 in 2006 to the massive delivery of cloud services in 2021. Swami Sivasubramanian, VP of Amazon Machine Learning, reinforced AWS’s image as a leader in artificial intelligence, machine learning, and data analytics.
As an AWS Premier Consulting Partner, Provectus shares Amazon Web Services’ vision of the future. Our common goal is to leverage cloud, data, and AI to reimagine the way businesses operate, compete, and deliver customer value. Not surprisingly, Provectus was named an AWS Launch Partner for Artificial Intelligence for Data Analytics (AIDA). This recognition underscores our expertise in delivering AI solutions for data analytics use cases in a broad range of industries.
There were other notable AI, ML, and data announcements as well. Let’s explore them in more detail!
The Journey from Data to AI/ML Services
AWS re:Invent 2021 was all about data and the journey that organizations take, from collecting and storing data, to processing and analytics, to predictions.
In his keynote, Swami Sivasubramanian described the end-to-end, data, analytics, and ML journey as a series of steps — with AWS providing the most comprehensive set of services for data storage and access, data analytics and visualization, and AI/ML. Some of these keystone services — listed below! — have been released this year.
#1 Amazon S3 Glacier
Amazon S3 Glacier Instant Retrieval storage class delivers the lowest cost storage for long-lived data that is rarely accessed and requires millisecond retrieval. The service comes with S3 Intelligent-Tiering, which enables automated storage optimization for rarely accessed data that needs immediate retrieval.
Amazon S3 Glacier storage classes are designed to be extremely low-cost. According to AWS, they can provide an immediate savings of 68% for data that hasn’t been accessed for more than 90 days, with no action needed.
#2 Cell-Level Security and Governed Tables with Automatic Compaction for AWS Lake Formation
AWS Lake Formation makes it easy to set up a secure data lake in days, instead of weeks or months. Newly released features further simplify data loading, storage optimization, and the management of access to a data lake.
- Governed Tables. This new type of Amazon SS tables makes the ingestion and management of data simple and reliable at any scale. The tables support ACID transactions that let multiple users reliably insert and delete data concurrently, across multiple governed tables.
- Storage Optimization with Automatic Compaction for governed tables. Lake Formation automatically compacts small S3 objects in governed tables into larger objects, to optimize access via analytics engines, such as Amazon Athena and Amazon Redshift Spectrum.
- Granular Access Control with Row and Cell-Level Security. This feature enables developers to control access to specific rows and columns, in query results and within AWS Glue ETL jobs, based on the user’s identity.
#3 Serverless and Amazon Kinesis Data Streams On-Demand
AWS has also released a set of Serverless services as part of the trend to support technologies for running code, managing data, and integrating applications, all without managing servers.
- Amazon Redshift Serverless. This new capability makes it super easy to run analytics in the cloud with high performance at any scale. Just load your data and start querying, with no need to set up and manage clusters.
- Amazon EMR Serverless. This new inference option enables you to easily deploy machine learning models for inference without having to configure or manage the underlying infrastructure.
- Amazon MSK Serverless. A new serverless option in Amazon EMR that makes it easy and cost-effective for data engineers and analysts to run petabyte-scale data analytics in the cloud.
At the same time, Amazon Kinesis Data Streams on-demand provides a new capacity mode that eliminates capacity provisioning and management for streaming workloads. It is a fully-managed, serverless service for real-time processing of streamed data at a massive scale. Kinesis Data Streams can take any amount of data, from any number of sources, and scale up or down as needed.
#4 Amazon DevOps Guru for RDS
Amazon DevOps Guru for RDS is a new capability for Amazon DevOps Guru. It allows developers to easily detect, diagnose, and resolve performance and operational issues in Amazon Aurora.
DevOps Guru for RDS can help developers using Amazon Aurora databases to detect, diagnose, and resolve database performance issues quickly and at scale. It will save developers and engineers many hours of work trying to uncover and remediate performance-related database issues.
DevOps Guru for RDS uses ML to automatically identify and analyze a wide range of performance-related database issues, such as over-utilization of host resources, database bottlenecks, or misbehavior of SQL queries. It also recommends solutions to remediate the issues it finds.
#5 Amazon DynamoDB Standard Infrequent Access table-class
Amazon DynamoDB Standard-Infrequent Access (DynamoDB Standard-IA) is a new table class for DynamoDB that reduces storage costs by 60% compared to existing DynamoDB Standard tables, while delivering the same performance, durability, and scaling.
DynamoDB Standard-IA table class is designed for customers who want a cost-optimized solution for storing infrequently accessed data in DynamoDB without changing any application code. This new table class renders the performance of DynamoDB and can use all of the same APIs.
DynamoDB Standard-IA is a great solution for businesses that have to store terabytes of data for several years while keeping it highly accessible. These include social media applications, e-commerce sites, healthcare, etc.
#6 AWS Database Migration Service Fleet Advisor
AWS Database Migration Service (AWS DMS) is a service that makes it easy to migrate databases to AWS quickly and securely. AWS DMS Fleet Advisor is a new feature of AWS DMS that enables developers to quickly build a database and analytics migration plan by automating the discovery and analysis of their fleet. AWS DMS Fleet Advisor is intended for users looking to migrate a large number of database and analytic servers to AWS.
AWS DMS Fleet Advisor can collect and analyze database schemas and objects, giving developers the ability to build a customized migration plan by determining the complexity of migrating source databases to target services in AWS. AWS DMS Fleet Advisor makes it easy to plan database and analytics migration to AWS without requiring expensive migration experts or third-party tools.
#7 Amazon SageMaker GroundTruth Plus
Amazon SageMaker Ground Truth Plus, the latest service in the Amazon SageMaker suite, is designed for professionals who do not have deep ML expertise or extensive knowledge of workflow design and quality management. Ground Truth Plus makes it easier for all to create high-quality training datasets, without having to build labeling applications and manage the labeling workforce. Just provide data along with labeling requirements, and Ground Truth Plus sets up the data labeling workflows and manages them automatically, in accordance with defined requirements.
Amazon SageMaker Ground Truth Plus uses a multi-step labeling workflow, including ML techniques for active learning, pre-labeling, and machine validation. This reduces the time required to label datasets for a variety of use cases, including computer vision and natural language processing. It provides transparency in data labeling operations and quality management through interactive dashboards and user interfaces. This lets you monitor the progress of training datasets across multiple projects, track project metrics such as daily throughput, inspect labels for quality, and provide feedback on the labeled data.
#8 Amazon SageMaker Studio Notebook
Amazon SageMaker Studio Notebook is a new feature of Amazon SageMaker Studio that provides a single integrated environment to perform data engineering, analytics, and machine learning. It allows teams across different data domains to collaborate more easily by using a range of data engineering, analytics, and machine learning workflows.
Amazon SageMaker Studio now allows users to interactively access, transform, and analyze a wide range of data for multiple purposes, all from within a universal notebook. With built-in integration with Spark, Hive, and Presto running on Amazon EMR clusters and data lakes running on Amazon S3, customers can now use Amazon SageMaker Studio to access and manipulate data in a universal notebook without having to switch services. In addition to developing machine learning models using their preferred framework (e.g. TensorFlow, PyTorch, or MXNet) to build, train, and deploy machine learning models in Amazon SageMaker Studio, customers can browse and query data sources, explore metadata and schemas, and start processing jobs for analytics or machine learning workflows, without having to leave the universal Amazon SageMaker Studio notebook.
#9 Amazon SageMaker Infrastructure Innovation
Along their data journey, organizations have various infrastructure needs. Some may need to move data faster between different departments, to get faster access to insights, while others need the ability to build and productionize their ML models more efficiently.
- Amazon SageMaker Training Compiler. This new capability automatically compiles your Python training code and generates GPU kernels specifically for your model. It can accelerate the training of deep learning (DL) models by up to 50%.
- Amazon SageMaker Inference Recommender. A brand-new Amazon SageMaker Studio capability that automates load testing and optimizes model performance across machine learning (ML) instances. Ultimately, it reduces the time to move ML models from development to production, and optimizes the cost of their operation.
- Amazon SageMaker Serverless Inference. A new type of Amazon MSK cluster that makes it easier for developers to run Apache Kafka without having to manage its capacity.
#10 Amazon SageMaker Canvas
Amazon SageMaker Canvas, a new visual, no-code capability that allows business analysts to build ML models and generate accurate predictions without writing code or requiring ML expertise. Its intuitive UI enables business analysts to browse and access disparate data sources in the cloud or on-premises, combine datasets with the click of a button, and train accurate models, to generate new predictions once new data is available.
SageMaker Canvas leverages the same technology as Amazon SageMaker to automatically clean and combine data, create hundreds of models under the hood, select the best performing one, and generate new individual or batch predictions. It supports such problem types as binary classification, multi-class classification, numerical regression, and time series forecasting. It enables organizations to tackle various business-critical use cases, from fraud detection and churn reduction to inventory optimization, without writing a single line of code.
#11 Amazon Lex Automated Chatbot Designer
The automatic chatbot designer in Amazon Lex enables developers to automatically design chatbots from conversation transcripts in hours rather than weeks. With Amazon Lex, they can build, test, and deploy chatbots and virtual assistants on contact center services (such as Amazon Connect), websites, and messaging channels (such as Facebook Messenger). The automatic chatbot designer enhances the usability of Amazon Lex by automating conversational design, minimizing developer effort, and reducing the time it takes to design a chatbot.
The automated chatbot designer uses machine learning to analyze conversation transcripts and semantically cluster them around the most common intents and related information. Developers can iterate the design, add chatbot prompts and responses, integrate business logic to fulfill user requests, and then build, test, and deploy the chatbot in Amazon Lex. The automated chatbot designer automates a significant portion of the bot design, minimizing effort and reducing the overall time it takes to design a chatbot.
#12 Amazon SageMaker Studio Lab
Amazon SageMaker Studio Lab is a free service that enables anyone to learn and experiment with ML without needing an AWS account, credit card, or cloud configuration knowledge.
With the release of Amazon SageMaker Studio Lab, AWS stays true to its belief that technology has the power to solve the world’s most pressing issues. The company proudly supports the new and innovative ways that our customers are using these technologies to deliver social impacts.
To start with Studio Lab, just request a free Studio Lab account here.
The Future Is Data-Driven
AWS believes that data-driven organizations are imperative for the future. AWS re:Invent 2021 was designed to support that idea with noteworthy service releases in data bases, data lakes, machine learning, and analytics, all with the support of new methods for data governance.
Provectus has been attending re:Invent conferences since 2018. re:Invents of the past have been grand and, by comparison, the 2021 event may seem a bit subdued. But while there were not many explosive announcements this year, the services released by AWS in 2021 have closed multiple gaps in its product map. This year’s event strengthened AWS’s role as a top cloud provider and AI, ML, and data leader of the future.
The Provectus team will be at AWS re:Invent 2022 for sure. Hope to meet you there!