Machine Learning Infrastructure for Personalized Video Recommendation
FireworkTV reduces infrastructure costs and improves performance of its video recommendation system
FireworkTV is the world’s first decentralized short video network. It connects creators, fans and engaged audiences by curating interactive 30-second videos that are tailored to each person’s unique lifestyle and tastes.
The ML team at FireworkTV recognized the limitations of their existing ML infrastructure (lagging productivity, growing overhead costs, lack of automation) and sought to build a new, more efficient infrastructure on AWS, to drive improvements of recommendation model performance, quality and reliability.
In close collaboration with FireworkTV’s ML team, Provectus reviewed the infrastructure and inference process to build a new ML infrastructure using Amazon SageMaker. The inference and training pipelines were migrated to Amazon SageMaker, to enable at-scale deployment.
FireworkTV’s ML team received a machine learning infrastructure on AWS, built in just four weeks. By using new inference and training pipelines, FireworkTV reduced its infrastructure costs by 2x and sped up inferences by 10x, increasing ML team’s productivity and driving improvements in real-time video recommendations.
Reduced ML infrastructure costs
Faster and more efficient inferences
Time to roll out the initial release
Under the Hood of Real-Time Video Recommendation - Efficient ML Infrastructure
Video recommendation has become an essential way of helping people explore and discover video content that may be of interest to them. In existing video recommender systems, the models make recommendations based on user-video interactions and single specific content features that are captured, processed and analyzed in real time.
FireworkTV, a decentralized short video network, is developing a Netflix-like personalized video recommender system that delivers engaging and interactive videos, tailored to users’ unique lifestyles.
To engage users (i.e. boost app usage), video recommendations need to be highly accurate and tailored to match their activity and preferences in real time. The more accurate and rapid the predictions are, the more engaged users will become, and the more ads will be displayed to them in their feeds, thus contributing to FireworkTV’s bottom line.
Real-time video recommendation, a challenge on its own, requires a smooth and robust machine learning infrastructure.
The machine learning team at FireworkTV recognized the limitations of their existing ML infrastructure and joined forces with Provectus to design and build a new ML infrastructure on AWS from scratch.
With the joint initiative, FireworkTV’s ML team was looking to:
- Accelerate inference and training pipelines
- Reduce ML infrastructure and overhead costs
- Increase automation (CI/CD, framework integrations)
- Improve recommendation model performance, quality and reliability
By swapping out their cumbersome homegrown tools for AWS AI products like Amazon SageMaker, the machine learning team also wanted to improve productivity, thus cutting overhead costs, and to increase user satisfaction.
Giving the Project a Further Boost Using AWS AI Stack
In a typical video recommender system, machine learning algorithms analyze the viewing history of users with similar tastes, to recommend what other users may be most interested in watching next. The idea is to keep them engaged so that they continue using the platform, ensuring less churn and more lifetime value.
FireworkTV’s ML infrastructure and video recommendation model were doing the job, but the infrastructure (based on Lambda and PyTorch) was too expensive and cumbersome. This limited the project’s potential to scale and grow. They needed a more efficient system to serve and train ML models.
Provectus reviewed and measured the existing inferencing and training pipelines of the video recommendation model, to propose building a new infrastructure on Amazon SageMaker and Amazon Elastic Inference. The inferencing, training and hosting of the model had to be migrated from Lambda to Amazon SageMaker Distributed Training. A move from EC2 instances to CPUs for ML model serving was proposed as well.
By moving away from Lambda, FireworkTV was to:
- Decrease admin overhead
- Unify ML tool stack for ease of use by engineers
- Move towards an automated pipeline
Following the migration, the performance and costs of the inferencing and training pipelines were compared, and both demonstrated a significant improvement.
Paving the Way for Growth with Cost-Efficient ML Infrastructure and Faster Video Recommendations
As a result of close collaboration with Provectus, FireworkTV received a new infrastructure for machine learning — a powerful and robust foundation for its video recommender system — in just four weeks.
The migration of inference and training pipelines to Amazon SageMaker allowed for circumventing the limitations of Lambda, such as admin overhead and growing infrastructure costs. It empowered the ML team through a more efficient ML process and better collaboration.
FireworkTV achieved a 2x reduction in ML infrastructure costs while accelerating its inference and training pipelines by 10x.
FireworkTV is now set to:
- Scale and grow its video recommender system, since cost and performance limitations are alleviated
- Make faster improvements to its video recommender system, since ML models can be built, trained and managed more efficiently
- Deliver personalized video recommendations in real time, due to reduced latency of the system
Provided that the MoM improvements in the performance, quality and reliability of the video recommendation model last, the machine learning infrastructure delivered by Provectus has paved the way for FireworkTV’s future growth, through faster and more accurate video recommendations that engage users, increase app usage and drive ad revenue.