Elevating Customer Experiences with Personalized Video Recommendations Powered by AI

Firework (formerly FireworkTV) rebuilt its ML infrastructure on Amazon SageMaker in four weeks. Recommendations run 10× faster on half the infrastructure cost, with measurable engagement lift.


Client profile

A video commerce and short video platform

Industry

Other, Media & Technology

Region

North America

10X

Faster model training and serving

2X

Reduction in ML infrastructure costs


Firework (formerly FireworkTV) started as a short video network that matched creators and audiences through personalized 30-second videos. The company has since evolved into a video commerce platform. It powers shoppable video, live commerce, and AI-driven content for retailers and brands. The ML infrastructure Provectus built during the FireworkTV era became the foundation for Firework’s recommendation engine.

01 The Challenge

A growing platform ready for ML infrastructure that could match its ambitions

Video recommendation is a speed game. The faster a platform matches a user to relevant content, the longer they stay. For short video, each clip is 30 seconds or less. A slow or irrelevant recommendation means the user swipes away. Response time and pick quality translate directly into session length and revenue.

FireworkTV’s ML team had built a personalized recommender system. It analyzed user activity and preferences in real time to surface engaging content. The initial infrastructure, running on general-purpose compute with homegrown tooling, had carried the platform through its early growth. As video volume and user traffic scaled, the team saw an opportunity to upgrade.

The ML team wanted to accelerate model training and serving. Reduce the cost of running the infrastructure. Introduce CI/CD automation and better framework integrations. Create a unified environment for experimentation. They wanted engineers spending time on recommendation quality instead of platform maintenance. All of this had to happen without disrupting the live platform.

02 The Approach

Benchmark the current setup, migrate to SageMaker, then prove the gains

Provectus, an AI-first systems integrator, started by benchmarking FireworkTV’s existing training and serving pipelines. The goal was to measure baseline performance across training speed, response time, and infrastructure cost. Every improvement would be quantified against a known starting point.

The team proposed migrating core ML workloads to Amazon SageMaker. SageMaker would replace general-purpose compute and custom tooling with a managed environment designed for ML. Training, hosting, and serving would run in a single platform with built-in distributed training and cost-optimized compute.

The migration was structured to minimize disruption. Provectus rebuilt the pipelines in parallel with the existing system. They validated performance against the benchmarks and switched over once the new infrastructure proved faster and cheaper. The entire engagement took four weeks from kickoff to initial release.

03 The Build

Unified ML platform with distributed training, automated pipelines, and cost-optimized serving

Provectus consolidated FireworkTV’s ML tool stack into a single platform on Amazon SageMaker. The fragmented setup of general-purpose instances, manual scripts, and disconnected monitoring was replaced with a managed environment.

The build included:

  • Distributed training pipelines that parallelize model training across multiple instances
  • Cost-optimized serving using compute resources sized for ML inference
  • CI/CD automation for model deployment, removing manual steps
  • A unified experimentation environment for iterating on recommendation models

New model versions go from training to production through an automated pipeline. No manual deployment process.

04 The Results

From infrastructure bottleneck to 10X faster recommendations in four weeks

The new ML infrastructure went live four weeks after kickoff. Provectus benchmarked the migrated system against the original across both performance and cost.

10X

Faster model training and serving

2X reduction in infrastructure costs

Faster training means the team iterates on recommendation models more frequently. Faster serving means the system responds quicker. Users experience that as more relevant content. For a platform where engagement is measured in seconds of watch time, that speed matters.

Infrastructure costs dropped by half. The move from general-purpose compute to ML-optimized resources eliminated the overhead of custom tooling and over-provisioned instances. The savings gave the ML team more budget for recommendation quality.

The engineering team’s day-to-day shifted from managing infrastructure to improving models. Manual deployment was replaced by automated pipelines. Monitoring became a managed service. The team could focus entirely on work that drives engagement.

05 What’s Next

ML infrastructure built for the platform Firework became

The recommendation infrastructure Provectus built for FireworkTV carried forward as the company evolved into Firework. The same ML foundation that powered personalized short video now supports shoppable video, live commerce, and AI-driven content. Recommendation quality directly drives conversion rates and revenue for Firework’s enterprise clients.

Ready to discuss your AI infrastructure?
Schedule a technical conversation with our team.