Second Genome enhances its proprietary research platform to run petabyte-scale drug trials and biomarker research in a secure, compliant environment.
Client profile
A biotechnology company using microbiome data to advance precision therapies and drug discovery
Industry
Genetics & Biotech
Region
North America
Petabyte-scale drug research enabled
Compliant environment for pharma partners
Second Genome is a biotechnology company that uses microbiome data and machine learning to develop precision therapies. The company’s proprietary platform translates clinical and molecular data into actionable insights for drug discovery programs. It draws on more than 400 structured databases of public and proprietary microbiome datasets. Second Genome collaborates with pharmaceutical companies, government agencies, and academic institutions.
01 The ChallengeThe microbiome therapeutics market is projected to grow from $250 million to over $1 billion by 2030. That is a 33.6% CAGR. Pharma companies are investing heavily in microbiome-based drug discovery. The race to translate molecular data into precision therapies depends on platform speed, data scale, and partner trust.
Second Genome’s leadership saw an opportunity to strengthen its position in this market. The company’s growth depended on attracting and retaining premium pharmaceutical partners. Those partners expected the highest standards of data security and compliance before entrusting research to an external platform.
The existing platform had been effective for internal research. As Second Genome moved toward larger, industrial-scale collaborations, data ingestion and processing needed to keep pace. Improving how data flowed through the platform would directly accelerate the speed and scale of biomarker research.
Operating in the highly regulated pharmaceutical industry meant raising the bar on security compliance was not optional. Second Genome needed an environment where partners could conduct drug research with full confidence. Data handling had to meet the strictest standards across industry, government, and academia.
Second Genome partnered with Provectus, an AI-first systems integrator and solutions provider. The goal: redesign the infrastructure and build ML for biomarker work.
02 The ApproachProvectus began by reviewing the data ingestion and staging components of Second Genome’s platform. The team examined data quality, error monitoring, logging, configuration handling, and API testing on sample data. This gave a clear picture of where improvements would have the most impact.
From there, Provectus designed a new secure, cloud-native data infrastructure on AWS. The team applied DevOps best practices to automate patch management, centralized logging, and disaster recovery. CI/CD pipelines streamlined how features and production deployments were managed.
On the security side, data in transit and at rest was encrypted. Access logs were securely stored. All open access points were eliminated. Multi-factor authentication was enforced for most users. Penetration tests confirmed the infrastructure met AWS security requirements for sensitive R&D data.
03 The BuildThe build delivered two capabilities in parallel.
The data infrastructure layer runs on AWS with fully automated pipelines for data ingestion, processing, and storage. CI/CD ensures that microbiome datasets are collected, stored, and processed in a compliant manner. Patch management, logging, and disaster recovery all run automatically.
The ML infrastructure enables Second Genome to contextualize its biomarker programs to specific therapeutic applications using AI. Data pipelines were tested and prepared for new capabilities as the research evolves. The platform supports petabyte-scale projects for biomarker research, drug trials, and drug development.
04 The ResultsThe new infrastructure transformed Second Genome’s ability to operate at scale. The platform now supports petabyte-scale projects with fully automated pipelines and CI/CD in place.
Scale
Petabyte-scale drug research enabled
In a secure, compliant environment
The infrastructure meets the strict security and operational standards that premium pharmaceutical companies require. Second Genome can now onboard partners who need assurance that their research data meets the highest protection standards. Partners run drug discovery at industrial scale without concerns about data security.
With a faster, more secure, and fully automated platform in place, Second Genome can take on larger programs. The company accelerates its biomarker and drug discovery work across collaborations with industry, government, and academia.
05 What’s NextThe data and ML infrastructure Provectus built gives Second Genome the foundation to scale its partnerships and programs. Provectus works with Second Genome on extending platform capabilities as the company pursues new collaborations.