Automating Document Processing in HCLS with AI
PSC Biotech automates processing of FDA forms with AI & IDP, to accelerate document processing operations, decrease costs, and increase throughput
PSC Biotech is a global life sciences consultancy that provides companies in the HCLS sector with essential services, to ensure that their healthcare products are developed, manufactured, and distributed to meet the highest standards, in compliance with all applicable regulatory requirements.
Founded in 1996, PSC Biotech has served as a strategic partner to emerging and established life science companies in 52 countries. PSC has helped over a thousand clients world-wide to stay up to date on current regulations and technologies in the HCLS sector.
PSC Biotech was looking to enhance its document processing operations by automating its existing pipeline with AI. Specifically, the company wanted to process FDA Form 483 observations faster, more accurately, and on a larger scale. They expected to account for human factors associated with manual document processing, to be able to optimize costs (including error-induced costs) and mitigate risks of infractions made by mappers and reviewers. PSC chose Provectus as a reliable provider of AI/ML services, with expertise in the development of Intelligent Document Processing (IDP) solutions.
The Provectus solution for automated document processing with AI/ML was built using industry best practices. The work included setting up development and management environments; conducting EDA and text classification experiments; using labeled datasets for a baseline model; implementing secure and reproducible ML infrastructure and CI/CD pipelines. The IDP solution for observation classification was delivered using AWS data lakes and AI/ML services, pre-trained deep learning and NLP algorithms, and frameworks such as PyTorch, Tensorflow, and NLTK.
Provectus developed an automated solution for AI/ML-powered processing and classification of documents. Not only did we design and build a highly accurate ML model for observation classification, but we also set up an entire ecosystem, to ensure its performance and cost-efficiency. We delivered a secure and reproducible, end-to-end ML infrastructure with CI/CD pipelines and all integrations for user-friendly management of documents in PSC Biotech’s existing pipeline. Now, PSC can handle FDA Form 483 observations much faster, more accurately and cost-effectively, and at scale.
Automated AI/ML solution for observation classification delivered
ML model with precision and recall exceeding 70% on new FDA Form 483 observations
Secure, reproducible, end-to-end ML Infrastructure for intelligent document processing
Manual Review of FDA Form 483 Observations Is Too Slow, Inefficient, and Error-Prone
PSC Biotech has been helping companies in the Health Care & Life Sciences sector to provide the highest level of service, while complying with all applicable regulatory requirements for over two decades. The line of services offered by PSC to its clients is extremely diverse, and most of them involve a certain amount of document work — collecting, processing, analyzing, and managing various documents that are streaming through their client network.
PSC is an established business whose document processing operations have been set up for years.
However, most of the processes in the pipeline were manual, meaning that:
- More time and resources were needed to process documents
- Costs of processing were high, and perpetually rising over time
- Risks of errors due to human factor were ever-present
- The throughput rate depended entirely on the number of employees
- The accuracy of document processing remained stagnantly low, and fluctuated significantly over time
Given the sensitive nature of the HCLS business operations, any process that is slow, inefficient, and, most importantly, prone to errors poses a huge risk — the health and well-being of millions of people, as well as a company’s bottom line, could be impacted by a single mistake made by a document reviewer.
Consider FDA Form 483 observations. An FDA 483 observation is a notice sent by the FDA to highlight any potential regulatory violations (e.g. processes, controls, products, employee practices, etc.) found during a routine inspection. The cost of such an observation can be extremely high, but failure to introduce the required changes on time is even more costly. The resulting fines can cost any HCLS business millions or even billions of dollars.
PSC Biotech handles thousands of FDA Form 483 observations per year. Its document mappers and reviewers had been doing their best, but the need to automate its document processing pipeline was long overdue. This could be achieved by adopting AI and implementing an ML-powered Intelligent Document Processing (IDP) solution to support document processing operations.
By augmenting the processing of documents with AI/ML/IDP, the leaders of PSC expected to decrease time spent on manual review of observations, decrease costs of form processing, mitigate risks of infractions made by mappers and reviewers, increase throughput rate, and increase the accuracy of document processing.
Provectus, an Artificial Intelligence (AI) consultancy, has developed and delivered a comprehensive suite of AI solutions, including the IDP Solution. Given our expertise in the provision of AI/ML services, Provectus was chosen to design and build an automated observation classification solution for PSC Biotech.
Building an Observation Classification Solution: From a Baseline Model to ML Infrastructure
Provectus approached the project via a series of engagements that involved data, model work (training, deployment to production, retraining), infrastructure and pipelines, logging and monitoring components, and a UI for document management.
During phase one, the Provectus team properly explored and prepared the data, set up all required environments (development, management, experimentation), established a baseline for text classification, and developed the observation classification model. A secure and reproducible, end-to-end machine learning infrastructure for experimentation and model training was also delivered as a part of the phase one engagement.
Provectus took advantage of various tools and services, including:
- Amazon S3 and AWS Glue, that were utilized as a data lake to store texts, document observations, and model predictions
- Amazon Comprehend that helped to develop a baseline for text classification
- The suite of Amazon SageMaker services that were used as a foundation for the model training pipeline
- Deep Learning and Natural Language Processing (NLP) algorithms that enabled us to extract and classify observations more accurately
- Frameworks: PyTorch, Tensorflow, and NLTK
Observation classification was delivered as a multi-label classification model that could categorize FDA Form 483 observations into 216 classes, with precision and recall of 70% or greater. The observations are automatically labeled and can be grouped into different categories. Users can search for observations by selecting any of the model-generated labels.
The phase one engagement of Provectus and PSC Biotech was successful, and both companies agreed to continue with phases two and three.
During phase two, the Provectus team developed the ML API service (based on OpenAPI specification) and enhanced the ML release cycle. The delivery of pipelines, including those for CI/CD, logging and monitoring, and model retraining, was prioritized. The ML infrastructure for the production environment was set up, building on the already existing foundation from phase one.
During phase three, our goal was to:
- Develop a user-friendly UI for document processing, to enable the employees of PSC Biotech to map and review forms more quickly
- Deliver a service for smooth integration of the machine learning component into the existing document processing pipeline
- Develop a CI/CD pipeline that could retrain the existing model and push it into production when it achieved higher precision and recall.
All phases were realized in due manner, step-by-step, in close collaboration between the Provectus and PSC Biotech teams.
Automating Document Processing with AI Leads to Fewer Errors, Faster Processing, and Cost Savings
PSC Biotech is a trusted provider of services in Health Care & Life Sciences. It is critical for PSC to be able to process FDA Form 483 observations as quickly and as accurately as possible, to ensure their clients stay up to date in introducing necessary changes to their products, services, and operations.
Augmenting the document processing pipeline with AI and automation is considered by PSC Biotech to be a key step towards meeting their goals of providing the highest standard of HCLS services. Provectus helped PSC on their path to excellence by contributing our AI/ML/IDP expertise.
The Provectus team designed and built an automated, AI/ML-powered observation classification solution from scratch. We ensured that the model returns precision and recall of no less than 70%, helping PSC to automate a significant portion of the labeling, mapping, and review of observation forms. We also delivered an entire ecosystem around the model, with ML infrastructure, CI/CD pipelines, monitoring and UI components, and more.
The new observation classification solution has enabled PSC Biotech to dramatically decrease the time spent on manual review of observations, optimize processing costs, and increase accuracy and throughput of document processing while mitigating risks of infractions made by mappers and reviewers.
PSC Biotech is now ready to pursue other opportunities in the AI/ML space, and Provectus will remain its trusted advisor. We look forward to future collaboration with PSC, to further empower their delivery of high-quality consulting services in the HCLS sector.
- Learn more about the Provectus Intelligent Document Processing (IDP) solution
- Watch the webinar on Choosing the right document processing solution for healthcare organizations
- Apply for Intelligent Document Processing Solution Discovery Program to get started
Looking to explore the solution?