Forge builds a GenAI document processing solution that reads Certificates of Incorporation, extracts the data, and hands managers a reviewable spreadsheet, so corporate docs stop blocking document-heavy workflows.
Client profile
A fintech company providing liquidity in private markets
Industry
Financial Services
Region
North America, EMEA
From idea to production document processing solution
Of documents now reviewed by humans on first run
Forge Global has been providing liquidity in private markets since 2014 – a platform for trading shares in pre-IPO companies globally. Private-market transactions generate a lot of unstructured corporate paperwork. Forge’s managers process every page of it.
01 The ChallengeCertificates of Incorporation and related corporate documents come in large volumes, variable formats, and jurisdictional jargons. Legal and financial terminology is specialized. Structure shifts by company, jurisdiction, and time period. None of it fits a clean schema.
The work could not be offshored, but security and compliance required Forge’s own managers to handle the data in-house. That meant a dedicated team spending its hours on manual data processing rather than on the higher-value judgement and decision-making parts of the workflow.
02 The ApproachProvectus started the engagement with a discovery session, then broke the work into four focused phases: a data lake, ML pipelines, deep-learning data extraction, and LLMs for text understanding.
The goal was clear: 95% automation, 5% human review at production quality. The accuracy target for the ML model was 70% or higher before operationalization. Further improvements would be driven by evidence gathered during model development, not by scope creep mid-build.
03 The BuildInfrastructure runs on AWS. The pipeline ingests PDF and Excel files, runs passage classification and field extraction as an integrated pipeline, and uses a selected LLM to extract specific values from unstructured text.
The outputs land as user-friendly spreadsheets that managers review. Every manager correction feeds back into filtering and model calibration. Infrastructure improvements, observability, and security were treated as in-scope from the start.
The initial ML model was trained on a Forge-supplied dataset of PDFs and spreadsheets, with a Provectus subject-matter expert validating markup accuracy before production runs.
04 The ResultsForge’s managers now upload corporate documents, review an automatically generated spreadsheet, and approve. The volume the team handles scales with infrastructure, not with headcount. Manager time that used to go into manual document work goes into higher-value review and decision work.
4 months
From idea to an in-production document processing solution
The engagement also built the robust foundation – AWS infrastructure, observability, CI/CD, security components – that Forge can apply if the next document type is added to the workflow.
05 What’s NextThe solution’s pipeline is architected for reuse: add a new document type, retrain against it, adjust the reviewer UI. Provectus continues the engagement on the extension path as Forge’s private-market operations grow.