CTCMPAO Learning Hub MVP
Knowledge pipeline and Wagtail learning hub prototype for a regulated health college — crawl, extract, triage, publish.

The College of Traditional Chinese Medicine Practitioners and Acupuncturists of Ontario needed a learning hub to help its members navigate regulatory standards and professional development requirements. I built a working prototype to demonstrate how the college's existing web content could be automatically ingested, structured, and served through a modern learning platform.
The core of the project is a three-layer knowledge pipeline. The inventory layer crawls ctcmpao.on.ca and extracts content from web pages and PDFs using Beautiful Soup, lxml, and pdfplumber. The knowledge layer normalises, tags, and triages extracted material into canonical database records — Standards of Practice, discipline decisions, practice guidelines — with heading-based chunking for precise retrieval. The hub layer syncs canonical knowledge into Wagtail pages with full-text search and a practitioner-facing browsing experience.
The project is archived — the RFP process concluded — but it demonstrates how a content-heavy regulatory site can be transformed into a structured, searchable knowledge base with a single orchestration command.
Outcome
Competitive RFP prototype with a three-layer knowledge pipeline: automated crawling and PDF extraction of regulatory content, triage workflow for quality control, and a Wagtail learning hub surfacing Standards of Practice, discipline decisions, and professional development material.