Shevs Connect | Full-Width Navigation
Mon – Fri: 08:00am – 05:00pm Magen Plaza, Juja – Along Thika Road, next to Juja Flyover. info@shevsconnectinstitute.com +254 780 253160

Clinical NLP & Medical Coding

SCILabel | Medical Imaging & Radiology AI

Clinical NLP & Medical Coding

Transforming unstructured clinical text into structured intelligencer

Industry Challenge | SCILabel

Industry Challenge

An estimated 80% of healthcare data exists as unstructured text — clinical notes, discharge summaries, referral letters, pathology reports, and radiology findings. Clinical NLP models extract structured information from this text to power clinical decision support, population analytics, and automated medical coding. These models must be trained on expert-annotated clinical text — a task that requires genuine clinical understanding, not generic text labeling

Radiology AI Challenge
How SCILabel Serves This Industry | Radiology AI

How SCILabel Serves This Industry

Data Collection

We source de-identified clinical text datasets from hospital electronic health record systems, clinical research databases, and academic medical centres under data sharing agreements. Data types include discharge summaries, progress notes, operative reports, radiology reports, pathology narratives, and referral letters. All text undergoes HIPAA Safe Harbor de-identification before annotation.

Data Annotation & Labeling

Our Track 1 (Medical NLP) workforce — doctors, nurses, pharmacists, and health informatics specialists — annotates clinical text with named entities (diseases, symptoms, medications, procedures, anatomy), relations (medication–indication, disease–treatment), assertions (present/absent/uncertain/historical), and normalised codes (ICD-10-CM, SNOMED CT, LOINC, RxNorm). We support annotation in BRAT, Label Studio, Prodigy, and custom platforms.

Data & Model Evaluation

NLP evaluators benchmark model precision, recall, and F1 on entity recognition, relation extraction, and ICD-10 coding accuracy against expert-coded gold standards. We test for performance variation across clinical specialty and documentation style.

Annotation Types & Formats

  • Named entity recognition: diseases, symptoms, medications, procedures, anatomy
  • Relation extraction: medication–dose–route–frequency, symptom–diagnosis
  • Assertion classification: present, absent, possible, historical, family history
  • ICD-10-CM, SNOMED CT, LOINC, and RxNorm code assignment
  • Sentence-level and document-level clinical classification
  • De-identification review and PHI validation

Specialist Workforce Tracks

Track 1 (Medical NLP): Medical Doctors, Nurses, Pharmacists, Health Informatics Specialists, Clinical Coders.

Example Deliverable | SCILabel

Example Deliverable

Client Output Example
A 50,000-sentence annotated clinical notes corpus with NER labels across 12 entity types, ICD-10-CM and SNOMED CT codes, and assertion classes — delivered in CoNLL and JSON formats with IAA Kappa of 0.89 and QA certification.