Documentation

Drug Data Model

TwinFyRx is built on top of RxNorm and focuses on establishing a consistent, high-fidelity representation of drug concepts across heterogeneous healthcare datasets.

The system normalizes multiple identifiers and representations of medications into a small number of stable RxNorm concept layers while preserving the ability to attach enrichment and analytics at the appropriate level of the hierarchy.

Foundation

RxNorm

TwinFyRx uses RxNorm as the foundational ontology for drug concepts. RxNorm provides a standardized hierarchy linking:

  • Ingredients
  • Dose forms
  • Clinical drug concepts
  • Branded drug concepts
  • NDC package identifiers

Rather than introducing a proprietary drug ontology, TwinFyRx adopts RxNorm directly and structures downstream datasets around its concept hierarchy. This preserves interoperability with the broader healthcare ecosystem while enabling additional enrichment layers.

Core Concept

Canonical Drug Concept

The atomic unit of the TwinFyRx model is the clinical drug concept defined in RxNorm. These correspond to RxNorm concept types:

SCD

Semantic Clinical Drug

SBD

Semantic Branded Drug

Within TwinFyRx this identifier is represented as drug_id

Clinical drug concepts represent a specific combination of ingredient, dose form, and strength.

ExampleConcept Type
atorvastatin 20 MG Oral TabletSCD
Lipitor 20 MG Oral TabletSBD

When sufficient information is available, TwinFyRx resolves drug records to this concept level. If a branded product is explicitly identified, records map to the corresponding SBD concept. Otherwise, records map to the SCD concept.

Concept Layer

Ingredient Concepts

TwinFyRx maintains ingredient-level concepts derived from RxNorm, corresponding to concept types:

IN

Single ingredient

MIN

Multi-ingredient combination

Ingredient identifiers are represented as ingredient_rxcui. Ingredient concepts anchor several enrichment layers including therapeutic classification and ingredient-level analytics.

Intermediate Layer

Clinical Drug Form

Certain analytic attributes depend on dose form but not strength. RxNorm defines this intermediate layer using:

SCDF

Semantic Clinical Drug Form

SBDF

Semantic Branded Drug Form

These represent ingredient + dose form but exclude strength. TwinFyRx maintains these concepts using the identifier clinical_drug_form_rxcui

Concept Hierarchy

ingredient_rxcuiIN / MIN

Active ingredient or combination

clinical_drug_form_rxcuiSCDF / SBDF

Ingredient + dose form (strength-agnostic)

drug_idSCD / SBD

Ingredient + dose form + strength

This layer provides a useful boundary for attaching attributes that should apply across strength variants of the same medication.

Crosswalks

Identifier Normalization

Healthcare datasets reference drugs using multiple identifier systems. TwinFyRx maintains crosswalk layers that resolve these identifiers into the RxNorm concept hierarchy.

Source IdentifierResolves To
NDCdrug_id
HCPCSdrug_id
Ingredient textingredient_rxcui
Free-text drug namedrug_id / clinical_drug_form_rxcui / ingredient_rxcui

These mappings allow datasets originating from different domains to converge on a consistent concept model.

Identifier

NDC

TwinFyRx maintains normalized representations of NDC identifiers including:

  • NDC11 (11-digit)
  • NDC9 (9-digit)
  • Labeler code

NDC identifiers are associated with clinical drug concepts where appropriate, allowing package-level data to be aggregated at the clinical drug level.

Identifier

HCPCS

Medical-benefit drugs frequently appear in datasets as HCPCS codes. These codes represent drug products that correspond to clinical drug concepts.

Normalization leverages the CMS HCPCS-to-NDC reference mappings before resolving to clinical drug concepts. This intermediate step improves mapping precision when package-level identifiers are available.

Classification

Therapeutic Classification

TwinFyRx maintains mappings between ingredient concepts and therapeutic classification systems such as ATC. This allows therapeutic grouping without duplicating classifications across multiple strength variants of the same medication.

Resolution

Textual Drug References

Some datasets contain drug references only as free text. TwinFyRx resolves these references using a layered normalization process that combines deterministic parsing, curated synonym mappings, and reconciliation against the RxNorm concept hierarchy.

Depending on the available information, normalization may resolve to:

  • drug_idClinical drug concept (full specificity)
  • clinical_drug_form_rxcuiDrug form (strength-agnostic)
  • ingredient_rxcuiIngredient only
Principles

Model Philosophy

RxNorm first

RxNorm provides the canonical drug ontology. TwinFyRx builds on top of it rather than redefining drug concepts.

Concept-appropriate enrichment

Attributes are attached at the appropriate layer of the hierarchy (ingredient, drug form, or clinical drug) depending on the scope of the attribute.

Identifier normalization

Drug identifiers from heterogeneous datasets are resolved into the RxNorm concept hierarchy wherever possible.

Stable atomic units

Clinical drug concepts serve as the primary atomic unit for analytics while allowing aggregation at higher levels of the hierarchy.

TwinFyRx focuses on maintaining a consistent drug concept model across heterogeneous healthcare datasets. By anchoring identifiers to the RxNorm hierarchy and attaching enrichment at the appropriate level of abstraction, the system provides a stable foundation for downstream analytics, pricing intelligence, and utilization analysis.