Documentation
Drug Data Model
TwinFyRx is built on top of RxNorm and focuses on establishing a consistent, high-fidelity representation of drug concepts across heterogeneous healthcare datasets.
The system normalizes multiple identifiers and representations of medications into a small number of stable RxNorm concept layers while preserving the ability to attach enrichment and analytics at the appropriate level of the hierarchy.
RxNorm
TwinFyRx uses RxNorm as the foundational ontology for drug concepts. RxNorm provides a standardized hierarchy linking:
- Ingredients
- Dose forms
- Clinical drug concepts
- Branded drug concepts
- NDC package identifiers
Rather than introducing a proprietary drug ontology, TwinFyRx adopts RxNorm directly and structures downstream datasets around its concept hierarchy. This preserves interoperability with the broader healthcare ecosystem while enabling additional enrichment layers.
Canonical Drug Concept
The atomic unit of the TwinFyRx model is the clinical drug concept defined in RxNorm. These correspond to RxNorm concept types:
Semantic Clinical Drug
Semantic Branded Drug
Within TwinFyRx this identifier is represented as drug_id
Clinical drug concepts represent a specific combination of ingredient, dose form, and strength.
| Example | Concept Type |
|---|---|
| atorvastatin 20 MG Oral Tablet | SCD |
| Lipitor 20 MG Oral Tablet | SBD |
When sufficient information is available, TwinFyRx resolves drug records to this concept level. If a branded product is explicitly identified, records map to the corresponding SBD concept. Otherwise, records map to the SCD concept.
Ingredient Concepts
TwinFyRx maintains ingredient-level concepts derived from RxNorm, corresponding to concept types:
Single ingredient
Multi-ingredient combination
Ingredient identifiers are represented as ingredient_rxcui. Ingredient concepts anchor several enrichment layers including therapeutic classification and ingredient-level analytics.
Clinical Drug Form
Certain analytic attributes depend on dose form but not strength. RxNorm defines this intermediate layer using:
Semantic Clinical Drug Form
Semantic Branded Drug Form
These represent ingredient + dose form but exclude strength. TwinFyRx maintains these concepts using the identifier clinical_drug_form_rxcui
Concept Hierarchy
Active ingredient or combination
Ingredient + dose form (strength-agnostic)
Ingredient + dose form + strength
This layer provides a useful boundary for attaching attributes that should apply across strength variants of the same medication.
Identifier Normalization
Healthcare datasets reference drugs using multiple identifier systems. TwinFyRx maintains crosswalk layers that resolve these identifiers into the RxNorm concept hierarchy.
| Source Identifier | Resolves To |
|---|---|
| NDC | drug_id |
| HCPCS | drug_id |
| Ingredient text | ingredient_rxcui |
| Free-text drug name | drug_id / clinical_drug_form_rxcui / ingredient_rxcui |
These mappings allow datasets originating from different domains to converge on a consistent concept model.
NDC
TwinFyRx maintains normalized representations of NDC identifiers including:
- NDC11 (11-digit)
- NDC9 (9-digit)
- Labeler code
NDC identifiers are associated with clinical drug concepts where appropriate, allowing package-level data to be aggregated at the clinical drug level.
HCPCS
Medical-benefit drugs frequently appear in datasets as HCPCS codes. These codes represent drug products that correspond to clinical drug concepts.
Normalization leverages the CMS HCPCS-to-NDC reference mappings before resolving to clinical drug concepts. This intermediate step improves mapping precision when package-level identifiers are available.
Therapeutic Classification
TwinFyRx maintains mappings between ingredient concepts and therapeutic classification systems such as ATC. This allows therapeutic grouping without duplicating classifications across multiple strength variants of the same medication.
Textual Drug References
Some datasets contain drug references only as free text. TwinFyRx resolves these references using a layered normalization process that combines deterministic parsing, curated synonym mappings, and reconciliation against the RxNorm concept hierarchy.
Depending on the available information, normalization may resolve to:
drug_idClinical drug concept (full specificity)clinical_drug_form_rxcuiDrug form (strength-agnostic)ingredient_rxcuiIngredient only
Model Philosophy
RxNorm first
RxNorm provides the canonical drug ontology. TwinFyRx builds on top of it rather than redefining drug concepts.
Concept-appropriate enrichment
Attributes are attached at the appropriate layer of the hierarchy (ingredient, drug form, or clinical drug) depending on the scope of the attribute.
Identifier normalization
Drug identifiers from heterogeneous datasets are resolved into the RxNorm concept hierarchy wherever possible.
Stable atomic units
Clinical drug concepts serve as the primary atomic unit for analytics while allowing aggregation at higher levels of the hierarchy.
TwinFyRx focuses on maintaining a consistent drug concept model across heterogeneous healthcare datasets. By anchoring identifiers to the RxNorm hierarchy and attaching enrichment at the appropriate level of abstraction, the system provides a stable foundation for downstream analytics, pricing intelligence, and utilization analysis.