Technical Reference
Data Model
A normalized, RxNorm-centered schema with 9 modular layers — from drug hierarchy and clinical intelligence through economics, formulary access, PBM attribution, and pharmacy network. Every prod table is MotherDuck-hosted and API-accessible.
Schema Overview
Core entity relationships across the drug hierarchy, enrichment layers, pricing, and network data. Click any table in the detailed view below to inspect columns.
Tables by Layer
RxNorm Foundation
FoundationRxNorm drug hierarchy enriched with engineered therapy behavior classification. Therapy attributes are derived at the clinical drug form level and propagated to all strength and branded variants. Primary drug table for most analytics — it combines drug hierarchy with therapy context in a single row.
RxNorm ingredient concepts (TTY = IN or MIN). Anchor for ATC classification and ingredient-level economics.
ICD-10-CM diagnosis code reference with full classification hierarchy from chapter down to the most specific subcategory. Covers all codes including non-billable parent categories. Filter is_billable = true to restrict to codes that can appear on medical or pharmacy claims.
WHO Anatomical Therapeutic Chemical (ATC) classification system. All five levels of the hierarchy are denormalized onto each row. Level 5 is the individual drug substance level; levels 1–4 represent progressively broader therapeutic groupings.
Crosswalk between ATC level-5 drug classification codes and RxNorm ingredient identifiers. One ingredient may map to multiple ATC codes (e.g., a drug with multiple approved therapeutic uses) and one ATC code may map to multiple ingredient RxCUIs (e.g., salt variants of the same molecule).
Multi-source NDC-to-drug_id resolution. Combines RxNav ndc2rxcui, FDA NDC directory, and supplemental manual overrides. source_rank encodes provenance confidence.
Drug Intelligence
IntelligenceEngineered ICD-10 diagnosis code volume estimates for drug products. Each row represents the estimated share of prescriptions for a given drug written for a specific diagnosis. Covers drugs classified as maintenance_continuous, maintenance_intermittent, or prn_chronic by the therapy behavior pipeline.
Diagnosis Suspecting
SuspectingExecutable clinical logic for diagnosis inference from pharmacy claims. Supports prescriber specialty, patient age/gender, concurrent Rx (ATC class-driven), medical claims history, and compound multi-signal rules.
NADAC Pricing
NADACWeekly CMS NADAC rates per NDC. National average drug acquisition cost for retail pharmacy — what pharmacies actually pay. Sample contains the most recent price per NDC.
ASP Pricing
ASPQuarterly Medicare Part B ASP+6% payment limits per HCPCS code. Used for medical-channel drug cost benchmarking (physician-administered biologics, injectables).
CMS crosswalk mapping NDC-level package data to HCPCS codes used in Medicare Part B ASP pricing. Supports unit conversion between pharmacy packages and billing units.
HCPCS J-code to RxNorm product crosswalk for Medicare Part B drug billing codes. Resolves each HCPCS code to its corresponding RxNorm SCD/SBD concept, enabling joins between ASP payment limits and the drug intelligence pipeline.
Package Normalization
PackagingDeterministic package size and billing unit normalization for every NDC11 with FDA package data. Parses free-text package descriptions into structured total_quantity and billing_unit fields.
Drug Economics
EconomicsNDC-level cost economics. Combines NADAC acquisition cost and ASP Part B payment limits with package normalization to yield price per billing unit and price per package.
NDC-level therapy cost mart adding WHO DDD-derived daily, 30-day, and 90-day cost estimates to mart_ndc_economics. Only applicable where a meaningful daily dose can be derived.
Drug-level cost aggregates. Min/median/max cost statistics across all NDCs; daily, 30-day, and 90-day therapy cost where dose is derivable.
Ingredient-level cost rollup across all clinical drug forms for the ingredient. Useful for class benchmarking where brand/generic distinctions are unimportant.
ATC 4th-level therapeutic class benchmarks. 30-day cost min/median/max across drugs in the class with applicable therapy economics.
Part D Formulary Access
FormularyDrug-level Medicare Part D formulary restriction prevalence. Captures prior auth, step therapy, quantity limit, and tier placement across all covering plans for a given year.
Composite payer access barrier score (0–1) per drug and year. Weighted combination of PA prevalence (35%), step therapy (25%), quantity limits (15%), tier position (10%), and non-coverage rate (15%).
Medicare Plan Intelligence
MedicareMedicare Part D and MA-PD plan dimension. One row per plan year × CMS contract × plan benefit package, with sponsor, marketing name, plan type, and market segment.
PBM attribution derived from observed Medicare plan artifacts (BIN/PCN routing, plan documents). Captures the primary PBM managing pharmacy benefits for a plan/year with supporting evidence.
Historical PBM attribution per Medicare contract/PBP across all CMS plan years. Built from Medicare plan artifact inference (primary) and BIN/PCN backfill (secondary). 96.9% attribution coverage.
Pharmacy claim routing identifiers (RxBIN, RxPCN, RxGroup, M3P overrides) per Medicare plan/year. Use to route claims data to the correct plan and PBM.
Pharmacy Directory
PharmacyUS pharmacy roster filtered from NPPES organizations with pharmacy taxonomy and CMS Part D network presence. Includes chain attribution, PSAO affiliation, type classification, and geocoded location.
Common Join Paths
Pre-validated join paths for common analytical workflows. Each pattern is a tested integration available through the API or directly via MotherDuck.
Navigate the RxNorm concept tree from ingredient through drug form to specific product and NDC
Resolve drug-to-condition relationships with utilization-weighted prescription volume estimates
From drug concept to executable inference rules with structured JSON rule logic
Weekly acquisition cost per NDC (NADAC) or quarterly Medicare Part B limits (ASP) via crosswalk
Therapy cost benchmarks from NDC level through drug, ingredient, and ATC class rollups
Part D formulary restriction prevalence and composite access friction score by drug and year
Identify the PBM managing a pharmacy benefit from BIN/PCN routing or Medicare plan inference
Enumerated Values
Standardized enum values used across enrichment layers. All values are validated at ingestion time.
Column Naming
| Concept | Column Name | RxNorm TTY | Example |
|---|---|---|---|
| Drug (ingredient + form + strength) | drug_id | SCD / SBD | 617314 |
| Ingredient | ingredient_rxcui | IN / MIN | 321988 |
| Clinical Drug Form (no strength) | clinical_drug_form_rxcui | SCDF / SBDF | 370621 |
| NDC 11-digit | ndc11 | — | 00071101301 |
| NDC 9-digit (product) | ndc9 | — | 000711013 |
| NDC labeler | labeler_code | — | 00071 |
| HCPCS Code | hcpcs_code | — | J7502 |
| ATC Level 5 (substance) | atc_code_5 | — | C10AA05 |
| ATC Level 4 (pharmacology class) | atc4_code | — | C10AA |
| National Provider Identifier | npi | — | 1234567890 |