Technical Reference

Data Model

A normalized, RxNorm-centered schema with 9 modular layers — from drug hierarchy and clinical intelligence through economics, formulary access, PBM attribution, and pharmacy network. Every prod table is MotherDuck-hosted and API-accessible.

9 layers20+ tables50K+ drugs68K pharmacies96.9% PBM coverage
Entity Relationships

Schema Overview

Core entity relationships across the drug hierarchy, enrichment layers, pricing, and network data. Click any table in the detailed view below to inspect columns.

ingredientdrug_profilendc_crosswalkref.icd10drug_icd10_estimatenadac_pricingmart_ndc_economicssuspecting_ruledrug_formulary_signalpbm_market_snapshotpharmacy_directory
Foundation / Intelligence
Suspecting Rules
NDC Crosswalk
Pricing / Economics
Formulary & Access
PBM & Plan
Pharmacy Network
Detailed Schema

Tables by Layer

RxNorm Foundation

Foundation
drug_profile
25 cols

RxNorm drug hierarchy enriched with engineered therapy behavior classification. Therapy attributes are derived at the clinical drug form level and propagated to all strength and branded variants. Primary drug table for most analytics — it combines drug hierarchy with therapy context in a single row.

ingredient
2 cols

RxNorm ingredient concepts (TTY = IN or MIN). Anchor for ATC classification and ingredient-level economics.

icd10
19 cols

ICD-10-CM diagnosis code reference with full classification hierarchy from chapter down to the most specific subcategory. Covers all codes including non-billable parent categories. Filter is_billable = true to restrict to codes that can appear on medical or pharmacy claims.

atc
10 cols

WHO Anatomical Therapeutic Chemical (ATC) classification system. All five levels of the hierarchy are denormalized onto each row. Level 5 is the individual drug substance level; levels 1–4 represent progressively broader therapeutic groupings.

atc_rxnorm
2 cols

Crosswalk between ATC level-5 drug classification codes and RxNorm ingredient identifiers. One ingredient may map to multiple ATC codes (e.g., a drug with multiple approved therapeutic uses) and one ATC code may map to multiple ingredient RxCUIs (e.g., salt variants of the same molecule).

ndc_crosswalk
5 cols

Multi-source NDC-to-drug_id resolution. Combines RxNav ndc2rxcui, FDA NDC directory, and supplemental manual overrides. source_rank encodes provenance confidence.

Drug Intelligence

Intelligence
drug_icd10_estimate
9 cols

Engineered ICD-10 diagnosis code volume estimates for drug products. Each row represents the estimated share of prescriptions for a given drug written for a specific diagnosis. Covers drugs classified as maintenance_continuous, maintenance_intermittent, or prn_chronic by the therapy behavior pipeline.

Diagnosis Suspecting

Suspecting
suspecting_rule
12 cols

Executable clinical logic for diagnosis inference from pharmacy claims. Supports prescriber specialty, patient age/gender, concurrent Rx (ATC class-driven), medical claims history, and compound multi-signal rules.

NADAC Pricing

NADAC
nadac_pricing
12 cols

Weekly CMS NADAC rates per NDC. National average drug acquisition cost for retail pharmacy — what pharmacies actually pay. Sample contains the most recent price per NDC.

ASP Pricing

ASP
asp_payment_limits
14 cols

Quarterly Medicare Part B ASP+6% payment limits per HCPCS code. Used for medical-channel drug cost benchmarking (physician-administered biologics, injectables).

asp_ndc_hcpcs_crosswalk
13 cols

CMS crosswalk mapping NDC-level package data to HCPCS codes used in Medicare Part B ASP pricing. Supports unit conversion between pharmacy packages and billing units.

asp_hcpcs_rxcui_map
6 cols

HCPCS J-code to RxNorm product crosswalk for Medicare Part B drug billing codes. Resolves each HCPCS code to its corresponding RxNorm SCD/SBD concept, enabling joins between ASP payment limits and the drug intelligence pipeline.

Package Normalization

Packaging
normalized_package_size
10 cols

Deterministic package size and billing unit normalization for every NDC11 with FDA package data. Parses free-text package descriptions into structured total_quantity and billing_unit fields.

Drug Economics

Economics
mart_ndc_economics
13 cols

NDC-level cost economics. Combines NADAC acquisition cost and ASP Part B payment limits with package normalization to yield price per billing unit and price per package.

mart_ndc_therapy_economics
19 cols

NDC-level therapy cost mart adding WHO DDD-derived daily, 30-day, and 90-day cost estimates to mart_ndc_economics. Only applicable where a meaningful daily dose can be derived.

mart_clinical_drug_economics
24 cols

Drug-level cost aggregates. Min/median/max cost statistics across all NDCs; daily, 30-day, and 90-day therapy cost where dose is derivable.

mart_ingredient_economics
14 cols

Ingredient-level cost rollup across all clinical drug forms for the ingredient. Useful for class benchmarking where brand/generic distinctions are unimportant.

mart_atc_class_economics
10 cols

ATC 4th-level therapeutic class benchmarks. 30-day cost min/median/max across drugs in the class with applicable therapy economics.

Part D Formulary Access

Formulary
drug_formulary_signal
12 cols

Drug-level Medicare Part D formulary restriction prevalence. Captures prior auth, step therapy, quantity limit, and tier placement across all covering plans for a given year.

drug_access_friction
5 cols

Composite payer access barrier score (0–1) per drug and year. Weighted combination of PA prevalence (35%), step therapy (25%), quantity limits (15%), tier position (10%), and non-coverage rate (15%).

Medicare Plan Intelligence

Medicare
partd_plan_dim
8 cols

Medicare Part D and MA-PD plan dimension. One row per plan year × CMS contract × plan benefit package, with sponsor, marketing name, plan type, and market segment.

pbm_inference
8 cols

PBM attribution derived from observed Medicare plan artifacts (BIN/PCN routing, plan documents). Captures the primary PBM managing pharmacy benefits for a plan/year with supporting evidence.

pbm_market_snapshot
9 cols

Historical PBM attribution per Medicare contract/PBP across all CMS plan years. Built from Medicare plan artifact inference (primary) and BIN/PCN backfill (secondary). 96.9% attribution coverage.

plan_routing
9 cols

Pharmacy claim routing identifiers (RxBIN, RxPCN, RxGroup, M3P overrides) per Medicare plan/year. Use to route claims data to the correct plan and PBM.

Pharmacy Directory

Pharmacy
pharmacy_directory
24 cols

US pharmacy roster filtered from NPPES organizations with pharmacy taxonomy and CMS Part D network presence. Includes chain attribution, PSAO affiliation, type classification, and geocoded location.

Integration Patterns

Common Join Paths

Pre-validated join paths for common analytical workflows. Each pattern is a tested integration available through the API or directly via MotherDuck.

Drug Hierarchy

Navigate the RxNorm concept tree from ingredient through drug form to specific product and NDC

ingredientdrug_profilendc_crosswalk
Diagnosis Mapping

Resolve drug-to-condition relationships with utilization-weighted prescription volume estimates

drug_profiledrug_icd10_estimateref.icd10
Suspecting Logic

From drug concept to executable inference rules with structured JSON rule logic

drug_profilesuspecting_rulerule_logic (JSONB)
Pharmacy Pricing

Weekly acquisition cost per NDC (NADAC) or quarterly Medicare Part B limits (ASP) via crosswalk

ndc_crosswalknadac_pricing / asp_payment_limits
Drug Economics

Therapy cost benchmarks from NDC level through drug, ingredient, and ATC class rollups

ndc_crosswalkmart_ndc_economicsmart_clinical_drug_economics
Formulary Access

Part D formulary restriction prevalence and composite access friction score by drug and year

drug_profiledrug_formulary_signaldrug_access_friction
PBM Attribution

Identify the PBM managing a pharmacy benefit from BIN/PCN routing or Medicare plan inference

bin_pcn_pbm_mappbm_market_snapshot
Controlled Vocabularies

Enumerated Values

Standardized enum values used across enrichment layers. All values are validated at ingestion time.

therapy_pattern
maintenance_continuousmaintenance_intermittentacute_short_courseprn_chronicprn_acutemixed_contextual
benefit_channel
pharmacy_onlymedical_onlydual_channel
rule_category
prescriber_specialtypatient_agepatient_genderconcurrent_rxmedical_claims_historycompoundsingle_indication
rule_strength
exclusionstrong_priorweak_prior
indication_direction
towardawayswitch
signal_source
rx_onlydemographicsconcurrent_rxmedical_claims_historycompound
price_source
nadacasp
billing_unit
EAMLGM
package_econ_confidence
highmediumlow
friction_category
very_lowlowmoderatehighvery_high
attribution_method
inferencebin_pcn_backfillbin_backfill
pharmacy_type
independent_unaffiliatednational_chainmass_merchltcgroceryspecialtyregionalmail_order
Conventions

Column Naming

ConceptColumn NameRxNorm TTYExample
Drug (ingredient + form + strength)drug_idSCD / SBD617314
Ingredientingredient_rxcuiIN / MIN321988
Clinical Drug Form (no strength)clinical_drug_form_rxcuiSCDF / SBDF370621
NDC 11-digitndc1100071101301
NDC 9-digit (product)ndc9000711013
NDC labelerlabeler_code00071
HCPCS Codehcpcs_codeJ7502
ATC Level 5 (substance)atc_code_5C10AA05
ATC Level 4 (pharmacology class)atc4_codeC10AA
National Provider Identifiernpi1234567890