Data Dictionary Compliance Testing

← All packages / @reso-standards/reso-certification / Data Dictionary Compliance Testing

Data Dictionary Compliance Testing

Validates OData servers against the RESO Data Dictionary 2.0 specification. DD 2.1 coming soon.

Replaces the Commander-based DD workflow entirely – metadata serialization, Lookup Resource fetching, and replication all happen natively in TypeScript, calling cert-utils inner functions directly for the replication strategies and variations checking.

Note: RESO no longer certifies providers on DD 1.7. The CLI enforces DD 2.0. DD 1.7 is available via the SDK for historical compatibility only.

Usage

reso-cert dd --url https://api.example.com --auth-token TOKEN

# Strict mode (fail on variations and schema validation errors)
reso-cert dd --url https://api.example.com --auth-token TOKEN --strict

# Batch expand optimization (see below)
reso-cert dd --url https://api.example.com --auth-token TOKEN --batch-expand

# Limit records per resource
reso-cert dd --url https://api.example.com --auth-token TOKEN --limit 10000

Pipeline

The DD pipeline executes these steps:

Step	What It Does
Health check	Wait for server to respond
Resolve auth	Bearer token or OAuth2 Client Credentials
Generate metadata report	Fetch `/$metadata`, serialize to metadata-report.json, fetch and merge Lookup Resource data if available
Initialize replication state	Shared state service that tracks stats across all replication strategies
Check variations (DD 2.0)	Compare metadata against reference DD, flag non-standard fields/lookups
Replicate: TIMESTAMP_DESC	Fetch all records ordered by ModificationTimestamp descending
Replicate: NEXT_LINK (DD 2.0)	Fetch all records using server-driven paging
Replicate: NEXT_LINK + filter (DD 2.0)	Fetch recent records with ModificationTimestamp filter

DD 2.0 Testing

Capability	Details
Replication strategies	TIMESTAMP_DESC + NEXT_LINK + NEXT_LINK with filter
Variations check	Yes
JSON schema validation	Yes (in strict mode)
Page size	1000

Expansion Strategies

During replication, the DD pipeline fetches both top-level resources and their expanded navigation properties (e.g., Property?$expand=Media). Two strategies are available:

Default: One Expansion per Request

Each navigation property gets its own request:

GET /Property
GET /Property?$expand=Media
GET /Property?$expand=OpenHouse
GET /Property?$expand=PropertyRooms
...

This is the safer approach – each response is manageable in size, and a failure on one expansion does not affect others.

`--batch-expand`: All expansions in a single request

All navigation properties for a resource are batched into one request:

GET /Property
GET /Property?$expand=Media,OpenHouse,PropertyRooms,UnitTypes,...

This reduces the number of HTTP round-trips significantly (e.g., 18 separate requests become 1). However, each response is much larger since it includes all related records inline.

Benchmark Results

Tested against the RESO reference server (Docker, PostgreSQL, 153 Property records with 18 navigation properties):

Strategy	Default (one-at-a-time)	Batch expand
TIMESTAMP_DESC	215s	417s
NEXT_LINK	154s	114s
NEXT_LINK + filter	155s	420s
Total	8m 44s	15m 51s

Key findings:

Batch expand was slower for this server because each expanded response was massive (153 records × 18 nav properties = huge JSON payloads)
The NEXT_LINK strategy was faster with batch expand because server-driven paging handles large payloads more efficiently
Batch expand is most beneficial for servers with many resources but lightweight expansions, or when combined with parallel execution
The default one-at-a-time approach is safer and faster for servers with large expansion payloads

Recommendation: Use the default unless you know your expansions are lightweight. Batch expand shines when round-trip latency is the bottleneck rather than payload size.

Options

Option	Default	Description
`--dd-version`	2.0	DD version (2.0)
`--limit`	100000	Max records to replicate per resource
`--strict`	false	Fail on variations, enforce JSON schema validation
`--batch-expand`	false	Batch all expansions per resource into a single `$expand` request

Metadata Report

The pipeline generates a metadata report (metadata-report.json or metadata-report.processed.json) in the RESO standard format:

{
  "description": "RESO Data Dictionary Metadata Report",
  "version": "2.0",
  "generatedOn": "2026-04-06T00:00:00.000Z",
  "resources": [{ "resourceName": "Property" }, ...],
  "fields": [
    {
      "resourceName": "Property",
      "fieldName": "ListPrice",
      "type": "Edm.Decimal",
      "nullable": true,
      "scale": 2,
      "precision": 14,
      "isEnumeration": false,
      "annotations": [...]
    }
  ],
  "lookups": [
    {
      "lookupName": "StandardStatus",
      "lookupValue": "Active",
      "type": "Edm.String",
      "annotations": [...]
    }
  ]
}

When the Lookup Resource is available, it is fetched (using @odata.nextLink pagination with $top/$skip fallback) and merged with the EDMX-based report. Fields with LookupName annotations get their type replaced with the lookup name.

Output Directory Structure

Results are written to a directory structure compatible with reso-certification-utils:

.reso-cert/
  data-dictionary-<version>/
    <providerUoi>-<providerUsi>/
      <recipientUoi>/
        current/
          metadata.xml                           # Raw EDMX XML from /$metadata
          metadata-report.json                   # Serialized metadata report (from EDMX)
          metadata-report.processed.json         # Merged report (if Lookup Resource available)
          lookup-resource-lookup-metadata.json    # Raw Lookup Resource data dump
          data-availability-report.json          # Replication results and field coverage
          data-availability-responses.json       # Raw OData response data
          data-dictionary-variations.json        # Variations report (DD 2.0)
        archived/
          20260406T033000000Z/                   # Previous run (auto-archived)
            ...

Path Construction

When using a config file (--config), the path components come from the config:

providerUoi and providerUsi from the top-level config
recipientUoi from each config entry

When running from CLI flags (no config), local placeholders are generated:

LOCAL-<timestamp> for providerUoi
LOCAL-SYSTEM for providerUsi
LOCAL-RECIPIENT for recipientUoi

Archiving

Each run automatically archives the previous current/ directory to archived/<timestamp>/ before writing new results. This preserves a full history of all runs for a given provider/recipient combination.

Migration from reso-certification-utils

For providers currently using reso-certification-utils, the output structure is identical. The results/ directory becomes .reso-cert/ – you can symlink for backward compatibility:

ln -s .reso-cert results

Report Files

File	Description
`metadata.xml`	Raw EDMX XML downloaded from `/$metadata`
`metadata-report.json`	Metadata serialized from EDMX (resources, fields, enum lookups)
`metadata-report.processed.json`	Metadata merged with Lookup Resource data (if available)
`lookup-resource-lookup-metadata.json`	Raw Lookup Resource records (LookupName, LookupValue, StandardLookupValue, LegacyODataValue)
`data-availability-report.json`	Field coverage, record counts, and data availability statistics per resource
`data-availability-responses.json`	Raw OData response payloads from replication
`data-dictionary-variations.json`	Non-standard fields and lookups found during variations checking (DD 2.0)

Future Optimizations

Parallel execution: Split the precomputed query list across N workers, each replicating a subset of resources concurrently. The replication state service is shared across workers for consistent tallying. Combined with batch expand, this could reduce DD testing time by an order of magnitude for servers with many resources.
Selective batching: Batch small expansions (Rooms, UnitTypes) but keep large ones (Media) as separate requests, based on estimated payload size from metadata field counts.
Schema validation error accumulation: Allow non-strict mode to continue past schema validation errors, collecting all errors and reporting them at the end rather than stopping on the first one (see #95).

Files

src/sdk/dd.ts                    # DD pipeline (calls cert-utils inner functions)
src/metadata/serializer.ts       # EDMX → metadata-report.json
src/metadata/lookup-resource.ts  # Lookup Resource fetch + merge
src/data-dictionary/             # DD-specific types and exports
legacy-cert-utils/               # Local copy of cert-utils v3.0.0 (for direct modification)