RESO Data Generator – User Guide
RESO Data Generator – User Guide
A task-oriented walkthrough of the RESO Data Generator. This is the package that fills a real RESO-compliant server with realistic test data on demand, with the foreign keys wired correctly, the lookups respected, and the relationships between resources resolved automatically.
If you have ever stood up a RESO server, looked at an empty Property resource, and wondered what to put in it without writing a thousand lines of seed data by hand, this is the tool that ends that pattern.
Audience
Developers and test engineers who need real data inside a RESO-shaped server but do not have a production feed to point at. Real examples:
- Building against a RESO server locally and needing data to query before any real records exist
- Reproducing a customer issue and needing a clean dataset shaped exactly like the production schema
- CI pipelines that spin up a RESO server, seed it, run tests against it, and tear it down
- Demos and walkthroughs where the audience needs to see real-looking listings, agents, offices and media records on screen
- Cert tooling that needs synthetic data to exercise the full surface of a server’s read and write paths
- AI agents and integration tests that need realistic but disposable records to operate against
The generator runs as a CLI, as a programmatic SDK, and through the RESO Desktop Client’s admin panel. Same engine, three surfaces.
Install
npm install @reso-standards/reso-data-generator
Node.js 22 or later. The CLI is the most common entry point; the SDK is for embedding the generator inside another tool.
What the Generator Does
The generator’s job is the opposite of validation. Validation says “is this record well-formed for the schema?” Generation says “give me records that are already well-formed for the schema.” The interesting part is that “well-formed” means more than “fields match types” – it means the foreign keys point at real records that exist, the lookups carry valid values, the related child records are linked correctly, and the whole graph reads like a plausible MLS feed.
Three things make that work:
- Per-resource generators that know the shape of each RESO resource and produce values that look like the real thing.
Propertyrecords get plausible addresses, prices in plausible ranges, bedroom and bathroom counts that match each other, geocoordinates that fall on land.Memberrecords get names that pair with email addresses that match.Officerecords get brokerage names and contact info that read like real businesses. The values are synthetic but they look like data, not likefoo / bar / baz. - Automatic foreign-key resolution so when you ask for 50 Properties, the generator notices that
PropertyreferencesMember(asListAgent) andOffice(asListOffice), creates the right number of upstream records first, and then creates the Properties with validListAgentKeyandListOfficeKeyvalues pointing at them. You never have to think about ordering. - A topologically sorted dependency graph that handles arbitrary chains of references – including the well-known Office ↔ Member circular dependency (an Office has a
BrokerKeypointing at a Member, but a Member has anOfficeKeypointing at an Office). The generator creates Office first without the broker links, then Member with valid Office references, then PATCHes the Office records with broker keys pointing at real Members. The circular dependency is resolved invisibly; you just ask for Properties.
The result: one command, a fully wired-up server.
Generating Data Through the CLI
The CLI is the most common entry point. Two modes: interactive (it asks you questions) and non-interactive (you pass everything as flags). Non-interactive is the right choice for scripts and CI; interactive is the right choice when you are exploring.
The Simplest Possible Run
npx reso-data-generator -u http://localhost:8080 -r Property -n 50 -t admin-token
This asks for 50 Property records against a server at localhost:8080, with the bearer token admin-token. With nothing else specified, the generator:
- Fetches the server’s metadata to learn the schema
- Discovers that
Propertyhas foreign-key references toMember(asListAgent),Office(asListOffice),OUID, andTeams - Computes how many of each upstream record it needs (defaults: 1 Office per 5 Properties, 1 Member per 2 Properties, etc.)
- Builds a dependency graph and sorts it topologically
- Creates the upstream records first via POST
- Creates the 50 Properties with valid foreign keys pointing at the records it just made
- Resolves the Office ↔ Member circular dependency by PATCHing Office records after Members exist
You go from “empty server” to “50 listings, agents, offices, all wired correctly” in one command.
Generating Related Records Too
Property is the parent for several child collection resources – Media (listing photos), OpenHouse (open house events), Showing (showing appointments), PropertyRooms (room-by-room details). These are not foreign-key dependencies; they are children that hang off the parent and reference it via the RESO ResourceName + ResourceRecordKey convention.
To generate them alongside the Properties, use --related:
npx reso-data-generator -u http://localhost:8080 -r Property -n 50 \
--related Media:5,OpenHouse:2,PropertyRooms:8 -t admin-token
This produces:
- 50 Properties (each with valid Member and Office references)
- 250 Media records (5 per Property, linked back via
ResourceName=PropertyandResourceRecordKey=<listingKey>) - 100 OpenHouse records (2 per Property, linked via
ListingKey) - 400 PropertyRooms records (8 per Property)
- Whatever Members, Offices and other upstream records the dependency graph needs
The --related flag takes a comma-separated list of Resource:count pairs. The count is per parent record, so Media:5 means “5 Media records for every Property generated.”
Choosing the Output Mode
The generator supports three output modes for the same generation logic. Pick the one that fits how the data is going to be used.
HTTP (default) – POST records directly to the server’s Add/Edit API. This is what you want when you have a running server and you want the data to land in it immediately.
npx reso-data-generator -u http://localhost:8080 -r Property -n 50 -t admin-token
JSON – write each record to its own file under a directory. This is what you want when you need disposable test data for a unit test, or when you want to commit a deterministic seed corpus to a repo.
npx reso-data-generator -r Property -n 50 -f json -o ./seed-data
The output directory gets one subdirectory per resource (./seed-data/Property/, ./seed-data/Member/, ./seed-data/Office/, etc.) with one numbered JSON file per record (0001.json, 0002.json, …). The dependency graph is resolved before the files are written, so foreign keys in the Property files point at the keys in the Member and Office files.
curl – generate a seed.sh bash script with curl commands. This is what you want when you need to seed a server from a Dockerfile or a CI step that does not have Node.js installed.
npx reso-data-generator -r Property -n 50 -f curl -o ./seed.sh -t admin-token
The generated script includes a health-check loop at the top so it waits for the server to be ready, then runs the POST commands in dependency order, then runs the PATCH commands for the Office ↔ Member back-fill.
Skipping Dependency Resolution
If you only want to generate one resource and you do not care that its foreign keys point at nothing, pass --no-deps:
npx reso-data-generator -r Property -n 10 --no-deps -f json -o ./seed-data
Useful when you are testing how a server handles records with broken foreign keys, or when you only need the shape of the data and not a coherent graph.
Overriding Dependency Counts
The defaults work for most cases, but if you need more or fewer upstream records than the heuristics produce, override them with --dep-counts:
npx reso-data-generator -u http://localhost:8080 -r Property -n 50 \
--dep-counts Office:20,Member:100 -t admin-token
This forces 20 Office records and 100 Member records, regardless of what the heuristic would have computed. The 50 Properties still get valid foreign keys; they just have a wider pool of agents and brokerages to draw from.
Interactive Mode
If you would rather walk through the options as questions instead of remembering flags, run the generator with no arguments:
npx reso-data-generator
It prompts for the output format, server URL, auth token, resource name, record count, dependency resolution toggle and related-record configuration. Useful for one-off exploration.
Using the Generator from Code
The CLI is a thin wrapper around an SDK that you can call directly from your own application. The SDK is the right entry point when you are embedding the generator inside another tool – a test harness, a CI orchestrator, an admin UI, an MCP tool that wants to seed data on behalf of an agent.
Generating With Dependencies
import { generateWithDependencies } from '@reso-standards/reso-data-generator';
const result = await generateWithDependencies(
{
resource: 'Property',
count: 50,
related: { Media: 5, OpenHouse: 2, PropertyRooms: 8 },
serverUrl: 'http://localhost:8080',
authToken: 'admin-token'
},
{ format: 'http' },
metadata,
(progress) => {
console.log(`${progress.resource}: ${progress.created}/${progress.total}`);
}
);
console.log(`Created ${result.totalRecords} records across ${result.resources.length} resources`);
The third argument is the metadata the generator uses to discover foreign-key relationships. You can fetch it via the RESO Client SDK’s metadata loader, or pass a MetadataReport you have already loaded from somewhere else.
The fourth argument is an optional progress callback that fires once per record. Useful for showing a progress bar in a UI or for streaming logs to a CI runner.
Generating a Single Resource
If you want to generate one resource and skip the dependency graph entirely (because you have already seeded the upstream records yourself, or because you are testing how the server handles missing references), use generateSeedData directly:
import { generateSeedData } from '@reso-standards/reso-data-generator';
const result = await generateSeedData(
{
resource: 'Property',
count: 10,
serverUrl: 'http://localhost:8080',
authToken: 'admin-token'
},
{ format: 'http' },
(progress) => {
console.log(`Created ${progress.created} of ${progress.total}`);
}
);
Same shape, fewer arguments, no dependency resolution.
Inspecting What Will Be Generated
If you want to see the dependency graph before the generator runs, use buildSeedPlan:
import { buildSeedPlan } from '@reso-standards/reso-data-generator';
const plan = buildSeedPlan({
resource: 'Property',
count: 50,
related: { Media: 5, OpenHouse: 2 }
}, metadata);
for (const phase of plan.phases) {
console.log(`Phase ${phase.order}: ${phase.resource} (${phase.count} records)`);
}
The plan is a dry run – it tells you which resources will be created, in which order, with how many of each. Useful for sanity-checking a generation run before it touches a server.
Using a Specific Generator
Each resource has a domain-specific generator that produces the realistic per-resource values. If you want to call one directly – for example, to produce a single Property record without writing it to a server – use getGenerator:
import { getGenerator } from '@reso-standards/reso-data-generator';
const propertyGenerator = getGenerator('Property');
const property = propertyGenerator.generate({ fields, lookups, foreignKeys: {} });
console.log(property.City, property.ListPrice);
The result is a fully populated record object, ready to be inspected or POSTed by hand.
Per-Resource Realism
The generator does not just produce values that pass type checks. It produces values that read like real data. A short tour:
- Property – realistic addresses from 75 real U.S. cities with city-specific street names, listing prices bounded by field-name-aware rules (~40 rules that prevent billion-dollar expenses and nonsensical values), structure values (bedrooms, bathrooms, living area, lot size) that are internally consistent, geocoordinates that match the declared city rather than landing in the ocean, listing dates and status combinations that make sense, public remarks that read like an agent wrote them, tax data based on state-specific assessment patterns, co-agent records drawn from the same office as the primary agent, and 20 real MLS system names for OriginatingSystem and SourceSystem fields
- Member – first and last names from a realistic distribution, email addresses constructed from the names (
firstname.lastname@brokerage.com), phone numbers in valid U.S. formats, designations drawn from real industry credentials (CRS, ABR, GRI, e-PRO), NAR member IDs in the right shape, MLS-style MemberMlsId values - Office – brokerage office records with names that read like real brokerages, addresses geo-consistent with their declared cities, contact information consistent with the address, MLS-style OfficeMlsId values
- Media – image records with placeholder URLs, descriptions, and ordering, linked back to the parent resource via the RESO
ResourceName+ResourceRecordKeyconvention so they show up in the rightPropertywhen expanded - OpenHouse – open house events with future-dated start times, durations that look like real open houses, linked to a parent property via
ListingKey - Showing – showing appointments with realistic time slots, agent and contact references, linked to the parent property
- PropertyRooms, PropertyGreenVerification, PropertyPowerProduction, PropertyUnitTypes – child collection records linked to the parent property via
ListingKey, generated with a generic child generator that respects the field metadata
For resources without a domain-specific generator, the library falls back to a generic field generator that handles every Edm type (Edm.String, Edm.Boolean, Edm.Int16, Edm.Int32, Edm.Int64, Edm.Decimal, Edm.Date, Edm.DateTimeOffset, Edm.TimeOfDay, Edm.Guid) plus enum and collection lookups drawn from the server’s metadata.
Relational Integrity
Generated records maintain referential consistency across resources. Member and Office records form pools that Property records draw from, so a listing’s ListAgentKey always points to a real Member record and ListOfficeKey always points to a real Office record. Co-agents (BuyerAgent, CoBuyerAgent, CoListAgent) are selected from the same office as the primary agent. Expansion records (ListAgent, BuyerAgent, etc.) are flattened into the parent Property record consistently, so querying a Property and its expanded Member returns matching data.
Reset
In the desktop client and web client, a Reset button with a two-step confirmation truncates all generated data while preserving the schema. This lets you regenerate fresh data without restarting the server or rebuilding containers.
A Note on the Data
The data the generator produces is synthetic and DD-driven. It is not sampled from any real listing feed, it is not derived from any production database, and it is not generated by a model trained on real listings. The values come from per-resource generators that understand the shape of each RESO resource and produce values that fit – nothing more.
This matters in a few practical ways:
- License-clean by construction – there is no upstream feed to license, no MLS contract to negotiate, no usage restriction inherited from a data vendor
- Safe to commit – the generated records can live in a repo, in a Docker image, in a CI fixture set, in any place a license-encumbered feed could not
- Safe to share – generated records can be sent to vendors, partners, and AI agents for testing without any chain-of-custody concerns
- Safe for AI training and inference – none of the generated data has any provenance link to a real listing, so any tool that processes it cannot leak real-world information through the synthetic set
The current generator produces realistic data shaped by the RESO Data Dictionary itself plus the per-resource heuristics described above. A more advanced generator that produces test data shaped to any specific server’s metadata report (including local fields) is in flight – see the RESO Reference Server guide for the broader story on how the generator pairs with the reference server, and reso-tools#106 for the work to make the generator strictly respect any target server’s declared field set.
Where to Next
- Standing up a server to generate against – the RESO Reference Server is the natural target. Spin it up locally with Docker or SQLite, point the generator at it, and you have a fully populated RESO server in seconds.
- Validating the generated records – the reso-validation library checks records against field metadata and resource business rules. The reference server runs it on every generated record automatically; you can also call it directly from your tests.
- Querying what you generated – the RESO Client SDK is the right entry point for fetching, filtering, and paging the data after the generator has populated it.
- Browsing it in a UI – the RESO Desktop Client can connect to any RESO server, including one you just seeded, and exposes the data through a metadata-aware browser.
- Running compliance against it – the RESO Certification test runners exercise the cert flows against any RESO server, so the loop “seed → query → cert” is one continuous workflow against the same data.
Reference
- Package README – full CLI flag table, output mode details, and SDK type reference
- Source on GitHub
- npm Package
- reso-tools#106 – the work to make the generator strictly respect any target server’s declared field set