Semantic layer
Entities, glossary, metrics in YAML. Versioned beside your code. Atlas reads them on every prompt.
// the schema is the product
Atlas is a YAML-defined semantic layer for analytics — authored by humans, consumed by AI agents. Entities, glossary, and metrics live in your repo; the agent reads them, writes deterministic SQL, and runs it through a 7-stage validation pipeline — read-only, table-whitelisted, statement-timed.
self-host is free · MCP server for claude desktop, cursor, continue · slack-native
// the schema is the product
Entities, dimensions, measures, joins, virtual dimensions, query patterns, glossary terms, and authoritative metrics — all in YAML, in your repo, code-reviewed in pull requests. Every field exists because an LLM needs it: sample_values ground the agent in real data, glossary.status: ambiguous forces clarifying questions, metrics.objective picks MAX vs MIN.
name: Orderstype: fact_tabletable: ordersgrain: one row per order dimensions: - name: status sql: status type: string sample_values: [pending, shipped, delivered, cancelled] - name: order_month sql: TO_CHAR(created_at, 'YYYY-MM') type: string virtual: true measures: - name: total_gmv_cents sql: total_cents type: sum joins: - target_entity: Customers relationship: many_to_one join_columns: { from: customer_id, to: id }
// user
What's our top-performing category by GMV this month?
// agent reads orders.yml + categories.yml + glossary.yml, then writes SQL
SELECT c.name, SUM(o.total_cents) / 100.0 AS gmv, COUNT(DISTINCT o.id) AS orders FROM orders o JOIN order_items oi ON oi.order_id = o.id JOIN products p ON p.name = oi.product_name JOIN categories c ON c.id = p.category_id WHERE o.status != 'cancelled' AND o.created_at >= DATE_TRUNC('month', NOW()) GROUP BY c.name ORDER BY gmv DESC LIMIT 5;
// result · 5 rows · 7 validators passed
// canonical questions
// same questions on the readme, the docs homepage, and the eval harness · against the bundled NovaMart e-commerce demo
of AI-generated SQL fails at least one Atlas validator.
// sample of 12,481 queries · gpt-4o, claude-sonnet, llama-3.1 · against 18 production schemas
// trace one query
Watch it run. Click any gate to see what it checks. This is the same panel the operator sees in the chat UI — every step is a real artifact, every gate is the real check.
-- session.4f8e · 7 validations passed -- read-only · scoped to analytics.public SELECT c.name, SUM(o.total_cents) / 100.0 AS gmv, COUNT(DISTINCT o.id) AS orders FROM orders o JOIN order_items oi ON oi.order_id = o.id JOIN products p ON p.name = oi.product_name JOIN categories c ON c.id = p.category_id WHERE o.status != 'cancelled' AND o.created_at >= DATE_TRUNC('month', NOW()) GROUP BY c.name ORDER BY gmv DESC LIMIT 5;
// nodes in the system
Inspectable, optional, TypeScript. No agent, no orchestration framework, no prompt salad.
Entities, glossary, metrics in YAML. Versioned beside your code. Atlas reads them on every prompt.
AST-parsed, permission-checked, row-limited. Read-only by default. Same in dev, same in prod.
One connection spec. On self-host, no data leaves your network. Atlas runs in your VPC.
Every query, every result, every operator — logged, searchable, exportable. SSO, SAML, SCIM.
// drop-in surfaces
// why atlas
The same comparison from the README — same words, same scoring, no claim drift across surfaces.
// deployment topology
Cloud, or your VPC. Same Atlas, same primitives, same upgrade path.
// self-host
free
Your infra. Your data.
One command. Bun, Docker, or k8s. AGPL-3.0.
Every feature, no limits.
$bun create atlas-agent my-atlas $cd my-atlas && bun run dev →atlas booted on :3000 →connected · postgres://localhost $▌
// atlas cloud
$29/ seat
Hosted. Zero ops.
We run it. Weekly updates, monitored connections, SLA.
Live in 3 minutes.
// ship it
self-host is free · MCP server for claude desktop, cursor, continue · slack-native (enterprise)