Skip to content

Commit c8dcd23

Browse files
committed
Stamp dbt_spellbook provenance and explicit public/visible on spell properties
Set dune.created_by=dbt_spellbook, dune.public, and dune.visible in the four shared property macros (mark_as_spell, expose_spells, hide_spells, expose_dataset) so that every spell lands with an explicit provenance tag and visibility state on the Dune catalog service. The baseline mark_as_spell applies to every model via the project-level post-hook; expose_spells/hide_spells/expose_dataset override as before. Also document the dbt Cloud and CI runner profile changes required for the catalog service migration, since they cannot be configured via this repo. Towards DWH-317
1 parent 7031d7b commit c8dcd23

4 files changed

Lines changed: 83 additions & 2 deletions

File tree

dbt_macros/dune/config_trino_properties.sql

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@
1414
{%- endif -%}
1515
{%- if target.name == 'prod' -%}
1616
{%- set properties = {
17+
'dune.created_by': 'dbt_spellbook',
1718
'dune.public': 'true',
19+
'dune.visible': 'true',
1820
'dune.data_explorer.blockchains': blockchains | as_text,
1921
'dune.data_explorer.category': 'abstraction',
2022
'dune.data_explorer.abstraction.type': spell_type,
@@ -37,7 +39,9 @@
3739
{% macro hide_spells() %}
3840
{%- if target.name == 'prod' -%}
3941
{%- set properties = {
42+
'dune.created_by': 'dbt_spellbook',
4043
'dune.public': 'false',
44+
'dune.visible': 'false',
4145
'dune.data_explorer.category': 'abstraction',
4246
'dune.vacuum': '{"enabled":true}'
4347
} -%}

dbt_macros/dune/expose_dataset.sql

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
{% macro expose_dataset(blockchains, contributors) %}
22
{%- if target.name == 'prod' -%}
33
{%- set properties = {
4+
'dune.created_by': 'dbt_spellbook',
45
'dune.public': 'true',
6+
'dune.visible': 'true',
57
'dune.data_explorer.blockchains': blockchains | as_text,
68
'dune.data_explorer.category': 'third_party_data',
79
'dune.data_explorer.contributors': contributors | as_text,

dbt_macros/dune/mark_as_spell.sql

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,20 @@
11
{% macro mark_as_spell(this, materialization) %}
22
{%- if target.name == 'prod' -%}
33
{%- if model.config.materialized == "view" -%}
4-
{%- set properties = { 'dune.data_explorer.category': 'abstraction' } -%}
4+
{%- set properties = {
5+
'dune.created_by': 'dbt_spellbook',
6+
'dune.public': 'true',
7+
'dune.visible': 'false',
8+
'dune.data_explorer.category': 'abstraction'
9+
} -%}
510
{%- else -%}
6-
{%- set properties = { 'dune.data_explorer.category': 'abstraction', 'dune.vacuum': '{"enabled":true}' } -%}
11+
{%- set properties = {
12+
'dune.created_by': 'dbt_spellbook',
13+
'dune.public': 'true',
14+
'dune.visible': 'false',
15+
'dune.data_explorer.category': 'abstraction',
16+
'dune.vacuum': '{"enabled":true}'
17+
} -%}
718
{%- endif -%}
819
{%- set deprecated_at = model.config.get('deprecated_at', none) -%}
920
{%- if deprecated_at -%}
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Catalog service migration (DWH-317)
2+
3+
Spellbook's underlying Trino catalogs have moved from Dune's self-hosted Hive Metastore (HMS) to the Dune catalog service. The migration is transparent for most spells, but a few things changed that you should be aware of when reading / contributing to the repo.
4+
5+
## What changed in this repo
6+
7+
Every spell now emits an additional set of Dune table properties on each run. The shared post-hook macros (`mark_as_spell`, `expose_spells`, `hide_spells`, `expose_dataset`) have been updated to always set:
8+
9+
- `dune.created_by='dbt_spellbook'` — provenance tag so the catalog service can distinguish dbt-authored spells from rows written by other producers (sqlmesh, `dunectl migrate`, etc.).
10+
- `dune.public` — explicit `'true'` for spells created via `mark_as_spell` and `expose_spells`, `'false'` for spells created via `hide_spells`. This used to be implicit and relied on HMS defaults.
11+
- `dune.visible` — explicit `'false'` by default, `'true'` for spells exposed via `expose_spells` / `expose_dataset`, `'false'` in sandpit regardless. This drives Data Explorer discoverability.
12+
13+
You do not need to change existing models. The baseline post-hook chain in every `dbt_project.yml` (`set_trino_session_property``optimize_spell``mark_as_spell`) now lands these properties for every spell automatically. Per-model `post_hook='{{ expose_spells(...) }}'` / `'{{ hide_spells() }}'` calls override the baseline as before.
14+
15+
## Per-model opt-in patterns (reminder)
16+
17+
| Macro | `dune.public` | `dune.visible` | Use when |
18+
| --- | --- | --- | --- |
19+
| `mark_as_spell` (default, automatic) | `true` | `false` | Intermediate / chain-specific tables, default |
20+
| `expose_spells(...)` (per-model `post_hook`) | `true` | `true` | Sector spells surfaced in Data Explorer |
21+
| `hide_spells()` (per-model `post_hook`) | `false` | `false` | Internal spells that must stay private |
22+
| `expose_dataset(...)` (per-model `post_hook`) | `true` | `true` | Third-party datasets |
23+
24+
`dune.public='true'` means anyone can query the table. `dune.visible='true'` means it shows up in Data Explorer. They are independent flags.
25+
26+
## What changed outside this repo
27+
28+
Some configuration cannot be updated via this repo because it lives in dbt Cloud job definitions and CI runner secrets. The following items were changed out-of-band as part of the migration and are documented here for reproducibility.
29+
30+
### dbt Cloud profile (`dunesql`)
31+
32+
The `dunesql` profile is defined in the dbt Cloud account connection (not in any `profiles.yml` in this repo). Its Trino connection was updated to target the migrated clusters:
33+
34+
- `database`: `hive_catalog_svc` (was `hive` on the pre-migration cluster).
35+
- `host`: one of the migrated spellbook clusters depending on the job (`trino-spellbook-cd.prod.internal.dunetech.io`, `trino-spellbook-daily.prod.internal.dunetech.io`, etc.).
36+
- `http_headers.X-Trino-Client-Tags`: `routingGroup=spellbook-<cluster>` matching the job's cluster.
37+
38+
On the migrated clusters, the Trino catalogs `hive` and `delta_prod` are aliases for the same catalog-service-backed catalog. Both names resolve to the same rows. Models can continue to reference sources and refs without schema qualification changes.
39+
40+
### CI runner `~/.dbt/profiles.yml`
41+
42+
The `dunesql` profile used by GitHub Actions (`.github/workflows/dbt_run.yml` invokes `dbt ... --profile dunesql`) is provisioned into `$HOME/.dbt/profiles.yml` by the runner image / secret at job start. Its `database` and `host` values follow the same pattern as the dbt Cloud profile above. Update both together when cluster names change.
43+
44+
### Sandpit profile (in-repo)
45+
46+
The sandpit profile (`dbt_subprojects/*/profiles.yml`, key `spellbook-sandpit`) was already updated in-repo: `database: hive_catalog_svc` and `X-Trino-Client-Tags: routingGroup=spellbook-sandpit`. No further action needed.
47+
48+
## Verifying a new spell lands correctly
49+
50+
After a dbt run against prod or sandpit, the spell should appear in the catalog service with the expected properties. Query the catalog service directly from any migrated cluster:
51+
52+
```sql
53+
SELECT extra_properties
54+
FROM system.metadata.table_properties
55+
WHERE catalog_name = 'hive_catalog_svc'
56+
AND schema_name = '<your_schema>'
57+
AND table_name = '<your_table>';
58+
```
59+
60+
Expected keys on a `mark_as_spell`-only spell: `dune.created_by`, `dune.public`, `dune.visible`, `dune.data_explorer.category`, and (for tables) `dune.vacuum`.
61+
62+
## Context
63+
64+
The migration tracking issue is DWH-317 in Linear. See the arrakis-jobs and core repos for the cluster and plugin-side changes that back this work.

0 commit comments

Comments
 (0)