Skip to content

Commit e3aedce

Browse files
Add append delta filename to metadata.append_deltas view
Why these changes are being introduced: * The method for merging append deltas into the static metadata database file needs the filenames of append deltas to easily identify which files to delete (once merged). Prior to this change, the append deltas view only had a 'filename' column, which referred to the path or S3 URI for the TIMDEXDataset parquet file. How this addresses that need: * Set filename='append_delta_filename' when creating metadata.append_deltas view * Explicitly select metadata column names when creating metadata.records view Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/TIMX-528
1 parent dba26cc commit e3aedce

1 file changed

Lines changed: 8 additions & 4 deletions

File tree

timdex_dataset_api/metadata.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -396,7 +396,8 @@ def _create_append_deltas_view(self, conn: DuckDBPyConnection) -> None:
396396
create or replace view metadata.append_deltas as (
397397
select *
398398
from read_parquet(
399-
'{self.append_deltas_path}/*.parquet'
399+
'{self.append_deltas_path}/*.parquet',
400+
filename = 'append_delta_filename'
400401
)
401402
);
402403
"""
@@ -414,14 +415,17 @@ def _create_append_deltas_view(self, conn: DuckDBPyConnection) -> None:
414415

415416
def _create_records_union_view(self, conn: DuckDBPyConnection) -> None:
416417
logger.debug("creating view of unioned records")
418+
417419
conn.execute(
418-
"""
420+
f"""
419421
create or replace view metadata.records as
420422
(
421-
select *
423+
select
424+
{','.join(ORDERED_METADATA_COLUMN_NAMES)}
422425
from static_db.records
423426
union all
424-
select *
427+
select
428+
{','.join(ORDERED_METADATA_COLUMN_NAMES)}
425429
from metadata.append_deltas
426430
);
427431
"""

0 commit comments

Comments
 (0)