Skip to content

Feat!: Add support for merge_filter and dbt incremental_predicates for Incremental By Unique Key#3540

Merged
themisvaltinos merged 4 commits into
mainfrom
themis/incremental_predicates
Dec 20, 2024
Merged

Feat!: Add support for merge_filter and dbt incremental_predicates for Incremental By Unique Key#3540
themisvaltinos merged 4 commits into
mainfrom
themis/incremental_predicates

Conversation

@themisvaltinos
Copy link
Copy Markdown
Collaborator

@themisvaltinos themisvaltinos commented Dec 19, 2024

This update introduces merge_filters in INCREMENTAL_BY_UNIQUE_KEY kind models and also adds support for dbt projects with incremental_predicates. It updates the incremental logic for engines that support the MERGE strategy, by incorporating ability to filter data during the merge operation.

When using this feature, both source and target aliases can be used to distinguish between the source and target columns similar to WHEN MATCHED.

Example Model

MODEL (
    name db.test,
    kind INCREMENTAL_BY_UNIQUE_KEY (
        unique_key [purchase_order_id],
        merge_filters (
            source._operation <> 1 AND 
            target.start_date > DATEADD(day, -7, CURRENT_DATE)
        )
    )
);

The above predicates are then used into SQLMesh’s MERGE statements:

MERGE INTO <existing_table> AS "__MERGE_TARGET__"
    USING <temp_table> AS "__MERGE_SOURCE__"
    ON
        -- Unique key condition
        "__MERGE_TARGET__"."id" = "__MERGE_SOURCE__"."id"
        -- Custom incremental predicates to limit the data scan
        AND "__MERGE_SOURCE__"._operation <> 1
        AND "__MERGE_TARGET__".start_date > DATEADD(day, -7, CURRENT_DATE)
    WHEN MATCHED THEN 
        UPDATE ...
    WHEN NOT MATCHED THEN 
        INSERT ...

This support extends to an existing dbt project to be converted to the equivalent sqlmesh INCREMENTAL_BY_UNIQUE_KEY model, as it aligns with dbt’s incremental predicates.

Comment thread sqlmesh/core/dialect.py Outdated
Comment thread tests/core/test_model.py Outdated
name db.employees,
kind INCREMENTAL_BY_UNIQUE_KEY (
unique_key name,
incremental_predicates source.salary > 0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this support @start_* and @end_* macros? Can we add a test for this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure they are resolved at evaluation time and not load time

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they are supported, but I'm not sure if they are resolved at evaluation time. I'll verify and adjust if they're not

@themisvaltinos themisvaltinos force-pushed the themis/incremental_predicates branch from 66ec744 to 79bd81c Compare December 20, 2024 11:47
@themisvaltinos themisvaltinos changed the title Feat!: Add support for incremental predicates for IncrementalByUniqueKey Feat!: Add support for merge_filters and dbt incremental_predicates for Incremental By Unique Key Dec 20, 2024
Comment thread sqlmesh/core/dialect.py Outdated
Comment on lines +433 to +435
value = self._parse_wrapped(
lambda: _parse_macro_or_clause(self, self._parse_conjunction), optional=True
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the parentheses here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we don't I just left them as an optional in case a user had it inside parenthesis but can remove them

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses can be parsed without this, because _parse_conjunction can reach _parse_primary which knows how to parse parenthesized expressions. So I don't see a reason to treat this as special syntax.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right then removing this

Comment thread sqlmesh/core/model/kind.py Outdated
Comment on lines +478 to +482
if isinstance(v, str):
v = v.strip()
if v.startswith("("):
v = v[1:-1]
return d.parse_one(v, into=exp.Expression, dialect=get_dialect(values))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the case of when_matched we needed the parentheses to fix some ambiguity issues in the parser. In this case since I believe we can omit them, so doing this here won't be necessary.

Also, I'm not sure if into=exp.Expression is needed here, looks redundant?

Comment thread sqlmesh/core/dialect.py Outdated
lambda: _parse_macro_or_clause(self, self._parse_when_matched),
optional=True,
)
elif name == "merge_filters":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nit] I wonder if this should be singular, i.e. merge_filter. I'm personally fine either way, just wanted to see what others think. Perhaps the plural form creates an expectation that a list or tuple is valid (even though it's not)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I'm fine with either, the plural wa if there were many conditions in the conjunction, but merge_filter works too

@themisvaltinos themisvaltinos force-pushed the themis/incremental_predicates branch from 79bd81c to b0d233f Compare December 20, 2024 16:57
@themisvaltinos themisvaltinos force-pushed the themis/incremental_predicates branch from b0d233f to 3c640e9 Compare December 20, 2024 17:59
@themisvaltinos themisvaltinos changed the title Feat!: Add support for merge_filters and dbt incremental_predicates for Incremental By Unique Key Feat!: Add support for merge_filter and dbt incremental_predicates for Incremental By Unique Key Dec 20, 2024
@themisvaltinos themisvaltinos merged commit 0e51dd8 into main Dec 20, 2024
@themisvaltinos themisvaltinos deleted the themis/incremental_predicates branch December 20, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants