Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/connector name update #132

Merged
merged 6 commits into from
Jul 25, 2024

Conversation

fivetran-joemarkiewicz
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz commented Jul 23, 2024

PR Overview

This PR will address the following Issue/Feature: Internally Created Issue

This PR will result in the following new package version: v1.9.0

This is technically not a breaking change; however, this does include a new model stg_fivetran_platform__incremental_mar_tmp being created in the destination. Therefore, I would prefer we make this a breaking change to highlight the new model and the resulting updates to the underlying connector schema.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

🚨 Schema Changes 🚨

  • Following the July 2024 Fivetran Platform connector update, the connector_name field has been added to the incremental_mar source table. As a result, the following changes have been applied:
    • A new tmp model stg_fivetran_platform__incremental_mar_tmp has been created. This is necessary to ensure column consistency in downstream incremental_mar models.
    • The get_incremental_mar_columns() macro has been added to ensure all required columns are present in the stg_fivetran_platform__incremental_mar model.
    • The stg_fivetran_platform__incremental_mar has been updated to reference both the aforementioned tmp model and macro to fill empty fields if any required field is not present in the source.
    • The connector_name field in the stg_fivetran_platform__incremental_mar model is now defined by: coalesce(connector_name, connector_id). This ensures the data model will use the appropriate field to define the connector_name.

Under the Hood

  • Updated integration test seed data within integration_tests/seeds/incremental_mar.csv to ensure new code updates are working as expected.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

Please see the validations below for confirmation that these changes will not result in unexpected changes and the core functionality of the models remain the same.

image

Additionally, I have been able to verify that this model works on a schema that has both the connector_id and connector_name (emulates connectors setup before July 2024) and also works on connectors setup after July 2024 which only have connector_name.

  • Pre July 2024

    • image
    • image
  • Post July 2024

  • image

  • image

If you had to summarize this PR in an emoji, which would it be?

✍️

Sorry, something went wrong.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@fivetran-joemarkiewicz fivetran-joemarkiewicz marked this pull request as ready for review July 24, 2024 14:41
@@ -17,7 +28,7 @@ fields as (
updated_at,
_fivetran_synced,
incremental_rows
from base
from fields
)

select *
Copy link
Contributor

@fivetran-avinash fivetran-avinash Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query is still pulling from the result of the fill_staging_columns macro, so the coalesce operation is not being applied. You will want to update this part to:

select * 
from final

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, it seems this wasn't an issue because the fill_staging_columns macro has the connector_name regardless and therefore wasn't erroring out. Thanks for catching this and just updated.

- The `get_incremental_mar_columns()` macro has been added to ensure all required columns are present in the `stg_fivetran_platform__incremental_mar` model.
- The `stg_fivetran_platform__incremental_mar` has been updated to reference both the aforementioned tmp model and macro to fill empty fields if any required field is not present in the source.
- The `connector_name` field in the `stg_fivetran_platform__incremental_mar` model is now defined by: `coalesce(connector_name, connector_id)`. This ensures the data model will use the appropriate field to define the `connector_name`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth calling out the seed changes to incremental_mar for validating that the data comes through properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, added.

Copy link
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Looks good (with one small comment)! There was one validation test failure on integration tests for consistency__mar_table_history, but that was to be expected based on some of the seed file modifications. Validation tests also worked on the pre July 2024 connector schema.

I was wondering if we wanted an additional validation test for this, but couldn't think of one off the top and it seems the existing ones do cover the tables impacted. Will leave it to your discretion.

@fivetran-joemarkiewicz
Copy link
Contributor Author

@fivetran-joemarkiewicz Looks good (with one small comment)! There was one validation test failure on integration tests for consistency__mar_table_history, but that was to be expected based on some of the seed file modifications. Validation tests also worked on the pre July 2024 connector schema.

I was wondering if we wanted an additional validation test for this, but couldn't think of one off the top and it seems the existing ones do cover the tables impacted. Will leave it to your discretion.

Thanks for reviewing @fivetran-avinash! Yeah that is expected to fail this time around due to the seed changes, but the test does succeed when running with live data (seen in the validation confirmation screenshots in the PR details). I also would not add a validation for this update since as you mentioned the other tests cover any validations from this code update. The benefit of these validations is we can rely on them moving forward! If there were any issues they would pop up in the consistency and integrity tests which currently exist.

Thanks again for reviewing and applied the updates based off your comments!

@@ -20,7 +20,9 @@ sources:
Each measurement is calculated cumulatively for the month and includes all types of mar (paid, free, etc.)
columns:
- name: connector_id
description: The *name* of the connector being measured. Note - this is erroneously named and will be fixed soon by Fivetran.
description: The *name* of the connector being measured. This may still be present in older Fivetran Platform connectors. However, all connectors setup after July 2024 will not have this field as it has been replaced with connector_name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still call out that it's misnamed so that people don't assume connector_id in the other tables mean the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that would be a good callout to still mention. Just updated the documentation and regen'd the docs.

@fivetran-joemarkiewicz fivetran-joemarkiewicz merged commit b82e78e into main Jul 25, 2024
10 checks passed
@fivetran-joemarkiewicz fivetran-joemarkiewicz deleted the feature/connector-name-update branch July 25, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants