Skip to content

Commit

Permalink
Merge pull request #12 from fivetran/MagicBot/add-union-schema
Browse files Browse the repository at this point in the history
Feature: Union schema compatibility
  • Loading branch information
fivetran-catfritz authored Oct 12, 2023
2 parents 5ccb503 + 08e2506 commit b12cf3f
Show file tree
Hide file tree
Showing 27 changed files with 233 additions and 54 deletions.
3 changes: 2 additions & 1 deletion .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ export CI_SNOWFLAKE_DBT_USER=$(gcloud secrets versions access latest --secret="C
export CI_SNOWFLAKE_DBT_WAREHOUSE=$(gcloud secrets versions access latest --secret="CI_SNOWFLAKE_DBT_WAREHOUSE" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HOST=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HOST" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_CATALOG=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_CATALOG" --project="dbt-package-testing-363917")
3 changes: 2 additions & 1 deletion .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ steps:
commands: |
bash .buildkite/scripts/run_models.sh redshift
- label: ":bricks: Run Tests - Databricks"
- label: ":databricks: Run Tests - Databricks"
key: "run_dbt_databricks"
plugins:
- docker#v3.13.0:
Expand All @@ -69,5 +69,6 @@ steps:
- "CI_DATABRICKS_DBT_HOST"
- "CI_DATABRICKS_DBT_HTTP_PATH"
- "CI_DATABRICKS_DBT_TOKEN"
- "CI_DATABRICKS_DBT_CATALOG"
commands: |
bash .buildkite/scripts/run_models.sh databricks
25 changes: 25 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,28 @@
# dbt_tiktok_ads_source v0.5.0
[PR #12](https://github.com/fivetran/dbt_tiktok_ads_source/pull/12) includes the following updates:
## Breaking changes
- Updated the source identifier format for consistency with other packages and for compatibility with the `fivetran_utils.union_data` macro. Identifiers now are:

| current | previous |
|----------|----------|
|tiktok_ads_adgroup_history_identifier | tiktok_ads__ad_group_history_identifier |
|tiktok_ads_ad_history_identifier | tiktok_ads__ad_history_identifier
|tiktok_ads_advertiser_identifier | tiktok_ads__advertiser_identifier|
|tiktok_ads_campaign_history_identifier | tiktok_ads__campaign_history_identifier|
|tiktok_ads_ad_report_hourly_identifier | tiktok_ads__ad_report_hourly_identifier|
|tiktok_ads_adgroup_report_hourly_identifier | tiktok_ads__ad_group_report_hourly_identifier|
|tiktok_ads_campaign_report_hourly_identifier | tiktok_ads__campaign_report_hourly_identifier|

- If you are using the previous identifier, be sure to update to the current version!

## Feature update 🎉
- Unioning capability! This adds the ability to union source data from multiple tiktok_ads connectors. Refer to the [Union Multiple Connectors README section](https://github.com/fivetran/dbt_tiktok_ads_source/blob/main/README.md#union-multiple-connectors) for more details.

## Under the hood 🚘
- Updated tmp models to union source data using the `fivetran_utils.union_data` macro.
- To distinguish which source each field comes from, added `source_relation` column in each staging model and applied the `fivetran_utils.source_relation` macro.
- Updated tests to account for the new `source_relation` column.

# dbt_tiktok_ads_source v0.4.0
[PR #10](https://github.com/fivetran/dbt_tiktok_ads_source/pull/10) applies the following updates:
## 🚨 Breaking Changes 🚨
Expand Down
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ If you are **not** using the [Tiktok transformation package](https://github.com
```yaml
packages:
- package: fivetran/tiktok_ads_source
version: [">=0.4.0", "<0.5.0"]
version: [">=0.5.0", "<0.6.0"]
```

## Step 3: Define database and schema variables
Expand All @@ -53,7 +53,17 @@ vars:
```

## (Optional) Step 4: Additional configurations
<details><summary>Expand for configurations</summary>
### Union multiple connectors
If you have multiple tiktok_ads connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `tiktok_ads_union_schemas` OR `tiktok_ads_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:

```yml
vars:
tiktok_ads_union_schemas: ['tiktok_ads_usa','tiktok_ads_canada'] # use this if the data is in different schemas/datasets of the same database/project
tiktok_ads_union_databases: ['tiktok_ads_usa','tiktok_ads_canada'] # use this if the data is in different databases/projects but uses the same schema name
```
Please be aware that the native `source.yml` connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined `source.yml`.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

### Passing Through Additional Metrics
By default, this package will select `clicks`, `impressions`, and `cost` from the source reporting tables to store into the staging models. If you would like to pass through additional metrics to the staging models, add the below configurations to your `dbt_project.yml` file. These variables allow for the pass-through fields to be aliased (`alias`) if desired, but not required. Use the below format for declaring the respective pass-through variables:
Expand Down Expand Up @@ -93,8 +103,6 @@ vars:
tiktok_ads_<default_source_table_name>_identifier: your_table_name
```

</details>

## (Optional) Step 5: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for more details</summary>

Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'tiktok_ads_source'
version: '0.4.0'
version: '0.5.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@ integration_tests:
schema: tiktok_ads_source_integration_tests_2
threads: 8
databricks:
catalog: null
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: tiktok_ads_source_integration_tests_2
threads: 2
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
16 changes: 8 additions & 8 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
name: 'tiktok_ads_source_integration_tests'
version: '0.4.0'
version: '0.5.0'
profile: 'integration_tests'
config-version: 2

vars:
tiktok_ads_source:
tiktok_ads_schema: tiktok_ads_source_integration_tests_2
tiktok_ads__ad_group_history_identifier: "tiktok_adgroup_history_data"
tiktok_ads__ad_history_identifier: "tiktok_ad_history_data"
tiktok_ads__advertiser_identifier: "tiktok_advertiser_data"
tiktok_ads__campaign_history_identifier: "tiktok_campaign_history_data"
tiktok_ads__ad_report_hourly_identifier: "tiktok_ad_report_hourly_data"
tiktok_ads__ad_group_report_hourly_identifier: "tiktok_adgroup_report_hourly_data"
tiktok_ads__campaign_report_hourly_identifier: "tiktok_campaign_report_hourly_data"
tiktok_ads_adgroup_history_identifier: "tiktok_adgroup_history_data"
tiktok_ads_ad_history_identifier: "tiktok_ad_history_data"
tiktok_ads_advertiser_identifier: "tiktok_advertiser_data"
tiktok_ads_campaign_history_identifier: "tiktok_campaign_history_data"
tiktok_ads_ad_report_hourly_identifier: "tiktok_ad_report_hourly_data"
tiktok_ads_adgroup_report_hourly_identifier: "tiktok_adgroup_report_hourly_data"
tiktok_ads_campaign_report_hourly_identifier: "tiktok_campaign_report_hourly_data"


seeds:
Expand Down
3 changes: 3 additions & 0 deletions models/docs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{% docs source_relation %}
The source of the record if the unioning functionality is being used. If not this field will be empty.
{% enddocs %}
16 changes: 8 additions & 8 deletions models/src_tiktok_ads.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
version: 2

sources:
- name: tiktok_ads
- name: tiktok_ads # This source will only be used if you are using a single tiktok_ads source connector. If multiple sources are being unioned, their tables will be directly referenced via adapter.get_relation.
database: "{% if target.type != 'spark' %}{{var('tiktok_ads_database', target.database)}}{% endif %}"
schema: "{{var('tiktok_ads_schema', 'tiktok_ads')}}"

Expand All @@ -17,7 +17,7 @@ sources:

tables:
- name: advertiser
identifier: "{{ var('tiktok_ads__advertiser_identifier', 'advertiser') }}"
identifier: "{{ var('tiktok_ads_advertiser_identifier', 'advertiser') }}"
description: Each record represents data for one advertiser.
columns:
- name: id
Expand Down Expand Up @@ -82,7 +82,7 @@ sources:
description: Timestamp of when Fivetran synced a record.

- name: campaign_history
identifier: "{{ var('tiktok_ads__campaign_history_identifier', 'campaign_history') }}"
identifier: "{{ var('tiktok_ads_campaign_history_identifier', 'campaign_history') }}"
description: Each record represents a version of a TikTok campaign.
columns:
- name: campaign_id
Expand Down Expand Up @@ -113,7 +113,7 @@ sources:
description: Split Test variables. Optional values; TARGETING, BIDDING_OPTIMIZATION , CREATIVE.

- name: adgroup_history
identifier: "{{ var('tiktok_ads__ad_group_history_identifier', 'adgroup_history') }}"
identifier: "{{ var('tiktok_ads_adgroup_history_identifier', 'adgroup_history') }}"
description: Each record represents a version of a TikTok ad group.
columns:
- name: adgroup_id
Expand Down Expand Up @@ -272,7 +272,7 @@ sources:
description: Whether users can download your video ads on TikTok(cannot be updated once created).

- name: ad_history
identifier: "{{ var('tiktok_ads__ad_history_identifier', 'ad_history') }}"
identifier: "{{ var('tiktok_ads_ad_history_identifier', 'ad_history') }}"
description: Each record represents a version of a TikTok ad.
columns:
- name: ad_id
Expand Down Expand Up @@ -327,7 +327,7 @@ sources:
description: The video ID.

- name: ad_report_hourly
identifier: "{{ var('tiktok_ads__ad_report_hourly_identifier', 'ad_report_hourly') }}"
identifier: "{{ var('tiktok_ads_ad_report_hourly_identifier', 'ad_report_hourly') }}"
description: Each record represents data for each ad for each hour.
columns:
- name: ad_id
Expand Down Expand Up @@ -445,7 +445,7 @@ sources:
description: Timestamp of when Fivetran synced a record.

- name: campaign_report_hourly
identifier: "{{ var('tiktok_ads__campaign_report_hourly_identifier', 'campaign_report_hourly') }}"
identifier: "{{ var('tiktok_ads_campaign_report_hourly_identifier', 'campaign_report_hourly') }}"
description: Each record represents data for each campaign for each hour.
columns:
- name: campaign_id
Expand Down Expand Up @@ -565,7 +565,7 @@ sources:
description: Timestamp of when Fivetran synced a record.

- name: adgroup_report_hourly
identifier: "{{ var('tiktok_ads__ad_group_report_hourly_identifier', 'adgroup_report_hourly') }}"
identifier: "{{ var('tiktok_ads_adgroup_report_hourly_identifier', 'adgroup_report_hourly') }}"
description: Each record represents data for each ad group for each hour.
columns:
- name: adgroup_id
Expand Down
28 changes: 25 additions & 3 deletions models/stg_tiktok_ads.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@


version: 2

models:
- name: stg_tiktok_ads__advertiser
description: Each record represents data for each advertiser.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- advertiser_id
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: advertiser_id
description: Advertiser ID
tests:
- unique
- not_null
- name: address
description: Advertiser address information
Expand Down Expand Up @@ -52,9 +56,12 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- ad_group_id
- updated_at
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: ad_group_id
description: Ad group ID
tests:
Expand Down Expand Up @@ -103,9 +110,12 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- ad_id
- updated_at
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: ad_id
description: Ad ID
tests:
Expand Down Expand Up @@ -153,9 +163,12 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- campaign_id
- updated_at
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: campaign_id
description: Campaign ID
tests:
Expand All @@ -179,9 +192,12 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- ad_id
- stat_time_hour
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: ad_id
description: Ad id
tests:
Expand Down Expand Up @@ -254,9 +270,12 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- ad_group_id
- stat_time_hour
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: ad_group_id
description: Ad group id
tests:
Expand Down Expand Up @@ -329,9 +348,12 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- campaign_id
- stat_time_hour
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"
- name: campaign_id
description: Campaign id
tests:
Expand Down
9 changes: 8 additions & 1 deletion models/stg_tiktok_ads__ad_group_history.sql
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,19 @@ fields as (
)
}}


{{ fivetran_utils.source_relation(
union_schema_variable='tiktok_ads_union_schemas',
union_database_variable='tiktok_ads_union_databases')
}}

from base
),

final as (

select
source_relation,
adgroup_id as ad_group_id,
cast(updated_at as {{ dbt.type_timestamp() }}) as updated_at,
advertiser_id,
Expand All @@ -40,7 +47,7 @@ final as (
gender,
languages,
landing_page_url,
row_number() over (partition by adgroup_id order by updated_at desc) = 1 as is_most_recent_record
row_number() over (partition by source_relation, adgroup_id order by updated_at desc) = 1 as is_most_recent_record
from fields
)

Expand Down
9 changes: 8 additions & 1 deletion models/stg_tiktok_ads__ad_group_report_hourly.sql
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,19 @@ fields as (
staging_columns=get_ad_group_report_hourly_columns()
)
}}

{{ fivetran_utils.source_relation(
union_schema_variable='tiktok_ads_union_schemas',
union_database_variable='tiktok_ads_union_databases')
}}

from base
),

final as (

select
select
source_relation,
adgroup_id as ad_group_id,
cast(stat_time_hour as {{ dbt.type_timestamp() }}) as stat_time_hour,
cpc,
Expand Down
11 changes: 9 additions & 2 deletions models/stg_tiktok_ads__ad_history.sql
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,19 @@ fields as (
)
}}


{{ fivetran_utils.source_relation(
union_schema_variable='tiktok_ads_union_schemas',
union_database_variable='tiktok_ads_union_databases')
}}

from base
),

final as (

select
select
source_relation,
ad_id,
cast(updated_at as {{ dbt.type_timestamp() }}) as updated_at,
adgroup_id as ad_group_id,
Expand All @@ -40,7 +47,7 @@ final as (
{{ dbt_utils.get_url_parameter('landing_page_url', 'utm_content') }} as utm_content,
{{ dbt_utils.get_url_parameter('landing_page_url', 'utm_term') }} as utm_term,
landing_page_url,
row_number() over (partition by ad_id order by updated_at desc) = 1 as is_most_recent_record
row_number() over (partition by source_relation, ad_id order by updated_at desc) = 1 as is_most_recent_record
from fields
)

Expand Down
Loading

0 comments on commit b12cf3f

Please sign in to comment.