Skip to content

Commit

Permalink
docs, model updates, add tests, add documentation (#16)
Browse files Browse the repository at this point in the history
* docs, model updates, add tests, add documentation

* add explicit tests for validation test fix, add changelog

* docs and update

* add decision log entry for different grains

* bk

* vertical integration test

* packages.yml

---------

Co-authored-by: Jamie Rodriguez <[email protected]>
  • Loading branch information
fivetran-reneeli and fivetran-jamie authored Oct 29, 2024
1 parent 4d64aab commit fc762c0
Show file tree
Hide file tree
Showing 32 changed files with 841 additions and 61 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@ target/
dbt_modules/
logs/
.DS_Store
dbt_packages/
dbt_packages/
integration_tests/package-lock.yml
integration_tests/.DS_Store
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
# dbt_amazon_ads v0.4.0
[PR #16](https://github.com/fivetran/dbt_amazon_ads/pull/16) includes the following updates:

## Feature Update: Conversions Support
- We have added conversion metrics to each of the end models by default.
- The conversion metrics are the following:
- `purchases_30_d`: Number of attributed conversion events occurring within 30 days of an ad click.
- `sales_30_d`: Total value of sales occurring within 30 days of an ad click.
- To bring in other conversion fields (`purchases_same_sku_30_d`, `sales_14_d`, etc.), please refer to our [passthrough column variables](https://github.com/fivetran/dbt_amazon_ads_source?tab=readme-ov-file#passing-through-additional-metrics).

> The above new field additions are **breaking changes** for users who were not already bringing in conversion fields via passthrough columns.
## Under the Hood
- Created `amazon_ads_persist_pass_through_columns` macro to ensure that the new conversion fields are backwards compatible with users who have already included them via [passthrough columns](https://github.com/fivetran/dbt_amazon_ads?tab=readme-ov-file#passing-through-additional-metrics) . The package will dynamically avoid "duplicate column" errors.
- Added integrity and consistency validation tests within `integration_tests` folder for the transformation models (to be used by maintainers only).
- Added documentation explaining potential discrepancies across reporting grains. See the [DECISIONLOG.md](https://github.com/fivetran/dbt_amazon_ads/blob/main/DECISIONLOG.md).

# dbt_amazon_ads v0.3.0
[PR #11](https://github.com/fivetran/dbt_amazon_ads/pull/11) includes the following updates:
## Feature update 🎉
Expand Down
6 changes: 6 additions & 0 deletions DECISIONLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,9 @@
- Portfolios:
- Grain is narrower than Accounts but broader than Campaigns
- This is a newer feature that is optional, so not all advertisers may not utilize portfolios. It is also possible that even if portfolios are being used, not all campaigns may be assigned to a portfolio. Arguably this report may not be entirely necessary, however since portfolios are a budgeting aid, we wanted to include a report with this grain.


## Why don't metrics add up across different grains (Ex. ad level vs campaign level)?
When aggregating metrics like clicks and spend across different grains, discrepancies can arise due to differences in how data is captured, grouped, or attributed at each grain. For example, certain actions or costs might be attributed differently at the ad, campaign, or ad group level, leading to inconsistencies when rolled up. Additionally, for example, at the keyword grain, where a keyword can belong to multiple ad groups, aggregations can lead to over counting. Conversely, some ads may only be represented at the ad group level, rather than individual ad levels, leading to under counting at the ad grain.

This is a reason why we have broken out the ad reporting packages into separate hierarchical end models (Ad, Ad Group, Campaign, and more). Because if we only used ad-level reports, we could be missing data.
31 changes: 21 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,13 @@ dispatch:
search_order: ['spark_utils', 'dbt_utils']
```
### Step 2: Install the package
Include the following amazon_ads package version in your `packages.yml` file:
### Step 2: Install the package (skip if also using the `ad_reporting` combo package)
Include the following amazon_ads package version in your `packages.yml` file _if_ you are not also using the upstream [Ad Reporting combination package](https://github.com/fivetran/dbt_ad_reporting):
> TIP: Check [dbt Hub](https://hub.getdbt.com/) for the latest installation instructions or [read dbt's Package Management documentation](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
```yaml
packages:
- package: fivetran/amazon_ads
version: [">=0.3.0", "<0.4.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.4.0", "<0.5.0"] # we recommend using ranges to capture non-breaking changes automatically
```

Do NOT include the `amazon_ads_source` package in this file. The transformation package itself has a dependency on it and will install the source package as well.
Expand All @@ -79,6 +79,8 @@ vars:
```

### (Optional) Step 5: Additional configurations
<details open><summary>Expand/Collapse details</summary>

#### Union multiple connectors
If you have multiple amazon_ads connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `amazon_ads_union_schemas` OR `amazon_ads_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:

Expand All @@ -92,9 +94,9 @@ vars:
To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

#### Passing Through Additional Metrics
By default, this package will select `clicks`, `impressions`, and `cost` from the source reporting tables to store into the staging models. If you would like to pass through additional metrics to the staging models, add the following configurations to your `dbt_project.yml` file. These variables allow the pass-through fields to be aliased (`alias`) if desired, but not required. Use the following format for declaring the respective pass-through variables:
By default, this package will select `clicks`, `impressions`, `cost`, `purchases_30_d`, and `sales_30_d` from the source reporting tables to store into the staging and end models. If you would like to pass through additional metrics to the package models, add the following configurations to your `dbt_project.yml` file. These variables allow the pass-through fields to be aliased (`alias`) if desired, but not required. Use the following format for declaring the respective pass-through variables:

> **Note** Ensure that you exercised due diligence when adding metrics to these models. The metrics added by default (clicks, impressions, and cost) have been vetted by the Fivetran team maintaining this package for accuracy. There are metrics included within the source reports, for example, metric averages, which may be inaccurately represented at the grain for reports created in this package. You want to ensure whichever metrics you pass through are indeed appropriate to aggregate at the respective reporting levels provided in this package.
> **Note** Make sure to exercise due diligence when adding metrics to these models. The metrics added by default have been vetted by the Fivetran team maintaining this package for accuracy. There are metrics included within the source reports, for example, metric averages, which may be inaccurately represented at the grain for reports created in this package. You want to ensure whichever metrics you pass through are indeed appropriate to aggregate at the respective reporting levels provided in this package.

```yml
vars:
Expand All @@ -103,10 +105,11 @@ vars:
alias: "custom_field"
amazon_ads__ad_group_passthrough_metrics:
- name: "unique_string_field"
alias: "field_id"
transform_sql: "coalesce(unique_string_field, 'NA')"
amazon_ads__advertised_product_passthrough_metrics:
- name: "new_custom_field"
alias: "custom_field"
transform_sql: "coalesce(custom_field, 'NA')" # reference alias in transform_sql if aliasing
- name: "a_second_field"
amazon_ads__targeting_keyword_passthrough_metrics:
- name: "this_field"
Expand All @@ -116,7 +119,7 @@ vars:
```

#### Changing the Build Schema
By default, this package will build the Amazon_ads staging models within a schema titled (<target_schema> + `amazon_ads_source`) in your destination. If this is not where you would like your Amazon Ads staging data to be written, add the following configuration to your root `dbt_project.yml` file:
By default, this package will build the Amazon Ads staging models (11 views, 11 tables) within a schema titled (<target_schema> + `amazon_ads_source`) and the Amazon Ads intermediate (1 view) and end models (7 tables) within a schema titled (<target_schema> + `amazon_ads`) in your destination. If this is not where you would like your Amazon Ads staging and modeling data to be written, add the following configuration to your root `dbt_project.yml` file:

```yml
models:
Expand All @@ -127,14 +130,17 @@ models:
```

#### Change the source table references
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable. This is not available when running the package on multiple unioned connectors.

> IMPORTANT: See this project's [`dbt_project.yml`](https://github.com/fivetran/dbt_amazon_ads_source/blob/main/dbt_project.yml) variable declarations to see the expected names.

```yml
vars:
amazon_ads_<default_source_table_name>_identifier: your_table_name
```

</details>

### (Optional) Step 6: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for more details</summary>

Expand All @@ -149,7 +155,7 @@ This dbt package is dependent on the following dbt packages. Be aware that these
```yml
packages:
- package: fivetran/amazon_ads_source
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]
- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand All @@ -165,10 +171,15 @@ The Fivetran team maintaining this package _only_ maintains the latest version o
In creating this package, which is meant for a wide range of use cases, we had to take opinionated stances on certain decisions, such as logic choices or column selection. Therefore, we have documented significant choices in the [DECISIONLOG.md](https://github.com/fivetran/dbt_amazon_ads/blob/main/DECISIONLOG.md) and will continue to update this as the package evolves. We are always open to and encourage feedback on these choices and the package in general.

### Contributions
A small team of analytics engineers at Fivetran develops these dbt packages. However, these packages are made better by community contributions.
A small team of analytics engineers at Fivetran develops these dbt packages. However, the packages are made better by community contributions.

We highly encourage and welcome contributions to this package. Check out [this dbt Discourse article](https://discourse.getdbt.com/t/contributing-to-a-dbt-package/657) on the best workflow for contributing to a package.

#### Contributors
We thank [everyone](https://github.com/fivetran/amazon_ads/graphs/contributors) who has taken the time to contribute. Each PR, bug report, and feature request has made this package better and is truly appreciated.

A special thank you to [Seer Interactive](https://www.seerinteractive.com/?utm_campaign=Fivetran%20%7C%20Models&utm_source=Fivetran&utm_medium=Fivetran%20Documentation), who we closely collaborated with to introduce native conversion support to our Ad packages.

## Are there any resources available?
- If you have questions or want to reach out for help, refer to the [GitHub Issue](https://github.com/fivetran/dbt_amazon_ads/issues/new/choose) section to find the right avenue of support for you.
- If you would like to provide feedback to the dbt package team at Fivetran or would like to request a new dbt package, fill out our [Feedback Form](https://www.surveymonkey.com/r/DQ7K7WW).
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'amazon_ads'
version: '0.3.0'
version: '0.4.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

24 changes: 12 additions & 12 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/run_results.json

This file was deleted.

18 changes: 13 additions & 5 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
config-version: 2

name: 'amazon_ads_integration_tests'
version: '0.3.0'
version: '0.4.0'

profile: 'integration_tests'

Expand All @@ -20,6 +20,15 @@ vars:
amazon_ads_targeting_keyword_report_identifier: "targeting_keyword_report_data"
amazon_ads_search_term_ad_keyword_report_identifier: "search_term_ad_keyword_report_data"

amazon_ads__campaign_passthrough_metrics:
- name: sales_7_d
- name: purchases_30_d
alias: purchases_alias
- name: purchases_14_d

models:
+schema: "amazon_ads_{{ var('directed_schema','dev') }}"

dispatch:
- macro_namespace: dbt_utils
search_order: ['spark_utils', 'dbt_utils']
Expand All @@ -32,7 +41,6 @@ seeds:
campaign_budget_amount: "float"
click_through_rate: "float"
keyword_bid: "float"

dispatch:
- macro_namespace: dbt_utils
search_order: ['spark_utils', 'dbt_utils']
clicks: "float"
impressions: "float"
cost: "float"
2 changes: 1 addition & 1 deletion integration_tests/packages.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
packages:
- local: ../
- local: ../
22 changes: 11 additions & 11 deletions integration_tests/seeds/campaign_level_report_data.csv
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
campaign_id,date,_fivetran_synced,campaign_applicable_budget_rule_id,campaign_applicable_budget_rule_name,campaign_bidding_strategy,campaign_budget_amount,campaign_budget_currency_code,campaign_budget_type,clicks,cost,impressions,campaign_rule_based_budget_amount
2187,2022-07-26,2022-10-14 11:54:40.980000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-07-21,2022-10-14 11:52:57.796000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-08-29,2022-10-14 11:40:45.808000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-08-31,2022-10-14 11:42:29.065000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-07-27,2022-10-14 11:54:40.980000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-08-28,2022-10-14 11:40:45.807000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-09-06,2022-10-14 11:42:29.062000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-07-11,2022-10-14 12:07:12.424000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET0.0,2,1.67,1095,
2187,2022-07-15,2022-10-14 12:07:12.428000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
2187,2022-07-18,2022-10-14 11:52:57.802000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,
campaign_id,date,_fivetran_synced,campaign_applicable_budget_rule_id,campaign_applicable_budget_rule_name,campaign_bidding_strategy,campaign_budget_amount,campaign_budget_currency_code,campaign_budget_type,clicks,cost,impressions,campaign_rule_based_budget_amount,sales_7_d,purchases_30_d,sales_30_d
2187,2022-07-26,2022-10-14 11:54:40.980000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,12,44,54
2187,2022-07-21,2022-10-14 11:52:57.796000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,2,34,345
2187,2022-08-29,2022-10-14 11:40:45.808000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,,,
2187,2022-08-31,2022-10-14 11:42:29.065000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,,,
2187,2022-07-27,2022-10-14 11:54:40.980000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,3,35,350
2187,2022-08-28,2022-10-14 11:40:45.807000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,,,
2187,2022-09-06,2022-10-14 11:42:29.062000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,4,24,335
2187,2022-07-11,2022-10-14 12:07:12.424000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET0.0,2,1.67,1095,,0,64,234
2187,2022-07-15,2022-10-14 12:07:12.428000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,,,
2187,2022-07-18,2022-10-14 11:52:57.802000,,,optimizeForSales,2000.0,USD,DAILY_BUDGET,0,0.0,,,,,
55 changes: 55 additions & 0 deletions integration_tests/tests/consistency/consistency_account_report.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

with prod as (
select
account_id,
sum(clicks) as clicks,
sum(impressions) as impressions,
sum(cost) as cost
{# sum(purchases_30_d) as total_purchases_30_d,
sum(sales_30_d) as total_sales_30_d #}
from {{ target.schema }}_amazon_ads_prod.amazon_ads__account_report
group by 1
),

dev as (
select
account_id,
sum(clicks) as clicks,
sum(impressions) as impressions,
sum(cost) as cost
{# sum(purchases_30_d) as total_purchases_30_d,
sum(sales_30_d) as total_sales_30_d #}
from {{ target.schema }}_amazon_ads_dev.amazon_ads__account_report
group by 1
),

final as (
select
prod.account_id,
prod.clicks as prod_clicks,
dev.clicks as dev_clicks,
prod.impressions as prod_impressions,
dev.impressions as dev_impressions,
prod.cost as prod_cost,
dev.cost as dev_cost
{# prod.purchases_30_d as prod_purchases_30_d,
dev.purchases_30_d as dev_purchases_30_d,
prod.sales_30_d as prod_sales_30_d,
dev.sales_30_d as dev_sales_30_d #}
from prod
full outer join dev
on dev.account_id = prod.account_id
)

select *
from final
where
abs(prod_clicks - dev_clicks) >= .01
or abs(prod_impressions - dev_impressions) >= .01
or abs(prod_cost - dev_cost) >= .01
{# or abs(prod_purchases_30_d - dev_purchases_30_d) >= .01
or abs(prod_sales_30_d - dev_sales_30_d) >= .01 #}
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

with prod as (
select
ad_group_id,
sum(clicks) as clicks,
sum(impressions) as impressions,
sum(cost) as cost
{# sum(purchases_30_d) as total_purchases_30_d,
sum(sales_30_d) as total_sales_30_d #}
from {{ target.schema }}_amazon_ads_prod.amazon_ads__ad_group_report
group by 1
),

dev as (
select
ad_group_id,
sum(clicks) as clicks,
sum(impressions) as impressions,
sum(cost) as cost
{# sum(purchases_30_d) as total_purchases_30_d,
sum(sales_30_d) as total_sales_30_d #}
from {{ target.schema }}_amazon_ads_dev.amazon_ads__ad_group_report
group by 1
),

final as (
select
prod.ad_group_id,
prod.clicks as prod_clicks,
dev.clicks as dev_clicks,
prod.impressions as prod_impressions,
dev.impressions as dev_impressions,
prod.cost as prod_cost,
dev.cost as dev_cost
{# prod.purchases_30_d as prod_purchases_30_d,
dev.purchases_30_d as dev_purchases_30_d,
prod.sales_30_d as prod_sales_30_d,
dev.sales_30_d as dev_sales_30_d #}
from prod
full outer join dev
on dev.ad_group_id = prod.ad_group_id
)

select *
from final
where
abs(prod_clicks - dev_clicks) >= .01
or abs(prod_impressions - dev_impressions) >= .01
or abs(prod_cost - dev_cost) >= .01
{# or abs(prod_purchases_30_d - dev_purchases_30_d) >= .01
or abs(prod_sales_30_d - dev_sales_30_d) >= .01 #}
Loading

0 comments on commit fc762c0

Please sign in to comment.