Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workday employee daily history model + monthly summary model + updates to employee surrogate key #5

Merged
merged 21 commits into from
Apr 3, 2024

Conversation

fivetran-avinash
Copy link
Contributor

@fivetran-avinash fivetran-avinash commented Feb 29, 2024

PR Overview

This PR will address the following Issue/Feature: N/A

This PR will result in the following new package version: 2.0

Adding surrogate keys to specific models, as well as history functionality which will rapidly change the nature of the package.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

dbt_workday v0.2.0

🚨 Breaking Changes 🚨

  • Created a surrogate key employee_id in workday__employee_overview that combines worker_id, position_id, and position_start_date. This accounts for the edge cases where:
    • A worker can hold multiple positions concurrently.
    • A position being held by multiple workers concurrently.
    • A worker being rehired for the same position.
  • Using this surrogate key as our grain will hopefully provide uniqueness for the majority of Workday HCM customer cases.

🚀 Feature Updates 🚀

This will allow customers to utilize the Fivetran history mode feature, which records every version of each record in the source table from the moment this mode is activated in the equivalent tables.

These staging models include:

  • stg_workday__personal_information_history: Containing historical records of a worker's personal information.

  • stg_workday__worker_history: Containing historical records of a worker's history.

  • stg_workday__worker_position_history: Containing historical records of a worker's position history.

  • stg_workday__worker_position_organization_history: Containing historical records of a worker's position history.

  • We have then utilized the workday__employee_daily_history model in the models/workday_history folder based off of Fivetran's history mode feature, pulling from Workday HCM source models you can view in the models/staging/workday_history folder.

  • These models are disabled by default due to their size, so you will need to set the below variable configurations for each of the individual models you want to utilize in your dbt_project.yml.

vars:
   employee_history_enabled: true  ##Ex: employee_history_enabled: true    
  • We have also added the workday__monthly_summary model in the models/workday_history folder. This table aggregates high-level monthly metrics to track changes over time to overall employee data for a customer.

  • We have chosen not to implement incremental logic in the history models due to the future-facing updating of Workday HCM transactions beyond current daily updates. See the DECISIONLOG for more details.

  • We support the option to pull from both your Workday HCM and History Mode connectors simultaneously from their specific database/schemas. We also support pulling from just your History Mode connector on its own and bypassing the standard connector on its own. See more detailed instructions in the README.

  • Workday HCM History Mode models can contain a multitude of rows if you bring in all historical data, so we've introduced the flexibility to set first date filters to bring in only the historical data you need. More details can be found in the README.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • [NA] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • [NA] The appropriate issue has been linked, tagged, and properly assigned.
  • All necessary documentation and version upgrades have been applied.
  • docs were regenerated (unless this PR does not include any code or yml updates).
  • BuildKite integration tests are passing.
  • Detailed validation steps have been provided below.

Detailed Validation

Please share any and all of your validation steps: Models currently testing and passing all warehouses with accurate daily data.

If you had to summarize this PR in an emoji, which would it be?

⛑️

@fivetran-avinash fivetran-avinash self-assigned this Mar 6, 2024
@fivetran-avinash fivetran-avinash marked this pull request as ready for review March 20, 2024 20:03
Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash thanks for all your hard work on this feature update! This is looking good. I do have some comments, suggestions, and questions to be addressed below. Two major items to callout include:

  • There seems to be a lot of documentation around using two different schemas when using history mode. Since the history tables are in the same schema (and are the same sources we are using for the non history models), we don't need to add any documentation around using two different schemas for history and non history mode.
  • We should try and adjust the datespines to be a bit more dynamic to reflect the actual data. The variable are going to come in handy, but I think we could make it more dynamic.

Thanks again, and let me know if you have any questions when going through the review.

.buildkite/scripts/run_models.sh Outdated Show resolved Hide resolved
DECISIONLOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
models/workday_history/workday__monthly_summary.sql Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
models/docs.md Show resolved Hide resolved
Copy link
Contributor Author

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz This is ready for re-review.

Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash please see the notes above that have not been resolved. Those are my final remaining comments before this will be good to go! Let me know if you have any questions or if you would like to discuss anything in more detail.

We are almost there!

Copy link
Contributor Author

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Ready for final review hopefully!

Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash thanks so much for diligently working to get this update through the finish line and addressing all my review notes! 🎉

I do have a few final suggestions and one comment to callout a breaking change that just occurred to me in the CHANGELOG. However, once those small updates are applied, this will be good to go! Approving now as I feel comfortable with you closing out these final changes and moving forward with release once those are applied and docs are regenerated!

@@ -19,7 +19,6 @@ dbt deps
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt test --target "$db"
dbt run --target "$db"
dbt run --vars '{employee_history_enabled: true}' --target "$db"
dbt test --target "$db"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running the test we need to turn on the variable. Otherwise the history tests will not run.

Suggested change
dbt test --target "$db"
dbt test --vars '{employee_history_enabled: true}' --target "$db"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

README.md Outdated
@@ -91,8 +97,34 @@ Please be aware that the native `source.yml` connection set up in the package wi

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

## (Optional) Step 4: Utilizing Workday HCM History Mode

If you have History Mode enabled for your Workday HCM connector, we now include support for the worker, worker position, worker position organization, and personal information tables directly. You can view these files in the [`staging/stg_workday_history`](https://github.com/fivetran/dbt_workday/blob/main/models/workday_history/staging) folder. This staging data then flows into the employee daily history model, which in turn populates the monthly summary model. This will allow you access to your historical data for these tables for the most accurate record of your data over time.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The display path needed to be updated to reflect the new path in the url.

Suggested change
If you have History Mode enabled for your Workday HCM connector, we now include support for the worker, worker position, worker position organization, and personal information tables directly. You can view these files in the [`staging/stg_workday_history`](https://github.com/fivetran/dbt_workday/blob/main/models/workday_history/staging) folder. This staging data then flows into the employee daily history model, which in turn populates the monthly summary model. This will allow you access to your historical data for these tables for the most accurate record of your data over time.
If you have History Mode enabled for your Workday HCM connector, we now include support for the worker, worker position, worker position organization, and personal information tables directly. You can view these files in the [`workday_history/staging/`](https://github.com/fivetran/dbt_workday/blob/main/models/workday_history/staging) folder. This staging data then flows into the employee daily history model, which in turn populates the monthly summary model. This will allow you access to your historical data for these tables for the most accurate record of your data over time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 29 to 30
{% set start_date = dbt.dateadd("year", "-2", "current_date") %} -- Arbitrarily picked. Choose a more appropriate default if necessary.
{% set last_date = dbt.dateadd("year", "-1", "current_date") %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match the comment above, let's just make this range truly be for one year. This is not necessary as it will not be used by a customer and only for compilation, but I want to ensure we don't get in trouble with causing unnescary confusion down the line.

Suggested change
{% set start_date = dbt.dateadd("year", "-2", "current_date") %} -- Arbitrarily picked. Choose a more appropriate default if necessary.
{% set last_date = dbt.dateadd("year", "-1", "current_date") %}
{% set start_date = dbt.dateadd("year", "-1", "current_date") %} -- One year in the past for first date
{% set last_date = dbt.dateadd("day", "-1", "current_date") %} -- Yesterday as last date

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committed.

Comment on lines 27 to 29
{% else %}
{% set start_date = dbt.dateadd("year", "-2", "current_date") %} -- Arbitrarily picked. Choose a more appropriate default if necessary.
{% set last_date = dbt.dateadd("year", "-1", "current_date") %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion from previous model for consistency and clarity for ourselves when we come back to this in the future.

Suggested change
{% else %}
{% set start_date = dbt.dateadd("year", "-2", "current_date") %} -- Arbitrarily picked. Choose a more appropriate default if necessary.
{% set last_date = dbt.dateadd("year", "-1", "current_date") %}
{% else %}
{% set start_date = dbt.dateadd("year", "-1", "current_date") %} -- One year in the past for first date
{% set last_date = dbt.dateadd("day", "-1", "current_date") %} -- Yesterday as last date

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committed.

@@ -10,8 +10,16 @@ models:
intermediate:
+materialized: ephemeral
staging:
+materialized: view
+materialized: ephemeral
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically a breaking change as the old views are now going to be stale. I would call this out in the CHANGELOG as a breaking change and inform customers that the previous staging views will no longer be used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, don't forget to add the link to the PR in the CHANGELOG for all these changes.

Copy link
Contributor Author

@fivetran-avinash fivetran-avinash Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the CHANGELOG.

Copy link
Contributor Author

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Final changes addressed!

@fivetran-avinash fivetran-avinash merged commit 3b5cd5f into main Apr 3, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants