Is this your first time submitting a feature request? <ul class="contains-task-l

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Feature] Unit Tests Should Support ref & source statements when specifying rows with sql about dbt-core HOT 3 OPEN

ernestoongaro commented on June 29, 2024 1

[Feature] Unit Tests Should Support ref & source statements when specifying rows with sql

from dbt-core.

Comments (3)

RobMcZagBDS commented on June 29, 2024

Thank you @ernestoongaro

The current unit-test functionality is perfectly suited to very specific and narrow testing, where you just pick the few columns you need, mock the macro calls to the return value you expect and check that you get the desired output.
It is a lot like unit tests done mocking everything, but that one tiny bit of complex logic that you want to test.
Great power, but you can paint a building with the same brush you paint small details in fine art paintings.

The typical case I had in mind is when you are trying to have a wider validation of a model, so you got some sample data and you have isolated a few rows that cover the different use cases and you received or manually verified the expected output for them, so you would keep these input and expected rows in a table and use them, eventually adding more rows when new use cases are found. One great case is adding the use case data when you find out a bug and fix it.

In this situation it is also useful to note that the output / expectation of one model easily becomes the input for the next model to test, so you could almost visualize the sequence as a chain of known set of inputs and their expected outputs down the lineage line.

I would suggest a format of ref or source and then as rows we could put the query that references the ref or source.

from dbt-core.

dbeatty10 commented on June 29, 2024

@RobMcZagBDS and @ernestoongaro thank you both for raising this issue 🤩

After discussing with @graciegoheen, this isn't something we’d prioritize anytime soon, but we will continue to listen for how many folks are asking for this.

A large reason for our prioritization is the complexity that would be involved in implementing this. Here is a summary of some of the obstacles identified by @gshank:

SQL fixtures: Each one would need to be compiled, requiring additional code and refactoring.
Extra fields: New fields like compiled_sql would need to be added.
Dependency handling: Handling dependencies (depends_on) would be difficult because fixture nodes are created dynamically and don’t exist during the initial parsing stage. The unit testing manifest might need separate depends_on structures for each fixture.
Additional unknowns: there may be remaining things that would be difficult to handle or that we'd have a hard time detecting.

from dbt-core.

RobMcZag commented on June 29, 2024

Thank you Dough, Grace, Gerda and Ernesto for looking into it.
I understand the technical difficulties, and somehow we can cope with the current limitations even if it is not as elegant and maintainable as it would by being able to use a source() reference.

Maybe it is just my feeling, but I would prefer to have a "FIXTURES" schema with "TABLE_XXX" and "TABLE_XXX__EXPECTATION" (if the expectation it is not the same as next input in the pipeline "TABLE_YYY") and select the rows and columns with SQL than have a similar collection of CSV or SQL files in a folder inside the repository.
We can do that with the current SQL feature by hardcoding the DB & SCHEMA. Do we have variables in the context?

BTW in the docs it would be nice to have a better description of what you expect from a SQL file.
My gut feeling is a piece of SQL that when run returns the desired rows and columns, but not 100% sure and have not yet experimented with it.

from dbt-core.

[Feature] Unit Tests Should Support ref & source statements when specifying rows with sql about dbt-core HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent