Code Monkey home page Code Monkey logo

dbt_metrics's Introduction

Deprecation Notice

The dbt_metrics package will no longer be maintained with the release of dbt-core 1.6 in July 2023. All semantic layer capability will be instead be found in MetricFlow.

dbt_metrics

About

This dbt package generates queries based on metrics, introduced to dbt Core in v1.0. For more information on metrics, such as available calculation methods, properties, and other definition parameters, please reference the documentation linked above.

Tenets

The tenets of dbt_metrics, which should be considered during development, issues, and contributions, are:

  • A metric value should be consistent everywhere that it is referenced
  • We prefer generalized metrics with many dimensions over specific metrics with few dimensions
  • It should be easier to use dbt’s metrics than it is to avoid them
  • Organization and discoverability are as important as precision
  • One-off models built to power metrics are an anti-pattern

Installation Instructions

Check dbt Hub for the latest installation instructions, or read the docs for more information on installing packages.

Include in your package.yml

packages:
  - package: dbt-labs/metrics
    version: [">=1.5.0", "<1.6.0"]

Supported Adapters

The adapaters that are currently supported in the dbt_metrics package are:

  • Snowflake
  • BigQuery
  • Redshift
  • Postgres
  • Databricks

Macros

Calculate

The calculate macro performs the metric aggregation and returns the dataset based on the specifications of the metric definition and the options selected in the macro. It can be accessed like any other macro:

select * 
from {{ metrics.calculate(
    metric('new_customers'),
    grain='week',
    dimensions=['plan', 'country'],
    secondary_calculations=[
        metrics.period_over_period(comparison_strategy="ratio", interval=1, alias="pop_1wk"),
        metrics.period_over_period(comparison_strategy="difference", interval=1),

        metrics.period_to_date(aggregate="average", period="month", alias="this_month_average"),
        metrics.period_to_date(aggregate="sum", period="year"),

        metrics.rolling(aggregate="average", interval=4, alias="avg_past_4wks"),
        metrics.rolling(aggregate="min", interval=4)
    ],
    start_date='2022-01-01',
    end_date='2022-12-31',
    where="plan='filter_value'"
) }}

If no grain is provided to the macro in the query then the dataset returned will not be time-bound.

start_date and end_date are optional. When not provided, the spine will span all dates from oldest to newest in the metric's dataset. This default is likely to be correct in most cases, but you can use the arguments to either narrow the resulting table or expand it (e.g. if there was no new customers until 3 January but you want to include the first two days as well). Both values are inclusive.

Supported Inputs

Input Example Description Required
metric_list metric('some_metric'), [metric('some_metric'),metric('some_other_metric')] The metric(s) to be queried by the macro. If multiple metrics required, provide in list format. Required
grain day, week, month The time grain that the metric will be aggregated to in the returned dataset Optional
dimensions [plan, country, some_predefined_dimension_name] The dimensions you want the metric to be aggregated by in the returned dataset Optional
start_date 2022-01-01 Limits the date range of data used in the metric calculation by not querying data before this date Optional
end_date 2022-12-31 Limits the date range of data used in the metric claculation by not querying data after this date Optional
where plan='paying_customer' A sql statment, or series of sql statements, that alter the final CTE in the generated sql. Most often used to limit the data to specific values of dimensions provided Optional
date_alias 'date_field' A string value that aliases the date field in the final dataset Optional

Develop

There are times when you want to test what a metric might look like before defining it in your project. In these cases you should use the develop metric, which allows you to provide a single metric in a contained yml in order to simulate what the metric might loook like if defined in your project.

{% set my_metric_yml -%}
{% raw %}

metrics:
  - name: develop_metric
    model: ref('fact_orders')
    label: Total Discount ($)
    timestamp: order_date
    time_grains: [day, week, month]
    calculation_method: average
    expression: discount_total
    dimensions:
      - had_discount
      - order_country

{% endraw %}
{%- endset %}

select * 
from {{ metrics.develop(
        develop_yml=my_metric_yml,
        metric_list=['develop_metric']
        grain='month'
        )
    }}

Supported Inputs

Input Example Description Required
metric_list ('some_metric'), [('some_metric'),('some_other_metric')] The metric(s) to be queried by the macro. If multiple metrics required, provide in list format. Do not provide in metric('name) format as that triggers dbt parsing for metric that doesn't exist. Just provide the name of the metric. Required
grain day, week, month The time grain that the metric will be aggregated to in the returned dataset Optional
dimensions [plan, country, some_predefined_dimension_name The dimensions you want the metric to be aggregated by in the returned dataset Optional
start_date 2022-01-01 Limits the date range of data used in the metric calculation by not querying data before this date Optional
end_date 2022-12-31 Limits the date range of data used in the metric claculation by not querying data after this date Optional
where plan='paying_customer' A sql statment, or series of sql statements, that alter the final CTE in the generated sql. Most often used to limit the data to specific values of dimensions provided Optional
date_alias 'date_field' A string value that aliases the date field in the final dataset Optional

Multiple Metrics Or Derived Metrics

If you have a more complicated use case that you are interested in testing, the develop macro also supports this behavior. The only caveat is that you must include the raw tags for any provided metric yml that contains a derived metric. Example below:

{% set my_metric_yml -%}
{% raw %}

metrics:
  - name: develop_metric
    model: ref('fact_orders')
    label: Total Discount ($)
    timestamp: order_date
    time_grains: [day, week, month]
    calculation_method: average
    expression: discount_total
    dimensions:
      - had_discount
      - order_country

  - name: derived_metric
    label: Total Discount ($)
    timestamp: order_date
    time_grains: [day, week, month]
    calculation_method: derived
    expression: "{{ metric('develop_metric') }} - 1 "
    dimensions:
      - had_discount
      - order_country

  - name: some_other_metric_not_using
    label: Total Discount ($)
    timestamp: order_date
    time_grains: [day, week, month]
    calculation_method: derived
    expression: "{{ metric('derived_metric') }} - 1 "
    dimensions:
      - had_discount
      - order_country

{% endraw %}
{%- endset %}

select * 
from {{ metrics.develop(
        develop_yml=my_metric_yml,
        metric_list=['derived_metric']
        grain='month'
        )
    }}

The above example will return a dataset that contains the metric provided in the metric list (derived_metric) and the parent metric (develop_metric). It will not contain some_other_metric_not_using as it is not designated in the metric list or a parent of the metrics included.

Available calculation methods

The method of calculation (aggregation or derived) that is applied to the expression.

Metric Calculation Method Description
count This metric type will apply the count aggregation to the specified field
count_distinct This metric type will apply the count aggregation to the specified field, with an additional distinct statement inside the aggregation
sum This metric type will apply the sum aggregation to the specified field
average This metric type will apply the average aggregation to the specified field
min This metric type will apply the min aggregation to the specified field
max This metric type will apply the max aggregation to the specified field
median This metric type will apply the median aggregation to the specified field, or an alternative percentile_cont aggregation if median is not available
derived This metric type is defined as any non-aggregating calculation of 1 or more metrics

Use cases and examples

Jaffle Shop Metrics

For those curious about how to implement metrics in a dbt project, please reference the jaffle_shop_metrics.

Secondary calculations

Secondary calculations are window functions which act on the primary metric or metrics. You can use them to compare values to an earlier period and calculate year-to-date sums or rolling averages. The use of secondary calculations requires a grain input in the macro.

Create secondary calculations using the convenience constructor macros. Alternatively, you can manually create a list of dictionary entries (not recommended).

Example of manual dictionary creation (not recommended)

Creating a calculation this way has no input validation.

[
    {"calculation": "period_over_period", "interval": 1, "comparison_strategy": "difference", "alias": "pop_1mth"},
    {"calculation": "rolling", "interval": 3, "aggregate": "sum"}
]

Column aliases are automatically generated, but you can override them by setting alias.

Period over Period (source)

The period over period secondary calculation performs a calculation against the metric(s) in question by either determining the difference or the ratio between two points of time. This other point in time is determined by the input variable which looks at the grain selected in the macro.

Constructor: metrics.period_over_period(comparison_strategy, interval [, alias, metric_list])

Input Example Description Required
comparison_strategy ratio or difference How to calculate the delta between the two periods Yes
interval 1 Integer - the number of time grains to look back Yes
alias week_over_week The column alias for the resulting calculation No
metric_list base_sum_metric List of metrics that the secondary calculation should be applied to. Default is all metrics selected No

Period to Date (source)

The period to date secondary calculation performs an aggregation on a defined period of time that is equal to or coarser (higher, more aggregated) than the grain selected. Great example of this is when you want to display a month_to_date value alongside your weekly grained metric.

Constructor: metrics.period_to_date(aggregate, period [, alias, metric_list])

Input Example Description Required
aggregate max, average The aggregation to use in the window function. Options vary based on the primary aggregation and are enforced in validate_aggregate_coherence(). Yes
period "day", "week" The time grain to aggregate to. One of ["day", "week", "month", "quarter", "year"]. Must be at equal or coarser (higher, more aggregated) granularity than the metric's grain (see Time Grains below). In example grain of month, the acceptable periods would be month, quarter, or year. Yes
alias month_to_date The column alias for the resulting calculation No
metric_list base_sum_metric List of metrics that the secondary calculation should be applied to. Default is all metrics selected No

Rolling (source)

The rolling secondary calculation performs an aggregation on a number of rows in metric dataset. For example, if the user selects the week grain and sets a rolling secondary calculation to 4 then the value returned will be a rolling 4 week calculation of whatever aggregation type was selected. If the interval input is not provided then the rolling caclulation will be unbounded on all preceding rows.

Constructor: metrics.rolling(aggregate [, interval, alias, metric_list])

Input Example Description Required
aggregate max, average The aggregation to use in the window function. Options vary based on the primary aggregation and are enforced in validate_aggregate_coherence(). Yes
interval 1 Integer - the number of time grains to look back No
alias month_to_date The column alias for the resulting calculation No
metric_list base_sum_metric List of metrics that the secondary calculation should be applied to. Default is all metrics selected No

Prior (source)

The prior secondary calculation returns the value from a specified number of intervals prior to the row.

Constructor: metrics.prior(interval [, alias, metric_list])

Input Example Description Required
interval 1 Integer - the number of time grains to look back Yes
alias 2_weeks_prior The column alias for the resulting calculation No
metric_list base_sum_metric List of metrics that the secondary calculation should be applied to. Default is all metrics selected No

Customisation

Most behaviour in the package can be overridden or customised.

Metric Configs

Metric nodes now accept config dictionaries like other dbt resources (beginning in dbt-core v1.3+). Metric configs can specified in the metric yml itself or for groups of metrics in the dbt_project.yml file.

# in metrics.yml
version: 2

metrics:
  - name: config_metric
    label: Example Metric with Config
    model: ref('my_model')
    calculation_method: count
    timestamp: date_field
    time_grains: [day, week, month]

    config:
      enabled: True

Or:

# in dbt_project.yml

metrics: 
  your_project_name: 
    +enabled: true

The metrics package contains validation on the configurations you're able to provide.

Accepted Metric Configurations

Below is the list of metric configs currently accepted by this package.

Config Type Accepted Values Default Value Description
enabled boolean True/False True Enables or disables a metric node. When disabled, dbt will not consider it as part of your project.
treat_null_values_as_zero boolean True/False True Controls the coalesce behavior for metrics. By default, when there are no observations for a metric, the output of the metric as well as period Over period secondary calculations will include a coalesce({{ field }}, 0) to return 0's rather than nulls. Setting this config to False instead returns NULL values.
restrict_no_time_grain_false boolean True/False False Controls whether this metric can be queried without a provided time grain. By default, all metrics will be able to be queried without a grain and aggregated in a non time-bound way. This config will restrict that behavior and require a grain input in order to query the metric.

Window Periods

Version 0.4.0 of this package, and beyond, offers support for the window attribute of the metric definition. This alters the underlying query to allow the metric definition to contain a window of time, such as the past 14 days or the past 3 months. Utilizing the window functionality requires a grain be provided in the query.

More information can be found in the metrics page of dbt docs/.

Derived Metrics

Note: In version 0.4.0, expression metrics were renamed to derived Version 0.3.0 of this package, and beyond, offer support for derived metrics! More information around this calculation_method can be found in themetrics page of dbt docs/.

Multiple Metrics

There may be instances where you want to return multiple metrics within a single macro. This is possible by providing a list of metrics instead of a single metric. See example below:

  select *
  from 
  {{ metrics.calculate(
      [metric('base_sum_metric'), metric('base_average_metric')], 
      grain='day', 
      dimensions=['had_discount']
      )
  }}

Note: The metrics must share the time_grain selected in the macro AND the dimensions selected in the macro. If these are not shared between the 2+ metrics, this behaviour will fail. Additionally, secondary calculations can be used for multiple metrics but each secondary calculation will be applied against each metric and returned in a field that matches the following pattern: metric_name_secondary_calculation_alias.

Where Clauses

Sometimes you'll want to see the metric in the context of a particular filter but this filter isn't neccesarily part of the metric definition. In this case, you can use the where input for the metrics package. It takes a list of sql statements and adds them in as filters to the final CTE in the produced SQL.

Additionally, this input can be used by BI Tools to as a way for filters in their UI to be passed through into the metric logic.

Calendar

The package comes with a basic calendar table, running between 2010-01-01 and 2029-12-31 inclusive. You can replace it with any custom calendar table which meets the following requirements:

  • Contains a date_day column.
  • Contains the following columns: date_week, date_month, date_quarter, date_year, or equivalents.
  • Additional date columns need to be prefixed with date_, e.g. date_4_5_4_month for a 4-5-4 retail calendar date set. Dimensions can have any name (see dimensions on calendar tables).

To do this, set the value of the dbt_metrics_calendar_model variable in your dbt_project.yml file:

#dbt_project.yml
config-version: 2
[...]
vars:
    dbt_metrics_calendar_model: my_custom_calendar

Dimensions from calendar tables

You may want to aggregate metrics by a dimension in your custom calendar table, for example is_weekend. You can include this within the list of dimensions in the macro call without it needing to be defined in the metric definition.

To do so, set a list variable at the project level called custom_calendar_dimension_list, as shown in the example below.

vars:
  custom_calendar_dimension_list: ["is_weekend"]

The is_weekend field can now be used by your metrics.

Time Grains

The package protects against nonsensical secondary calculations, such as a month-to-date aggregate of data which has been rolled up to the quarter. If you customise your calendar (for example by adding a 4-5-4 retail calendar month), you will need to override the get_grain_order() macro. In that case, you might remove month and replace it with month_4_5_4. All date columns must be prefixed with date_ in the table. Do not include the prefix when defining your metric, it will be added automatically.

Custom aggregations

To create a custom primary aggregation (as exposed through the calculation_method config of a metric), create a macro of the form metric_my_aggregate(expression), then override the gen_primary_metric_aggregate() macro to add it to the dispatch list. The package also protects against nonsensical secondary calculations such as an average of an average; you will need to override the get_metric_allowlist() macro to both add your new aggregate to to the existing aggregations' allowlists, and to make an allowlist for your new aggregation:

    {% do return ({
        "average": ['max', 'min'],
        "count": ['max', 'min', 'average', 'my_new_aggregate'],
        [...]
        "my_new_aggregate": ['max', 'min', 'sum', 'average', 'my_new_aggregate']
    }) %}

To create a custom secondary aggregation (as exposed through the secondary_calculations input in the metric macro), create a macro of the form secondary_calculation_my_calculation(metric_name, dimensions, calc_config), then override the perform_secondary_calculations() macro.

Secondary calculation column aliases

Aliases can be set for a secondary calculation. If no alias is provided, one will be automatically generated. To modify the existing alias logic, or add support for a custom secondary calculation, override generate_secondary_calculation_alias().

dbt_metrics's People

Contributors

atomatize avatar callum-mcdata avatar clausherther avatar dave-connors-3 avatar davidbloss avatar deanna-minnick avatar drewbanin avatar fivetran-joemarkiewicz avatar joellabes avatar joeryv avatar jtcohen6 avatar jthigpen avatar leahwicz avatar longtomjr avatar mirnawong1 avatar rijnhardtkotze avatar sisu-callum avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbt_metrics's Issues

Allow for dynamic filtering at runtime

Right now the metrics package does a great job of, when given a metric, time_grain, dimensions and calculations, returning an aggregation based off of that metric. In order to allow for maximum flexibility in how people are able to use and access these metrics, we should introduce an optional filters argument that would allow the user to apply sql-style filtering.

The benefit of this it is gives a huge range of flexibility into the information you can get out of a single metric, while maintaining consistency and avoiding having to make hundreds of metrics that are just different versions of filters people might need to apply. I think we'd be able to ensure accuracy here by making sure that any filtered columns where dimensions in the metric - if you can use a dimension in a metrics query you can filter on it and vice versa.

This syntax would allow you to filter a metric off of a single dimension, multiple dimensions and using LIKE operators. Imagine a hypothetical scenario where a user wanted to explore number of account sign ups by country and marketing campaign.

Examples:

No filters:

This would simply return the number of accounts by day.

{{metrics.get_metric_sql(number_of_accounts, day, [], )}}

One filter:

This would return the number of accounts by day only for accounts that signed up in the US.

{{metrics.get_metric_sql(number_of_accounts, day, [],where=[country='United States"] )}}

Two filters:

This would return the number of accounts by day from the United States where the marketing campaign had been attributed to Google.

{{metrics.get_metric_sql(number_of_accounts, day, [],where=[country='United States",marketing_campaign='Google"] )}}

Two filters using substring matching:

This would return the number of accounts by day from the United states where the campaign name matched any substring of "google_brand".

{{metrics.get_metric_sql(number_of_accounts, day, [],where=[country='United States",marketing_campaign_name='google_brand%"] )}}

As we begin creating the ability to serve more complex / dynamic use case it will be important for the metrics package to be able to create these dynamic, on the fly numbers with as little extraneous SQL code written by the end user as possible.

Google sheet with examples:

[Feature] Should we support aggregated ratio metrics as a base type?

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When working with the internal dbt team, we started brainstorming around the NPS Metric. Historically it is described as promoter score - detractor score, with each of those being ratios.

Our eventual solution was to offload a lot of this logic into the column being used in the metric definition but it got us thinking - what if we did want to support this within the current metrics framework? How could we do so in the most flexible way while remaining consistent with our tenets?

This definitely is just the start of the debate of what should live in metrics versus models but for right now lets assume this information should live in metric definitions,

What Would This Look Like?

We'd most likely need to be able to define "ratio" metrics where both the numerator and denominator were aggregations unto themselves. Here is a really rough mockup

metrics:
  - name: promoter_score
    label: Promoter Score
    model: ref('reviewers')
    description: "The count of promoters over total people"

    calculation_method: ratio
    ratio_config:
           numerator_calculation_method: count
           numerator_expression: case when person_type = 'promoter' then user_id else null end
           denominator_calculation_method: count
           denominator_expression: user_id

    timestamp: signup_date
    time_grains: [day, week, month]

    dimensions:
      - plan
      - country

I really don't like the semantic method of defining this metric but .... if the needs must.

Describe alternatives you've considered

Enforce that any aggregation needs to be a metric unto itself, even if it's not really useful. In the example of NPS, we could construct:

  • Count of customers
  • Count of promoters
  • Count of detractors
  • Promoter score (count of promoters/count of customers)
  • Detractor score (count of detractors/count of customers)
  • NPS ((Promoter Score - Detractor Score )* 100)

Who will this benefit?

Anyone who wants to build more complicated ratio metrics without needing X components

Are you interested in contributing this feature?

Yep

Anything else?

No response

Cleaning Up Metric Tree Logic

Although we plan on removing the functionality for metric tree parsing and replacing with dbt helper functions, it still needs to exist in the v0.3 lifecycle of the metrics package. This code currently pulls from the get_metric_tree macro and then does additional metric parsing in the get_metric_sql macro.

Everything from line 66 to 115 should be self-contained within a single macro that returns the metric tree.

{%- set metric_tree = {'full_set':[],'leaf_set':[],'expression_set':[],'base_set':[],'ordered_expression_set':{}} -%}

{% if metric_list is iterable and (metric_list is not string and metric_list is not mapping) %} 
    {% set base_set_list = []%}
    {% for metric in metric_list %}
        {%- do base_set_list.append(metric.name) -%}
        {%- set metric_tree = metrics.get_metric_tree(metric ,metric_tree) -%}
    {%endfor%}
    {%- do metric_tree.update({'base_set':base_set_list}) -%}

{% else %}
    {%- do metric_tree.update({'base_set':single_metric.name}) -%}
    {%- set metric_tree = metrics.get_metric_tree(single_metric ,metric_tree) -%}
{% endif %}

{# Now we will iterate over the metric tree and make it a unique list to account for duplicates #}
{% set full_set = [] %}
{% set leaf_set = [] %}
{% set expression_set = [] %}
{% set base_set = [] %}

{% for metric in metric_tree['full_set']|unique%}
    {% do full_set.append(metric)%}
{% endfor %}
{%- do metric_tree.update({'full_set':full_set}) -%}

{% for metric in metric_tree['leaf_set']|unique%}
    {% do leaf_set.append(metric)%}
{% endfor %}
{%- do metric_tree.update({'leaf_set':leaf_set}) -%}


{% for metric in metric_tree['expression_set']|unique%}
    {% do expression_set.append(metric)%}
{% endfor %}
{%- do metric_tree.update({'expression_set':expression_set}) -%}

{% for metric in metric_tree['leaf_set']|unique%}
    {%- do metric_tree['ordered_expression_set'].pop(metric) -%}
{% endfor %}

{# This section overrides the expression set by ordering the metrics on their depth so they 
can be correctly referenced in the downstream sql query #}
{% set ordered_expression_list = []%}
{% for item in metric_tree['ordered_expression_set']|dictsort(false, 'value') %}
    {% if item[0] in metric_tree["expression_set"]%}
        {% do ordered_expression_list.append(item[0])%}
    {% endif %}
{% endfor %}
{%- do metric_tree.update({'expression_set':ordered_expression_list}) -%}

[Feature] Remove Model Introspection

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Right now, the metrics package runs get_filtered_columns_in_relation in two different locations within the package. While this doesn't have a big impact in a traditional dbt deployment, it can have a larger impact in a dbt server esque world. The time cost can be in the order of milliseconds, which add up and contribute towards our internal self-constrained limits.

The solution here is to remove those introspective queries.

State of Today

The two queries are used in the metric validation process which exists in multiple steps.

  1. get_common_valid_dimension_list
  2. get_complete_dimension_list
  3. get_calendar_dimension_list

Options

  • Remove the queries and only validate dimensions provided against the metric metadata.

Describe alternatives you've considered

None - this is long overdue to be removed. Especially with me realizing how many damn introspective queries we are running!

Who will this benefit?

Everyone using the semantic layer

Are you interested in contributing this feature?

I made the mistake so you betcha

Anything else?

Mea culpa on the loop

[Bug] ambiguous column error when building off a model with `date_<grain>` field

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

We are currently investigating adding metrics to the dbt Ad Reporting package, in which each end model has a column named date_day. After defining a metric like this

metrics:
  - name: spend
    label: Ad spend (Fivetran)
    model: ref('ad_reporting__ad_report')

    type: sum
    sql: spend
    description: Total spend (in currency of individual platforms)

    timestamp: date_day
    time_grains: [day, week, month]

    dimensions:
      - platform
      - campaign_id
      - campaign_name
      - ad_group_id
      - ad_group_name
      - ad_id
      - ad_name
      - account_id
      - account_name

When I attempt to run a model calling metrics.calculate on this, I receive an error about date_day being an ambiguous column name, as it's also a column name in the helper calendar model. In some under-the-hood macros in this package, date_<grain> uses dot notation, while in some places, it's not prepended with a CTE or model alias, which is where the error is coming from I believe

Expected Behavior

I'm able to build metrics off of a model with a date_<grain> column name (pretty standard)

The alternative option is to not support metrics on models with these sorts of column names

Steps To Reproduce

  1. Build a model with a date_day field (or install Ad Reporting/an individual Ad Package)
  2. Define a metric using that model
  3. Call metric.calculate on that model
  4. dbt run

Relevant log output

Database Error in model test_metrics (models/test_metrics.sql)
21:50:28    Column name date_day is ambiguous at [139:17]
21:50:28    compiled SQL at target/run/jamie_dbt_dev/models/test_metrics

Environment

- dbt-adapter & version: dbt-bigquery 1.2.0
- dbt_metrics version: >0.3.0, <= 0.4.0

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

[Feature] Allow for custom-defined metrics types

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Note: I am creating this issue following a Slack thread during 2022-08-04 dbt Staging.

Currently there is a defined list of type values that may be used for metrics. I assume that over time more will be added, like median. I support adding various common, standard types.

BUT, it would be nice if users could define their own custom types. That would make it super extensible, and then dbt devs wouldn't be on the hook to anticipate every scenario. A custom type would likely be very specific to an org's business context, so it makes sense for development to be in the hands of the user rather than relying on a finite list.

I hypothesize that while any individual custom definition of a type could be considered fringe or rare, in aggregate, the various fringe types could constitute a significant proportion of metrics that users need to implement.

Describe alternatives you've considered

dbt devs greatly expand the type options, while still inevitably failing to anticipate all user needs.

Who will this benefit?

  • Users who want to develop a metric type that has not yet been conceived of and/or is not one that should be made available to the broader community, most likely because it is a fringe use case.
  • dbt devs who don't have to do as much work building out type options.

Are you interested in contributing this feature?

Possibly! I don't know enough about the mechanics of type or metrics overall to say if I'd be the best person to help with this, but it seems possible that I could be helpful.

Anything else?

No response

Development Workflow - Develop/Test Metric Function?

Description

After discussions with a few different partners, we've realized that the metrics implementation right now is very focused on production use cases. Which is great! This is where business consumers are going to see them and its the highest importance to ensure that they're experience is delightful.

However, its also important for us to have a delightful experience for the developer adding metrics to a project. Right now, developers can define a metric in a dbt project and then run a model with the calculate macro but this creates temporary objects for the sake of testing/validation. Bit of an anti-pattern.

Additionally, it doesn't fit in nicely with dbt Server, which runs based on the production code manifest.

Proposal

Add additional functionality that would take in a yml definition (defined in some jinja block) and calculate parameters, and then returns the dataset. This could be used in the future in query editors in order to test out metrics or even test alternate definitions to see how they change.

Potential Implementation:

{% set metric_definition %}
metrics:
  - name: abc
    ...
{% endset %}
{{ metrics.develop(definition=metric_definition, <calculate params>) }}

[Feature] Remove Whitespace In Metrics Query

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Right now, there is a lot of whitespace produced in the metrics query. This is not helpful to those who want to see the compiled sql. Figure out a way to reduce the whitespace introduced by comments!

Describe alternatives you've considered

Not doing this?

Who will this benefit?

Anyone who wants to read the query and doesn't want to scroll down 60 empty lines.

Are you interested in contributing this feature?

You break it, you buy it. So.... yea.

Anything else?

No response

[Feature] dbt_metrics should be "somewhat" backwards compatible

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

With version 0.3.0, we slightly modified the syntax used to calculate metrics - namely by changing the name from metric to calculate. This was intended to ensure individuals knew what the macro was doing - representative macro names are a big part of this repository.

However, that now introduces issues of backwards compatibility.

The idea that @drewbanin and I had was that the old macro call should still work but throw up a lot of warnings. The questions we'd need to answer would be:

  • does this still work if they use metric_name as opposed to the metric function?
    • I suspect it will work for base metrics but not for expression metrics. Additionally it wouldn't work once we move to helper functions that rely on the metric object.
  • is this worth the time it would take?

Describe alternatives you've considered

Not making things backwards compatable.

Who will this benefit?

Anyone who has used metrics previously that doesn't want to migrate their syntax to the new version. Also potentially partners who won't have upgraded their integrations.

Are you interested in contributing this feature?

Yes

Anything else?

No response

Unified Error Message For Expression Metrics

Right now the expression message compile error logic is handled with the following codeblock.

{% if not is_multi_metric %}
    {% for metric in metric_list %}
        {% if metric.type != "expression" and metric.metrics | length > 0 %}
            {%- do exceptions.raise_compiler_error("The metric " ~ metric.name ~ " was not an expression and dependent on another metric. This is not currently supported - if this metric depends on another metric, please change the type to expression.") %}
        {%- endif %}
    {% endfor %}
{% else %}
    {% if single_metric.type != "expression" and single_metric.metrics | length > 0 %}
        {%- do exceptions.raise_compiler_error("The metric " ~ single_metric.name ~ " was not an expression and dependent on another metric. This is not currently supported - if this metric depends on another metric, please change the type to expression.") %}
    {%- endif %}
{%- endif %}

This is duplicative and could create issues if someone remembers to update one error string but not the other. We should re-write the logic here to call a single macro that takes an input and returns the error message from a single location.

[Bug] All-time grain fails joining metrics when no calendar dimension provided

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

 {%- for dim in dimensions %}
      {%- if loop.first and calendar_dimensions | length == 0 -%}{{ dim }}{%- endif -%}
      , {{ dim }}
 {%- endfor -%}

This code section creates duplicate dimensions

Expected Behavior

It shouldn't. Replace it with this code

    {%- for dim in dimensions %}
        {%- if loop.first and calendar_dimensions | length == 0 -%}
    {{ dim }}
        {%- elif not loop.first and calendar_dimensions | length == 0 -%}
    , {{ dim }}
        {%- else -%}
    , {{ dim }}
        {%- endif -%}
    {%- endfor -%}

Steps To Reproduce

Fix

Relevant log output

No response

Environment

- dbt-adapter & version:
- dbt_metrics version:

Which database adapter are you using with dbt?

No response

Additional Context

No response

Allow specifying a timezone

Since calculating a metric typically involves truncating a timestamp to day/week/month, it would be great to be able to configure what timezone to use, maybe as an additional optional argument to metrics.calculate(...). I'm not sure whether this makes more sense as something to support in the dbt metric definition instead since people likely want to enforce use of a certain timezone for a metric calculation so open to your thoughts there.

Moving the WHERE clause to the base query to reduce table scan load

To Where Or Not To Where

@joellabes had a great point in our discussion today around the WHERE clause added in #34. He stated that it could potentially break our idea of the metric being the single point of access for centrally approved metric calculations - ie, where clauses could cause different datasets to be returned.

I'm torn on the subject. In the end I think it comes down to how much flexibility we want to give people.

Paths Forward

  1. We remove the WHERE clause
  2. We add the WHERE clause to the end of the generated SQL instead of the base query. This would limit functionality to only filtering on either the metric value (ie arbitrarily excluding or including values based on some condition) or the dimensions provided in the metric definitions.
  3. Leave as-is (as raised by @joellabes despite his opposition)

@jasnonaz I know you were a proponent of implementing this functionality and curious to hear your thoughts.

Column Name Lengths - Column Aliasing With Secondary Calcs Could Break Limit

@joellabes had a great observation during our call today that the additional functionality of prepending metric names onto secondary calculation aliases could cause the generated dataset to hit the character limits for different data warehouse.

The least permissive of the Big 4 connectors is Postgres with its 59 character limit. The second of these is Redshift with 127 bytes (which I'm assuming is synonymous with characters).

We should add additional testing to confirm whether this fails or not. If it does, we should potentially trim the metric name to the maximum number of characters that can be combined with the secondary calculation alias name.

start_date and end_date arguments need to be cast as date

Description of the Issue

When leveraging the start_date and end_date arguments for a metric I receive an error indicating the datatypes for period and upper/lower_bound fields within the final cte "where" condition do not match. I believe this issue is originating from within the get_metric_sql.sql macro.

How to reproduce

Create the following metric_test.sql file:

##metric_test.sql
select 
    *
from {{ metrics.metric(
    metric_name='new_opportunities',
    grain='month',
    dimensions=['lead_source'],
    start_date="2022-01-01",
    end_date="2022-03-03"
) }}

Execute dbt run -s metric_test and see the below error:

22:52:46  Running with dbt=1.0.3
22:52:47  Found 15 models, 14 tests, 0 snapshots, 0 analyses, 524 macros, 0 operations, 0 seed files, 4 sources, 0 exposures, 1 metric
22:52:47  
22:52:47  Concurrency: 4 threads (target='bigquery')
22:52:47  
22:52:47  1 of 1 START table model dbt_joe_salesforce.metric_test......................... [RUN]
22:52:52  1 of 1 ERROR creating table model dbt_joe_salesforce.metric_test................ [ERROR in 4.89s]
22:52:52  
22:52:52  Finished running 1 table model in 5.52s.
22:52:52  
22:52:52  Completed with 1 error and 0 warnings:
22:52:52  
22:52:52  Database Error in model metric_test (models/metric_test.sql)
22:52:52    No matching signature for operator >= for argument types: DATE, STRING. Supported signature: ANY >= ANY at [143:11]
22:52:52    compiled SQL at target/run/my_new_project/models/metric_test.sql
22:52:52  
22:52:52  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Proposed Solution

Digging further into this, I think the solution is pretty straight forward. I believe the issue can be addressed by casting the start_date and end_date references in the below line as dates:

{% if start_date %} '{{ start_date }}' {% else %} min(case when has_data then period end) over () {% endif %} as lower_bound,
{% if end_date %} '{{ end_date }}' {% else %} max(case when has_data then period end) over () {% endif %} as upper_bound

I actually have a working branch on my forked repo that I am using to get around this error. See the quick changes here.

I am happy to open a PR if this is an issue others are encountering. Additionally, please let me know if I am missing something. Thanks!

[Bug] develop should validate that the model exists

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

develop checks that model is provided but fails in a downstream macro if the model doesn't exist.

Expected Behavior

develop checks that the model field is provided but provides a compilation error if the model doesn't exist

Steps To Reproduce

Run develop with a model that doesn't exist.

Relevant log output

It fails on `get_model_relation`.

Environment

- dbt-adapter & version: Snowflake v1.2.0 
- dbt_metrics version: 0.3.2

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

Unable to use function calls in start_date and end_date

Description of the Issue

When defining end_date='dateadd(day, -1, date_trunc("day", getdate()))' to dynamically get "yeterday" on a metric, the compiled SQL (Snowflake) throws an error:

[22007][100040] Date 'dateadd(day, -1, date_trunc("day", getdate()))' is not recognized

This is due to the fact that #20 and release tag 0.1.5 is now wrapping all start_date and end_date input in single-quoted strings.

How to reproduce:

##metric_test.sql
select 
    *
from {{ metrics.metric(
    metric_name='new_opportunities',
    grain='month',
    dimensions=['lead_source'],
    start_date="2022-01-01",
    end_date='dateadd(day, -1, date_trunc("day", getdate()))'
) }}

Execute dbt run -s metric_test which will complete succsfully. The error occurs when querying the table/view in the database:

Error

ANALYTICS_DEV> SELECT t.*
               FROM ANALYTICS_DEV.DEV_WAREHOUSE.NEW_OPPORTUNITIES t
               LIMIT 501
[2022-03-31 00:25:22] [22007][100040] Date 'dateadd(day, -1, date_trunc("day", getdate()))' is not recognized

Compiled SQL snipet

bounded as (
    select 
        *,
        cast('202201-01' as date) as lower_bound,
        cast('dateadd(day, -1, date_trunc("day", getdate()))' as date) as upper_bound
    from joined 
),

Proposed Solution

I want to define a metric that uses a mix of functions to calculate a date 'date_trunc("day", getdate()))', and plain date strings '2021-01-01'. The macro should detect:

  • If start_date contains a function call, then leave it un-quoted so the call will be executed
  • Else if state_date contains only a basic date string, wrap it in quotes
  • Else act like start_date isn't defined

Desired compile SQL snipet

bounded as (
    select 
        *,
         cast('2021-01-01' as date) as lower_bound,
         cast(dateadd(day, -1, date_trunc("day", getdate())) as date) as upper_bound
    from joined 
),

PR to resolve

[Feature] Alter get_metric_sql inputs to be agnostic to method of initialization

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Hi My Name Is Callum And I Wrote "Bad" Code

One thing I love about dbt is that it brings software best practices to the data community. And sometimes that means a member of the data community (me) creating something that doesn't conform to software best practices - so we gotta fix that! The result is that I learn more about SWE best practices and the package gets better 🕺

Right now, the get_metric_sql macro has conditional logic to handle when it is invoked by the develop macro or the calculate macro. Different behavior is used for each of them to handle the nuances of each. Additionally, there is a variable that the develop macro provides that is not provided by calculate. This is a big no-no in software engineering land, as evidenced by this helpful article describing the anti-pattern.

The Fixes

Change the metric_list parameter passed to get_metric_sql from a list of metric objects to a list of dictionaries. These dictionaries will have a key of the metric name and a value of all the needed metric attributes.

This will allow us to remove the metric_definition variable passed to get_metric_sql from the develop macro which is almost the exact thing that I am describing. We can standardize how metric attributes are passed to the macro with this list of dictionaries.

Change the necessary validation macros to use the above object instead of the metric object. Then apply those same validation macros on the develop macro now that they are no longer dependent on the metric object

Validation of grain is dependent on the metric object in order to retrieve the list of acceptable time grains. Instead we will pull this from the parameters in the dictionary so that it can be used by both develop and calculate. This same behavior is followed for dimensions and expression metrics so it could be duplicated for them as well.

Change the looping logic for sql generation so that it produces the same output for expression metrics, the develop macro, and single metric macro calls.

The actual sql generation logic is held within the gen series of macros but the logic on how it is constructed (order, looping, etc) is held in get_metric_sql. Unfortunately it has 3 paths depending on the specifics of the input - we should standardize this to one that is flexible enough for all 3 instances. This will require editing get_metric_sql as well as gen_joined_metrics_cte which currently is only used in the multi-metric path.

For more clarity, the 3 paths are currently:

  • single non-expression metric - skips the gen_joined_metrics_cte that is used to join all base metrics on dimensions + grain
  • expression metric OR multiple metrics - loops through all base metrics then uses gen_joined_metrics_cte to create all the expression metrics
  • develop macro - similar to single non-expression metric but using develop macro inputs instead

Describe alternatives you've considered

Not doing this and leaving as is. It works but will be harder to keep things consistent if we continue to add new methods for generating or running the metric sql.

Who will this benefit?

The development team as well as anyone who wants to contribute. We can lay out the documentation of each input so that it is easier to create new methods on top of the base framework.

Are you interested in contributing this feature?

You betcha.

Anything else?

No response

--defer and metrics

Hello everyone, we are trying out dbt_metrics and saw a strange problem within our CI environment.
We use dbt build --select state:modified+ --defer to only execute models that have changed, and since metrics themselves do not count as a model but rather are CTE, they of course are not executed.
What happens instead is when the CTE for the metric is compiled it does not respect the --defer flag and uses the current schema rather than the deferred one, does this make sense?

probably related to dbt-labs/dbt-core#5525

[Feature] Consider renaming the package

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

A contributor wrote this code:

select * from {{ metric.calculate(
    metric=metric('items_sold'),
    grain='week',
    dimensions=[]
) }}

and when compiling got this error:

dbt.exceptions.RuntimeException: Runtime Error
  Compilation Error in analysis test_metrics (analyses/test_metrics.sql)
    'dbt.context.providers.RuntimeMetricResolver object' has no attribute 'calculate'. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with "dbt deps".
01:09:55  Encountered an error:
Runtime Error
  Compilation Error in analysis test_metrics (analyses/test_metrics.sql)

The reason is that they invoked metric.calculate() not metricS.calculate(). metric is a dbt Core construct, like ref, and metrics is the name of the package.

It would be less possible to make this mistake if we renamed the package, e.g. to dbt_metrics. This would also probably help with googling in the future because dbt_metrics isn't as polluted a namespace. (see also the poor sods who are trying to search how to do something using R)

Describe alternatives you've considered

  • Doing nothing.
  • Creating a package called ref to maximise chaos in the world.

Who will this benefit?

People who are trying to write code at 6pm and can't remember where the S's go

This will explicitly cause pain for people integrating against the package, so if we're going to do it we should do it ASAP.

Are you interested in contributing this feature?

No response

Anything else?

No response

[Feature] Upgrade Develop To Support Expression Metrics & Multiple Metrics

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Rise of the Shadow DAG

develop was a hit with some of the beta customers but it is currently limited to only a single base metric. We should support multiple metrics and expression metrics.

Note: This work will depend on #87 so that get_metric_sql doesn't have even more diverging paths.

What does this look like

Once the work in #87 is done, this will take place entirely within the develop macro and the get_faux_metric_tree macro. We'll pass through the same list of list of metric dictionaries that would be passed in via calculate but we'll need to figure out the relational path with some funky regex.

For example:

  • the yml provided defines 2 base metrics, 1 first level expression metric, 1 second level expression metric.
  • get_faux_metric_tree will use regex to create the appropriate hierarchy of metrics
  • develop will provide all the attributes associated with each metric
  • get_metric_sql will use those just like it would calculate to build the necessary sql

Concerns

The only concern I have here is the "shadow DAG" persisting in BI tools instead of the central canonical repo. This definitely feels like giving people more rope than the current implementation which is very very restrictive. It does make me think that we maybe put some form of artificial boundary in to limit the complexity so that it is only jokingly a shadow DAG instead of actually one.

Describe alternatives you've considered

Not doing this? Or going down the bad path of adding even more weird behavior to get metric sql.

Who will this benefit?

Anyone who wants to test out metrics. Potentially also a method in the future of validating metrics before pushing back to some form of API creation.

Are you interested in contributing this feature?

YEAAAAA

Anything else?

No response

New "prior" secondary calculation

To ease period over period reporting in some BI tools, an additional secondary calculation could be added.

Example usage:

{{ metrics.metric(
    metric_name='sessions',
    grain='day',
    dimensions=[],
    secondary_calculations=[
        metrics.period_over_period(comparison_strategy="prior", interval=7, alias="sessions_7_days_ago"),
    ],
) }}

This would perform no actual calculation but pull the metric value from the correct interval.

[Bug] `start_date` and `end_date` don't limit partition scanning in BigQuery

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Using start_date and end_date in metrics calls where the underlying model is a BigQuery date-partitioned table does not limit data scanning.

I believe this is because of the way the start_date and end_date fields are rendered in the WHERE clause of the metric SQL (as the generic cast(<timestamp_col> as date)). I think BQ isn't quite sure what to do with "generic" type casting when it comes to time types, versus their built-in type casting methods. Using DATE(<timestamp_col>) instead does limit partition scanning

Expected Behavior

A limit in data scanning when using start_date and end_date clauses.

Steps To Reproduce

  • Create a metric based on a BigQuery date-partitioned model
  • Call the metric specifying start_date and/or end_date
  • Data scanning should show a full column scan for column(s) specified in the metric

Relevant log output

No response

Environment

- dbt-adapter & version: dbt-bigquery==1.2.0
- dbt_metrics version: `master` (pulled September 12)

Which database adapter are you using with dbt?

bigquery

Additional Context

Relevant Slack thread: https://getdbt.slack.com/archives/C02CCBBBR1D/p1663019049038639

[Feature] Datatypes of metric in resulting dataframe should always return numeric

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Some partners have raised concerns about needing to know the datatype of the metric before visualizing it. For their purposes, we need to ensure consistency that the metric is always returned as a numeric.

We already coalesce to 0 but there might be some odd behavior around min and max with date fields. If we see that the metric type isn't a numeric, we should fail loudly.

Describe alternatives you've considered

No response

Who will this benefit?

Everyone who likes consistency

Are you interested in contributing this feature?

You betcha

Anything else?

No response

[Feature] Allow variables to be passed as dimensions

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When talking with the fivetran team about their future development and potentially adding metrics, they mentioned that custom pass through dimensions might want to be included with the metric definition. Right now, metrics don't support this.

Describe alternatives you've considered

Not supporting this, as it is lower priority for initial launch.

Who will this benefit?

Anyone who wants to use the semantic layer with packages designed to be dynamic.

Are you interested in contributing this feature?

Yeas

Anything else?

No response

Expression Calculations On 0's Return Nulls Incorrectly

Description

Welp, the fix we instituted in order to fix the division by zero issue is now causing issues in expression metrics that are simple calculations.

Example

select *
from 
{{ metrics.calculate(metric('multi_dimension_expression_metric'), 
    grain='month',
    dimensions=['had_discount','order_country']
    )
}}

Here the expression metric is defined as base_sum_metric + 1. But because we override the 0 value with null (to correctly disallow division by zero) we now have issues with calculations.

image

[Feature] Should we default order the grain and dimensions?

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

By default, generated sql by the metrics package doesn't order the final CTE. This decision was taken because we noticed a small impact to performance when ordering in our testing and we were optimizing for the most efficient query we could take while retaining functionality.

Should we include this by default? Should we have this as an option that can be added on?

Describe alternatives you've considered

Keeping things the same and pushing the ordering down to the consuming analytics engineer OR the integration partner.

Who will this benefit?

I'm unsure here - apart from the cleanliness of the returned dataset, I am not 100% convinced that this is a necessary feature. Most integration partners make the ordering very easy.

Are you interested in contributing this feature?

Yes

Anything else?

No response

[Feature] Metrics Should Support Rolling/Window Metrics

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Right now, metrics are aggregations applied to the underlying dataset grouped upon dimensions and the provided time grain. But this is a simplistic view of metrics - it excludes metrics that exist across a rolling range of time.

Example:

  • Weekly Active Users: Right now, metrics handles this perfectly well.
  • Rolling 14D Active Users: This is not supported

Why Would A Company Use Rolling Window Metrics?
There are certain usage patterns for products that have natural ebbs and flows based on inconsistent time ranges. Weekly metrics might be unreliable but a rolling 14d average might be a better representation of the health of the metric. Given this, we should expand our definition of metrics to support these ones as well.

Implementation

From early mock-ups, I believe this could require one new components of the metric definition. Note I am not married to these names and think they should be called something else:

  • Interval: the number of days that the metric should look back. Ex: 14

NOTE: I am proposing that we tie this lookback functionality directly to days. The aggregation could still possibly be different

These would live in the metric definition.

Open Questions

  • We strongly believe that the lookback window is part of the metric definition itself. This helps us keep our guiding tenet of a metric value should be consistent everywhere that it is referenced
    • Example: Right now our strong thinking is that any lookback is itself part of the metric definition - if you want 30D average instead of 14D then that would be a different metric.
    • Should this be implicit in the metric definition? I'd love to hear counterpoints if people have them!
  • We believe that these lookbacks should be expressed in days, weeks, months, years.
    • How do people think about lookback metrics? I've always defined them in terms of days (14d, 30d, 90d, etc) but I know there are use cases for other time grains like month. Quarter is too nebulous and not a supported datetime format so I believe that should be excluded. Thoughts?
  • We believe that time grains at higher levels than day represent the desire to see the metric on that grain (week, month), not double aggregations to that level
    • Example: Your metric is 30D lookback and you select the week grain when aggregating your metric. What you're really asking is to see the 30D lookback for each week, not the sum of all 30D lookbacks for each day in that week.
    • Is there ever a use case for lookback metrics aggregating their aggregates? To me this seems like a capital B Bad Idea but I'd love to hear counterpoints

SQL Mockup

Mockups in further comments

Describe alternatives you've considered

Not supporting functionality

Who will this benefit?

Anyone who has a metric that is not possible with our current metrics

Are you interested in contributing this feature?

yes

Anything else?

No response

Period Over Period Difference - First Period Should Be Null?

Description

When using the secondary calculation period_over_period and the difference comparison strategy, the first period should most likely be NULL because there is no value that it is being previously compared to. In the current implementation, we coalesce that value to 0 and potentially incorrectly display the value as equal to itself.

image

Fix

This logic comes from line 27 in secondary_calculation_period_over_period.sql.

{% macro default__metric_comparison_strategy_difference(metric_name, calc_sql) %}
    coalesce({{ metric_name }}, 0) - coalesce({{ calc_sql }}, 0)
{% endmacro %}

We should potentially find some way to determine if it is the first value and, if so, exclude the coalesce from the subtraction.

[Feature] Clean up pytest testing

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Right now, we have a unique test file for each aspect of the package whose behavior we want to confirm. This is fine for ease of understanding but is a pretty bad pattern for quick CI. I'm not entirely sure what the performance boost would be on changing all of the unique folder paths to single files but it is worth consideration so that we're not waiting for an hour or two for our tests to run.

Describe alternatives you've considered

Not doing this and accepting the time constraints.

Who will this benefit?

The developers of dbt_metrics

Are you interested in contributing this feature?

Yea - it was my initial implementation so the fix should fall on me :nervous:

Anything else?

No response

Adopting Pytest

With @dbeatty10 working on a framework for package maintainers to begin introducing pytest into their development cycle, we're presented with an opportunity to beta-test it for something we'd been planning on implementing in the first place!

Why Pytest?

Right now, dbt tests confirm that things are still working but they can't test that expected failure testing. Pytest allows us to introduce errors that we WANT to fail and then test that we're getting the result we want to see.

Use Real Refs and Real Metrics - Fixing CI Defer

Description

In many parts of the codebase for dbt_metrics we reference the graph object because it is immensely helpful in understanding relationships. HOWEVER, it is not tightly integrated into the rest of dbt like ref is, which became clear when testing out metrics with CI!

Fix

@jtcohen6 discovered the issue and laid it out well in the below loom. The solution is removing the get_metric_relation and get_model_relation macros and instead using ref and metric which are correctly handled in our parsing logic.

Obi-Wan Shows Us The Way: https://www.loom.com/share/6f5fb63404ad45e4aa15d5a4bac0ee8b

Validate that grain provided to macro is defined on the metric

Description

Right now, our validation is just confirming that a grain field exists in the macro parameters being provided by the end user. We should also add verification that this defined grain exists in the metric, or list of metrics, as well as the calendar table.

Potential Pitfalls

  • How this interacts with the calendar table: We need some way of verifying that the date_{grain} field exists in the calendar table provided by the dbt project.

Implementation

The validation should:

  • Build a list of acceptable grains from the metric
  • Build a list of acceptable grains from the calendar table
  • Confirm that the provided grain is contained within both lists above. If not, return a compilation error
    • This is enough of a diverging condition that we'd want to provide two different error messages for the two ways this can error out.

[SEMANTIC-53] [Feature] Make dbt_metrics smarter

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

A table scan for you, a table scan for me

In our current implementation, each parent metric that is included in the full set goes through its own full build_metric_sql - this means that each parent metric is hitting the underlying table even if they're pulling the exact same dataset. Obviously that is less than optimal for people who are querying multiple metrics from the same table.

How Do We Handle It

Initially my thought was that we could introduce some additional logic into the parsing of metric_dictionary and metric_tree in order to group parent metrics by their model. But then I realized filters would be an issue - Metric A and Metric B might have mutually exclusive filters whose application would affect the value of each other and thus break one of our key tenants.

So, the idea becomes:

  • Group all metrics without any filters into the same base query. Any metric with a filter then gets its own gen_base_query

Describe alternatives you've considered

For once I'm taking this part of the issue seriously - the alternative is that we keep the logic the same. Instead of this being my tongue in cheek way to fill this cell, I think this is a legitimate suggestion based on the complications raised above. It does raise some concerns with inefficiencies around table scans if using multiple metrics but we don't know yet how popular of a behavior that is.

Who will this benefit?

Anyone using multiple metrics from the same base model would see increased performance were we to implement this. Outside of that group, I am unsure.

Are you interested in contributing this feature?

Yes

Anything else?

No response

Quotes around Identifiers/Timestamps in compiled SQL for Snowflake

We’ve started using metrics_store for experiments and are using Snowflake as the connected data warehouse.

Our metric definition looks like the following, here we are counting the number of events ingested from Segment or Amplitude per day/week/month.

metrics:
  - name: event_count
    label: Event Counts
    model: ref('activity_stream')
    description: "Count of events logged"

    type: count
    sql: activity

    timestamp: timestamp
    time_grains: [day, week, month]

    dimensions:
      - source

    filters:
      - field: timestamp
        operator: '>='
        value: '2021-01-01'

The model used for calculation of the metrics is as follows(borrowed from the example shown in the Readme)

{{ config(materialized = 'incremental') }}

select *
from {{ metrics.metric(
    metric_name='event_count',
    grain='week',
    dimensions=['source'],
    secondary_calculations=[
        metrics.period_over_period(comparison_strategy="ratio", interval=1, alias="pop_1wk")
    ]
) }}

However, the compiled SQL has the following issues:

  • There are a few column names that need to be wrapped in double quotes for Snowflake : source, activity and timestamp — if not, they show up as invalid identifiers with the following error : invalid identifier 'TIMESTAMP'
  • The timestamp filter mentioned in the metric is showing up as a filter where 1=1 and timestamp >= 2021-01-01 in the compiled SQL. This is missing quotes around timestamp and the date 2021-01-01

Once we copied over the SQL and updated the quotes around identifiers/dates, the metric creation completed successfully!

[Bug] metrics.calculate crashes when model `ref` in metric definition uses double quotes

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When a dbt metric definition's model field specifies a ref using double quotes (for example, model: ref("bookings"), metrics.calculate crashes while attempting to fetch the model relation during the dbt compile stage (log output below).

Expected Behavior

Double-quotes in a ref should be supported.
dbt-core compiles successfully when a ref is specified with double quotes, whether in a metric definition, or a jinja block. It seems reasonable that double quotes should be accepted, as it is valid python syntax.

Steps To Reproduce

Can't share my company's code but would recommend trying to reproduce by modifying a fixture to use double-quotes in a metric definition's model field. I tried to reproduce it this way but it was taking a bit too long to get my local environment in a state that made pytest happy.

Relevant log output

dbt.exceptions.RuntimeException: Runtime Error
  Compilation Error in model average_booking_value (models/metrics/average_booking_value.sql)
    No first item, sequence was empty.
    
    > in macro get_metric_sql (macros/get_metric_sql.sql)
    > called by macro default__calculate (macros/calculate.sql)
    > called by macro calculate (macros/calculate.sql)
    > called by macro get_model_relation (macros/graph_parsing/get_model_relation.sql)
    > called by model average_booking_value (models/metrics/average_booking_value.sql)
15:04:59  Encountered an error:
Runtime Error
  Compilation Error in model average_booking_value (models/metrics/average_booking_value.sql)
    No first item, sequence was empty.
    
    > in macro get_metric_sql (macros/get_metric_sql.sql)
    > called by macro default__calculate (macros/calculate.sql)
    > called by macro calculate (macros/calculate.sql)
    > called by macro get_model_relation (macros/graph_parsing/get_model_relation.sql)
    > called by model average_booking_value (models/metrics/average_booking_value.sql)


### Environment

```markdown
- dbt-adapter & version: v1.2.0
- dbt_metrics version: v0.3.1

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

Build Common List of Dimensions Across All Metrics (& Perform Metric count match)

Description

This issue stems from a great realization during a code review with @joellabes . The current method of creating a valid list of dimensions is brittle and can be broken by having more than 1 metric. A dimension that is excluded in metric 2 might then be re-included in Metric 3 if they're shared.

    {%- set dimension_list = [] -%}
    {% for metric in metric_list %}
        {# get all dimensions in metric  #}
        {%- set metric_dimensions= metrics.get_valid_dimension_list(metric) -%}
        {# This line removes any of the dimensions for loop metric that don't exist in the dimension list already
        This creates a smaller list that only contains the subset #}

        {# TODO #50 build common list across all metrics and then find ones that appear X (metric count) times  #}
        {%- set new_dimensions = ( metric_dimensions | reject('in',dimension_list) | list) -%}
        {% for dim in new_dimensions %}
            {%- do dimension_list.append(dim) -%}
        {% endfor %}
    {%endfor%}

Fix

The proposed solution to this is:

  • Create a dictionary of dimensions from all metrics and store the count of times that they appear. Then filter out all that don't match the count of metrics provided by the metric.

Cartesian Joins Create ALL Combinations, Not All Possible Combinations

While testing this package I noticed that the resulting table contains all possible combinations of values contained within the dimension list, as opposed to all of the possible combinations. This creates rows where the combination of values could never occur in the base dataset.

For example, the testing case I was going through contained two dimensions called user_name and organization_name. There's a relationship between those 2 fields in that an organization can have multiple users but a user is only going to be part of 1 organization.

The dataset produced by the macro has rows for a particular in every potential organization.

In the actual metric calculation, that's not that big of a deal because the values provided are equal to 0 but it could be a bit of an issue when working with BI tools. Drop downs / filters would contain combinations that aren't possible and potentially give the data consumers the wrong conclusion of the relationship between those two fields.

Not sure if this is intended behavior but I figured I'd flag just in case!

Rename dimension list to better reflect variable name

Description

The section of code that handles the creation of a list of valid dimensions needs to better reflect that actions that are being performed. In other words, this issue involves changing structure and naming to:

  • From the metrics provided to the macro, extrapolate a list of valid dimensions that can be used in the macro call

Codeblock

    {# TODO get valid common dimension name to clarify what is #}
    {%- set dimension_list = [] -%}
    {% for metric in metric_list %}
        {# get all dimensions in metric  #}
        {%- set metric_dimensions= metrics.get_valid_dimension_list(metric) -%}
        {# This line removes any of the dimensions for loop metric that don't exist in the dimension list already
        This creates a smaller list that only contains the subset #}

        {# TODO build common list across all metrics and then find ones that appear X (metric count) times  #}
        {%- set new_dimensions = ( metric_dimensions | reject('in',dimension_list) | list) -%}
        {% for dim in new_dimensions %}
            {%- do dimension_list.append(dim) -%}
        {% endfor %}
    {%endfor%}

Handling Non-Time Metric Aggregations

Problem Statement

We've now heard from multiple partners that sometimes they want to return a single number representing a metric. They don't need it to be listed by day, week, month, etc - they want the single numerical representation for that business concept.

And we can't do that right now. With time_grain and timestamp being required fields for a metric definition, we currently only allow for returned datasets to be broken down by some sort of time aggregation. The partners could hypothetically pull the most recent value for the metric but that only works in some cases. Cases where a multi-month aggregation, or even multi-year aggregation, of a metric cannot be handled with current functionality.

Feature Description

As I see it, there are two options as we move forward:

  1. Re-architect the metric node to no longer require a timestamp and time-grain field. We would then change the logic in the metrics package to be able to handle when these are missing and still return the correct logic.
    • There still exists the possibility that sometimes a metric with a timestamp would be needed to aggregated over all time.
  2. Add support for the all type in time_grains. This would aggregate over all time and return the metric value. However, this still has implementation questions
    • Do we return a date field with the metric? My gut instinct is no but there could be some shenanigans around what time range really constitutes all

In my opinion, the perfect solutions is a combination of the two of them. We should support all, or some alternative name, to allow for use cases that aren't focused around time-series data. Additionally, we should consider re-architecting the metric node to allow for non-timestamped metrics, which is an area we might run into.

How Does This Relate To The Vision

This is a great issue for us to have a broader discussion on what the future of the semantic layer looks like. Is it a permissive framework that gives users as much flexibility as they need or is it a tightly woven experience around an opinionated stance of what metrics are and the best practices associated with them?

Period Over Period Cast To Float

Description

Once again, the most confusing and asinine aspect of PostgreSQL strikes again - if you do simple division, it returns as an integer instead of a float or floating numeric.

Example:

image

Solution

In the period_over_period secondary calculation, we need to cast to float for Postgres and Redshift as they both exhibit this known behavior.

[Bug] all_time missing `date_all_time` in calendar_table

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Using the all_time grain and receiving the following error

  • I have a custom calendar (not sure if anything needs to change there, no indicator in docs)
  • I have added the all_time grain to the metric definition
  • I have included the all_time grain in the grain definition on the metric model

Expected Behavior

A user should be able to define an all_time grain according to the documentation and have the relevant date_grain column generated from the macro.

Steps To Reproduce

  1. Define a metric and include the all_time grain in the metric definition
  2. Create a select statement using metric.calculate and specify the grain parameter to be all_time
  3. Compile | Run the SQL
  4. Error is displayed

Relevant log output

(.venv) ➜  **redacted** git:(develop) ✗ dbt run -m **redacted**_all_time
Running with dbt=1.2.0
Found 192 models, 1923 tests, 2 snapshots, 2 analyses, 647 macros, 0 operations, 0 seed files, 131 sources, 0 exposures, 2 metrics

Concurrency: 4 threads (target='local')

1 of 1 START view model **redacted**_all_time  [RUN]
1 of 1 ERROR creating view model **redacted**_all_time  [ERROR in 1.93s]

Finished running 1 view model in 0 hours 0 minutes and 6.34 seconds (6.34s).

Completed with 1 error and 0 warnings:

Database Error in model **redacted**_all_time (**redacted**/metrics/m_conversations_per_user_all_time.sql)
  Name date_all_time not found inside calendar_table at [125:24]
  compiled SQL at **redacted**/metrics/**redacted**_all_time.sql

Environment

- dbt-adapter & version:
- dbt_metrics version:

Which database adapter are you using with dbt?

bigquery

Additional Context

https://getdbt.slack.com/archives/C02CCBBBR1D/p1662754222962929

Multiple metrics macro does not include results when dimension has NULL values

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

It looks like there is a issue when grouping metrics by dimensions with NULL values. I have a dimension that in certain case has NULL value, and when I use the macro for multiple metrics including this dimension the output does not include the rows where this dimension is NULL and the metric value is greater than zero.

Expected Behavior

In the output of the model, I would expect to have a row that include metrics value also when the dimension is NULL.

Steps To Reproduce

fix

Relevant log output

No response

Environment

- dbt-adapter & version:
- dbt_metrics version:

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

[Feature] Metrics Should Support All-Time Grain Metrics

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

From the beginning of time the metrics issue , people have been asking about non-time based metrics. Some people believed that non-time based metrics were a requirement while others thought that we'd run into practical constraints for usage. George Fraser of Fivetran had this comment:

This isn’t just a technical issue: there are lots of business metrics which are time-series in principle but in practice we choose to calculate at just the present time, because it’s simpler.

He was right! We've heard from a lot of our customers that they're interested in getting single numbers returned for their metrics. The metrics themselves might be time-series in principle but the end consumer is only interested in the present value.

So lets add that functionality.

Implementation

There are two different options we have for adding this type of functionality.

  1. Alter the metric node to make timestamp & time_grains optional - if those attributes are not present then we return the single aggregate.
  2. Add all_time as an option for the grain input that aggregates across the entire date range.

I am almost certain that option 1 is NOT the route we want to go. Making these parameters optional has the potential to introduce too many issues downstream. For example:

  • Our partners rely on their being a grain input. They'd need to change their integrations to support that parameter becoming conditional
  • There are a host of validation issues that could arise if we no longer enforce timestamp and time_grains. Metrics that actually do require it wouldn't know that the metric is misconfigured until they are actually run.

So with that being said, I believe that we should add support for all_time (or some similarly named input) in time grains.

Open Questions

  • Should this be defined in the metric like all other time grains or should it be supported in all metrics?
    • I feel strongly that it should have to be defined.
  • Should it be called all_time? Is there some other name that better captures it?
  • What error message should we raise to show that secondary calculations wouldn't be supported with this functionality?
    • Should we try and figure out some way of supporting them?

Describe alternatives you've considered

Option 1 raised in above message. I think that is a capital B bad idea but I am willing to be convinced otherwise!

Who will this benefit?

Anyone who wants to return metric results across all time.

Are you interested in contributing this feature?

Yep!

Anything else?

No response

Unable to create Period to date secondary calc with period as day when metric grain is day

The following metric fails. Using the dbt-bigquery adaptor.
Metric mode file named order_count_daily.sql

-- depends_on: {{ ref('enriched_orders') }}
{{ config(materialized = 'table') }}

select *
from {{ metrics.metric(
    metric_name='order_count_daily',
    grain='day',
    dimensions=[],
    secondary_calculations=[
        metrics.period_to_date(aggregate="sum", period="day", alias="pod_sum_month")
    ]
) }}

Yaml file containing metric definition

metrics:
- name: order_count_daily
  label: Order count daily
  model: ref('enriched_orders')
  description: Count of orders for each day
  type: count
  sql: order_id
  timestamp: order_date
  time_grains: [day]
  filters: []
  dimensions: []

Error -

10:56:28  Database Error in model order_count_daily (models/mascot/order_count_daily.sql)
10:56:28    Name date_day is ambiguous inside spine at [69:67]
10:56:28    compiled SQL at target/run/jaffle_shop/models/mascot/order_count_daily.sql

The same secondary calculation works if Metric grain is month & period is month

Compiled sql looks like so



  create or replace table `badger-345908`.`transformed`.`order_count_daily`
  
  
  OPTIONS()
  as (
    -- depends_on: `badger-345908`.`transformed`.`enriched_orders`






select *
from -- Need this here, since the actual ref is nested within loops/conditions:
    -- depends on: `badger-345908`.`transformed`.`dbt_metrics_default_calendar`
    (with source_query as (

    select
        /* Always trunc to the day, then use dimensions on calendar table to achieve the _actual_ desired aggregates. */
        /* Need to cast as a date otherwise we get values like 2021-01-01 and 2021-01-01T00:00:00+00:00 that don't join :( */
        cast(timestamp_trunc(
        cast(cast(order_date as date) as timestamp),
        day
    ) as date) as date_day,
        
        order_id as property_to_aggregate

    from `badger-345908`.`transformed`.`enriched_orders`
    where 1=1
),

 spine__time as (
     select 
        /* this could be the same as date_day if grain is day. That's OK! 
        They're used for different things: date_day for joining to the spine, period for aggregating.*/
        date_day as period, 
        
            date_day,
        
        
        date_day
     from `badger-345908`.`transformed`.`dbt_metrics_default_calendar`
 ),

spine as (

    select *
    from spine__time

),

joined as (
    select 
        spine.period,
        
        spine.date_day,
        
        

        -- has to be aggregated in this CTE to allow dimensions coming from the calendar table
    count(source_query.property_to_aggregate)
 as order_count_daily,
        logical_or(source_query.date_day is not null) as has_data

    from spine
    left outer join source_query on source_query.date_day = spine.date_day
    
    group by 1, 2

),

bounded as (
    select 
        *,
         min(case when has_data then period end) over ()  as lower_bound,
         max(case when has_data then period end) over ()  as upper_bound
    from joined 
),

secondary_calculations as (

    select *
        
        , 
    sum(order_count_daily)
over (
            partition by date_day
            
            order by period
            rows between unbounded preceding and current row
        )

as pod_sum_month

        

    from bounded
    
),

final as (
    select
        period
        
        , coalesce(order_count_daily, 0) as order_count_daily
        
        , pod_sum_month
        

    from secondary_calculations
    where period >= lower_bound
    and period <= upper_bound
    order by 1
)

select * from final

) metric_subq
  );

As can be seen, the issue is with the following CTE - where date_day column is being selected twice leading to ambiguity when selecting this column from the CTE

spine__time as (
     select 
        /* this could be the same as date_day if grain is day. That's OK! 
        They're used for different things: date_day for joining to the spine, period for aggregating.*/
        date_day as period, 
        
            date_day,
        
        
        date_day
     from `badger-345908`.`transformed`.`dbt_metrics_default_calendar`
 ),

Although it doesn't make much sense to create a period to date secondary calc. with period as day if the metric grain is day (since the metric as well as secondary calc will be same), I feel the model should not fail & the behaviour should be consistent across granularities

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.