Code Monkey home page Code Monkey logo

dbt-risingwave's People

Contributors

algosday avatar chenzl25 avatar mattiasmts avatar neverchanje avatar stdrc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

siddyp northvolt

dbt-risingwave's Issues

failed to run the scaffold project generated by `dbt init`

Reproduction step:

  1. dbt init
  2. dbt run
dev=> CREATE VIEW "dev"."public"."my_first_dbt_model" AS (
dev(>     WITH source_data AS (
dev(>         SELECT
dev(>             1 AS id
dev(>         UNION ALL
dev(>         SELECT
dev(>             NULL AS id
dev(>     )
dev(>     SELECT
dev(>         *
dev(>     FROM
dev(>         source_data
dev(> );

ERROR:  QueryError: Invalid input syntax: UNION types INT32 of column id is different from types VARCHAR of column id

Essentially, it's because RW is unable to run the following query

select 1 as id union all select null as id;

The query compiler incorrectly recognizes null as varchar.

feat: Rerun only files which changed

From user:

About the dbt plugin, looks like it’s working pretty well but there is a major problem with it.
It always recreate all the models. So even if I changed just one file the run itself is very slow.
Do you think it’s possible to change this? So files which hasn’t changed since the last run will be skipped

Support dbt Core 1.7

I would like to use a new feature in dbt Core 1.7. Unfortunately the connector is only compatible with dbt 1.6:

install_requires=["dbt-postgres~=1.6.0"],

Would be nice if the support for dbt 1.7 can be added.

disallow creating table model

We currently don't support CTAS (CREATE TABLE AS). We should provide an error to the user and ensure that it's not compilable.

Discussion: Support incremental materialization

Currently, dbt-risingwave doesn't support incremental materialization and recommends users use materialized_view instead, because RisingWave could maintain materialized views incrementally in real-time. However, dbt itself is well known for the incremental materialization, many users know how incremental works and how to use it. When they fully understand the trade-off between real-time stream processing and incremental batch processing, you want to use incremental materialization to optimize their workload. E.g. they can run dbt for incremental models every midnight.

Materialized views are fully refreshed

From the dbt docs a materialized view and a incremental should only fully refresh if it does not already exist or if you supply the ´--full-refresh´ parameter.

The risingwave docs says that it implements dbt's incremental mode i.e. https://docs.getdbt.com/docs/build/incremental-models but I'm not certain that it is implemented correctly.

In dbt-risingwave every time I run dbt run, all materialized views {{ config(materialized='materializedview') }} are dropped and re-created which is not expected when reading the official dbt documentation.

risingwave's driver always maps varchar to varchar(0)


{% macro risingwave__get_columns_in_relation(relation) -%}
  {% call statement('get_columns_in_relation', fetch_result=True) %}
      select
          column_name,
          data_type,
          0 as character_maximum_length, -- todo
          0 as numeric_precision,
          0 as numeric_scale

      from {{ relation.information_schema('columns') }}
      where table_name = '{{ relation.identifier }}'
        {% if relation.schema %}
        and table_schema = '{{ relation.schema }}'
        {% endif %}
      order by ordinal_position

  {% endcall %}
  {% set table = load_result('get_columns_in_relation').table %}
  {{ return(sql_convert_columns_in_relation(table)) }}
{% endmacro %}

It's because that in the postgres driver where risingwave's driver derives, character_maximum_length is a required param.

Support persisting documentation

RisingWave supports COMMENT ON syntax, as documented here: https://docs.risingwave.com/docs/current/sql-comment-on/

However, if I set the persist_docs config like this, as documented at https://docs.getdbt.com/reference/resource-configs/persist_docs:

    +persist_docs:
      relation: true
      columns: true

Then dbt will fail with an error like this:

10:11:19  SQL status: CREATE_MATERIALIZED_VIEW in 8.0 seconds
10:11:19  Using risingwave connection "model.tt_rw_staging.stg_my_view.v1"
10:11:19  On model.tt_rw_staging.stg_my_view.v1: /* {"app": "dbt", "dbt_version": "1.6.9", "profile_name": "thumbtack_rw", "target_name": "local", "node_id": "model.tt_rw_staging.stg_my_view.v1"} */



  comment on materialized_view "thumbtack"."dbt_jamesjohnston_stg_business"."stg_my_view_v1" is $dbt_comment_literal_block$This is the description of my table
$dbt_comment_literal_block$;


10:11:19  Postgres adapter: Postgres error: Failed to run the query

Caused by:
  sql parser error: Expected comment object_type, found: materialized_view at line:5, column:31
Near "
  comment on materialized_view"

Obviously this is invalid SQL.... I'm guessing the feature is not supported yet.

Definitely a low-priority nice-to-have for me, since I'll also be looking at using the dbt documentation generator. But in the future it would be nice if anyone directly browsing the database can have these descriptions close at hand.

unable to run `dbt_utils.get_relations_by_pattern`

dbt_utils.get_relations_by_pattern is a commonly used function that can list the relations that match the pattern.

https://github.com/dbt-labs/dbt-utils/blob/6ba7b660b1d1b4e3b41cb0cf6c3c0e4a70ae54e4/macros/sql/get_tables_by_pattern_sql.sql#L6C10-L6C44

SELECT
    DISTINCT table_schema AS "table_schema",
    table_name AS "table_name",
    CASE
        table_type
        WHEN 'BASE TABLE' THEN 'table'
        WHEN 'FOREIGN' THEN 'external'
        WHEN 'MATERIALIZED VIEW' THEN 'materializedview'
        ELSE LOWER(table_type)
    END AS "table_type"
FROM
    dev.information_schema.tables
WHERE
    table_schema ILIKE 'public'
    AND table_name ILIKE 'test_table%'
    AND table_name NOT ILIKE ''

Now the issue is that ILIKE is unsupported by RisingWave.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.