Code Monkey home page Code Monkey logo

Comments (6)

fabianmurariu avatar fabianmurariu commented on September 13, 2024 1

I'll try next week to open source the code where this is happening

from arrow-datafusion.

fabianmurariu avatar fabianmurariu commented on September 13, 2024 1

Strange, I'm encountering this with custom TableProviders, I'll be able to share more next week tho

from arrow-datafusion.

alamb avatar alamb commented on September 13, 2024

Thanks for the report @fabianmurariu

Is there any way we can get a self contained reproducer? I ran the query in the description and it doesn't seem to have all the tables

> WITH e1 AS (SELECT * FROM _default), e2 AS (SELECT * FROM _default), a AS (SELECT * FROM nodes), b AS (SELECT * FROM nodes), c AS (SELECT * FROM nodes) SELECT a.name, b.name, c.name FROM e1 JOIN a ON e1.src = a.id JOIN b ON e1.dst = b.id JOIN e2 ON b.id = e2.src JOIN c ON e2.dst = c.id WHERE e1.id <> e2.id;
Error during planning: table 'datafusion.public._default' not found

from arrow-datafusion.

alamb avatar alamb commented on September 13, 2024

Thanks @fabianmurariu

cc @mustafasrepo in case you have any thoughts

from arrow-datafusion.

mustafasrepo avatar mustafasrepo commented on September 13, 2024

The physical plans before and after enforce distribution rule, might help in locating the problem. You can use get_plan_string helper to print this information. Putting prints at the start and at the end of

will produce necessary logs as far as I can tell.

from arrow-datafusion.

mustafasrepo avatar mustafasrepo commented on September 13, 2024

Thanks @fabianmurariu

cc @mustafasrepo in case you have any thoughts

I have tried to reproduce problem by defining absolutely necessary fields in the query with below queries

statement ok
CREATE TABLE IF NOT EXISTS _default (name VARCHAR, src BIGINT, dst BIGINT, id BIGINT) AS VALUES('mustafa', 1, 2, 0),('test', 2, 3, 1);

statement ok
CREATE TABLE IF NOT EXISTS nodes (name VARCHAR,id BIGINT) AS VALUES('TR', 1),('GR', 2);

statement ok
set datafusion.execution.target_partitions = 8;

query TTT
WITH e1 AS (SELECT * FROM _default),
  e2 AS (SELECT * FROM _default),
  a AS (SELECT * FROM nodes),
  b AS (SELECT * FROM nodes),
  c AS (SELECT * FROM nodes)
  SELECT a.name, b.name, c.name
FROM e1 JOIN a ON e1.src = a.id JOIN b ON e1.dst = b.id JOIN e2 ON b.id = e2.src JOIN c ON e2.dst = c.id WHERE e1.id <> e2.id;
----

However, this test seems to pass. Unfortunately I cannot debug further. After seeing plans, or after full reproducer I will take another look.

from arrow-datafusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.