Comments (6)
I'll try next week to open source the code where this is happening
from arrow-datafusion.
Strange, I'm encountering this with custom TableProviders, I'll be able to share more next week tho
from arrow-datafusion.
Thanks for the report @fabianmurariu
Is there any way we can get a self contained reproducer? I ran the query in the description and it doesn't seem to have all the tables
> WITH e1 AS (SELECT * FROM _default), e2 AS (SELECT * FROM _default), a AS (SELECT * FROM nodes), b AS (SELECT * FROM nodes), c AS (SELECT * FROM nodes) SELECT a.name, b.name, c.name FROM e1 JOIN a ON e1.src = a.id JOIN b ON e1.dst = b.id JOIN e2 ON b.id = e2.src JOIN c ON e2.dst = c.id WHERE e1.id <> e2.id;
Error during planning: table 'datafusion.public._default' not found
from arrow-datafusion.
Thanks @fabianmurariu
cc @mustafasrepo in case you have any thoughts
from arrow-datafusion.
The physical plans before and after enforce distribution rule, might help in locating the problem. You can use get_plan_string helper to print this information. Putting prints at the start and at the end of
will produce necessary logs as far as I can tell.from arrow-datafusion.
Thanks @fabianmurariu
cc @mustafasrepo in case you have any thoughts
I have tried to reproduce problem by defining absolutely necessary fields in the query with below queries
statement ok
CREATE TABLE IF NOT EXISTS _default (name VARCHAR, src BIGINT, dst BIGINT, id BIGINT) AS VALUES('mustafa', 1, 2, 0),('test', 2, 3, 1);
statement ok
CREATE TABLE IF NOT EXISTS nodes (name VARCHAR,id BIGINT) AS VALUES('TR', 1),('GR', 2);
statement ok
set datafusion.execution.target_partitions = 8;
query TTT
WITH e1 AS (SELECT * FROM _default),
e2 AS (SELECT * FROM _default),
a AS (SELECT * FROM nodes),
b AS (SELECT * FROM nodes),
c AS (SELECT * FROM nodes)
SELECT a.name, b.name, c.name
FROM e1 JOIN a ON e1.src = a.id JOIN b ON e1.dst = b.id JOIN e2 ON b.id = e2.src JOIN c ON e2.dst = c.id WHERE e1.id <> e2.id;
----
However, this test seems to pass. Unfortunately I cannot debug further. After seeing plans, or after full reproducer I will take another look.
from arrow-datafusion.
Related Issues (20)
- Data set which is much bigger than RAM HOT 5
- ci: clippy failed on main
- Convert `Grouping` to UDAF HOT 4
- Convert `BitAnd`, `BitOr`, `BitXor` to UDAF HOT 2
- CTE in a UNION query can escape its scope HOT 1
- Unclear error message when calling a function with no parameters.
- [Epic] Implement support for `StringView` in DataFusion HOT 8
- Implement equality `=` and inequality `<>` support for `StringView` HOT 6
- Implement `arrow_cast` support for `StringView` and `BinaryView` HOT 1
- use StringViewArray when reading String columns from Parquet HOT 17
- [EPIC] Continued correct and improved extracting Parquet statistics into ArrayRefs HOT 8
- Update ListingTable to use `StatisticsConverter`
- `StatisticsConverter::row_group_null_counts` incorrect for missing column HOT 4
- Support extracting `Int8`, `Int16`, `Int32` statistics from Parquet Data Pages HOT 2
- Do we need to escape search string as it's used in regexp? Wondering what's the result of `contains("abcdefg", ".*")` HOT 6
- Add a benchmark for extracting parquet data page statistics HOT 1
- Push down filters below `Unnest` in sub queries HOT 11
- Convert `Average` to UDAF HOT 1
- Convert `Min` and `Max` to UDAF HOT 5
- Convert `Hyperloglog` to UDAF HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arrow-datafusion.