Code Monkey home page Code Monkey logo

Comments (6)

jayzhan211 avatar jayzhan211 commented on September 13, 2024 1

@jayzhan211

I understand that the issue runs a bit deeper as we employ sqlparser for converting SQL queries into statements.

"SELECT c FROM t WHERE c >= '2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels'"

Processing stmts from tokenizer:

[Statement(Query(Query { with: None, body: Select(Select { distinct: None, top: None, projection: [UnnamedExpr(Identifier(Ident { value: "c", quote_style: None }))], into: None, from: [TableWithJoins { relation: Table { name: ObjectName([Ident { value: "t", quote_style: None }]), alias: None, args: None, with_hints: [], version: None, partitions: [] }, joins: [] }], lateral_views: [], selection: Some(AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "c", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }), group_by: Expressions([]), cluster_by: [], distribute_by: [], sort_by: [], having: None, named_window: [], qualify: None, value_table_mode: None }), order_by: [], limit: None, limit_by: [], offset: None, fetch: None, locks: [], for_clause: None }))]

pub fn parse_sql_with_dialect(

yes, I think we should fix sqlparser to get ('2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels') as the right hand side of binary operation

from arrow-datafusion.

alamb avatar alamb commented on September 13, 2024

Thank you for the report -- this definitely looks like a bug to me

from arrow-datafusion.

jayzhan211 avatar jayzhan211 commented on September 13, 2024

An example to reproduce.

statement ok
create table t1 as values (
    date_bin(interval '1 hour', '2022-08-03 14:38:50Z' at time zone 'Europe/Brussels'), 
    date_bin(interval '1 hour', '2022-08-03 14:38:50Z' at time zone 'Europe/Brussels'));

query error
select * from t1 where column1 >= '2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels';
----
DataFusion error: Internal error: binary_op should be handled by sql_expr_to_logical_expr..
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

sql we got in sql_expr_to_logical_expr

sql: AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "column1", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }

We need to support AtTimeZone, and maybe timezone casting 🤔 ?

Related code is in

StackEntry::SQLExpr(sql_expr) => {
match *sql_expr {
SQLExpr::BinaryOp { left, op, right } => {
// Note the order that we push the entries to the stack
// is important. We want to visit the left node first.
let op = self.parse_sql_binary_op(op)?;
stack.push(StackEntry::Operator(op));
stack.push(StackEntry::SQLExpr(right));
stack.push(StackEntry::SQLExpr(left));
}
SQLExpr::JsonAccess {
left,
operator,
right,
} => {
let op = self.parse_sql_json_access(operator)?;
stack.push(StackEntry::Operator(op));
stack.push(StackEntry::SQLExpr(right));
stack.push(StackEntry::SQLExpr(left));
}
_ => {
let expr = self.sql_expr_to_logical_expr_internal(
*sql_expr,
schema,
planner_context,
)?;
eval_stack.push(expr);
}
}
}

from arrow-datafusion.

Abdullahsab3 avatar Abdullahsab3 commented on September 13, 2024

Thanks for the deep dive @jayzhan211! Could it be that the expression is not getting correctly parsed in this instance? Executing this query for example:

select * from t1 where column1 >= ('2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels');

seems to be working fine with no issues

from arrow-datafusion.

jayzhan211 avatar jayzhan211 commented on September 13, 2024

Yes, you are right!

This is one with ()

sql: BinaryOp { left: Identifier(Ident { value: "column1", quote_style: None }), op: GtEq, right: Nested(AtTimeZone { timestamp: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None }, time_zone: "Europe/Brussels" }) }
sql: AtTimeZone { timestamp: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None }, time_zone: "Europe/Brussels" }

This one is without ()

sql: AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "column1", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }

I think it is parsed as
(column1 >= '2019-03-27T22:00:00.000Z'::timestamp) at time zone 'Europe/Brussels'
which ideally should be parsed as
column1 >= ('2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels')

from arrow-datafusion.

dmitrybugakov avatar dmitrybugakov commented on September 13, 2024

@jayzhan211

I understand that the issue runs a bit deeper as we employ sqlparser for converting SQL queries into statements.

"SELECT c FROM t WHERE c >= '2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels'"

Processing stmts from tokenizer:

[Statement(Query(Query { with: None, body: Select(Select { distinct: None, top: None, projection: [UnnamedExpr(Identifier(Ident { value: "c", quote_style: None }))], into: None, from: [TableWithJoins { relation: Table { name: ObjectName([Ident { value: "t", quote_style: None }]), alias: None, args: None, with_hints: [], version: None, partitions: [] }, joins: [] }], lateral_views: [], selection: Some(AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "c", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }), group_by: Expressions([]), cluster_by: [], distribute_by: [], sort_by: [], having: None, named_window: [], qualify: None, value_table_mode: None }), order_by: [], limit: None, limit_by: [], offset: None, fetch: None, locks: [], for_clause: None }))]

pub fn parse_sql_with_dialect(

from arrow-datafusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.