Describe the bug When casting a string to a timestamp with time zo

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

An example to reproduce. <div class="snippet-clipboard-content notranslate positio

Thanks for the deep dive <a class="user-mention notranslate" data-hovercard-type="user

Yes, you are right! This is one with ()</c

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

binary_op should be handled by sql_expr_to_logical_expr.. This was likely caused by a bug in DataFusion's code,about apache/arrow-datafusion

jayzhan211 commented on September 13, 2024 1

@jayzhan211

I understand that the issue runs a bit deeper as we employ sqlparser for converting SQL queries into statements.

"SELECT c FROM t WHERE c >= '2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels'"

Processing stmts from tokenizer:

[Statement(Query(Query { with: None, body: Select(Select { distinct: None, top: None, projection: [UnnamedExpr(Identifier(Ident { value: "c", quote_style: None }))], into: None, from: [TableWithJoins { relation: Table { name: ObjectName([Ident { value: "t", quote_style: None }]), alias: None, args: None, with_hints: [], version: None, partitions: [] }, joins: [] }], lateral_views: [], selection: Some(AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "c", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }), group_by: Expressions([]), cluster_by: [], distribute_by: [], sort_by: [], having: None, named_window: [], qualify: None, value_table_mode: None }), order_by: [], limit: None, limit_by: [], offset: None, fetch: None, locks: [], for_clause: None }))]

datafusion/datafusion/sql/src/parser.rs

Line 315 in dac2a7e

pub fn parse_sql_with_dialect(

yes, I think we should fix sqlparser to get ('2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels') as the right hand side of binary operation

from arrow-datafusion.

alamb commented on September 13, 2024

Thank you for the report -- this definitely looks like a bug to me

from arrow-datafusion.

jayzhan211 commented on September 13, 2024

An example to reproduce.

statement ok
create table t1 as values (
    date_bin(interval '1 hour', '2022-08-03 14:38:50Z' at time zone 'Europe/Brussels'), 
    date_bin(interval '1 hour', '2022-08-03 14:38:50Z' at time zone 'Europe/Brussels'));

query error
select * from t1 where column1 >= '2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels';
----
DataFusion error: Internal error: binary_op should be handled by sql_expr_to_logical_expr..
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

sql we got in sql_expr_to_logical_expr

sql: AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "column1", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }

We need to support AtTimeZone, and maybe timezone casting 🤔 ?

Related code is in

datafusion/datafusion/sql/src/expr/mod.rs

Lines 69 to 98 in 286eb34

    
           StackEntry::SQLExpr(sql_expr) => { 
        
               match *sql_expr { 
        
                   SQLExpr::BinaryOp { left, op, right } => { 
        
                       // Note the order that we push the entries to the stack 
        
                       // is important. We want to visit the left node first. 
        
                       let op = self.parse_sql_binary_op(op)?; 
        
                       stack.push(StackEntry::Operator(op)); 
        
                       stack.push(StackEntry::SQLExpr(right)); 
        
                       stack.push(StackEntry::SQLExpr(left)); 
        
                   } 
        
                   SQLExpr::JsonAccess { 
        
                       left, 
        
                       operator, 
        
                       right, 
        
                   } => { 
        
                       let op = self.parse_sql_json_access(operator)?; 
        
                       stack.push(StackEntry::Operator(op)); 
        
                       stack.push(StackEntry::SQLExpr(right)); 
        
                       stack.push(StackEntry::SQLExpr(left)); 
        
                   } 
        
                   _ => { 
        
                       let expr = self.sql_expr_to_logical_expr_internal( 
        
                           *sql_expr, 
        
                           schema, 
        
                           planner_context, 
        
                       )?; 
        
                       eval_stack.push(expr); 
        
                   } 
        
               } 
        
           }

from arrow-datafusion.

Abdullahsab3 commented on September 13, 2024

Thanks for the deep dive @jayzhan211! Could it be that the expression is not getting correctly parsed in this instance? Executing this query for example:

select * from t1 where column1 >= ('2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels');

seems to be working fine with no issues

from arrow-datafusion.

jayzhan211 commented on September 13, 2024

Yes, you are right!

This is one with ()

sql: BinaryOp { left: Identifier(Ident { value: "column1", quote_style: None }), op: GtEq, right: Nested(AtTimeZone { timestamp: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None }, time_zone: "Europe/Brussels" }) }
sql: AtTimeZone { timestamp: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None }, time_zone: "Europe/Brussels" }

This one is without ()

sql: AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "column1", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }

I think it is parsed as
(column1 >= '2019-03-27T22:00:00.000Z'::timestamp) at time zone 'Europe/Brussels'
which ideally should be parsed as
column1 >= ('2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels')

from arrow-datafusion.

dmitrybugakov commented on September 13, 2024

@jayzhan211

I understand that the issue runs a bit deeper as we employ sqlparser for converting SQL queries into statements.

"SELECT c FROM t WHERE c >= '2019-03-27T22:00:00.000Z'::timestamp at time zone 'Europe/Brussels'"

Processing stmts from tokenizer:

[Statement(Query(Query { with: None, body: Select(Select { distinct: None, top: None, projection: [UnnamedExpr(Identifier(Ident { value: "c", quote_style: None }))], into: None, from: [TableWithJoins { relation: Table { name: ObjectName([Ident { value: "t", quote_style: None }]), alias: None, args: None, with_hints: [], version: None, partitions: [] }, joins: [] }], lateral_views: [], selection: Some(AtTimeZone { timestamp: BinaryOp { left: Identifier(Ident { value: "c", quote_style: None }), op: GtEq, right: Cast { expr: Value(SingleQuotedString("2019-03-27T22:00:00.000Z")), data_type: Timestamp(None, None), format: None } }, time_zone: "Europe/Brussels" }), group_by: Expressions([]), cluster_by: [], distribute_by: [], sort_by: [], having: None, named_window: [], qualify: None, value_table_mode: None }), order_by: [], limit: None, limit_by: [], offset: None, fetch: None, locks: [], for_clause: None }))]

datafusion/datafusion/sql/src/parser.rs

Line 315 in dac2a7e

pub fn parse_sql_with_dialect(

from arrow-datafusion.

binary_op should be handled by sql_expr_to_logical_expr.. This was likely caused by a bug in DataFusion's code about arrow-datafusion HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	StackEntry::SQLExpr(sql_expr) => {
	match *sql_expr {
	SQLExpr::BinaryOp { left, op, right } => {
	// Note the order that we push the entries to the stack
	// is important. We want to visit the left node first.
	let op = self.parse_sql_binary_op(op)?;
	stack.push(StackEntry::Operator(op));
	stack.push(StackEntry::SQLExpr(right));
	stack.push(StackEntry::SQLExpr(left));
	}
	SQLExpr::JsonAccess {
	left,
	operator,
	right,
	} => {
	let op = self.parse_sql_json_access(operator)?;
	stack.push(StackEntry::Operator(op));
	stack.push(StackEntry::SQLExpr(right));
	stack.push(StackEntry::SQLExpr(left));
	}
	_ => {
	let expr = self.sql_expr_to_logical_expr_internal(
	*sql_expr,
	schema,
	planner_context,
	)?;
	eval_stack.push(expr);
	}
	}
	}