Describe This Problem "Interceptor" is a hook point in sqlness, wh

Oh, like "should_panic" or "statement err", right? <p d

Some initial interceptor implementations about sqlness HOT 11 CLOSED

waynexia commented on September 2, 2024

Some initial interceptor implementations

from sqlness.

Comments (11)

jiacai2050 commented on September 2, 2024 1

Oh, like "should_panic" or "statement err", right?

Yes.

Maybe left the reason part to only be a comment?

Comment seems OK to me.

from sqlness.

jiacai2050 commented on September 2, 2024

before query execution

In datafusion, it support set statement to do this kinds of job.

https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/sqllogictests/test_files/information_schema.slt#L30

As for other two cases, they seems not very useful in practice.

statement error Error during planning: SHOW TABLES is not supported unless information_schema is enabled
SHOW TABLES

Also I find this error declare is useful when our sql files have lot of cases, and we need to check which SQL will throw error.

from sqlness.

waynexia commented on September 2, 2024

In datafusion, it support set statement to do this kinds of job.

Yes, parameter is to provide something like this. DataFusion does support this, and as well as MySQL, PostgreSQL. But I'm doubting if this is a standard SQL grammar and, to my opinion, it's not a conflict for us (a test framework) to support this as well. Consider these scenarios:

My database is distributed on A, B and C, I set data distribution rules for my table and I want to verify it. I execute a distributed insert, and want to connect to A, B, and C to do a non-distributed query. I want to know what query will be sent to which instance.
My database implementation simply does not support SET, but I want to change my configuration via APIs other than SQL like HTTP.

Those are two practical examples I come up with, and are relying on test context. The parameter are the way to provide context to each SQL.

As for other two cases, they seems not very useful in practice.

Post-process like replace-result is widely used in MySQL's integration test. It's very normal for a query to results in random values. What if you query random(), current_time(), or system table for current memory consumption? I'm not care about the actually value, but just want to make sure they works, or can match some fixed pattern.

Also I find this error declare is useful when our sql files have lot of cases, and we need to check which SQL will throw error.

If I read it correctly, the error message will present in .result file next to the query resulting it?

from sqlness.

jiacai2050 commented on September 2, 2024

My database is distributed on A, B and C, I set data distribution rules for my table and I want to verify it. I execute a distributed insert, and want to connect to A, B, and C to do a non-distributed query. I want to know what query will be sent to which instance.

This example make sense to me, as for other aspect, I think we can wait until some real world issue arise .

The first principle I obey is to stick with SQL, if SQL can fix it, then we don't have to, it bring little value to this project IMO.

If I read it correctly, the error message will present in .result file next to the query resulting it?

I usually check SQL file to see how many cases have been added, but I can't tell the bad cases from the good one, I don't care what this SQL will output, I only want to know which SQL will throw error.

Maybe we can add a ERR: <reason> special syntax to declare it(this may belong to your first proposal parameter).

-- ERR: don't support XX type now
create table t(a XX);

from sqlness.

waynexia commented on September 2, 2024

My database is distributed on A, B and C, I set data distribution rules for my table and I want to verify it. I execute a distributed insert, and want to connect to A, B, and C to do a non-distributed query. I want to know what query will be sent to which instance.

This example make sense to me, as for other aspect, I think we can wait until some real world issue arise .

Nice, I'll start working on this one.

The first principle I obey is to stick with SQL, if SQL can fix it, then we don't have to, it bring little value to this project IMO.

I agree. But how do I test a random() function or results contains timestamp. I think this is also a necessary part to accomplish SQL's functionality.

If I read it correctly, the error message will present in .result file next to the query resulting it?

I usually check SQL file to see how many cases have been added, but I can't tell the bad cases from the good one, I don't care what this SQL will output, I only want to know which SQL will throw error.

Maybe we can add a ERR: <reason> special syntax to declare it(this may belong to your first proposal parameter).
-- ERR: don't support XX type now
create table t(a XX);

That makes sense. We should preserve the comment in the result file. (maybe comment is enough? Is it necessary to add new syntax?)

from sqlness.

jiacai2050 commented on September 2, 2024

I agree. But how do I test a random() function or results contains timestamp. I think this is also a necessary part to accomplish SQL's functionality.

It's possible to remove random value in this way

select count(random())

That makes sense. We should preserve the comment in the result file. (maybe comment is enough? Is it necessary to add new syntax?)

I prefer hard requirement, comment is optional.

from sqlness.

waynexia commented on September 2, 2024

I agree. But how do I test a random() function or results contains timestamp. I think this is also a necessary part to accomplish SQL's functionality.

It's possible to remove random value in this way
select count(random())

Then I'd ask as a user, why this framework doesn't support SQL like random() and force me to write redundant SQL?

And the second use case, how do I test a system table query? E.g., I expect to see the node info, but want to ignore the memory usage to each node because it changes time by time. Do I have to project out that column? Then what about if my query contains timestamp column? I don't think this way is a good practice. "fix it by SQL" should be "I want to project some column, and the SQL does support projection", but not "I don't want to project column, but the framework forces me to make projection". The SQL query result is naturally unstable in most cases. That is the point we should stick to and add support for.

Require every SQLs to be stable reproducible because we are a text comparison based framework. That doesn't look good 😥

That makes sense. We should preserve the comment in the result file. (maybe comment is enough? Is it necessary to add new syntax?)

I prefer hard requirement, comment is optional.

What do you mean by hard-requirement? Isn't the original goal to make the case readable?

from sqlness.

jiacai2050 commented on September 2, 2024

Then I'd ask as a user, why this framework doesn't support SQL like random() and force me to write redundant SQL?

I'm not very strong about this, I just think we should keep result as 'static' as possible, replace-result interceptor indeed has its value, and wait to see your implementation.

What do you mean by hard-requirement? Isn't the original goal to make the case readable?

hard requirement means we must add -- ERR: xx prefix for bad sql, otherwise next SQL should not throw error.

I think it's not conflict with readability, instead it make more clear what the SQL will do.

from sqlness.

waynexia commented on September 2, 2024

hard requirement means if we must add -- ERR: xx prefix for bad sql, otherwise next SQL should not throw error.

I think it's not conflict with readability, instead it make more clear what the SQL will do.

Got it. This is already covered by now? If a query returns an error, it result would be something like ERR: xxx (depending on how the user formats their result). And re-run it should also return that error.

from sqlness.

jiacai2050 commented on September 2, 2024

Got it. This is already covered by now?

Not exactly, we have to check result file to ensure this, what I want is to declare this in original SQL

statement error DataFusion error: Error during planning: table 'datafusion.information_schema.tables' not found
SELECT * from information_schema.tables

Like this in sqllogcitest.

from sqlness.

waynexia commented on September 2, 2024

Oh, like "should_panic" or "statement err", right? We can have a try. But I'm not sure if we should also take xxx into consideration, as it's duplicated with the follow-up full-text match. Maybe left the reason part to only be a comment?

from sqlness.

Some initial interceptor implementations about sqlness HOT 11 CLOSED

Comments (11)

Related Issues (19)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent