Code Monkey home page Code Monkey logo

Comments (11)

jiacai2050 avatar jiacai2050 commented on September 2, 2024 1

Oh, like "should_panic" or "statement err", right?

Yes.

Maybe left the reason part to only be a comment?

Comment seems OK to me.

from sqlness.

jiacai2050 avatar jiacai2050 commented on September 2, 2024

before query execution

In datafusion, it support set statement to do this kinds of job.

https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/sqllogictests/test_files/information_schema.slt#L30

As for other two cases, they seems not very useful in practice.

statement error Error during planning: SHOW TABLES is not supported unless information_schema is enabled
SHOW TABLES

Also I find this error declare is useful when our sql files have lot of cases, and we need to check which SQL will throw error.

from sqlness.

waynexia avatar waynexia commented on September 2, 2024

In datafusion, it support set statement to do this kinds of job.

Yes, parameter is to provide something like this. DataFusion does support this, and as well as MySQL, PostgreSQL. But I'm doubting if this is a standard SQL grammar and, to my opinion, it's not a conflict for us (a test framework) to support this as well. Consider these scenarios:

  • My database is distributed on A, B and C, I set data distribution rules for my table and I want to verify it. I execute a distributed insert, and want to connect to A, B, and C to do a non-distributed query. I want to know what query will be sent to which instance.
  • My database implementation simply does not support SET, but I want to change my configuration via APIs other than SQL like HTTP.

Those are two practical examples I come up with, and are relying on test context. The parameter are the way to provide context to each SQL.

As for other two cases, they seems not very useful in practice.

Post-process like replace-result is widely used in MySQL's integration test. It's very normal for a query to results in random values. What if you query random(), current_time(), or system table for current memory consumption? I'm not care about the actually value, but just want to make sure they works, or can match some fixed pattern.


Also I find this error declare is useful when our sql files have lot of cases, and we need to check which SQL will throw error.

If I read it correctly, the error message will present in .result file next to the query resulting it?

from sqlness.

jiacai2050 avatar jiacai2050 commented on September 2, 2024

My database is distributed on A, B and C, I set data distribution rules for my table and I want to verify it. I execute a distributed insert, and want to connect to A, B, and C to do a non-distributed query. I want to know what query will be sent to which instance.

This example make sense to me, as for other aspect, I think we can wait until some real world issue arise .

The first principle I obey is to stick with SQL, if SQL can fix it, then we don't have to, it bring little value to this project IMO.

If I read it correctly, the error message will present in .result file next to the query resulting it?

I usually check SQL file to see how many cases have been added, but I can't tell the bad cases from the good one, I don't care what this SQL will output, I only want to know which SQL will throw error.

Maybe we can add a ERR: <reason> special syntax to declare it(this may belong to your first proposal parameter).

-- ERR: don't support XX type now
create table t(a XX);

from sqlness.

waynexia avatar waynexia commented on September 2, 2024

My database is distributed on A, B and C, I set data distribution rules for my table and I want to verify it. I execute a distributed insert, and want to connect to A, B, and C to do a non-distributed query. I want to know what query will be sent to which instance.

This example make sense to me, as for other aspect, I think we can wait until some real world issue arise .

Nice, I'll start working on this one.

The first principle I obey is to stick with SQL, if SQL can fix it, then we don't have to, it bring little value to this project IMO.

I agree. But how do I test a random() function or results contains timestamp. I think this is also a necessary part to accomplish SQL's functionality.

If I read it correctly, the error message will present in .result file next to the query resulting it?

I usually check SQL file to see how many cases have been added, but I can't tell the bad cases from the good one, I don't care what this SQL will output, I only want to know which SQL will throw error.

Maybe we can add a ERR: <reason> special syntax to declare it(this may belong to your first proposal parameter).

-- ERR: don't support XX type now
create table t(a XX);

That makes sense. We should preserve the comment in the result file. (maybe comment is enough? Is it necessary to add new syntax?)

from sqlness.

jiacai2050 avatar jiacai2050 commented on September 2, 2024

I agree. But how do I test a random() function or results contains timestamp. I think this is also a necessary part to accomplish SQL's functionality.

It's possible to remove random value in this way

select count(random())

That makes sense. We should preserve the comment in the result file. (maybe comment is enough? Is it necessary to add new syntax?)

I prefer hard requirement, comment is optional.

from sqlness.

waynexia avatar waynexia commented on September 2, 2024

I agree. But how do I test a random() function or results contains timestamp. I think this is also a necessary part to accomplish SQL's functionality.

It's possible to remove random value in this way

select count(random())

Then I'd ask as a user, why this framework doesn't support SQL like random() and force me to write redundant SQL?

And the second use case, how do I test a system table query? E.g., I expect to see the node info, but want to ignore the memory usage to each node because it changes time by time. Do I have to project out that column? Then what about if my query contains timestamp column? I don't think this way is a good practice. "fix it by SQL" should be "I want to project some column, and the SQL does support projection", but not "I don't want to project column, but the framework forces me to make projection". The SQL query result is naturally unstable in most cases. That is the point we should stick to and add support for.

Require every SQLs to be stable reproducible because we are a text comparison based framework. That doesn't look good 😥

That makes sense. We should preserve the comment in the result file. (maybe comment is enough? Is it necessary to add new syntax?)

I prefer hard requirement, comment is optional.

What do you mean by hard-requirement? Isn't the original goal to make the case readable?

from sqlness.

jiacai2050 avatar jiacai2050 commented on September 2, 2024

Then I'd ask as a user, why this framework doesn't support SQL like random() and force me to write redundant SQL?

I'm not very strong about this, I just think we should keep result as 'static' as possible, replace-result interceptor indeed has its value, and wait to see your implementation.

What do you mean by hard-requirement? Isn't the original goal to make the case readable?

hard requirement means we must add -- ERR: xx prefix for bad sql, otherwise next SQL should not throw error.

I think it's not conflict with readability, instead it make more clear what the SQL will do.

from sqlness.

waynexia avatar waynexia commented on September 2, 2024

hard requirement means if we must add -- ERR: xx prefix for bad sql, otherwise next SQL should not throw error.

I think it's not conflict with readability, instead it make more clear what the SQL will do.

Got it. This is already covered by now? If a query returns an error, it result would be something like ERR: xxx (depending on how the user formats their result). And re-run it should also return that error.

from sqlness.

jiacai2050 avatar jiacai2050 commented on September 2, 2024

Got it. This is already covered by now?

Not exactly, we have to check result file to ensure this, what I want is to declare this in original SQL

statement error DataFusion error: Error during planning: table 'datafusion.information_schema.tables' not found
SELECT * from information_schema.tables

Like this in sqllogcitest.

from sqlness.

waynexia avatar waynexia commented on September 2, 2024

Oh, like "should_panic" or "statement err", right? We can have a try. But I'm not sure if we should also take xxx into consideration, as it's duplicated with the follow-up full-text match. Maybe left the reason part to only be a comment?

from sqlness.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.