Code Monkey home page Code Monkey logo

dts-sql's People

Contributors

mohammadrezapourreza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dts-sql's Issues

Code of evaluation needed

Dear author,
I evaluate your results(results/deepseek_spider_validation_set/Predicted.txt) on my own evaluation code(Execution Accuracy), but I found the result(82.7) is not the same as what you provided in the paper(85.5). I wonder if there is any error with mine. Could you please public your evaluation code?

My predicted exec accuracy:
easy medium hard extra all
count 248 446 174 166 1034
===================== EXECUTION ACCURACY =====================
execution 0.927 0.901 0.741 0.566 0.827
Thank you!

response results not follow finetune template, the results like sql generate , not like schema link.

I ues "deepseek-ai/deepseek-coder-7b-instruct-v1.5" base model and use your prompt template with bird datasets finetune sft model .

when I inference using the finetuned model , the response results showing as following:

the quesiton is response results not follow finetune template, the response results like sql generate , not like schema link.

I use your pbulished schema link model , the inference results following prompt templates.

I can not find why so diiferent between your model and my own finetune model

The SQL query to list out all post ID with score more than 60 and list out all the user ID that created these posts would be: ```sql SELECT posts.Id, users.Id FROM posts JOIN users ON posts.OwnerUserId = users.Id WHERE posts.Score > 60
postHistory, posts
============================
To answer this question, we need to look at the `posts` and `postLinks` tables respectively. ```sql SELECT Score FROM posts WHERE Id = 395
posts, postLinks
============================
To answer this question, we need to look at the `posts` table to find the post related to ID 61217 and the `votes` table to find the popularity of this post. ```SQL SELECT posts.Id, posts.Title, COUNT(votes.Id) as Popularity FROM posts LEFT JOIN votes ON posts.Id = votes.PostId WHERE posts.Id = 61217 GROUP BY posts.Id, posts.Title
posts, postLinks
============================
To answer this question, we need to use the `postHistory` and `comments` tables. First, we can find the post history type IDs for post ID 3720 by querying the `postHistory` table: ```sql SELECT PostHistoryTypeId FROM postHistory WHERE PostId = 3720
comments, postHistory
============================

Did you also use column descriptions instead of samples when processing bird data set for training?

In the script DTS_SQL_bird_submission.py, when schema link and sql generate are performed, when table schema information is obtained, Instead of using samples, you use columns description information.
These two differences are not the same as when fine-tuning data processing in fintuning_dataset_createtor.py.
The table schema information in the training data in fintuning_dataset_createtor.py has samples, but no column descriptions.
Did you also use column descriptions instead of samples when processing bird data set for training

Are finetuning models available ?

Dear author, I have noticed that DTS-SQL + DeepSeek 7B achieves 60.31% ex accuracy in BIRD leaderboard.
I would like to ask when the fine-tuned model can be open-sourced? Thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.