Code Monkey home page Code Monkey logo

ask-your-database's Introduction

Query your database with GPT

A CLI tool that will let you ask GPT questions about any Postgres database. Just provide your connection details, and ask-your-database automatically loads up the schema, gets example data, and runs queries for you. It works best with GPT-4.

Note: This will directly run queries provided by GPT on the database that you provide. Unless it's a read-only connection, that means it might delete or update data by accident. Please use with caution and always keep backups of your data.

Usage

First, create a JSON configuration file:

{
  "openAIAPIKey": "xxx",
  "openAIModel": "gpt-4",
  "dbTimeoutMs": 20000,
  "apiTimeoutMs": 30000,
  "postgresConnection": {
    "host": "localhost",
    "port": 5432,
    "database": "imdb",
    "user": "imdb",
    "password": "1234"
  }
}

Then, call this with the config file like so:

npx ask-your-database myConfig.json

What it does

This example is using an example IMDB postgres database from RyanMarcus/imdb_pg_dataset. You can load it up by downloading this archive and using pg_restore --database=db_name --user=user imdb_pg11.

First, this will tell you a bit about your database, and prompt you for an initial question:

> npx [email protected] imdbConfig.json

You are connected to the database imdb. It has the following tables:

aka_name, aka_title, cast_info, char_name, comp_cast_type, company_name, company_type,
complete_cast, info_type, keyword, kind_type, link_type, movie_companies, movie_info,
movie_info_idx, movie_keyword, movie_link, name, person_info, role_type, title

? Ask me a question about this database, and I'll try to answer! (q to quit)

Then you can put in a question, for example:

> What movies did Keanu Reeves act in?

If the response includes a SQL query, then this tool will automatically run it for you and show the results:

To answer your question, we'll need to join the 'name' table to find Keanu Reeves' person_id,
and then join the 'cast_info' table to get the movie_ids in which Keanu Reeves appeared as an
actor, and finally join the 'title' table to get the titles of those movies.

Please run the following query:

SELECT title.title
FROM name
JOIN cast_info ON name.id = cast_info.person_id
JOIN title ON cast_info.movie_id = title.id
JOIN role_type ON cast_info.role_id = role_type.id
WHERE name.name ILIKE 'Reeves, Keanu' AND role_type.role = 'actor';

Returned 253 rows. Here are the first 101 rows:
┌─────────┬────────────────────────────────────────────────────────────────────────────┐
│ (index) │                                   title                                    │
├─────────┼────────────────────────────────────────────────────────────────────────────┤
│    0    │                        '101 Biggest Celebrity Oops'                        │
│    1    │                          '1992 MTV Movie Awards'                           │
│    2    │                          '1993 MTV Movie Awards'                           │
│    3    │                          '1995 MTV Movie Awards'                           │
│    4    │                          '1999 MTV Movie Awards'                           │
│    5    │                  '2000 Blockbuster Entertainment Awards'                   │
│    6    │                          '2003 MTV Movie Awards'                           │
│    7    │                      '2004 Taurus World Stunt Awards'                      │
│    8    │                          '2006 MTV Movie Awards'                           │
│    9    │                                 '47 Ronin'                                 │
│   10    │                              "A Man's Story"                               │
│   11    │                             'A Scanner Darkly'                             │

... some results truncated

When you send the response to GPT, the result will be included:

> This is great, but it includes things that aren't movies. Can you filter to only
movies, and also sort to show the most recent first?

Using the new results and the past context, GPT will then run more queries and improve its results:

To filter only movies and sort them by the most recent first, we need to join the 'kind_type'
table to get the movie kind and filter by 'movie', and order the result by 'production_year'
in descending order. Please run the following query:

SELECT title.title, title.production_year
FROM name
JOIN cast_info ON name.id = cast_info.person_id
JOIN title ON cast_info.movie_id = title.id
JOIN role_type ON cast_info.role_id = role_type.id
JOIN kind_type ON title.kind_id = kind_type.id
WHERE name.name ILIKE 'Reeves, Keanu' AND role_type.role = 'actor' AND kind_type.kind = 'movie'
ORDER BY title.production_year DESC;

Returned 64 rows. Here are the first 51 rows:
┌─────────┬─────────────────────────────────────┬─────────────────┐
│ (index) │                title                │ production_year │
├─────────┼─────────────────────────────────────┼─────────────────┤
│    0    │             'John Wick'             │      null       │
│    1    │            'Passengers'             │      null       │
│    2    │              'Jekyll'               │      null       │
│    3    │              'Jekyll'               │      null       │
│    4    │          'Man of Tai Chi'           │      2013       │
│    5    │             '47 Ronin'              │      2013       │
│    6    │           'Side by Side'            │      2012       │
│    7    │         'Generation Um...'          │      2012       │
│    8    │           'Sunset Strip'            │      2012       │

... some results truncated

(Our example data set only includes data up to 2013)

Then, you can continue to refine the results:

> This is great, but can you omit the ones with null production_year?

The response:

To omit the ones with null production_year, add a condition in the WHERE clause. Please run the
following query:

SELECT title.title, title.production_year
FROM name
JOIN cast_info ON name.id = cast_info.person_id
JOIN title ON cast_info.movie_id = title.id
JOIN role_type ON cast_info.role_id = role_type.id
JOIN kind_type ON title.kind_id = kind_type.id
WHERE name.name ILIKE 'Reeves, Keanu' AND role_type.role = 'actor' AND kind_type.kind = 'movie' AND title.production_year IS NOT NULL
ORDER BY title.production_year DESC;

Returned 60 rows. Here are the first 51 rows:
┌─────────┬─────────────────────────────────────┬─────────────────┐
│ (index) │ title │ production_year │
├─────────┼─────────────────────────────────────┼─────────────────┤
│ 0 │ '47 Ronin' │ 2013 │
│ 1 │ 'Man of Tai Chi' │ 2013 │
│ 2 │ 'Sunset Strip' │ 2012 │
│ 3 │ 'Side by Side' │ 2012 │
│ 4 │ 'Generation Um...' │ 2012 │
│ 5 │ "A Man's Story" │ 2010 │
│ 6 │ "Henry's Crime" │ 2010 │
│ 7 │ 'The Private Lives of Pippa Lee' │ 2009 │
│ 8 │ 'Street Kings' │ 2008 │
│ 9 │ 'The Day the Earth Stood Still' │ 2008 │

... some results truncated

And so on!

Approach

This combines a few simple things together:

  1. First, it queries the database for all of the tables. For each table it gets all the columns, their types, and one example result. All of these are provided as the beginning of the prompt.
  2. The schema and your question are sent to GPT.
  3. Any queries returned as part of the response are automatically executed, and the result is printed to you.
  4. When you send any followup questions, the result is automatically sent so GPT can use it as context for the following responses.

Local development

Using pnpm or your Node package manager of choice:

pnpm install
pnpm start yourConfig.json

Additional disclaimers

Note that your database schema and any data queried by GPT will be sent to OpenAI for processing in the cloud. Exercise the appropriate amount of caution depending on your data.

ask-your-database's People

Contributors

stubailo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ask-your-database's Issues

OpenAI 404 error since Organization not defined

I ran into a 404 error that I was able to maybe hack/solve with the following code. Should be an easy fix to add this to the config.json and then pull into the configuration in the correct places.

In index.js at line 100

const configuration = new Configuration({
organization: "org-xxx", << New line added
apiKey: config.openAIAPIKey,
});

I extracted this modification from his web link and then copied my Organization ID from my own account.

https://platform.openai.com/docs/api-reference/authentication?lang=node.js

OpenAI helper page: https://help.openai.com/en/articles/6891827-error-code-404-you-must-be-a-member-of-an-organization-to-use-the-api#

Cannot read 'id' from property

Hi, first ever GitHub issue being created, so please be kind: :)

I am a python programmer, so don't know Node or JS as much. I've run down the rabbit hole of trying to find solutions, but I don't know how to implement most of the suggestions to try them out.

https://stackoverflow.com/questions/67193689/node-js-typeerror-cannot-read-property-id-of-undefined

My Error:
(node:116947) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'id' of undefined
at /home/usaspending-gpt4/.npm/_npx/cda643d17fed3bae/node_modules/ask-your-database/dist/index.js:70:188
at Array.map ()
at /home/usaspending-gpt4/.npm/_npx/cda643d17fed3bae/node_modules/ask-your-database/dist/index.js:70:39
at Array.map ()
at main (/home/usaspending-gpt4/.npm/_npx/cda643d17fed3bae/node_modules/ask-your-database/dist/index.js:69:49)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
(node:116947) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:116947) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.