rgbkrk / genai Goto Github PK

View Code? Open in Web Editor NEW

352.0 7.0 36.0 384 KB

What if GPT could help you notebook?

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

exceptions gpt gpt-35-turbo interactive-computing ipython jupyter openai traceback

genai's Issues

Add %env to the list of ignores

Since some people use %env to set environment variables that are secrets, go ahead and ignore cells that contain them.

[Updates] Switch to progress bar for `%%assist`

The messages that get sent with %%assist don't make much sense when we tend to get markdown back. We should switch this to a progress bar. In fact... maybe we should just emit Markdown and allow the user to copy the code instead of creating a new cell.

Ignore Keyboard Interrupt

We should not provide suggestions for KeyboardInterrupt or any other terminations. Probably should also stop if out of memory.

Create a way to override the default prompts

I'd like to be able to experiment with the prompts to do tuning outside of making another release.

import genai

genai.set_default_error_prompt(
    """
You are a data scientist diagnosing errors from your colleagues.
They ran into an error in their notebook. Be concise. Format your response in markdown.
Be extremely minimal with prose, aiming primarily for concise fixes.
""".strip()
)

As the python package solidifies, the most likely piece to change over time is going to be the prompt.

issue snafu, ignore this

bad project issue conversion, please ignore

Set up CI

Totally happy with the same base we use for other projects.

Can we extend the genai beyond openai . Hoping to use Google Palm API as the preferred LLM

The current code supports only openai. Would like to see options to use opensource LLM s and Google Palm API.

[Prompt] Suggest how to install when ModuleNotFound

When a user hits a ModuleNotFound Error, we should send a different kind of prompt and context. Sometimes the user makes a typo, and sometimes they don't have the package installed. genai should be able to figure out between the two.

Missing Package

Typo

Context

Send the user's code
Send the output of pip freeze (assuming this isn't too big) as a role: system message

Bonus Context / Optimization

As a bonus, if the pip freeze output is too big maybe we can use something like levenshtein distance to pull the closest string matches.

packages = !pip list --format=freeze
packages = [pkg.split('==')[0] for pkg in packages]

import Levenshtein

def find_closest_strings(target, string_list, n=20):
    """
    Finds the n closest strings in string_list to the target string.
    """
    # Compute the Levenshtein distance between the target string and each string in the list
    similarity_scores = [(string, Levenshtein.distance(target, string)) for string in string_list]

    # Sort the list based on the similarity scores
    similarity_scores.sort(key=lambda x: x[1])

    # Return the n closest strings
    return [x[0] for x in similarity_scores[:n]]

find_closest_strings("pndas", packages)

Outputs

['pandas',
 'conda',
 'dask',
 'anyio',
 'appdirs',
 'attrs',
 'dx',
 'fqdn',
 'fs',
 'genai',
 'geopandas',
 'idna',
 'jedi',
 'Jinja2',
 'openai',
 'parso',
 'partd',
 'patsy',
 'pip',
 'py']

Prompt suggestion

Use !pip or %pip if an install is needed
"Here are the packages installed on the system"

[?][Design] Should %%assist return Markdown output?

I'm starting to think that we should emit markdown output instead of creating a new cell with %%assist.

Reasons:

Responses tend to come back in either Markdown or a plain text format
set_next_input (the IPython new cell creator) can't stream updates
The response usually needs some editing
The document gets littered with new cells on repeated runs

Handle bytes data for repr usage

Send unprocessed code over to ChatCompletion

Sometimes ChatCompletion will use the get_ipython().run_cell_magic line which I assume comes from it seeing that in the original code:

We should make sure it's seeing the cell with the magic.

Trim down reprs

We should make sure to set pandas display options to something reasonable when using the get_historical_context, because this blows up huge pretty quick

def craft_output_message(output):
    return {
        "content": repr(output),
        "role": "system",
    }

We should always try to keep this under a certain length as well as do any processing in advance for known useful objects.

[MAINT] Combine the two dev deps sections for poetry

Due to poetry changes we now have two development dependencies sections:

[tool.poetry.dev-dependencies]
flake8-docstrings = "^1.6.0"
pytest = "^7.2.2"
black = "^23.1.0"
isort = "^5.10.1"
pytest-cov = "^4.0.0"
pytest-asyncio = "^0.19.0"
nox = "^2022.1.7"
nox-poetry = "^1.0.1"
pytest-mock = "^3.8.2"
bump2version = "^1.0.1"

[tool.poetry.group.dev.dependencies]
pandas = "^1.5.3"

Those need to all be under tool.poetry.group.dev.dependencies based on the feedback poetry now gives when you poetry add --dev

The --dev option is deprecated, use the `--group dev` notation instead.

Setting intentions

The goal of this package is to expose AI tools to ipython primitives like:

Cell creation set_next_input
Custom Exception handling
Completion

As well as using contextual information like the current in-memory variables and previous inputs (In) to provide context to ChatGPT.

Write up use of OPENAI_API_KEY for use and development

Developers likely want a .env that looks like this:

export OPENAI_API_KEY=sk-lettersbunchofnumbers2

Let's stick that in the README.

Remove the pip prompting

Maybe the prompt about %pip vs !pip is too much. Sometimes GPT includes this in code segments that don't even have an install:

Increase output context sharing

We are currently limited to execute_result for available outputs for sharing with LLMs (by nature of how Out works). How can we increase output context sharing? I'd like to be able to have !pip show pkg be part of what the LLM reads from for context when assisting and working through errors.

Consider a feedback loop

For suggestions, we could use input to do back & forth between the user and ChatGPT.

# Assume messages is all the messages we sent prior as well as "assistant responses"
messages.append(completion["choices"][0]["message"])

user_input = input("chat> ")

while user_input is not "":
    user_message = {"role": "user", "content": user_input}
    messages.append(user_message)

    completion = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=messages,
    )

    assistant_message = completion["choices"][0]["message"]
    messages.append(assistant_message)

    display(
        {
            "text/plain": assistant_message["content"],
            "text/markdown": assistant_message["content"],
        },
        raw=True,
    )

    user_input = input("chat> ")

It's not the best user experience. However, it'll work across all the jupyter platforms, even ipython at the command line.

rgbkrk / genai Goto Github PK

genai's Issues

Missing Package

Typo

Context

Prompt suggestion

Recommend Projects

Recommend Topics

Recommend Org