Code Monkey home page Code Monkey logo

genai's Issues

[Updates] Switch to progress bar for `%%assist`

The messages that get sent with %%assist don't make much sense when we tend to get markdown back. We should switch this to a progress bar. In fact... maybe we should just emit Markdown and allow the user to copy the code instead of creating a new cell.

Ignore Keyboard Interrupt

We should not provide suggestions for KeyboardInterrupt or any other terminations. Probably should also stop if out of memory.

Create a way to override the default prompts

I'd like to be able to experiment with the prompts to do tuning outside of making another release.

import genai

genai.set_default_error_prompt(
    """
You are a data scientist diagnosing errors from your colleagues.
They ran into an error in their notebook. Be concise. Format your response in markdown.
Be extremely minimal with prose, aiming primarily for concise fixes.
""".strip()
)

As the python package solidifies, the most likely piece to change over time is going to be the prompt.

Set up CI

Totally happy with the same base we use for other projects.

[Prompt] Suggest how to install when ModuleNotFound

When a user hits a ModuleNotFound Error, we should send a different kind of prompt and context. Sometimes the user makes a typo, and sometimes they don't have the package installed. genai should be able to figure out between the two.

Missing Package

image

Typo

image

Context

  • Send the user's code
  • Send the output of pip freeze (assuming this isn't too big) as a role: system message
Bonus Context / Optimization

As a bonus, if the pip freeze output is too big maybe we can use something like levenshtein distance to pull the closest string matches.

packages = !pip list --format=freeze
packages = [pkg.split('==')[0] for pkg in packages]

import Levenshtein

def find_closest_strings(target, string_list, n=20):
    """
    Finds the n closest strings in string_list to the target string.
    """
    # Compute the Levenshtein distance between the target string and each string in the list
    similarity_scores = [(string, Levenshtein.distance(target, string)) for string in string_list]

    # Sort the list based on the similarity scores
    similarity_scores.sort(key=lambda x: x[1])

    # Return the n closest strings
    return [x[0] for x in similarity_scores[:n]]

find_closest_strings("pndas", packages)

Outputs

['pandas',
 'conda',
 'dask',
 'anyio',
 'appdirs',
 'attrs',
 'dx',
 'fqdn',
 'fs',
 'genai',
 'geopandas',
 'idna',
 'jedi',
 'Jinja2',
 'openai',
 'parso',
 'partd',
 'patsy',
 'pip',
 'py']

Prompt suggestion

  • Use !pip or %pip if an install is needed
  • "Here are the packages installed on the system"

[?][Design] Should %%assist return Markdown output?

I'm starting to think that we should emit markdown output instead of creating a new cell with %%assist.

Reasons:

  • Responses tend to come back in either Markdown or a plain text format
  • set_next_input (the IPython new cell creator) can't stream updates
  • The response usually needs some editing
  • The document gets littered with new cells on repeated runs

Send unprocessed code over to ChatCompletion

Sometimes ChatCompletion will use the get_ipython().run_cell_magic line which I assume comes from it seeing that in the original code:

image

We should make sure it's seeing the cell with the magic.

Trim down reprs

We should make sure to set pandas display options to something reasonable when using the get_historical_context, because this blows up huge pretty quick

def craft_output_message(output):
    return {
        "content": repr(output),
        "role": "system",
    }

We should always try to keep this under a certain length as well as do any processing in advance for known useful objects.

[MAINT] Combine the two dev deps sections for poetry

Due to poetry changes we now have two development dependencies sections:

[tool.poetry.dev-dependencies]
flake8-docstrings = "^1.6.0"
pytest = "^7.2.2"
black = "^23.1.0"
isort = "^5.10.1"
pytest-cov = "^4.0.0"
pytest-asyncio = "^0.19.0"
nox = "^2022.1.7"
nox-poetry = "^1.0.1"
pytest-mock = "^3.8.2"
bump2version = "^1.0.1"

[tool.poetry.group.dev.dependencies]
pandas = "^1.5.3"

Those need to all be under tool.poetry.group.dev.dependencies based on the feedback poetry now gives when you poetry add --dev

The --dev option is deprecated, use the `--group dev` notation instead.

Setting intentions

The goal of this package is to expose AI tools to ipython primitives like:

  • Cell creation set_next_input
  • Custom Exception handling
  • Completion

As well as using contextual information like the current in-memory variables and previous inputs (In) to provide context to ChatGPT.

Remove the pip prompting

Maybe the prompt about %pip vs !pip is too much. Sometimes GPT includes this in code segments that don't even have an install:

image

Increase output context sharing

We are currently limited to execute_result for available outputs for sharing with LLMs (by nature of how Out works). How can we increase output context sharing? I'd like to be able to have !pip show pkg be part of what the LLM reads from for context when assisting and working through errors.

Consider a feedback loop

For suggestions, we could use input to do back & forth between the user and ChatGPT.

# Assume messages is all the messages we sent prior as well as "assistant responses"
messages.append(completion["choices"][0]["message"])

user_input = input("chat> ")

while user_input is not "":
    user_message = {"role": "user", "content": user_input}
    messages.append(user_message)

    completion = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=messages,
    )

    assistant_message = completion["choices"][0]["message"]
    messages.append(assistant_message)

    display(
        {
            "text/plain": assistant_message["content"],
            "text/markdown": assistant_message["content"],
        },
        raw=True,
    )

    user_input = input("chat> ")

It's not the best user experience. However, it'll work across all the jupyter platforms, even ipython at the command line.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.