Code Monkey home page Code Monkey logo

icortex's Introduction



Github Actions Status PyPI Latest Release Downloads Documentation Status License Discord Twitter

A no-code development framework — Let AI do the coding for you 🦾


tl;dr in goes English, out comes Python:

demo.mp4

ICortex is a no-code development framework that lets you to develop Python programs using plain English. Simply create a recipe that breaks down step-by-step what you want to do in plain English. Our code generating AI will follow your instructions and develop a Python program that suits your needs.

Create a TextCortex account to receive free starter credits and start using ICortex.

Try it out

Binder

You can try out ICortex directly in your browser. Launch a Binder instance by clicking here, and follow the instructions in our docs to get started.

Alternatively, you can use ICortex in Google Colab if you have an account. See below.

Check out the documentation to learn more. Join our Discord to get help.

Installation

Locally

Install directly from PyPI:

pip install icortex
# This line is needed to install the kernel spec to Jupyter:
python -m icortex.kernel.install

On Google Colab

Google Colab is a restricted computing environment that does not allow installing new Jupyter kernels. However, you can still use ICortex by running the following code in a Colab notebook:

!pip install icortex
import icortex.init

Note that the package needs to be installed to every new Google Colab runtime—you may need to reinstall if it ever gets disconnected.

Quickstart

Click here to get started using ICortex.

Getting help

Feel free to ask questions in our Discord.

Uninstalling

To uninstall, run

pip uninstall icortex

This removes the package, however, it may still leave the kernel spec in Jupyter's kernel directories, causing it to continue showing up in JupyterLab. If that is the case, run

jupyter kernelspec uninstall icortex -y

icortex's People

Contributors

osolmaz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

icortex's Issues

Improve cosmetics

  • Add spinners
  • More colors in output
  • Add ICortex lexer to Pygments (either to the official repo or as a plugin) for syntax highlighting in the terminal
  • Add ICortex to CodeMirror for syntax highlighting in Jupyter notebooks

Error with %icortex magic on Google Colab

Description of bug / unexpected behavior

Running %icortex init (and also other args) gives the following error.

usage: ipykernel_launcher.py [-h] [--user] [--sys-prefix] [--prefix PREFIX]
                             [--uninstall]
ipykernel_launcher.py: error: unrecognized arguments: -f /root/.local/share/jupyter/runtime/kernel-7ccb0a19-8059-43ff-a85c-39f8db991f4a.json

%prompt works without an issue.

Improve prompt dialog and result caching

The current workflow of prompting, package installation and execution can be improved. This also ties to caching of generated code.

  • Don't reuse cached code automatically if the user didn't choose to execute it
  • Create a flowchart of the workflow and include it in the reference

Open questions

How opinionated should the prompt workflow should be?
Should the user be able to change the workflow in the configuration?

Submit execution context to TextCortex API

Currently, requests to TextCortex API generate code independently for each cell. Without the context of the entire notebook, global variables, etc. the API returns disparate code, forcing the user to be overly specific about e.g. variable names in their prompts.

Ideally, the entire execution context, i.e.

  1. inputs of previously executed cells,
  2. code generated from prompts,
  3. outputs of previously executed cells,
  4. names of variables in the global namespace,
  5. values of variables in the global namespace

should all be submitted to the API in each request for the best possible generation.

Bandwidth is a bottleneck for code generated remotely, so the request payload would need to be pruned without losing too much of the context. Say it should not exceed a ballpark of 500kB.

Implementation

Fortunately, IPython caches inputs and outputs for each cell and stores them in hidden variables in the global namespace, which we can easily access to:

https://ipython.readthedocs.io/en/stable/interactive/reference.html#input-caching-system

For submitting to a remote API, history variables need to be pruned down to the aforementioned limit. Code generation performance is inversely proportional to the amount of discarded information, but we expect it to perform already pretty well with only (1), (2) and (4) from above.

  • Implement logic to pack

    • (1)
    • (2)
    • (3)
    • (4)
    • (5)
  • Create a schema to convert the dict into JSON

That JSON would then be included in the payload and processed by the API for each request.

Notes

The JSON schema is to be the same as Jupyter notebook format where code generation specific data are stored in cell metadata.

Future work

  • A more sophisticated pruning algorithm that processes and includes (3) and (5) in the payload

Add Support for Different Model Types

ONNX models are serialised version of the current AI models. They are a bit faster from the normal pytorch or huggingface therefore some users might want to use this type of models.

We already do have a model converted and added to huggingface:
https://huggingface.co/TextCortex/codegen-350M-optimized

In total we need to support following model types:

  • Pytorch: Filename extension .pt
  • Hugginface: Filename extension .bin
  • ONNX: Filename extension .onnx

Here is the script for supporting text generation for ONNX models with optimum library from huggingface:

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForCausalLM
# Load models
model = ORTModelForCausalLM.from_pretrained("TextCortex/codegen-350M-optimized")
tokenizer = AutoTokenizer.from_pretrained("TextCortex/codegen-350M-optimized")

def generate_onnx(prompt, min_length=16, temperature=0.1, num_return_sequences=1):
   generated_ids=model.generate(input_ids, min_length=min_length, temperature=temperature,
                                   num_return_sequences=num_return_sequences, early_stopping=True)
   out = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
   return out

For the Vanilla Pytorch models .pt, you can directly use AutoModel class from transformers (which also works for the huggingface .bin model types.)

Create icortex CLI

Create a standalone script icortex with the following spec:

  • icortex without any argument launches the REPL.
  • icortex init starts an interactive project initialization script that will let the user configure API keys and other options.
  • ...

Adding icortex as a dependency

In a project that has Python version:

python = "^3.10"

Adding icortex to the project raised the following error.

The current project's Python requirement (>=3.10,<4.0) is not compatible with some of the required packages Python requirement:
  - icortex requires Python >=3.8,<3.11, so it will not be satisfied for Python >=3.11,<4.0

Because no versions of icortex match >0.1.1,<0.2.0
 and icortex (0.1.1) requires Python >=3.8,<3.11, icortex is forbidden.
So, because ... depends on icortex (^0.1.1), version solving failed.

Make sure icortex can be added in such cases.

Migrate prompt parsing logic to IPython.core.InteractiveShell

Currently, prompt parsing and interactive features such as package auto-installing, executing, etc. happen outside IPython parser. This has the following disadvantages:

  • Prompts themselves are not stored in the history, but the Python code they resolve to
  • The global and local namespaces have to be passed to ICortex.eval_prompt() in order to execute the Python code returned from services—if we implement the following steps, they can be accessed directly in the overloaded InteractiveShell

Steps:

  • Create an overloaded version of IPython.core.inputtransformer2 and migrate prompt parsing logic in there
  • Create an overloaded version of IPython.core.InteractiveShell which uses the overloaded inputtransformer2 and migrate interactive ICortex logic in there
  • Modify ICortexShell to use the overloaded InteractiveShell

This will most likely simplify ICortexShell.do_execute() and resolve #10 automatically.

Possible issues:

  • Will the interactive features (input() from the user) continue to work smoothly in both the shell and notebooks?

Ability to use without installing the kernel spec

A lot of people will use ICortex from within Google Colab, which doesn't support installing new kernel specs.

If there can be a relatively non-hacky way over overloading IPythonKernel with ICortex magics with a single command, it would make it possible to use ICortex in Colab.

Better memoization/caching

Feedback from Eden:

The caching method is interesting, although I would've expected memoized responses first, then falling back to your file cache on miss.

I think you're also mixing caches between models, as they rely on the same file location.
You likely want to either just have a session cache via memoised patterns - functools has a very simple one to add in - or you want to break up caches between models and services.

Ability to run ICortex notebooks as scripts, argument & context magics

The whole point of ICortex is to create reusable tools (i.e. scripts) using plain English. The %prompt magic already supports adding new arguments through the service API—a similar functionality can be envisioned to inject arguments into the context through a simplified version of an argparse-like API.

Ultimately, the user should be able to call notebooks like this:

icortex my_notebook.ipynb file_to_be_processed -o output_file -p 123

Similar ideas floated around for IPython online e.g. here.

getopt/argparse-like interfaces are known for their versatility—this project uses that interface to auto-generate GUIs from parser objects. We can leverage this versatility to let users create plain English scripts that can take any number of arguments, and eventually, be able to call each other.

Goals

Make it

  • Opinionated
  • Simple
  • Flexible

Caveats

Any argument that is injected into the context will be fed to the language model eventually as a string, so the API would need to let the users be able to specify how the arguments should be formatted during string conversion.

TBD

Better async implementation

Feedback from Eden:

Oddly you're not using async where it makes sense - in the network requests - but I'm guessing you're forced into it for Jupyter. It's not essential unless you ever find yourself in a place with multiple user requests for code gen.

What would be interesting, and would use that async appropriately, is sending the request to multiple services and letting the user pick the most appropriate one.

IPythonKernel.do_execute() is currently overloaded in a hacky way and it poses problems e.g. when KeyboardInterrupt is called. Async handling needs to be overall improved.

Construct module <> package map for auto-installing packages

Module names do not map directly to PyPI package names, so there needs to be a way to bridge that gap to auto-install missing modules.

To construct the mapping we could:

  1. Download the list of most popular Python packages, e.g. https://hugovk.github.io/top-pypi-packages/
  2. Scrape each package and find out which modules they install

The mapping should not be owned/provided by a third-party to prevent arbitrary code execution vulnerabilities.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.