Code Monkey home page Code Monkey logo

coauthor-interface's Introduction

An Open-Source Interface for Human-AI Collaborative Writing

Overview

This repository contains the code for the interface of CoAuthor. The interface comes in two parts: (1) the frontend presented to the users for writing with AI and to replay previous writing sessions, and (2) the backend that serves requests from the frontend and queries models to generate suggestions.

For downloading the CoAuthor dataset and replaying its writing sessions, please visit the website instead.

If you have any questions, please feel free to reach out to Mina Lee at [email protected].


Contents


Backend

The backend is a Flask app that serves requests from users, manages sessions, and stores logs for future replays.

By default, the backend is setup to support OpenAI models via OpenAI API. To use other models, you will need to modify the backend to support them.

1. Clone this Github repository

Type the following command to clone this repository into a directory of your choice:

git clone https://github.com/minalee-research/coauthor-interface

Inside the coauthor-interface directory, run the following to install the required packages:

pip install -r requirements.txt

2. Add your API key(s) to use OpenAI models

Create a file ./config/api_keys.csv and add your API key(s) as follows:

host domain key
openai default sk-***************************************

Replace the sk-*************************************** with your OpenAI API key. If you don't have it, you can get one here.

For host and domain, you can simply use openai and default. If you want to define a new domain for your experiments and use a specific key for a subset of access codes that are under the domain, see Advanced Usage for more details on setting up new domains.

3. Run the server on your local machine or on a server

Run the server in ./backend with basic parameters as follows:

python3 api_server.py \
    --config_dir '../config' \
    --log_dir ../logs \
    --port 5555 \
    --proj_name 'pilot' \
    --debug

The backend initializes sessions using access codes that are read from ./config/access_codes.csv. When you enter the frontend, the access code provided needs to match one of the created codes here.

The choice of models, examples (prompts that are hidden from users), and prompts (prompts that are shown to users in the text editor) can be specified when you create ./config/access_codes.csv.


Frontend

1. Run the frontend

You can run the frontend using a simple Python server or host it on a third-party server.

To run the frontend on a local machine, run the following command in the ./frontend directory:

python -m http.server 8000

To run the frontend on a server, you can use a third-party platform such as Glitch.

2. Set the server URL

Update ./frontend/js/config.js to have the correct URL of the frontend and backend server. For instance, if your server is running on http://127.0.0.1:5555 and your frontend is running on http://127.0.0.1:8000 then the following two lines in the config file should look like:

const serverURL = 'http://127.0.0.1:5555'
const frontendURL = 'http://127.0.0.1:8000' 

3. Access the frontend

Now, you can access the frontend server on your browser as follows:

FRONTEND_URL/index.html?access_code=ACCESS_CODE

where FRONTEND_URL is the URL of the frontend server (e.g. http://127.0.0.1:8000) and ACCESS_CODE is one of the access codes you defined in ./config/access_codes.csv. If you have followed the instructions above, you should be able to access the frontend at here:

http://127.0.0.1:8000/index.html?access_code=demo

4. Use the frontend

  • Get suggestions from AI: While writing in the text editor, press the tab key whenever you want to get suggestions. You can get suggestions multiple times in a row if you want more; you can navigate the suggestions using arrow keys and press the enter key to select a suggestion; to reopen the previous suggestions, press the shift key and tab key at the same time.
  • Save your writing session: If you want to save the writing session (to share it with others or to replay it later), press the "Save your work" button on the bottom of the page and save the SESSION_ID you get; otherwise, your session will not be saved.
  • Replay your writing session: To view the replay of your writing session, you can access it at FRONTEND_URL/replay.html?session_id=SESSION_ID where FRONTEND_URL is the URL of the frontend server and SESSION_ID is the session ID you received when you saved your writing session.

Advanced Usage

Access codes

Each access code is mapped to a set of configurations (e.g. decoding parameters). You can create a new access code by adding a new row to ./config/access_codes.csv. The following is an example of a row in ./config/access_codes.csv:

domain example prompt access_code session_length n max_tokens temperature top_p presence_penalty frequency_penalty stop engine additional_data
demo na na demo 0 5 50 0.95 1 0.5 0.5 . text-davinci-003 na

Parameters for experiments

  • domain: The domain of the access code. This is used to group access codes together. For instance, you can create a new domain called story and add all access codes that are used for story writing to this domain.
  • example: The part of a prompt that is hidden from users, designed to contain a set of examples for in-context learning. If you don't want to provide example(s), you can set this to na. Otherwise, you can provide multiple examples in ./config/examples as a text file and refer to it here.
  • prompt: The prompt that is shown to users in the text editor. If you don't want to provide a prompt, you can set this to na. Otherwise, you can add a prompt in ./config/prompts.tsv refer to its prompt_code here.
  • access_code: The access code that users need to enter to access the frontend. Choose a unique access code for each row.
  • session_length: The minimum length of a writing session in seconds. After an user has written for this amount of time, the "Save your work" button will be enabled. If you don't want to set the time limit, you can set this to 0.
  • additional_data: Additional data that you want to connect with the session. Unless you have a specific use case, you can set this to na.

Parameters for OpenAI models (see here for more details)

  • engine: The engine used to generate suggestions (see here for the list of supported models).
  • n: The number of completions to generate for each prompt.
  • max_tokens: The maximum number of tokens to generate in the completion.
  • temperature: The temperature of the model. The higher the temperature, the more random the text.
  • top_p: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
  • presence_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
  • frequency_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
  • stop: Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

For stop, you can additionally use the following options to post-process model outputs:

  • Leave it empty if you want a raw model output (e.g. it may include multiple empty lines).
  • Put . if you want to show max one sentence for each suggestion.
  • Put \n if you want to show max one paragraph for each suggestion.
  • Use | to add multiple stop sequences (e.g. .|\n|***). You can have up to four stop sequences.

Blocklist

You can block certain words or phrases from being generated by the model by adding them to ./config/blocklist.txt and setting --use_blocklist to be true when running the backend.

coauthor-interface's People

Contributors

minalee-research avatar soumendrak avatar

Stargazers

Atsuya Kobayashi avatar Caleb Forestal |  avatar Xiang Fu avatar syeminPark avatar  avatar Nischal Ashok Kumar avatar Jingyi Qiu avatar  avatar yejin avatar  avatar  avatar Eric avatar HyoJin Joo avatar Zihan Yan avatar Taneem Jan avatar Diwank Singh Tomer avatar Shailesh Iyer avatar Agrim Sachdeva avatar Woojun Lee avatar Carolina Camassa avatar Frieda Rong avatar Gaurav Pendharkar avatar  avatar  avatar Dtmy avatar Jeff Carpenter avatar Cynthia avatar Kenneth C. Arnold avatar Chengbo Zheng avatar Stefania Druga avatar ben avatar  avatar Adam Trischler avatar Kwon Ko avatar Sanjib Narzary avatar Erick Piller avatar Michele Cremaschi avatar Ethan, Wenjun Hou avatar Mahmoud avatar  avatar Ethan Hill avatar Nicholas Broad avatar Lesly Arun Franco avatar  avatar Yewon Kim avatar Jiaxian Guo avatar Dominique Mariko avatar Gabriele Sarti avatar  avatar Junyan Xu avatar elucida avatar Pooyan Jamshidi avatar Sandalots avatar Ali avatar 爱可可-爱生活 avatar Bharat Raghunathan avatar  avatar Iftitahu Ni'mah avatar  avatar Helma Torkamaan avatar Vikas avatar Jong Hyeon Yeo avatar ericklan avatar DL avatar Hrishi Mittal avatar Barton Rhodes avatar Jan Kubica avatar Bo avatar Daniel Vila Suero avatar Peiyuan Liao avatar Ryan Kanno avatar Abdullah Mohammed avatar Hussein Mozannar avatar Nils Feldhus avatar  avatar Shannon Shen avatar Jonathan Bragg avatar Jeff Hammerbacher avatar Chenglei avatar

Watchers

ericklan avatar Abdullah Mohammed avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.