Code Monkey home page Code Monkey logo

constitutional-ai-awesome-papers's Introduction

Constitutional-AI-awesome-papers

Paper lists about 'Constitutional AI System' or 'AI under Ethical Guidelines'. This GitHub repository is intended for personal study, and under consistent update. I hope for everyone's active related-works recommendations.

Paper

  1. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

    Anthropic [Link] arxiv Nov.2022

  2. Constitutional AI: Harmlessness from AI Feedback

    Anthropic [Link] arxiv Dec.2022

  3. Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences

    Denis Emelin, Ronan Le Bras, Jena D. Hwang, Maxwell Forbes, Yejin Choi [Link] EMNLP 2022

  4. Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits

    Ruibo Liu, Chenyan Jia, Ge Zhang, Ziyu Zhuang, Tony X. Liu, Soroush Vosoughi [Link] NeurIPS 2022

  5. The Capacity for Moral Self-Correction in Large Language Models

    Anthropic [Link] arxiv Feb.2023

  6. Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

    Zhiqing Sun1, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan [Link] arxiv May.2023

  7. Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

    Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo [Link] arxiv Oct.2023

  8. Generating Summaries with Controllable Readability Levels

    Leonardo F. R. Ribeiro, Mohit Bansal, Markus Dreyer [Link] arxiv Oct.2023

  9. Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

    Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu [Link] arxiv Oct.2023

  10. Collective Constitutional AI: Aligning a Language Model with Public Input

    Anthropic [Link] arxiv Oct.2023

  11. Specific versus General Principles for Constitutional AI

    Anthropic [Link] arxiv Oct.2023

constitutional-ai-awesome-papers's People

Contributors

minbeomkim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.