Code Monkey home page Code Monkey logo

ai-creative's Introduction

Missing Building Blocks in the Agent World

"AI Agents will likely be the most impactful technology of our generation". In this document I present you the following modular building blocks, fully open source. Beware that most of it is still a work in progress! See this high level overview for the current status.

Agent OpenAPI

Turn any API into an Agent, Turn any Agent into an API.

The Agent OpenAPI serves an OpenAPI for talking to an agent, so it can be discovered publicly, and can be used as a tool for other agents. More information in the Agent OpenAPI GitHub

Agent Relay

Agents need to be accessible from anywhere. The Agent Relay makes agents accessible from messaging apps, VoIP and phonecalls, and over email! Check out the Agent Relay on GitHub

CRUD OpenAPI

Data needs to be discoverable as tools. A reliable CRUD Agent is extremely useful. More info here

Why

  • We're living through a technological paradigm shift that will change how we interact with computers, and how humans can find purpose. A new foundation is being created now. In this important time, I want to do my part setting good standards for HMC that benefits humanity.

  • Big tech capitalism is trying to create a controlled closed ecosystem for AI. As AGI is approaching, misaligned commercial incentives become ever more extreme, and I don't want to live in this walled garden distopia. The solution is an open, accessible, modular ecosystem for AI Agents. An ecosystem without any vendor lock-in or privacy problems. An ecosystem where we, the people, stay in control.

Key Focus: Reliable Agents by intelligent search

  • Analysing thousands of services on capability, quality, speed, cost, and availability.
  • Have a scalable way to sign up and get access to all service providers with multiple accounts.
  • Proxy them into my own gateway which can be made available as a "Universal API" that exposes all services through a single endpoint.

Strategy: ActionSchema for Devs: OEF

  • Devs want Open source. Give it.
  • Devs want Easy: Serve it BYOK, accessible, and useful.
  • Devs want Freedom. Provide them agents so they can go Screenless.

ActionSchema in different keywords: AI Software Engineer, Universal API, Reliable Agents, OLAM

TODO: Keep jumping between these projects and aim to finish them asap. Laserfocus.

LONGTERM: Keep these stable services for decades. Keep LOC/Complexity LOW.

Highlevel ActionSchema

This is the current ecosystem of projects developed by Code From Anywhere (❗️ dependency, ⏸️ paused, 🚫 blocked, 🔴 not started, 🟠 work in progress, 🟢 done)

Name Purpose Status MVP LOC
Auth Central SDK and Agent Auth Gateway 🟩🟥 🟢 GitHub OAuth2 GPT POC
🔴 Refactor
CRUD OpenAPI Turn database into agent-tools 🟩🟩🟩🟩🟥 🟢 CRUD Only firsst
🟢 Semantic search
🟢 CLI
🟢 CRUD-Agent
🔴 Config: user separation
±3k
Agent OpenAPI Turn any API into an Agent 🟩🟩🟧🟥🟥🟥 🟢 Simple POC
🟢 OpenAPI-centric Refactor
🟠 Use tools from OpenAPIs with OAuth2
🔴 Agent Creator Agent
🔴 Files
🔴 Threads
±2k
OpenAPI Explorer Explore OpenAPI Possibilities 🟩🟩🟩🟥 🟢 Simple OpenAPI overview with experimental forms
🟢 Aggregate openapis from multiple endpoints
🟢 Expose LLM search endpoint.
🟠 Manual entry
±2k
Agent Relay Make agent available anywhere 🟩🟩🟩🟥🟥🟥🟥 🟢 Browser & Phonecall STS
🟢 Custom agent compatibility
🟢 Whatsapp, SMS, Messenger
🔴 Agent-first refactor
🔴 Email
🔴 Deepgram STS Tool use
🔴 Outbound
1175

A dependency to the above is what I call "OpenAPI-first development". It is an opinionated way of design-first development where your OpenAPI serves as the SSOT for a lot of things, and you don't generate it, you rather generate pieces in your code FROM it.

If I feel fancy, work on this. More experimental:

Website Purpose Status POC or next steps Depends on
OpenAPI Tester Big Wish E2E testing/validating an OpenAPI's functionality ActionSchema
Brainstorm Natural Language to Operations mapping Good OpenAPI search
Brainstorm LLM Hierarchy Creation, Maintenance, and Search
ActionSchema Demo Show how ActionSchema works Paused VSCode plugin for OpenAPI selection and form-filling Functional OpenAPI
Big Wish Slow-agents that can continue very long or self-activate ActionSchema
Universal API Universal-API or Open-LAM Brainstorm Exposes all services through a single cacheable NLP endpoint OpenAPI Explorer, Search, Proxy
Normalise GPT Schema Normalisation Brainstorm
Human OpenAPI Turn people into agent-tools User can signup after which the API can communicate with the user 🟢 Agent Relay, 🔴 User OpenAPI
Enhancement Proxy Allow agents to iteratively improve their tools 🚫 Paused. Will be solved by CRUDE 🚫 Finish ActionSchema Rewrite
🟠 Serve on subdomain with frontpage
🔴 Create OpenAPI to self-modify
Serverless Browser Serverless Playwright Browsing OpenAPI Idea
oAuth2 Authenticator Automatic signup, login, and payments to gather API access Idea
Combination Proxy Combine multiple OpenAPIs into one 🔴 🟠 Serve with form to make your own easily.
🔴 Examples of agents.
actionschema Extension of JSON Schema allowing data-centric development 🟠 Rewrite to v2 in progress
🔴 x-proxy
🔴 x-schema
🔴 x-code

Key insights

  • Most AI is focused around realtime co-pilots because we're all still used to the direct HMC. Try making ambient pilots that don't need to be fast.
  • Pick my focus. Big topics like browser automation APIs and video editing are done by hundreds of companies and are extremely hard to stay competitive in; It's a never-ending cat and mouse game.
  • Products and APIs change all the time. Instead of choosing to spend knowledgework time in specific niches, index all available capabilities.
  • Most users care about their privacy and would want to have things ran locally. However, running locally is hard to setup and scale. Another way to have practical privacy is to keep the core local, but run smaller fleeting tasks in the cloud.
  • How any API works exactly doesn't need to be abstracted away from. The only thing we need to do is determine API capability, quality, speed, cost, and availability.

Questions

  • Can ActionSchema become agentic: allowing an agent to decompose tasks in parallel and sequential ways?
  • How can I build a meta programming language that dynamically finds new actions, tests them, and improves them, that can create purpose-oriented change in a system?
    • How can I measure purpose-oriented change and figure out whether it's worth the cost?

Let's Code From Anywhere!

Welcome to Code From Anywhere - a group of distributed developers and entrepreneurs building planet-first & humane-centered software. We work remotely but often come together in places like Nepal and Brazil, going on adventures.

We 🤍 Developers, AI Startups & Adventurers. Do you have a question, comment, or want to connect? Head over to our Discord

Contributors and Sponsors

Top contributors

License

License: MIT

This project is licensed under the MIT License - see the LICENSE file for details.

Commercial License

If your company generates more than $1,000,000 in Annual Recurring Revenue (ARR), you are required to obtain a commercial license. Please see the COMMERCIAL_LICENSE file for more information.

Contact

For commercial licensing inquiries, please contact Wijnand at [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.