lgrammel / js-agent Goto Github PK
View Code? Open in Web Editor NEWBuild AI Agents with JS & TS
License: MIT License
Build AI Agents with JS & TS
License: MIT License
I feel like creating a testing setup esp. with unit tests for actions etc. would speed up action development a lot.
I feel like the usecase of building presentations could be a great showcase where people not only see the system in action but could also see that it's productively usable.
I could imagine the process of breaking down a topic into research and parts and then generating a markdown or whatever based presentation would be a sufficiently simple usecase to serve as an example but also is a viable production level usecase that is doable without a lot of effort. (At the same time the result is very sharable which is a very nice sideeffect.)
Might be worth building.
ref https://twitter.com/lgrammel/status/1645840202055680013
if i use this framework to query openAI GPT model, how is output routed to a plugin vs just sending the openAI response?
since openAI doesn't know about these client side plugins, how do the two interact?
referencing:
I see
You can perform the following actions using ${
this.format.description
}:
${this.describeActions()}
## RESPONSE FORMAT (ALWAYS USE THIS FORMAT)
Explain and describe your reasoning step by step.
Then use the following format to specify the action you want to perform next:
${this.format.format({
action: "an action",
param1: "a parameter value",
param2: "another parameter value",
})}
You must always use exactly one action with the correct syntax per response.
Each response must precisely follow the action syntax.
so does openAI API respond with a block of text with action: "an action",
where 'an action' is something chatGPT would like executed
then client side the Action plugin will parse the openAI response and make any needed API calls requested from the action?
So looking at an example action
https://github.com/lgrammel/gptagent.js/blob/main/packages/agent/src/action/tool/summarize-webpage/SummarizeWebpageAction.ts
has
description = "Summarize a webpage considering a topic.",
so openAI would be initialized with
``
You can perform the following actions using ${
this.format.description
}:
${this.describeActions()}
You can perform the following actions using [Summarize a webpage considering a topic]
still seems a bit messy but maybe I can run through and debug the intermediate steps.
hello, ihave run pnpm run-agent
cat example/helloworld/task.txt
but i don't find any helloworld.js , where should it be? nothing in the drive folder... any idea why ?
dev/js-agent/examples/javascript-developer$ pnpm run-agent `cat example/helloworld/task.txt`
> @js-agent/[email protected] run-agent /home/smag/dev/js-agent/examples/javascript-developer
> ts-node ./src/main.ts "The" "classical" "introductory" "exercise." "Just" "say" "\"Hello," "World!\"." "\"Hello," "World!\"" "is" "the" "traditional" "first" "program" "for" "beginning" "programming" "in" "a" "new" "language" "or" "environment." "The" "objectives" "are" "simple:" "Write" "a" "program" "that" "prints" "the" "string" "\"Hello," "World!\"." "Write" "the" "program" "in" "JavaScript." "Locate" "the" "program" "and" "run" "it."
### JavaScript Developer Agent ###
{
task: 'The classical introductory exercise. Just say "Hello, World!". "Hello, World!" is the traditional first program for beginning programming in a new language or environment. The objectives are simple: Write a program that prints the string "Hello, World!". Write the program in JavaScript. Locate the program and run it.',
projectInstructions: 'You are working on a JavaScript/TypeScript project.'
}
Thinking…
Done
{ type: 'succeeded', summary: 'Completed all tasks.' }
Run cost: $0.00
LLM calls: 0
Inspired by https://jina.ai/news/auto-gpt-unmasked-hype-hard-truths-production-pitfalls/
Looking at the code I believe, it should be possible to implement serialisation. Idea would be simple - dump an algorithm into JSON file.
I do wonder if it's a bigger question on agents being able to abstract their own tasks into concepts and create a list of possible tasks to use.
Like when I was writing a trading bot I kept thinking - why agent cannot create a tool once like "get_technical_indicators" and then reuse it. It keeps writing the tool from scratch every time.
It's like software development before invention of libraries.
I've been playing with this right now and we're definitely getting somewhere.
This blog post is a good summary, I'm guessing you know it: https://jina.ai/news/auto-gpt-unmasked-hype-hard-truths-production-pitfalls/
Now, autogpt allows subtasks to be executed at GPT 3.5 turbo, which is a lot faster and cheaper to execute. I believe that with a bit of prompt engineering that is possible at least for solving some subtasks here. (Prompt engineering in this case model specific, as 3.5 sometimes doesn't want to format responses nicely and machine readable. I also believe autogpt has a "json autofix" thing which adds missing parenthesis before parsing.)
Apart from that it would be interesting to explore, how we could reduce the number of needed steps to achieve small, autonomous tasks more quickly and cost effectively.
Errors should have 3 levels of recipients:
Also errors should show tracebacks and be handled consistently.
Hi @lgrammel,
First of all great work on creating a well factored and structured code.
I'd love to contribute to it as I feel with increasing complexity of autonomous agents it will make a big difference. If you look at the most popular "Auto-GPT" it's drowning in PRs. It seem its getting harder and harder to manage those for the author as the whole system made an experiment which grew very fast in popularity.
I tried testing gptagent.js using the code from README and noticed that I don't get done step at the end.
Agent just gets stuck in loop of thinking:
https://share.cleanshot.com/9V1sYxZ1
at mac m1 doing this
// # in root folder:
> pnpm install
// 🐳 start docker daemon ...
> pnpm nx run-many --target=build
> cd examples/javascript-developer
> mkdir drive
> pnpm build
...
.build/gptagent-executor.js 2.1mb ⚠️
⚡ Done in 124ms
[+] Building 1.1s (13/13) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/node:lts-alpine 1.0s
=> [1/8] FROM docker.io/library/node:lts-alpine@sha256:ca5d399560a9d239cbfa28eec00417f1505e5e108f3ec6938d230767eaa81f61 0.0s
=> [internal] load build context 0.1s
=> => transferring context: 2.24MB 0.0s
=> CACHED [2/8] RUN apk update && apk add dumb-init && apk add git && apk add ca-certificates 0.0s
=> CACHED [3/8] RUN npm install -g [email protected] 0.0s
=> CACHED [4/8] RUN npm install -g pnpm 0.0s
=> CACHED [5/8] RUN apk add python3 && apk add build-base 0.0s
=> CACHED [6/8] WORKDIR /home/service 0.0s
=> CACHED [7/8] COPY --chown=node:node .build/gptagent-executor.js /home/service 0.0s
=> CACHED [8/8] WORKDIR /home/service/repository 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:63e9eac913f6bcedbcdeb23b046108c355461aae652a64d1d05c8232d916bd86 0.0s
=> => naming to docker.io/library/gptagent-javascript-developer 0.0s
javascript-developer > pnpm run-executor
Run js-agent executor container
Drive: /Users/admin/js-agent/examples/javascript-developer/drive
...
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
{"level":30,"time":1682034345171,"pid":11,"hostname":"b4fe30d7745f","message":"Server listening at http://0.0.0.0:3001"}
...
// on new terminal :
javascript-developer > pnpm run-agent `cat example/helloworld/task.txt`
...
> @js-agent/[email protected] run-agent /Users/admin/js-agent/examples/javascript-developer
> ts-node ./src/main.ts "The" "classical" "introductory" "exercise." "Just" "say" "\"Hello," "World!\"." "\"Hello," "World!\"" "is" "the" "traditional" "first" "program" "for" "beginning" "programming" "in" "a" "new" "language" "or" "environment." "The" "objectives" "are" "simple:" "Write" "a" "program" "that" "prints" "the" "string" "\"Hello," "World!\"." "Write" "the" "program" "in" "JavaScript."
### JavaScript Developer Agent ###
The classical introductory exercise. Just say "Hello, World!". "Hello, World!" is the traditional first program for beginning programming in a new language or environment. The objectives are simple: Write a program that prints the string "Hello, World!". Write the program in JavaScript.
Executing run-command…
## Command pnpm install executed successfully
### stdout
ERR_PNPM_NO_PKG_MANIFEST No package.json found in /home/service/repository
Executing run-command…
## Command pnpm nx run agent:build executed successfully
### stdout
ERR_PNPM_NO_IMPORTER_MANIFEST_FOUND No package.json (or package.yaml, or package.json5) was found in "/home/service/repository".
Thinking…
Done
then i go inside docker container
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.