linonetwo / langchain-alpaca Goto Github PK

View Code? Open in Web Editor NEW

218.0 5.0 6.0 92 KB

Run Alpaca LLM in LangChain

Home Page: https://www.npmjs.com/package/langchain-alpaca

License: MIT License

TypeScript 77.33% JavaScript 22.67%

alpaca chatgpt llama llm local-first offline-first self-hosted

langchain-alpaca's Introduction

LangChain-Alpaca

Run alpaca LLM fully locally in langchain.

pnpm i langchain-alpaca

You can store following example as loadLLM.mjs , run it with google/zx by DEBUG=langchain-alpaca:* zx './loadLLM.mjs'

/* eslint-disable no-undef */
import { AlpacaCppChat, getPhysicalCore } from 'langchain-alpaca'
import path from 'node:path'

console.time('LoadAlpaca')
const alpaca = new AlpacaCppChat({
  // replace this with your local model path.
  modelParameters: { model: '/Users/linonetwo/Desktop/model/LanguageModel/ggml-alpaca-7b-q4.bin', threads: getPhysicalCore() - 1 },
})
const response = await alpaca.generate(['Say "hello world"']).catch((error) => console.error(error))
console.timeEnd('LoadAlpaca')

console.log(`response`, response, JSON.stringify(response))
// close the node-tty session to free the memory used by alpaca.cpp. You can query alpaca as much as you want before closing it.
alpaca.closeSession()

See example/*.mjs for more examples. Run with env DEBUG=langchain-alpaca:* will show internal debug details, useful when you found this LLM not responding to input.

Read doc of LangChainJS to learn how to build a fully localized free AI workflow for you.

Prebuild Binary

By default, langchain-alpaca bring prebuild binry with it. But it will still try to build one when postinstall , which should be very fast, and produce a somehow faster version of binary. This is optional, and will fail in silence, because it still work with prebuild binary.

If you are using windows, and want to make postinstall work, make sure download and install CMake: https://cmake.org/download/ as said in https://github.com/antimatter15/alpaca.cpp#windows-setup .

Parameter of AlpacaCppChat

export interface AlpacaCppModelParameters {
  /** run in interactive mode
   * (This also means to stream the results in langchain)
   */
  interactive?: boolean
  /** run in interactive mode and poll user input at startup */
  interactiveStart?: boolean
  /** in interactive mode, poll user input upon seeing PROMPT */
  reversePrompt?: string | null
  /** colorise output to distinguish prompt and user input from generations */
  color?: boolean
  /** RNG seed (default: -1) */
  seed?: number
  /** number of threads to use during computation (default: %d) */
  threads?: number
  /** prompt to start generation with (default: random) */
  prompt?: string | null
  /** prompt file to start generation */
  file?: string | null
  /** number of tokens to predict (default: %d) */
  n_predict?: number
  /** top-k sampling (default: %d) */
  top_k?: number
  /** top-p sampling (default: %.1f) */
  top_p?: number
  /** last n tokens to consider for penalize (default: %d) */
  repeat_last_n?: number
  /** penalize repeat sequence of tokens (default: %.1f) */
  repeat_penalty?: number
  /** size of the prompt context (default: %d) */
  ctx_size?: number
  /** temperature (default: %.1f) */
  temp?: number
  /** batch size for prompt processing (default: %d) */
  batch_size?: number
  /** model path, absolute or relative location of `ggml-alpaca-7b-q4.bin` model file (default: %s) */
  model?: string
}

export interface AlpacaCppChatParameters {
  /**
   * Working directory of dist file. Default to `path.join(path.dirname(require.resolve('langchain-alpaca')), 'binary')`.
   * If you are using esm, try set this to node_modules/langchain-alpaca/dist/binary
   *
  cwd: string
  /**
   * Name of alpaca.cpp binary
   */
  cmd: string
  shell: string
}

Use params like this:

new AlpacaCppChat({ fields?: BaseLLMParams & Partial<AlpacaCppChatParameters> & { modelParameters?: Partial<AlpacaCppModelParameters> })

Where BaseLLMParams is from langchain core package.

Development

During dev, you can put your model (or ln -s it) in the model/ggml-alpaca-7b-q4.bin.

And run the zx example/loadLLM.mjs to test it.

langchain-alpaca's People

Stargazers

Watchers

Forkers

orkuhh camperdave williamtran29 kfdslsope realsrisri ibanez32

langchain-alpaca's Issues

If template contains a newline in streaming example, it does not work

If we use bigger template that involves new lines, a callback handleLLMNewToken does not get called.
Code:

/* eslint-disable no-undef */
import { CallbackManager } from 'langchain/callbacks'
import { LLMChain } from 'langchain/chains'
import { PromptTemplate } from 'langchain/prompts'
import path from 'node:path'
import { AlpacaCppChat } from 'langchain-alpaca'
import { dirname } from 'path';
import { fileURLToPath } from 'url';
const __dirname = dirname(fileURLToPath(import.meta.url));

const template = 'Below is an instruction that describes a task.\n Write a response that appropriately completes the request: {prot}';
const prompt = new PromptTemplate({
  template: template,
  inputVariables: ['prot'],
})

const alpaca = new AlpacaCppChat({
  modelParameters: { model: path.join(__dirname, './models/ggml-alpaca-7b-q4.bin') },
  // stream output to console to view it on realtime
  streaming: true,
  callbackManager: CallbackManager.fromHandlers({
    handleLLMNewToken: (token) => {
      console.log("NEW TOKEN");
      process.stdout.write(token);
      return token;
    },
  }),
})

const chain = new LLMChain({ llm: alpaca, prompt: prompt })
const response = await chain.call({ prot: '2 + 2 = ' })
console.log(`response`, response, JSON.stringify(response))
alpaca.closeSession()

Debug console:

 langchain-alpaca:session "Below is an instruction that describes a task.\\ Write a response that appropriately completes the request: 2 + 2 = " +0ms
  langchain-alpaca:state onData {"doneInit":true,">":false,"prompt":false,"queue[0]":{"prompt":"","doneInput":true,"doneEcho":false,"outputStarted":false}} +0ms
  langchain-alpaca:data "Below is an instruction that describes a task.\\ Write a response that appropriately completes the request: 2 + 2 =  \r\n" +0ms
  langchain-alpaca:state onData {"doneInit":true,">":false,"prompt":false,"queue[0]":{"prompt":"","doneInput":true,"doneEcho":false,"outputStarted":false}} +9s
  langchain-alpaca:data "\u001b[0m4" +9s
  langchain-alpaca:state onData {"doneInit":true,">":true,"prompt":false,"queue[0]":{"prompt":"","doneInput":true,"doneEcho":false,"outputStarted":false}} +802ms
  langchain-alpaca:data "\u001b[0m\r\n> \u001b[1m\u001b[32m" +802ms

Add embedding feature

Adding an embedding feature ?

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

chore(deps): update dependency langchain to ^0.0.38 || ^0.0.120
chore(deps): update pnpm to v7.33.5
chore(deps): update pnpm/action-setup action to v2.4.0
chore(deps): update pnpm to v8
fix(deps): update dependency node-pty to v1
Click on this checkbox to rebase all open PRs at once

Detected dependencies

github-actions

.github/workflows/test.yaml

actions/checkout v3

pnpm/action-setup v2.2.4

actions/setup-node v3

npm

package.json

cross-env ^7.0.3

debug ^4.3.4

node-pty ^0.10.1

langchain ^0.0.38

pnpm 7.29.3

Check this box to trigger a request for Renovate to run again on this repository

Doesn't run on windows

Trying to use on windows but it doesn't build or execute. I'll have to look into it more, maybe try it on linux later.

ESM Compatibility

I've been trying to use this module in my ESM / nodeJS project, and unfortunately require.resolve (as used in constants.ts defaultBinaryPath) is not available outside of a CJS context.

I've created a fork to try and experiment around this, and I'll update this issue if I can get it working in both CJS/ESM!

linonetwo / langchain-alpaca Goto Github PK

langchain-alpaca's Introduction

LangChain-Alpaca

Prebuild Binary

Parameter of AlpacaCppChat

Development

langchain-alpaca's People

Stargazers

Watchers

Forkers

langchain-alpaca's Issues

If template contains a newline in streaming example, it does not work

Add embedding feature

Dependency Dashboard

Open

Detected dependencies

Doesn't run on windows

ESM Compatibility

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent