Code Monkey home page Code Monkey logo

llama.rn's People

Contributors

jhen0409 avatar smashinfries avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

llama.rn's Issues

Parallel decoding

llama.cpp now supports parallel decoding in one context so we can support.

Breaking change: Deprecate stopCompletion method and move to return values of completion.

Feature Request: TextStreaming

Is it possible to add a text streaming feature? It looks like your loading a local cpp server I wonder does swift support sockets for react native? Inference is so slow right on mobile devices right now, streaming would help the user know something is happening. Interested in contributing if you need contribs. I believe it is supported by llama.cpp in langchains implementation but im not sure if that's custom

Android: Cannot load models, stopCompletions not working.

As it says on the tin. Loading small 3b models ala Tiny Llama or StableLM models do not work. Tested models:

Attempting to call initLlama results in

  • Error: Failed to initialize context

Which I can only assume is here:

I do not know enough about native functions to investigate further.

In addition, stopCompletions() does not stop a completion on Android.
Thanks for your work, the project is fantastic otherwise.

Update llamacpp module to latest

I am having troubling updating the llamacpp submodule mysel. Could the project be updated as llamacpp has added support for a few new base models that currently do not work in llama.rn?

stablelm-2-zephyr-1_6b-Q8_0.gguf does not work

Hello,

I've been working on getting the stablelm-2-zephyr-1_6b-Q8_0.gguf operational (link: https://huggingface.co/spaces/stabilityai/stablelm-2-1_6b-zephyr), especially since the 3B version seems to function quite well. However, I'm encountering an issue with the 1.6B version where it fails to initialize the context. Currently, I'm using the latest version of your master branch to compile the library. Is there a straightforward modification I can make on my end to resolve this?

from logs:

01-29 22:50:05.365 3017 20732 E RNLLAMA_LOG_ANDROID: llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 340, got 268

Thank you.

Early stopping inference

Shouldn't there be a function that allows the user to stop inference? Could be implemented as a callback function just like in whisper.rn's realtimeInference()

Failure to initLlama on Xiaomi phones.

Hello again, I've received reports from users of ChatterUI that model loading fails on Xiaomi branded phones:

Confirmed not working:

  • Xiaomi Poco F5 - Android 14
  • Redmi 10C - Android 13

I've also queried about other phones, and got a few responses for working devices.

Confirmed working:

  • Samsung A71 - Android 13
  • Samsung M52 - Android 13

Version used:

  • llama.rn 0.3.0-rc.14

Logcat response on the tested Poco F5:

RNLLAMA_ANDROID_JNI: [RNLlama] is_model_loaded false

There aren't enough users to confirm this is a trend across all Xiaomi phones, but it is peculiar.

LLaVa support

llama.cpp includes LLaVa example (+clip.cpp), we could use it to provide vision support. We may implement it after #30 is done.

Also, it will be great if we could make an another package named clip.rn or react-native-clip, but currently I afraid we haven't more resources to maintain it, so just keep in mind.

OpenCL Implementation for Android

First of all, thanks for the hard work on bringing this project to the react-native ecosystem.

I have been using llama.rn for a few weeks now in my personal project:
https://github.com/Vali-98/ChatterUI

I was wondering if there is any interest in implementing OpenCL for android. I have attempted to work on it myself to little success, given my inexperience with native modules.

[Android] Seed value does not create deterministic outputs.

As mentioned in the title, setting a seed value does not make an output deterministic on Android.

  • llama.rn version: 0.3.0-rc.13

  • Model used: phi-2.Q3_K_M.gguf

  • Android Devices Tested on: Emulated Pixel 3a - Android 14

Params used:

{
  "frequency_penalty": 0, 
  "grammar": "", 
  "min_p": 0.07, 
  "mirostat": 0, 
  "mirostat_eta": 0.1, 
  "mirostat_tau": 5, 
  "n_predict": 288, 
  "n_threads": 5, 
  "presence_penalty": 0, 
  "prompt": "", 
  "repeat_penalty": 1, 
  "seed": 2, 
  "stop": ["User:", "### Response: "], 
  "temperature": 1, 
  "tfs_z": 1, 
  "top_k": 0, 
  "top_p": 1, 
  "typical_p": 1
}

cannot load model

issue in the README regarding model loading. It mentions the 'gguf' model but lacks clear instructions. is file loading implemented yet? No model found is always result.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.