alexrozanski / llamachat Goto Github PK

View Code? Open in Web Editor NEW

1.4K 13.0 53.0 14.95 MB

Chat with your favourite LLaMA models in a native macOS app

Home Page: https://llamachat.app

License: MIT License

Swift 93.44% Rich Text Format 4.98% Shell 1.58%

ai llama llamacpp machine-learning macos swift swiftui

llamachat's Introduction

hey, I'm alex 👋

I'm a full-stack engineer and creative 🔮

I've been building for the Apple ecosystem for a long time, in Objective-C/C++ and Swift, and for AppKit, UIKit and SwiftUI. I've always been driven to ship polished projects with clean APIs, and back in the day I built PXSourceList and PXListView which shipped in some pretty big apps, including early versions of Sketch.

These days I'm excited about Swift, SwiftUI and TypeScript, and am placing a focus on building out fun, interesting projects that make the most of all the incredible tools and technologies that are out there.

I've worked for various companies both big¹ ² and small³ ⁴, and I'm currently available for freelance projects 🙌

If you've ever used any of my open-sourced projects, tips are always appreciated 🙏🏼 and you can also sponsor my work on GitHub

Find me around the web:

llamachat's People

Contributors

Stargazers

Watchers

Forkers

hirajanwin xdarabseh winnerineast kenhuangus jettisonthenet vitosubxx marcelc63 mosesyx osmboy fakechris sunyuping tinker713 browsertasker1 eltociear tripl3z cj99 yibit kittenyang urzone libai55 amishakov meezyart nuxwork meet-ai apertus-dev dbonates codica tousyou ygexe aicodehunt rogerlzp xuejianwu mounta11n sarahbrownplace zxc524580210 ivjia tuanquanghpvn trannhan aahmadai knightcn1983 cookiesowns riccitensor appdemos sorokinvld namangarg20 hugo53 sinomv davidalphafox szymonrucinski id-2 lcztt bankrupcy mcutools

llamachat's Issues

llama2 7B-chat model works poorly.

a few weeks ago, I installed the llama2 on the MacBook Pro M1. but it works like below.

what's wrong with this?

Chinese LLaMA/Alpaca is currently working.

I'v tried add an Alpaca model, select quanted Chinese alpaca model generated by https://colab.research.google.com/drive/1Eak6azD3MLeb-YsfbP8UZC8wrL1ddIMI?usp=sharing, It worked very well.

Thanks for your great work!

Control answer length?

Hi thanks for this, I am using with GPT4ALL. How do I control how long the answers are? currently they are quite short...

Feature Request: Add API endpoint

I love this! Nice work!

This allows me to quickly test my local models, and see how i want to configure them. Once i've confirmed (through the GUI) that everything is how i want it, it would be great if we can expose this as an API. So it's not only a CLIENT, but now can operate as a LlamaServer.

Is this something that you'd be willing to consider?

Syntax Error in ChatModel.swift

Hey @alexrozanski, at line 73 of ChatModel.swift, you have this:

guard let lastChatContext else {
        canClearContext = false
        return
      }

It seems like a condition is missing:

      guard let _ = lastChatContext else {
        canClearContext = false
        return
      }

Shouldn't it look something like the above?

[Feature Request] Selectable Text

Being able to copy a portion of the generated output would be nice. Having selectable text would solve this issue.

Llamachat is spouting gibberish

System: Macbook Pro 2019
Installation method: Homebrew
Version: ==> llamachat: 1.2.0 (auto_updates)

I downloaded the 7b via bittorrent and imported it into llamachat. I tried saying hello to my new friend, but all I get in return is gibbrerish. How do I debug this issue?

It appears to convert the model without any complaints:

python3 -u /var/folders/tj/hdvn6t_x1lb_27qt5h9xx530000 Loading model file /Users/jblack/Library/Application Loading vocab file /Users/jblack/Library/Application Writing vocab...
[ 1/291] Writing tensor tok_embeddings.weight [ 2/291] Writing tensor norm.weight [ 3/291] Writing tensor output.weight [ 4/291] Writing tensor layers.0.attention.wq.weight [ 5/291] Writing tensor layers.0.attention.wk.weight [ 6/291] Writing tensor layers.0.attention.wv.weight [ 7/291] Writing tensor layers.0.attention.wo.weight [ 8/291] Writing tensor layers.0.attention_norm.weight [ 9/291] Writing tensor layers.0.feed_forward.w1.weight [ 10/291] Writing tensor layers.0.feed_forward.w2.weight [ 11/291] Writing tensor layers.0.feed_forward.w3.weight [ 12/291] Writing tensor layers.0.ffn_norm.weight [ 13/291] Writing tensor layers.1.attention.wq.weight [ 14/291] Writing tensor layers.1.attention.wk.weight [ 15/291] Writing tensor layers.1.attention.wv.weight [ 16/291] Writing tensor layers.1.attention.wo.weight [ 17/291] Writing tensor layers.1.attention_norm.weight [ 18/291] Writing tensor layers.1.feed_forward.w1.weight [ 19/291] Writing tensor layers.1.feed_forward.w2.weight [ 20/291] Writing tensor layers.1.feed_forward.w3.weight [ 21/291] Writing tensor layers.1.ffn_norm.weight [ 22/291] Writing tensor layers.2.attention.wq.weight [ 23/291] Writing tensor layers.2.attention.wk.weight [ 24/291] Writing tensor layers.2.attention.wv.weight [ 25/291] Writing tensor layers.2.attention.wo.weight [ 26/291] Writing tensor layers.2.attention_norm.weight [ 27/291] Writing tensor layers.2.feed_forward.w1.weight [ 28/291] Writing tensor layers.2.feed_forward.w2.weight [ 29/291] Writing tensor layers.2.feed_forward.w3.weight [ 30/291] Writing tensor layers.2.ffn_norm.weight [ 31/291] Writing tensor layers.3.attention.wq.weight [ 32/291] Writing tensor layers.3.attention.wk.weight [ 33/291] Writing tensor layers.3.attention.wv.weight [ 34/291] Writing tensor layers.3.attention.wo.weight [ 35/291] Writing tensor layers.3.attention_norm.weight [ 36/291] Writing tensor layers.3.feed_forward.w1.weight [ 37/291] Writing tensor layers.3.feed_forward.w2.weight [ 38/291] Writing tensor layers.3.feed_forward.w3.weight [ 39/291] Writing tensor layers.3.ffn_norm.weight [ 40/291] Writing tensor layers.4.attention.wq.weight [ 41/291] Writing tensor layers.4.attention.wk.weight [ 42/291] Writing tensor layers.4.attention.wv.weight [ 43/291] Writing tensor layers.4.attention.wo.weight [ 44/291] Writing tensor layers.4.attention_norm.weight [ 45/291] Writing tensor layers.4.feed_forward.w1.weight [ 46/291] Writing tensor layers.4.feed_forward.w2.weight [ 47/291] Writing tensor layers.4.feed_forward.w3.weight [ 48/291] Writing tensor layers.4.ffn_norm.weight [ 49/291] Writing tensor layers.5.attention.wq.weight [ 50/291] Writing tensor layers.5.attention.wk.weight [ 51/291] Writing tensor layers.5.attention.wv.weight [ 52/291] Writing tensor layers.5.attention.wo.weight [ 53/291] Writing tensor layers.5.attention_norm.weight [ 54/291] Writing tensor layers.5.feed_forward.w1.weight [ 55/291] Writing tensor layers.5.feed_forward.w2.weight [ 56/291] Writing tensor layers.5.feed_forward.w3.weight [ 57/291] Writing tensor layers.5.ffn_norm.weight [ 58/291] Writing tensor layers.6.attention.wq.weight [ 59/291] Writing tensor layers.6.attention.wk.weight [ 60/291] Writing tensor layers.6.attention.wv.weight [ 61/291] Writing tensor layers.6.attention.wo.weight [ 62/291] Writing tensor layers.6.attention_norm.weight [ 63/291] Writing tensor layers.6.feed_forward.w1.weight [ 64/291] Writing tensor layers.6.feed_forward.w2.weight [ 65/291] Writing tensor layers.6.feed_forward.w3.weight [ 66/291] Writing tensor layers.6.ffn_norm.weight [ 67/291] Writing tensor layers.7.attention.wq.weight [ 68/291] Writing tensor layers.7.attention.wk.weight [ 69/291] Writing tensor layers.7.attention.wv.weight [ 70/291] Writing tensor layers.7.attention.wo.weight [ 71/291] Writing tensor layers.7.attention_norm.weight [ 72/291] Writing tensor layers.7.feed_forward.w1.weight [ 73/291] Writing tensor layers.7.feed_forward.w2.weight [ 74/291] Writing tensor layers.7.feed_forward.w3.weight [ 75/291] Writing tensor layers.7.ffn_norm.weight [ 76/291] Writing tensor layers.8.attention.wq.weight [ 77/291] Writing tensor layers.8.attention.wk.weight [ 78/291] Writing tensor layers.8.attention.wv.weight [ 79/291] Writing tensor layers.8.attention.wo.weight [ 80/291] Writing tensor layers.8.attention_norm.weight [ 81/291] Writing tensor layers.8.feed_forward.w1.weight [ 82/291] Writing tensor layers.8.feed_forward.w2.weight [ 83/291] Writing tensor layers.8.feed_forward.w3.weight [ 84/291] Writing tensor layers.8.ffn_norm.weight [ 85/291] Writing tensor layers.9.attention.wq.weight [ 86/291] Writing tensor layers.9.attention.wk.weight [ 87/291] Writing tensor layers.9.attention.wv.weight [ 88/291] Writing tensor layers.9.attention.wo.weight [ 89/291] Writing tensor layers.9.attention_norm.weight [ 90/291] Writing tensor layers.9.feed_forward.w1.weight [ 91/291] Writing tensor layers.9.feed_forward.w2.weight [ 92/291] Writing tensor layers.9.feed_forward.w3.weight [ 93/291] Writing tensor layers.9.ffn_norm.weight [ 94/291] Writing tensor layers.10.attention.wq.weight [ 95/291] Writing tensor layers.10.attention.wk.weight [ 96/291] Writing tensor layers.10.attention.wv.weight [ 97/291] Writing tensor layers.10.attention.wo.weight [ 98/291] Writing tensor layers.10.attention_norm.weight [ 99/291] Writing tensor layers.10.feed_forward.w1.weight [100/291] Writing tensor layers.10.feed_forward.w2.weight [101/291] Writing tensor layers.10.feed_forward.w3.weight [102/291] Writing tensor layers.10.ffn_norm.weight [103/291] Writing tensor layers.11.attention.wq.weight [104/291] Writing tensor layers.11.attention.wk.weight [105/291] Writing tensor layers.11.attention.wv.weight [106/291] Writing tensor layers.11.attention.wo.weight [107/291] Writing tensor layers.11.attention_norm.weight [108/291] Writing tensor layers.11.feed_forward.w1.weight [109/291] Writing tensor layers.11.feed_forward.w2.weight [110/291] Writing tensor layers.11.feed_forward.w3.weight [111/291] Writing tensor layers.11.ffn_norm.weight [112/291] Writing tensor layers.12.attention.wq.weight [113/291] Writing tensor layers.12.attention.wk.weight [114/291] Writing tensor layers.12.attention.wv.weight [115/291] Writing tensor layers.12.attention.wo.weight [116/291] Writing tensor layers.12.attention_norm.weight [117/291] Writing tensor layers.12.feed_forward.w1.weight [118/291] Writing tensor layers.12.feed_forward.w2.weight [119/291] Writing tensor layers.12.feed_forward.w3.weight [120/291] Writing tensor layers.12.ffn_norm.weight [121/291] Writing tensor layers.13.attention.wq.weight [122/291] Writing tensor layers.13.attention.wk.weight [123/291] Writing tensor layers.13.attention.wv.weight [124/291] Writing tensor layers.13.attention.wo.weight [125/291] Writing tensor layers.13.attention_norm.weight [126/291] Writing tensor layers.13.feed_forward.w1.weight [127/291] Writing tensor layers.13.feed_forward.w2.weight [128/291] Writing tensor layers.13.feed_forward.w3.weight [129/291] Writing tensor layers.13.ffn_norm.weight [130/291] Writing tensor layers.14.attention.wq.weight [131/291] Writing tensor layers.14.attention.wk.weight [132/291] Writing tensor layers.14.attention.wv.weight [133/291] Writing tensor layers.14.attention.wo.weight [134/291] Writing tensor layers.14.attention_norm.weight [135/291] Writing tensor layers.14.feed_forward.w1.weight [136/291] Writing tensor layers.14.feed_forward.w2.weight [137/291] Writing tensor layers.14.feed_forward.w3.weight [138/291] Writing tensor layers.14.ffn_norm.weight [139/291] Writing tensor layers.15.attention.wq.weight [140/291] Writing tensor layers.15.attention.wk.weight [141/291] Writing tensor layers.15.attention.wv.weight [142/291] Writing tensor layers.15.attention.wo.weight [143/291] Writing tensor layers.15.attention_norm.weight [144/291] Writing tensor layers.15.feed_forward.w1.weight [145/291] Writing tensor layers.15.feed_forward.w2.weight [146/291] Writing tensor layers.15.feed_forward.w3.weight [147/291] Writing tensor layers.15.ffn_norm.weight [148/291] Writing tensor layers.16.attention.wq.weight [149/291] Writing tensor layers.16.attention.wk.weight [150/291] Writing tensor layers.16.attention.wv.weight [151/291] Writing tensor layers.16.attention.wo.weight [152/291] Writing tensor layers.16.attention_norm.weight [153/291] Writing tensor layers.16.feed_forward.w1.weight [154/291] Writing tensor layers.16.feed_forward.w2.weight [155/291] Writing tensor layers.16.feed_forward.w3.weight [156/291] Writing tensor layers.16.ffn_norm.weight [157/291] Writing tensor layers.17.attention.wq.weight [158/291] Writing tensor layers.17.attention.wk.weight [159/291] Writing tensor layers.17.attention.wv.weight [160/291] Writing tensor layers.17.attention.wo.weight [161/291] Writing tensor layers.17.attention_norm.weight [162/291] Writing tensor layers.17.feed_forward.w1.weight [163/291] Writing tensor layers.17.feed_forward.w2.weight [164/291] Writing tensor layers.17.feed_forward.w3.weight [165/291] Writing tensor layers.17.ffn_norm.weight [166/291] Writing tensor layers.18.attention.wq.weight [167/291] Writing tensor layers.18.attention.wk.weight [168/291] Writing tensor layers.18.attention.wv.weight [169/291] Writing tensor layers.18.attention.wo.weight [170/291] Writing tensor layers.18.attention_norm.weight [171/291] Writing tensor layers.18.feed_forward.w1.weight [172/291] Writing tensor layers.18.feed_forward.w2.weight [173/291] Writing tensor layers.18.feed_forward.w3.weight [174/291] Writing tensor layers.18.ffn_norm.weight [175/291] Writing tensor layers.19.attention.wq.weight [176/291] Writing tensor layers.19.attention.wk.weight [177/291] Writing tensor layers.19.attention.wv.weight [178/291] Writing tensor layers.19.attention.wo.weight [179/291] Writing tensor layers.19.attention_norm.weight [180/291] Writing tensor layers.19.feed_forward.w1.weight [181/291] Writing tensor layers.19.feed_forward.w2.weight [182/291] Writing tensor layers.19.feed_forward.w3.weight [183/291] Writing tensor layers.19.ffn_norm.weight [184/291] Writing tensor layers.20.attention.wq.weight [185/291] Writing tensor layers.20.attention.wk.weight [186/291] Writing tensor layers.20.attention.wv.weight [187/291] Writing tensor layers.20.attention.wo.weight [188/291] Writing tensor layers.20.attention_norm.weight [189/291] Writing tensor layers.20.feed_forward.w1.weight [190/291] Writing tensor layers.20.feed_forward.w2.weight [191/291] Writing tensor layers.20.feed_forward.w3.weight [192/291] Writing tensor layers.20.ffn_norm.weight [193/291] Writing tensor layers.21.attention.wq.weight [194/291] Writing tensor layers.21.attention.wk.weight [195/291] Writing tensor layers.21.attention.wv.weight [196/291] Writing tensor layers.21.attention.wo.weight [197/291] Writing tensor layers.21.attention_norm.weight [198/291] Writing tensor layers.21.feed_forward.w1.weight [199/291] Writing tensor layers.21.feed_forward.w2.weight [200/291] Writing tensor layers.21.feed_forward.w3.weight [201/291] Writing tensor layers.21.ffn_norm.weight [202/291] Writing tensor layers.22.attention.wq.weight [203/291] Writing tensor layers.22.attention.wk.weight [204/291] Writing tensor layers.22.attention.wv.weight [205/291] Writing tensor layers.22.attention.wo.weight [206/291] Writing tensor layers.22.attention_norm.weight [207/291] Writing tensor layers.22.feed_forward.w1.weight [208/291] Writing tensor layers.22.feed_forward.w2.weight [209/291] Writing tensor layers.22.feed_forward.w3.weight [210/291] Writing tensor layers.22.ffn_norm.weight [211/291] Writing tensor layers.23.attention.wq.weight [212/291] Writing tensor layers.23.attention.wk.weight [213/291] Writing tensor layers.23.attention.wv.weight [214/291] Writing tensor layers.23.attention.wo.weight [215/291] Writing tensor layers.23.attention_norm.weight [216/291] Writing tensor layers.23.feed_forward.w1.weight [217/291] Writing tensor layers.23.feed_forward.w2.weight [218/291] Writing tensor layers.23.feed_forward.w3.weight [219/291] Writing tensor layers.23.ffn_norm.weight [220/291] Writing tensor layers.24.attention.wq.weight [221/291] Writing tensor layers.24.attention.wk.weight [222/291] Writing tensor layers.24.attention.wv.weight [223/291] Writing tensor layers.24.attention.wo.weight [224/291] Writing tensor layers.24.attention_norm.weight [225/291] Writing tensor layers.24.feed_forward.w1.weight [226/291] Writing tensor layers.24.feed_forward.w2.weight [227/291] Writing tensor layers.24.feed_forward.w3.weight [228/291] Writing tensor layers.24.ffn_norm.weight [229/291] Writing tensor layers.25.attention.wq.weight [230/291] Writing tensor layers.25.attention.wk.weight [231/291] Writing tensor layers.25.attention.wv.weight [232/291] Writing tensor layers.25.attention.wo.weight [233/291] Writing tensor layers.25.attention_norm.weight [234/291] Writing tensor layers.25.feed_forward.w1.weight [235/291] Writing tensor layers.25.feed_forward.w2.weight [236/291] Writing tensor layers.25.feed_forward.w3.weight [237/291] Writing tensor layers.25.ffn_norm.weight [238/291] Writing tensor layers.26.attention.wq.weight [239/291] Writing tensor layers.26.attention.wk.weight [240/291] Writing tensor layers.26.attention.wv.weight [241/291] Writing tensor layers.26.attention.wo.weight [242/291] Writing tensor layers.26.attention_norm.weight [243/291] Writing tensor layers.26.feed_forward.w1.weight [244/291] Writing tensor layers.26.feed_forward.w2.weight [245/291] Writing tensor layers.26.feed_forward.w3.weight [246/291] Writing tensor layers.26.ffn_norm.weight [247/291] Writing tensor layers.27.attention.wq.weight [248/291] Writing tensor layers.27.attention.wk.weight [249/291] Writing tensor layers.27.attention.wv.weight [250/291] Writing tensor layers.27.attention.wo.weight [251/291] Writing tensor layers.27.attention_norm.weight [252/291] Writing tensor layers.27.feed_forward.w1.weight [253/291] Writing tensor layers.27.feed_forward.w2.weight [254/291] Writing tensor layers.27.feed_forward.w3.weight [255/291] Writing tensor layers.27.ffn_norm.weight [256/291] Writing tensor layers.28.attention.wq.weight [257/291] Writing tensor layers.28.attention.wk.weight [258/291] Writing tensor layers.28.attention.wv.weight [259/291] Writing tensor layers.28.attention.wo.weight [260/291] Writing tensor layers.28.attention_norm.weight [261/291] Writing tensor layers.28.feed_forward.w1.weight [262/291] Writing tensor layers.28.feed_forward.w2.weight [263/291] Writing tensor layers.28.feed_forward.w3.weight [264/291] Writing tensor layers.28.ffn_norm.weight [265/291] Writing tensor layers.29.attention.wq.weight [266/291] Writing tensor layers.29.attention.wk.weight [267/291] Writing tensor layers.29.attention.wv.weight [268/291] Writing tensor layers.29.attention.wo.weight [269/291] Writing tensor layers.29.attention_norm.weight [270/291] Writing tensor layers.29.feed_forward.w1.weight [271/291] Writing tensor layers.29.feed_forward.w2.weight [272/291] Writing tensor layers.29.feed_forward.w3.weight [273/291] Writing tensor layers.29.ffn_norm.weight [274/291] Writing tensor layers.30.attention.wq.weight [275/291] Writing tensor layers.30.attention.wk.weight [276/291] Writing tensor layers.30.attention.wv.weight [277/291] Writing tensor layers.30.attention.wo.weight [278/291] Writing tensor layers.30.attention_norm.weight [279/291] Writing tensor layers.30.feed_forward.w1.weight [280/291] Writing tensor layers.30.feed_forward.w2.weight [281/291] Writing tensor layers.30.feed_forward.w3.weight [282/291] Writing tensor layers.30.ffn_norm.weight [283/291] Writing tensor layers.31.attention.wq.weight [284/291] Writing tensor layers.31.attention.wk.weight [285/291] Writing tensor layers.31.attention.wv.weight [286/291] Writing tensor layers.31.attention.wo.weight [287/291] Writing tensor layers.31.attention_norm.weight [288/291] Writing tensor layers.31.feed_forward.w1.weight [289/291] Writing tensor layers.31.feed_forward.w2.weight [290/291] Writing tensor layers.31.feed_forward.w3.weight [291/291] Writing tensor layers.31.ffn_norm.weight Wrote /Users/jblack/Library/Application Support/com.al 0gn/T/6A1F66D4-970D-4452-9D5A-F8D5231D098F/convert-pth-to-ggml.py /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B 1
Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B/consolidated.00.pth
Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/tokenizer.model
| size 32000 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 32000 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 x 11008 | type UnquantizedDataType(name='F16')
| size 11008 x 4096 | type UnquantizedDataType(name='F16')
| size 4096 | type UnquantizedDataType(name='F32')
exrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B/ggml-model-f16.bin

test -f /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B/ggml-model-f16.bin

Can the questions not affect each other?

I have observed that the previous questions affect the results of the later questions, and each question is not independent of the other. Is there a way to make them independent of each other?
Thank you.

windows support

Support for Falcon

thanks for sharing this great UI! : )

Is it possible to run Falcon?
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

Add iOS support ?

Is it possible?

Ollama support

Can support for Ollama (https://ollama.ai) be added.

Thanks

Support for Code Llama – Instruct?

Thanks for the amazing macOS client. I was wondering we can expect support for Code Llama (Instruct) anytime soon? Thanks!

Broken link in README

LlamaChat/README.md

Line 16 in 13c4ed6

    
           Download a `.dmg` containing the latest version [👉 here 👈](https://llamachat.app/api/download).

Expected:
https://www.llamachat.app/download

Suggestion add feature to LlamaChat - upload any type of document

Is it possible to add a feature to LlamaChat that allows users to upload any type of document and multi document (PDF, long PDF, docs, excel, etc.) and then chat with running local llm? This feature would enhance privacy, as compared to using online services like chat with OpenAI API, which may not be safe for confidential company or project-related documents such as PDFs or budget reports in Excel.

Add Metal/GPU support for running model inference

I am no expert in this, but it seems to be running on CPUs, which could cause severe heat generation.

/Users/.../Documents/Llamachat/LlamaChat/ui/sources/convert/ConvertSourceView.swift:189:14 Initializer for conditional binding must have Optional type, not '[ConvertSourceStepViewModel]'

Tried building it, got this, doing something wrong? Maybe review the instructions again just in case?

Error when converting model

I have downloaded the Llama-2 (13B) models and I have them in .pth format.

Environment check passes, so does setting up & dependency check. When it reaches the conversion step, I see this:

Writing vocab...

Traceback (most recent call last):
File "/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert-pth-to-ggml.py", line 11, in
convert.main(['--outtype', 'f16' if args.ftype == 1 else 'f32', '--', args.dir_model])
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 1144, in main
OutputFile.write_all(outfile, params, model, vocab)
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 953, in write_all
for i, ((name, lazy_tensor), ndarray) in enumerate(zip(model.items(), ndarrays)):
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 875, in bounded_parallel_map
result = futures.pop(0).result()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 438, in result
return self.__get_result()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 950, in do_item
return lazy_tensor.load().to_ggml().ndarray
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 489, in load
ret = self._load()
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 497, in load
return self.load().astype(data_type)
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 489, in load
ret = self._load()
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 549, in load
ndarrays = [load_unquantized(tensor) for tensor in lazy_tensors]
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 549, in
ndarrays = [load_unquantized(tensor) for tensor in lazy_tensors]
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 297, in load_unquantized
tensor = lazy_tensor.load()
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 489, in load
ret = self._load()
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 695, in load
return UnquantizedTensor(storage.load(storage_offset, elm_count).reshape(size))
File "/private/var/folders/m0/872knsrj62s8zllz2f18nhpr0000gn/T/E7FD53F2-B53D-4439-9AE7-898FE0607B18/convert.py", line 679, in load
raise Exception("tensor stored in unsupported format")
Exception: tensor stored in unsupported format

Am I doing something wrong?

Add ability to configure runtime and model hyperparameters

Some essential ones that come to mind:

-t {n} to specify number of threads. From what I can tell, LlamaChat uses 3 threads by default, but on my machine 8 threads gets the best performance.
--temp and the rest of the useful params

Support downloadable models

Any chance we could just select the model, and then LlamaChat downloads the necessary files automatically?

Support configuring whether to load the entire model into memory or use mmap

Greetings,
Love the application and UX!

I noticed Llama cpp running on my M1 was flushing the memory during and after each generation causing slower-than-expected outputs.
This can be fixed by passing "-mlock" argument, which massively boosts Mac M1 performance by locking the model into the memory.

However, currently, LlamaChat has a similar issue, and I believe it can be fixed by passing a simple '-mlock' argument. In fact, I suggest leaving it ON by default for a seamless beginner's experience for M1s.

Moreover, please also consider an advanced feature to allow users to change the parameters.

Failed to load model for eachadea/ggml-vicuna-7b-1.1

After I downloaded eachadea/ggml-vicuna-7b-1.1's ggml-vicuna-7b-1.1-q4_0.bin
model from https://huggingface.co/eachadea/ggml-vicuna-7b-1.1/tree/main, I was able to add Chat Source successfully.
However, during the conversation, an error "Failed to load model" occurred.
I also tried llama.cpp, and I could load the model only after updating to the latest llama.cpp. The llama.cpp from 5 days ago would also fail to load the model. I'm not sure if the ggml model in llama.cpp has been modified in any way.

Unquantized model files aren't cleaned up after conversion

When importing .pth models we convert them to .ggml files first then quantize them; however, these un-quantized models aren't cleaned up so sit in the internal models directory, even though they are not needed.

[Feature Request] Support InternLM

Dear LlamaChat developer,

Greetings! I am vansinhu, a community developer and volunteer at InternLM. Your work has been immensely beneficial to me, and I believe it can be effectively utilized in InternLM as well. Welcome to add Discord https://discord.gg/gF9ezcmtM3 . I hope to get in touch with you.

Best regards,
vansinhu

Error using pth format model

Please forgive my ignorance...
Problem:

Exception: tensor stored in unsupported format

Full Error that popped up:

Traceback (most recent call last):
File "/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert-pth-to-ggml.py", line 11, in
convert.main(['--outtype', 'f16' if args.ftype == 1 else 'f32', '--', args.dir_model])
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 1144, in main
OutputFile.write_all(outfile, params, model, vocab)
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 953, in write_all
for i, ((name, lazy_tensor), ndarray) in enumerate(zip(model.items(), ndarrays)):
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 875, in bounded_parallel_map
result = futures.pop(0).result()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 438, in result
return self.__get_result()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 950, in do_item
return lazy_tensor.load().to_ggml().ndarray
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 489, in load
ret = self._load()
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 497, in load
return self.load().astype(data_type)
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 489, in load
ret = self._load()
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 695, in load
return UnquantizedTensor(storage.load(storage_offset, elm_count).reshape(size))
File "/private/var/folders/dl/8yzjfxfn26sf347jggtzb30c0000gn/T/0D6EEEEF-7FD3-417C-AFC4-71D9A9E81116/convert.py", line 679, in load
raise Exception("tensor stored in unsupported format")
Exception: tensor stored in unsupported format

Support ggmlv3

I download llama-7b.ggmlv3.q4_0.bin from https://huggingface.co/TheBloke/LLaMa-7B-GGML/tree/main, LlamaChat says not a valid model. It seems not support ggmlv3 format.

Performence on terminal is very different within llamachat

I find the performence when using llama.cpp in terminal is better in llamachat.
The answer speed in terminal (interactive mode) is clearly faster on the same cpu usage level.
It's a little strange. Maybe it can be solved.

Request for support of official OpenAI API for ChatGPT/GPT4

I am reaching out to request support for the official OpenAI API for ChatGPT/GPT4 in LlamaChat. I understand that the current Llama models are designed to run offline, but I believe that adding support for faster APIs would make the software more native to Mac and ultimately, more useful.

I kindly ask you to consider adding the official OpenAI API for ChatGPT/GPT4 in LlamaChat. Thank you for your time and consideration.

Support closing add source view with dedicated cancel button

when I open the software, but without load any models, I can't find anywhere to exit.
Until I forced to kill it in the system setting.

Build error

/System/Library/Frameworks/Sparkle.framework/Versions/B/Sparkle' (no such file, not in dyld cache)
'/Library/Frameworks/Sparkle.framework/Versions/B/Sparkle' (no such file),

support for amd gpu (macos)

maybe there is support for amd gpu for macos, since it can run mps. or maybe add fallback mode like pytorch, but not full on cpu only.

Support for Open LLaMa?

This new open source LLaMa might just drop in, but I'd like to know more. Thanks!

Sources don't persist

After closing the app and reopening it, the chat sources are cleared out and need to be re-added.

My steps:

Opened app, and added Alpaca in GGML format.
Used the app for a bit.
Closed the app.
Opened app, and got prompted to add a source again. Previous addition disappeared.

Chinese LLaMA/Alpaca support

Dear LlamaChat Maintainer,

Greetings from Yiming Cui, the Chinese-LLaMA/Alpaca maintainer. I would like to thank you for your efforts in making LLaMA-like models more accessible to the community. I just conducted a quick test, loading our Chinese-Alpaca-7B/13B model (ggml format), and it functioned without any errors. The system outputs closely resemble those generated by llama.cpp, and I believe there are no major issues concerning the support of Chinese models. I plan to include a description of LlamaChat on our project page and I am looking forward to future updates of LlamaChat.
Thank you.

Best regards,
Yiming

P.S. As a quick suggestion, some users might be interested in using advanced hyper-parameter settings (such as temperature, top-k, top-p, threads, etc.) to generate more diverse outputs. (forgive me if this feature is already implemented in LlamaChat)

Screenshot:

Feature Request: detect models in folder (and subfolders)

Would be nice to maybe pickup the sha of all the models in a folder (recursively), and auto-import all the models in the tool.

Ensure messages view is scrolled to bottom when new messages are added

When you're at the bottom of the chat, it should focus on showing the generated text, as it's being generated. Which means, it should automatically scroll to a new line as it's being generated.

Now the newly generated messages go off-screen at the bottom and you have to manually scroll continuously to read each line as it's being generated.

Same goes for showing your own message at the bottom when pressing enter.

Feature request: custom initial prompt

Really cool project, thank you so much !

Could the chat context be customized by the user ?
Each conversation could have its own context configuration and be tuned from the settings

Selected Model is Unsupported

Using Alpaca and gpt4all

Have the following

ggml-alpaca-7b-q4.bin
gpt4all-lora-quantized.bin

After selecting the path it shows Selected model is of an unsupported version

I also want to use versions of Mac 13 and below, and I hope to support the use of Mac computers under Mac 13

installation environment problem

First of all, thank you so much for sharing. The interface and installation method of this project is impressive in its simplicity and power. I want to use it so much!

During my installation, I encountered the following problems:

When performing the second step "setting up environment", the following error occurred:

Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/Library/Python/3.7'
Consider using the `--user` option or check the permissions.

You are using pip version 19.0.3, however version 23.2.1 is available.
You should consider upgrading via the ‘pip install --upgrade pip’ command.

After some searching, I tried the following:

I ran the python -u -m pip install numpy sentencepiece torch --user command, and it showed a successful installation (except for a WARNING: Error parsing requirements for yapf: [Errno 2] No such file or directory : '/Users/user/anaconda3/lib/python3.7/site-packages/yapf-0.29.0.dist-info/METADATA'").
I checked that my pip version is "pip 23.2.1" through pip --version, but the pip version displayed in the above error message is "19.0.3", so I suspect it is the environment of LlamaChat Not as I expected.

What are the reasons and solutions for this situation? Thank you so much!

Feature request: Automatically stop generation

It would be nice to have an option to automatically stop generation after a random number of 1-n sentences, delimited by punctuation followed by whitespace. Additionally, it may be helpful to have a "Stop generation at end of sentence" button as an alternative to the current control in the chat window.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.