Option 1 <div class="snippet-clipboard-content notranslate position-relative overf

Why two different options generate different size of models? about mergekit HOT 1 CLOSED

arcee-ai commented on May 28, 2024

Why two different options generate different size of models?

from mergekit.

Comments (1)

cg123 commented on May 28, 2024

The difference is that in your first config, you're defining two output slices:

slices:
  - sources: # output slice #1
    - model: AIDC-ai-business/Marcoroni-7B-v3
      layer_range: [0, 24]
  - sources: # output slice #2
    - model: Toten5/Marcoroni-neural-chat-7B-v2
      layer_range: [8, 32]

These simply get stacked on top of each other, giving you a final model with 40 layers instead of the 32 that a 7B model has. In your second config, you're defining a single output slice that combines two input slices:

slices:
  - sources: # output slice #1
      - model: AIDC-ai-business/Marcoroni-7B-v3 # input slice #1
        layer_range: [0, 24]
      - model: Toten5/Marcoroni-neural-chat-7B-v2 # input slice #2
        layer_range: [8, 32]

The two input slices will be combined using the merge method you specified (slerp here.) That means that layer 0 of AIDC-ai-business/Marcoroni-7B-v3 will be SLERP merged with layer 8 of Toten5/Marcoroni-neural-chat-7B-v2, layer 1 with layer 9, and so on. The end result is the size of your input slices, so just 24 layers.

As for the filter options - filter works by searching for the substring you specify in the tensor name, so it depends on the architecture you're merging. If you want to know all of the tensor names in a Mistral model you can see a list on huggingface here.

Hope this helps!

from mergekit.

Why two different options generate different size of models? about mergekit HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent