Issues Policy acknowledgement <li class="task-li

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I ran the following code but could not reproduce the error: <div class="highlight

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[BUG] Azure Databricks disk_offload error about mlflow HOT 10 OPEN

vitaliy-sharandin commented on June 10, 2024

[BUG] Azure Databricks disk_offload error

from mlflow.

Comments (10)

harupy commented on June 10, 2024

@vitaliy-sharandin Thanks for reporting this. Could you share your model logging code?

from mlflow.

harupy commented on June 10, 2024

I ran the following code but could not reproduce the error:

%pip install -U git+https://github.com/huggingface/transformers torch accelerate==0.29.3 mlflow

dbutils.library.restartPython()

########

import transformers
import torch


model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    token="...",
)

import mlflow
import uuid

mlflow.set_registry_uri("databricks-uc")

with mlflow.start_run() as run:
  mlflow.transformers.log_model(pipeline, "model")


mlflow.register_model(
    model_uri=f"runs:/{run.info.run_id}/model",
    name=f"..."
)

from mlflow.

vitaliy-sharandin commented on June 10, 2024

The main difference between our code is that I am first fine-tuning adapters with peft and trying to register the run which has only adapters saved and base model reference without model weights. I have also read MLFLow Transformers guide which specifies that you don't need to use mlflow.transformers.persist_pretrained_model() once you are trying to register model to Unity Catalogue, hence my code has to work as I am trying to do exactly that.

Here is my notebook:
https://github.com/vitaliy-sharandin/data_science_projects/blob/master/portfolio/nlp/fine-tuned-llm/psy_ai_mlflow_tracking_deployment.ipynb

from mlflow.

harupy commented on June 10, 2024

Thanks for the notebook! Let me run the notebook and see If I can reproduce the issue.

from mlflow.

harupy commented on June 10, 2024

@vitaliy-sharandin Can you try inserting this code before loading the model to see if it can fix the error?

def get_model_with_peft_adapter(base_model, peft_adapter_path):
    from peft import PeftModel

    return PeftModel.from_pretrained(base_model, peft_adapter_path, offload_folder="offload")

mlflow.transformers.get_model_with_peft_adapter = get_model_with_peft_adapter

Not sure if offload_folder is the only to fix this issue, but want to give it a try.

from mlflow.

vitaliy-sharandin commented on June 10, 2024

It doesn't quite make sense, as I don't have adapters to load pre-model-tuning, so I don't have value for peft_adapter_path obligatory argument.

from mlflow.

github-actions commented on June 10, 2024

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

from mlflow.

harupy commented on June 10, 2024

@vitaliy-sharandin the traceback says get_model_with_peft_adapter is called.

from mlflow.

vitaliy-sharandin commented on June 10, 2024

@harupy Sorry, I have misunderstood your code at first. I did what you've proposed and it led to new error, please check out the notebook.

from mlflow.

vitaliy-sharandin commented on June 10, 2024

@harupy Any updates?

from mlflow.

[BUG] Azure Databricks disk_offload error about mlflow HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent