Code Monkey home page Code Monkey logo

Comments (17)

kryptec avatar kryptec commented on June 11, 2024 4

I recently ran into this problem, and after reading the above comments resolved it by giving the absolute path to the folder instead of the relative one.

I'm not sure if this is a hugging face issue, or the cluster environment I'm working on, but I thought I would mention it here in case it helps anyone.

from setfit.

LazerJesus avatar LazerJesus commented on June 11, 2024 3

I still get this error:

peftmodelpath = "/notebooks/eva/model.bin"

model = PeftModelForCausalLM.from_pretrained(
    model, 
    peftmodelpath, 
    cache_dir=peftmodelpath, 
    local_files_only=True, 
    model_head_file=None
)

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/notebooks/eva/model.bin'. Use repo_type argument if needed.

from setfit.

pdhall99 avatar pdhall99 commented on June 11, 2024 2

I get the same error. When I try to load a locally saved model:

from setfit import SetFitModel

model = SetFitModel.from_pretrained("/path/to/model-directory", local_files_only=True)

I get

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/path/to/model-directory'. Use `repo_type` argument if needed.

I think this could be solved by changing these lines from

        if os.path.isdir(model_id) and MODEL_HEAD_NAME in os.listdir(model_id):
            model_head_file = os.path.join(model_id, MODEL_HEAD_NAME)

to something like

        if os.path.isdir(model_id):
            if MODEL_HEAD_NAME in os.listdir(model_id):
                model_head_file = os.path.join(model_id, MODEL_HEAD_NAME)
            else:
                model_head_file = None

from setfit.

Mouhanedg56 avatar Mouhanedg56 commented on June 11, 2024 2

I saved a trained model on local path. I can't see anything wrong when loading the model using from_pretrained with correct path.

This error appears when you try to load a model from a nonexistent local path which have more than 1 backslash \ with local_files_only=True.

  • if we pass a nonexistent path:
    • if path is in the form 'repo_name' or 'namespace/repo_name': ModelHubMixin.from_pretrained will throw FileNotFoundError.
    • else: ModelHubMixin.from_pretrained will throw a HFValidationError because the path does not exist locally and is not in the expected hub form.
  • if we pass an existing local folder with no body: SentenceTransformer will throw an OSError: /path/to/your/model does not appear to have a file named config.json
  • if we pass an existing folder with a body and head: no issue in this case
  • if we pass an existing folder with a body with no head: SetFitModel._from_pretrained throws HFValidationError not catched by try/except. Since the expected behaviour in SetFit _from_pretrained classmethod is to initialise classification head with random weights when the MODEL_HEAD_NAME is not found, this can be considered as a bug.

@pdhall99 's suggestion looks fixing the issue and follows the same logic in the ModelHubMixin.from_pretrained:

  • if folder exists:
    • if model head file exists => model_head_file = os.path.join(model_id, MODEL_HEAD_NAME) else None => initialise classification head with random weights
  • else: look for the model head in the hub using hf_hub_download.

from setfit.

lamtrinh259 avatar lamtrinh259 commented on June 11, 2024 2

@kryptec thank you so much, I tried your method and gave it the absolute path, and apparently, it works now!

from setfit.

bojanbabic avatar bojanbabic commented on June 11, 2024 2

In the Colab this happens if you mount drive. For some reason mounted path is not recognized. Instead, try having model in /content. This should solve issue of missing path.

from setfit.

ayseozgun avatar ayseozgun commented on June 11, 2024 1

Hello,

I am trying to read the hf model directly from s3 on sagemaker studio. I am also getting same 'HFValidationError' error. I am putting my code below:

`from transformers import T5Tokenizer, T5ForConditionalGeneration

Specify the S3 URL to your model and tokenizer

model_url = "s3://bucketname/model/"

Load the model and tokenizer from S3

tokenizer = T5Tokenizer.from_pretrained(model_url)
model = T5ForConditionalGeneration.from_pretrained(model_url)

Now you can use the model and tokenizer for inference

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True, max_length=512)
input_ids.to("cuda")

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))`

I am able to see the model by running below code on sagemaker, so i am sure the path is correct.

`s3 = boto3.client('s3')

List all objects in the model folder from S3

s3_resource = boto3.resource('s3')
my_bucket = s3_resource.Bucket(bucket_name)
for object_summary in my_bucket.objects.filter(Prefix=''):
file_path = object_summary.key
file_name = os.path.basename(file_path)
if file_name:
print(file_name)`

Can you please help me? Thanks :)

from setfit.

lewtun avatar lewtun commented on June 11, 2024

Hey @jrivd can you provide a code snippet / Colab that reproduces the error? This will help debug what exactly is going on :)

from setfit.

jrivd avatar jrivd commented on June 11, 2024

Hi @lewtun, thanks for your response. You can see the error in action here: https://colab.research.google.com/drive/10t9QmQEe7BHIQQ8XUmw1B37vPFRfORDK?usp=sharing
Many thanks!

from setfit.

jrivd avatar jrivd commented on June 11, 2024

Many thanks for your comments, @pdhall99 and @Mouhanedg56! I'll go over them carefully

from setfit.

sfernandes-gim avatar sfernandes-gim commented on June 11, 2024

Hey @pdhall99 , was this issue finally resolved? I am trying to load the ST models offline but still get the repo_name' or 'namespace/repo_name' error when using full path . i can only load models offline in my environment

When the model file is located in same directory still unable to load as below.
model = SetFitModel.from_pretrained("./all-MiniLM-L6-v2",model_head_file=None,local_files_only=True). i tried with multiple arguments but getting the error Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start Thks

from setfit.

pdhall99 avatar pdhall99 commented on June 11, 2024

Hi @sfernandes-gim, this change is not yet merged.

from setfit.

lewtun avatar lewtun commented on June 11, 2024

Hi folks, we've just released a new version that include fixes to some of the above issues. For those still having troubles, could you please comment below with a code snippet for debugging? Thanks!

from setfit.

sfernandes-gim avatar sfernandes-gim commented on June 11, 2024

Hi @lewtun , trying to load from my local directory but always gives below error.Also tried with local_files_only=True option .Please advise

model = SetFitModel.from_pretrained("./all-MiniLM-L6-v2")

Error
HFValidationError: Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: './all-MiniLM-L6-v2'._

Tried below
odel = SetFitModel.from_pretrained("./Output/all-MiniLM-L6-v2",local_files_only=True,use_differentiable_head=True, head_params={"out_features": num_classes})

Error
HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './Output/all-MiniLM-L6-v2'. Use repo_type argument if needed.

from setfit.

sfernandes-gim avatar sfernandes-gim commented on June 11, 2024

Thanks @lewtun this works fine with the latest release

from setfit.

u1vi avatar u1vi commented on June 11, 2024

I had the same issue in the local machine (not colab). I was using mounted drive (pure storage).

Creating a symlink to a folder (saving directory) to the working directory (where the training is going on) solved the issue for me.

ln -s existing_source_file optional_symbolic_link

from setfit.

Rakin061 avatar Rakin061 commented on June 11, 2024

In the Colab this happens if you mount drive. For some reason mounted path is not recognized. Instead, try having model in /content. This should solve issue of missing path.

Not working event loading model from /content. Have you solve this issue from your side ? @bojanbabic

from setfit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.