Code Monkey home page Code Monkey logo

Comments (8)

frankfliu avatar frankfliu commented on July 22, 2024 1

I knew it will break when I rush the fix without unittest :(

This is a regression caused by last weeks PR. Fixed with #3095

from djl.

frankfliu avatar frankfliu commented on July 22, 2024

@david-sitsky

You are right, SentenceTransformer has an extra Linear layer. We use huggingface AutoModel to trace the model with doesn't include the pooling and dens. In our TranslatorFactory we applied the pooling, but we didn't add Linear.

There two solutions:

  1. you can trace the model with SentenceTransformer, so the script model will contain the missing layer.
  2. Write a custom Translator, manually load Linear weights and apply linear layer.

from djl.

frankfliu avatar frankfliu commented on July 22, 2024

Here is a demo code that can produce the identical result as SentenceTransformer:

        Criteria<String, float[]> criteria =
                Criteria.builder()
                        .setTypes(String.class, float[].class)
                        .optModelUrls("djl://ai.djl.huggingface.pytorch/sentence-transformers/clip-ViT-B-32-multilingual-v1")
                        .optTranslatorFactory(new TextEmbeddingTranslatorFactory())
                        .optArgument("normalize", "false")
                        .optEngine("PyTorch")
                        .build();
        try (ZooModel<String, float[]> model = criteria.loadModel();
             Predictor<String, float[]> predictor = model.newPredictor()) {
            float[] embeddings = predictor.predict("fast blue car");

            NDManager manager = model.getNDManager();
            try (InputStream is = Files.newInputStream(Paths.get("model.safetensors"))) {
                NDList list = NDList.decode(manager, is);
                NDArray weight = list.get(0);
                NDArray array = manager.create(embeddings);
                NDArray output = array.getNDArrayInternal().linear(array, weight, null).get(0);

                float[] finalEmbeddings = output.toFloatArray();
                System.out.println(finalEmbeddings.length);
            }
        }

from djl.

david-sitsky avatar david-sitsky commented on July 22, 2024

@frankfliu - many thanks for the sample code. Once I downloaded the model.safetensors file so it is accessible, I ran your code, but it only returned 1, rather than 512. To be precise, it returned a float array with this value: [0.0038979053]. Any ideas what is wrong here?

I also tried your first suggestion of tracing the model, using this code:

from opensearch_py_ml.ml_models import SentenceTransformerModel

model_id = "sentence-transformers/clip-ViT-B-32-multilingual-v1"
folder_path = "/data/output/djl/clip-ViT-B-32-multilingual-v1"
pre_trained_model = SentenceTransformerModel(model_id=model_id, folder_path=folder_path, overwrite=True)
model_path = pre_trained_model.save_as_pt(model_id=model_id, sentences=["fast blue car"])

However, when I try an use this model from within djl-serving, I seem to get this error:

TRACE Predictor Predictor input data: [ND: (5) cpu() int64
[  101, 15040, 23254, 13000,   102]
, ND: (5) cpu() int64
[ 1,  1,  1,  1,  1]
]
DEBUG JniUtils Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.sentence_transformers.SentenceTransformer.SentenceTransformer self, Dict(str, Tensor) input) -> Dict(str, Tensor)
Exception raised from checkAndNormalizeInputs at ../aten/src/ATen/core/function_schema_inl.h:392 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x77360a6b405b in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xbf (0x77360a6aef6f in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libc10.so)
frame #2: void c10::FunctionSchema::checkAndNormalizeInputs<c10::Type>(std::vector<c10::IValue, std::allocator<c10::IValue> >&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) const + 0x6ae (0x7734c1dbd03e in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libtorch_cpu.so)
frame #3: torch::jit::Method::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) const + 0x173 (0x7734c4dc0b53 in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libtorch_cpu.so)
frame #4: <unknown function> + 0x12c950 (0x77353172c950 in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/0.26.0-libdjl_torch.so)
frame #5: Java_ai_djl_pytorch_jni_PyTorchLibrary_moduleRunMethod + 0x183 (0x77353172cbbc in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/0.26.0-libdjl_torch.so)
frame #6: [0x7735ec80fa70]

WARN  WorkerThread Failed to predict
ai.djl.translate.TranslateException: ai.djl.engine.EngineException: Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.sentence_transformers.SentenceTransformer.SentenceTransformer self, Dict(str, Tensor) input) -> Dict(str, Tensor)
	at ai.djl.inference.Predictor.batchPredict(Predictor.java:192) ~[api-0.26.0.jar:?]
	at ai.djl.serving.wlm.ModelInfo$ModelThread.lambda$run$1(ModelInfo.java:1088) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.Job.runAll(Job.java:68) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.ModelInfo$ModelThread.run(ModelInfo.java:1088) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.WorkerThread.runJobs(WorkerThread.java:137) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.WorkerThread.run(WorkerThread.java:87) [wlm-0.26.0.jar:?]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: ai.djl.engine.EngineException: Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.sentence_transformers.SentenceTransformer.SentenceTransformer self, Dict(str, Tensor) input) -> Dict(str, Tensor)
	at ai.djl.pytorch.jni.PyTorchLibrary.moduleRunMethod(Native Method) ~[pytorch-engine-0.26.0.jar:?]
	at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:57) ~[pytorch-engine-0.26.0.jar:?]
	at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:146) ~[pytorch-engine-0.26.0.jar:?]
	at ai.djl.nn.AbstractBaseBlock.forward(AbstractBaseBlock.java:79) ~[api-0.26.0.jar:?]
	at ai.djl.nn.Block.forward(Block.java:127) ~[api-0.26.0.jar:?]
	at ai.djl.inference.Predictor.predictInternal(Predictor.java:143TRACE Predictor Predictor input data: [ND: (5) cpu() int64
[  101, 15040, 23254, 13000,   102]
, ND: (5) cpu() int64
[ 1,  1,  1,  1,  1]
]
DEBUG JniUtils Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.sentence_transformers.SentenceTransformer.SentenceTransformer self, Dict(str, Tensor) input) -> Dict(str, Tensor)
Exception raised from checkAndNormalizeInputs at ../aten/src/ATen/core/function_schema_inl.h:392 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x77360a6b405b in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xbf (0x77360a6aef6f in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libc10.so)
frame #2: void c10::FunctionSchema::checkAndNormalizeInputs<c10::Type>(std::vector<c10::IValue, std::allocator<c10::IValue> >&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) const + 0x6ae (0x7734c1dbd03e in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libtorch_cpu.so)
frame #3: torch::jit::Method::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) const + 0x173 (0x7734c4dc0b53 in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/libtorch_cpu.so)
frame #4: <unknown function> + 0x12c950 (0x77353172c950 in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/0.26.0-libdjl_torch.so)
frame #5: Java_ai_djl_pytorch_jni_PyTorchLibrary_moduleRunMethod + 0x183 (0x77353172cbbc in /data/nuix/djl-serving/cache/pytorch/2.0.1-cpu-linux-x86_64/0.26.0-libdjl_torch.so)
frame #6: [0x7735ec80fa70]

WARN  WorkerThread Failed to predict
ai.djl.translate.TranslateException: ai.djl.engine.EngineException: Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.sentence_transformers.SentenceTransformer.SentenceTransformer self, Dict(str, Tensor) input) -> Dict(str, Tensor)
	at ai.djl.inference.Predictor.batchPredict(Predictor.java:192) ~[api-0.26.0.jar:?]
	at ai.djl.serving.wlm.ModelInfo$ModelThread.lambda$run$1(ModelInfo.java:1088) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.Job.runAll(Job.java:68) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.ModelInfo$ModelThread.run(ModelInfo.java:1088) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.WorkerThread.runJobs(WorkerThread.java:137) ~[wlm-0.26.0.jar:?]
	at ai.djl.serving.wlm.WorkerThread.run(WorkerThread.java:87) [wlm-0.26.0.jar:?]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: ai.djl.engine.EngineException: Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.sentence_transformers.SentenceTransformer.SentenceTransformer self, Dict(str, Tensor) input) -> Dict(str, Tensor)
	at ai.djl.pytorch.jni.PyTorchLibrary.moduleRunMethod(Native Method) ~[pytorch-engine-0.26.0.jar:?]
	at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:57) ~[pytorch-engine-0.26.0.jar:?]
	at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:146) ~[pytorch-engine-0.26.0.jar:?]
	at ai.djl.nn.AbstractBaseBlock.forward(AbstractBaseBlock.java:79) ~[api-0.26.0.jar:?]
	at ai.djl.nn.Block.forward(Block.java:127) ~[api-0.26.0.jar:?]
	at ai.djl.inference.Predictor.predictInternal(Predictor.java:143) ~[api-0.26.0.jar:?]
	at ai.djl.inference.Predictor.batchPredict(Predictor.java:170) ~[api-0.26.0.jar:?]
	... 10 more) ~[api-0.26.0.jar:?]
	at ai.djl.inference.Predictor.batchPredict(Predictor.java:170) ~[api-0.26.0.jar:?]
	... 10 more

I've probably done something dumb here.. any ideas what? The input predictor data I believe is as expected, and is handled basically this sort of code:

    private final HuggingFaceTokenizer tokenizer;
        HuggingFaceTokenizer.newInstance("sentence-transformers/clip-ViT-B-32-multilingual-v1");

    @Override
    public NDList processInput(TranslatorContext ctx, Input input)
    {
        Encoding encoding = tokenizer.encode(input.getAsString(0));
        return encoding.toNDList(ctx.getNDManager(), false);
    }

from djl.

david-sitsky avatar david-sitsky commented on July 22, 2024

@frankfliu - many thanks for your work on this. I've also updated my djl checkout to get your changes and ran this command:

djl/extensions/tokenizers$ python3 src/main/python/model_zoo_importer.py -m sentence-transformers/clip-ViT-B-32-multilingual-v1

I then copied the ZIP file to the appropriate location so I can run this unit test, and I'm using 0.27.0-SNAPSHOT versions of the DJL libraries.

    @Test
    public void testMe() throws Exception
    {
        Criteria<String, float[]> criteria = Criteria.builder()
                                                     .setTypes(String.class, float[].class)
                                                     .optModelPath(Paths.get("/data/djl-serving/models/clip-ViT-B-32-multilingual-v1.zip"))
                                                     .optTranslatorFactory(new DeferredTranslatorFactory())
                                                     .optProgress(new ProgressBar()).build();
        try (ZooModel<String, float[]> model = criteria.loadModel();
             Predictor<String, float[]> predictor = model.newPredictor())
        {
            System.out.println("Embeddings length is: " + predictor.predict("fast blue car").length);
        }
    }

However the embeddings length still comes back at 768:

[Test worker] 4993 INFO  ai.djl.pytorch.engine.PtEngine - Number of intra-op threads is 14
[Test worker] 5372 INFO  ai.djl.translate.DeferredTranslatorFactory - Using TranslatorFactory: ai.djl.huggingface.translator.TextEmbeddingTranslatorFactory
[Test worker] 5375 INFO  ai.djl.util.Platform - Found matching platform from: jar:file:/home/sits/.gradle/caches/modules-2/files-2.1/ai.djl.huggingface/tokenizers/0.27.0-SNAPSHOT/10c26f2548c7993477443654c28030d140a443ec/tokenizers-0.27.0-SNAPSHOT.jar!/native/lib/tokenizers.properties
[Test worker] 5379 INFO  ai.djl.huggingface.tokenizers.jni.LibUtils - Extracting native/lib/linux-x86_64/libtokenizers.so to cache ...
Embeddings length is: 768

Any ideas what is wrong here?

from djl.

frankfliu avatar frankfliu commented on July 22, 2024

@david-sitsky

The changes didn't get into 0.27.0 release. You need to use 0.28.0-SNAPSHOT release.
The model zoo model has been updated, you can use "djl://ai.djl.huggingface.pytorch/sentence-transformers/clip-ViT-B-32-multilingual-v1" now. But please make sure clean the cache to get the latest model from model zoo.

from djl.

david-sitsky avatar david-sitsky commented on July 22, 2024

Thanks @frankfliu - everything is working now as expected.

from djl.

david-sitsky avatar david-sitsky commented on July 22, 2024

@frankfliu - something has changed in the latest 0.28.0 snapshots, as my test code I posted on this issue is now failing again. Any ideas?

from djl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.