Comments (15)
Any update on this issue?
From my perspective this seems overly restrictive to require a single file called "model.pth". Here are some potential use cases that causes troubles for:
- Using a different serialization method (joblib, pickle, etc)
- Downloading the model file from elsewhere on startup
- Having multiple serialized weights files for a single model
Can we get some guidance on whether this issue will be fixed? We'd really like to use versions greater than 1.5. Thanks in advance!
from sagemaker-pytorch-inference-toolkit.
Here's a sample notebook that succesfully deploys a PyTorch 1.6 model using the SageMaker PyTorch 1.6 framework container with TorchServe:
We use a pre-trained Hugging Face Roberta model with a custom inference script:
https://github.com/data-science-on-aws/workshop/blob/374329adf15bf1810bfc4a9e73501ee5d3b4e0f5/09_deploy/wip/pytorch/code/inference.py
It seems that the model has be called exactly model.pth
for the SageMaker PyTorch 1.6 serving container to pick it up correctly:
See:
Hope that helps.
Antje
from sagemaker-pytorch-inference-toolkit.
@abmitra84 I simply added a custom save_pytorch_model function which saves my Huggingface model using torch.save() as follows:
MODEL_NAME = 'model.pth'
def save_pytorch_model(model, model_dir):
os.makedirs(model_dir, exist_ok=True)
logger.info('Saving PyTorch model to {}'.format(model_dir))
save_path = os.path.join(model_dir, MODEL_NAME)
torch.save(model.state_dict(), save_path)
I call this save function at the end of my model training code. This gives me the model.pth file.
from sagemaker-pytorch-inference-toolkit.
@antje Thanks for your response. Yes this seems a reasonable path. I didn't train within Sagemaker and didn't plan to use Sagemaker for inferencing (FastAPI is easy and fast). But now I have to, due to certain technical considerations so I wasn't sure if it is a good idea to convert pytorch_model.bin
to model.pth
. Even after that SentenceBERT's model directory structure didn't give me confidence that it will work.
Just to give you an update: I downgraded from V1.6 to V1.5 in pytorch and it seems to be able to identify model_fn() now. To me it looks like for V1.6 onwards torchserve is a default toolkit and it has that default model name = model.pth
piece. Not a good idea. Hope the folks make it more flexible
from sagemaker-pytorch-inference-toolkit.
@antje how do you save model as .pth as Huggingface by default saves it as pytorch_model.bin ?
Does this problem go away with lower pytorch version? I am using Pytorch V1.6.0
If anyone has any idea, I am having trouble with SBERT deployment ([https://www.sbert.net/]). SBERT was trained using Huggingface but underneath the directory structure it creates is little different. It keeps the bin file within a sub folder (0_Transformers) and sagemaker just can't figure out where the model.pth is. It keeps giving that error without going to model_fn where the SBERT based model loading is carried out by SentenceBERT wrapper
Any idea will be helpful
from sagemaker-pytorch-inference-toolkit.
I'm having the same issue trying to extend a PyTorch 1.8 preconfigured container. I am attempting to use M2M_100 transformer from Huggingface. I downloaded the model using Huggingface's from_pretrained
and saved using save_pretrained
. This saves a few files for the model, including a couple .json files, a .bin model file, a file for SentencePiece (.model).
I have created a model.tar.gz file that contains my model saved in a "model" sub dir and my entry point code (containing model_fn, input_fn, etc.) with a requirements.txt file in a "code" sub dir. I'm attempting to deploy a PyTorchModel
(since I don't need nor want to train/tune). However, the container either (a) errors out and tells me that I should provide a model_fn function (refering me to a doc link for this) or (b) ignores my entry_point code and tries to use the default_model_fn
.
My PyTorchModel configuration looks like:
model = PyTorchModel(
source_dir='s3://<bucket>/m2m_translation/model.tar.gz',
entry_point='inference_code.py',
model_data='s3://<bucket>/m2m_translation/model.tar.gz',
role=role,
framework_version='1.8',
py_version='py3'
)
I'm following AWS tutorials that suggest the patterns I have put into use. Like: https://aws.amazon.com/blogs/startups/how-startups-deploy-pretrained-models-on-amazon-sagemaker/
The weird thing is that I can deploy a test setup that follows almost the same pattern using sagemaker[local]
on my laptop no problem. The only real difference is that I'm point the source_dir
to a local directory instead of at the model.tar.gz
file. I tried pointing source_dir
to an S3 bucket that contains the inference_code.py
file rather than the tar.gz file and this did not work giving me the above error b.
Is there a fix? Do I have to somehow force my model to save as a .pth
file (not sure how this will work giving the multiple dependencies of M2M_100)? Do I have to create a custom container rather than extend a preconfigured one? Do I just downgrade to PyTorch 1.5?
from sagemaker-pytorch-inference-toolkit.
Thanks team! This issue is now resolved for me as of ~April (approximately, depending which framework version we're talking about) so I'll close it.
The new releases of the PyTorch containers consume the fixed version of TorchServe, which means they're able to see other files apart from model.pth
(like the inference script, and any other artifacts floating around). I've also confirmed this means we no longer need to specifically name our models model.pth
- so long as the custom model_fn
knows what to look for.
For example, I can now have a model.tar.gz
like below and it works fine:
+ code
| + inference.py (Defining model_fn)
| + requirements.txt
+ config.json
+ pytorch_model.bin
For anybody that's still seeing the issue, you may need to upgrade and/or stop specifying the patch version of the framework: E.g. use framework_version="1.7"
and not framework_version="1.7.0"
.
Note that if we want to be able to directly estimator.deploy()
, I'm seeing we still need to:
- Copy the current working dir (the script files - or whatever subset of them you need) into
{model_dir}/code
in the training job - so they're present in the model tarball at the right location from inference import *
intrain.py
, because the entry-point will just be carried over from training to inference unless you specifically re-configure it, so it'll still be pointing attrain.py
.
...Of course if you use a PyTorchModel
you don't need to do either of these things - because it re-packages the model tarball with the source_dir
you specify, and sets the entry_point
.
The issue @clashofphish is seeing is something separate I believe - possibly related to using an S3 tarball for source_dir
rather than a local folder... Does it work for you if source_dir
is instead a local folder containing inference_code.py
? Does the root of your model.tar.gz
on S3 definitely contain inference_code.py
? Does renaming to inference.py
(which used to be a fixed/required name for some other frameworks I think) help? I'd suggest raising a separate issue for this if you're still having trouble with it!
from sagemaker-pytorch-inference-toolkit.
Hello @athewsey, do you have a requirements.txt
in your model data archive or does your inference.py
imports anything that's not pre-installed to the PyTorch container?
from sagemaker-pytorch-inference-toolkit.
@ChuyangDeng and @athewsey I believe the new PyTorch 1.6 image requires that the model filename be model.pth
. I had to make changes to our code, as well, to conform. this is not ideal, of course.
here is the code relevant: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/9a6869e/src/sagemaker_pytorch_serving_container/torchserve.py#L121
and here is the sample that we're working from: https://github.com/tobrien0/TorchServeOnAWS/blob/9fa0f87/2_serving_natively_with_amazon_sagemaker/deploy.ipynb
You might be able to override the default. @ChuyangDeng can you advise?
cc @antje
from sagemaker-pytorch-inference-toolkit.
@athewsey Thanks for bring this to our attention. Are you using the official containers or did you build your own? Could you show me the code you use to start the batch job?
from sagemaker-pytorch-inference-toolkit.
Hi @ChuyangDeng and @icywang86rui sorry to be slow. I'm using the official PyTorch v1.6 container, providing both a requirements.txt
and an inference.py
.
My current flow is to first train an estimator, and then create a model and deploy it, like:
training_job_desc = estimator.latest_training_job.describe()
model = PyTorchModel(
model_data=training_job_desc["ModelArtifacts"]["S3ModelArtifacts"],
name=training_job_desc["TrainingJobName"],
role=role,
source_dir="src/", # also contains requirements.txt
entry_point="src/inference.py",
framework_version="1.6",
py_version="py3",
)
Some libraries (e.g. pytorch-tabnet) do not produce .pth
files because they wrap some additional configuration or functionality in their saved files/bundles. I'd prefer not to have to reverse engineer every high-level library I use to recover a raw .pth
file, and in previous versions the model_fn
pattern provided a great compatibility layer for achieving this.
I'm not really clear from the responses here whether this is planned to be "fixed" in the inference container or just removing the functionality from the docs? TorchServe offers APIs to customize the initialize()
ation functions for a model, right?
from sagemaker-pytorch-inference-toolkit.
@antje One quick question, in your code (https://github.com/data-science-on-aws/workshop/blob/master/10_deploy/wip/pytorch/01_Deploy_RoBERTa.ipynb) model_s3_uri
points to s3:///roberta model.tar.gz or just he directory s3:///roberta model/ t without being tarred ?
from sagemaker-pytorch-inference-toolkit.
Hi AWS team,
Any update on this? We would like to use our own 'model_fn()' method.
Regards,
Karthik
from sagemaker-pytorch-inference-toolkit.
@ajaykarpur et al. I'd also like to point out that a valid use case is to provide no model.pth
and have the model_fn
download it -- e.g. serving any model off of pytorch hub. I am doing that in several models and can no longer deploy my models using these containers as a result
from sagemaker-pytorch-inference-toolkit.
Hi,
I am also facing this issue. But it seems using framework version=1.7, solves it.
But I am facing out of space error:
Now, this is the local mode, and when I try with an instance, it gives the same issue. Is there any way of increasing the size of the container which is used for deployment?
Thanks for any help..
from sagemaker-pytorch-inference-toolkit.
Related Issues (20)
- Serving a model using custom container, instance run of disk HOT 4
- Need for a minimum reproducible example in readme.md
- No model logs from PyTorch 1.10 SageMaker endpoint HOT 2
- Launch TorchServe without repackaging model contents HOT 5
- Batch Inference does not work when using the default handler
- add environment variable "OMP_NUM_THREADS"
- Document how to locally run the container HOT 2
- using cuda enabled pytorch image
- how to use gpu in sagemaker instance HOT 1
- Is this Dockerfile compatible with sagemaker elastic inference
- MMS mode in inference does not support in GPU instance
- [Question] Using model.mar with built-in handler script
- Specify batch size for MME
- Prepend `code_dir` to `sys.path` rather than `append`
- Incorrect reporting of memory utilisation
- Documentation for inference.py `transform_fn`
- Reuse the requirements.txt installation logic from sagemaker-inference-toolkit
- ModuleNotFoundError: Sagemaker only copies entry_point file to /opt/ml/code/ instead of the holy-cloned source code
- Improve debuggability during model load and inference failures
- Zombie process exception HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sagemaker-pytorch-inference-toolkit.