Code Monkey home page Code Monkey logo

Comments (10)

ivan-khvostishkov avatar ivan-khvostishkov commented on May 23, 2024

Hi, @KraftZzz. Thanks you for raising this issue.

I have few questions to clarify:

1/ You cannot use ssh / ssm, but do you get any prediction results?

2/ If you don't see any logs in CloudWatch, you probably have misconfigured permissions (no access to CloudWatch). Can you try any of the SageMaker examples, e.g. Deploy a pretrained PyTorch BERT model from Hugging Face and confirm that this example is working in your environment and that you can see the logs for it?

If you have the same issue even if you don't use SageMaker SSH Helper, you might need to reach out to the AWS Support.

from sagemaker-ssh-helper.

KraftZzz avatar KraftZzz commented on May 23, 2024
  1. I can got the prediction results and endpoint is inService status.
  2. I am sure that the permission is not bad and I can see the MMS(multi-model-server) launch log, but no any ssm info or some infos about ssh-helper. And I checked the examples(Deploy a pretrained PyTorch BERT model from Hugging Face) you provided, in this example, using Pytorch Model and is a single model, no multi model.
  3. I refer this example:https://github.com/huggingface/notebooks/tree/main/sagemaker/17_custom_inference_script, could you please help me to confirm whether is available in this example

from sagemaker-ssh-helper.

ivan-khvostishkov avatar ivan-khvostishkov commented on May 23, 2024

OK, got it. Could you try instead of dependencies parameter add requirements.txt to the code/?

In the requirements add SageMaker SSH Helper :

sagemaker-ssh-helper

I've tried your example and with it works for me with this approach.

As a side comment, all examples including your code are single-model endpoints. "Multi-model-server" name is somewhat confusing. If you really want to deploy a multi-model endpoint, you will need to use MultiDataModel and SSHMultiModelWrapper. See the FAQ for more details.

from sagemaker-ssh-helper.

KraftZzz avatar KraftZzz commented on May 23, 2024

You add this code:
import os
import sys
sys.path.append(os.path.join(os.path.dirname(file), "lib"))

import sagemaker_ssh_helper
sagemaker_ssh_helper.setup_and_start_ssh()
in inference.py, right?
I mention MMS because I see the following information in the endpoint log:

Warning: MMS is using non-default JVM parameters: -XX:-UseContainerSupport

2023-04-26T04:35:25,060 [INFO ] main com.amazonaws.ml.mms.ModelServer -
MMS Home: /opt/conda/lib/python3.8/site-packages
Current directory: /
Temp directory: /home/model-server/tmp
Number of GPUs: 1
Number of CPUs: 4
Max heap size: 3500 M
Python executable: /opt/conda/bin/python3.8
Config file: /etc/sagemaker-mms.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8080
Model Store: /.sagemaker/mms/models
Initial Models: ALL
Log dir: null
Metrics dir: null
Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Preload model: false
Prefer direct buffer: false
2023-04-26T04:35:25,118 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-model
2023-04-26T04:35:25,179 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /home/model-server/tmp/.mms.sock.9000 --handler sagemaker_huggingface_inference_toolkit.handler_service --model-path /.sagemaker/mms/models/model --model-name model --preload-model false --tmp-dir /home/model-server/tmp
2023-04-26T04:35:25,180 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.mms.sock.9000
2023-04-26T04:35:25,180 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - [PID] 72
2023-04-26T04:35:25,180 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - MMS worker started.
2023-04-26T04:35:25,180 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Python runtime: 3.8.10
2023-04-26T04:35:25,181 [INFO ] main com.amazonaws.ml.mms.wlm.ModelManager - Model model loaded.
2023-04-26T04:35:25,187 [INFO ] main com.amazonaws.ml.mms.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2023-04-26T04:35:25,199 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
2023-04-26T04:35:25,256 [INFO ] main com.amazonaws.ml.mms.ModelServer - Inference API bind to: http://0.0.0.0:8080
Model server started.

from sagemaker-ssh-helper.

KraftZzz avatar KraftZzz commented on May 23, 2024

no any sagemaker-ssh-helper worklog in endpoint cloudwatch log, so I run:
instance_ids = ssh_wrapper.get_instance_ids()
print(f'To connect over SSM run: aws ssm start-session --target {instance_ids[0]} --region {sess.boto_region_name}')
no any output

from sagemaker-ssh-helper.

KraftZzz avatar KraftZzz commented on May 23, 2024

Could you please share your step ?

from sagemaker-ssh-helper.

ivan-khvostishkov avatar ivan-khvostishkov commented on May 23, 2024

My steps are the following:

1/ Added to inference.py the following lines:

+import os
+import sys
+sys.path.append(os.path.join(os.path.dirname(__file__), "lib"))
+
+import sagemaker_ssh_helper
+sagemaker_ssh_helper.setup_and_start_ssh()
+
+
 from transformers import AutoTokenizer, AutoModel
 import torch
 import torch.nn.functional as F

2/ Modified in sagemaker/17_custom_inference_script/sagemaker-notebook.ipynb and executed the following cell:

from sagemaker.huggingface.model import HuggingFaceModel
from sagemaker_ssh_helper.wrapper import SSHModelWrapper  # <--NEW--


# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data=s3_location,       # path to your model and script
   role=role,                    # iam role with permissions to create an Endpoint
   transformers_version="4.26",  # transformers version used
   pytorch_version="1.13",        # pytorch version used
   py_version='py39',            # python version used
)

ssh_wrapper = SSHModelWrapper.create(huggingface_model, connection_wait_time_seconds=0)  # <--NEW--


# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge"
    )

After endpoint has been deployed, I was able to fetch the instance_ids():

ssh_wrapper.get_instance_ids()
INFO:sagemaker-ssh-helper:Querying SSM instance IDs for endpoint huggingface-pytorch-inference-2023-04-24-17-00-23-155
INFO:sagemaker-ssh-helper:Got preliminary SSM instance IDs: ['mi-01234567890abcd00']
INFO:sagemaker-ssh-helper:Got final SSM instance IDs: ['mi-01234567890abcd00']

['mi-01234567890abcd00']

from sagemaker-ssh-helper.

KraftZzz avatar KraftZzz commented on May 23, 2024

Oh, To my confusion, I managed to get the Mi-xxxx in one of my experiments yesterday. But I don't modify any code...

from sagemaker-ssh-helper.

KraftZzz avatar KraftZzz commented on May 23, 2024

Thanks for your share

from sagemaker-ssh-helper.

ivan-khvostishkov avatar ivan-khvostishkov commented on May 23, 2024

You're welcome! Let me know if you managed to make your code work, so we can close this issue.

from sagemaker-ssh-helper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.