Comments (2)
Did you figure out how to run it locally?
from sagemaker-pytorch-inference-toolkit.
I was able to run endpoints locally, it is..... not simple.
Well, let's assume you know a little about how sagemaker works, mainly what goes into the /opt/ml/model is not the .tar.gz file but what is inside that file, I also will be using boto3.
First, the easy way, just run the container directly, docker run --rm -p 8080:8080 --gpus all -v $PATH_TO_FOLDER:/opt/ml/model $CONTAINER_NAME serve
and you use request to send whatever you want to http://localhost:8080/invocations
This works, with the bonus that you don't need to use python, but it will be outside the sagemaker environment, so, to get a little closer to sagemaker you have to use the local mode, install sagemaker[local]
and boto3
, now the long part.
first import and initialize the local sagemaker
import boto3
import sagemaker
from sagemaker.local import LocalSession
boto_session = boto3.Session(region_name='us-west-2')
session = LocalSession(boto_session=boto_session)
session.config = {'local': {'local_code': True}}
Explaination, botosession must be set with one region or you will get an error, the last line ensures that all will be done in local mode, or at least that is what it says in the documentation, now you have a sagemaker session, you can do everything and it will be done in the local version, but there are small differences, I use custom containers, so, I will be using the single container custom inference in sagemaker., first, we create the model
model = session.create_model(
name='local',
role='arn:aws:iam::123456789012:role/service-role/AmazonSageMaker', #dummy execution role
primary_container={
"Image": "$YOUR_IMAGE",
"ModelDataUrl": f"file://$FOLDER_WITH_TAR_GZ_CONTENTS",
"Environment": {}, # Variables de entorno
},
)
The file is from your current working directory, and you must set environments even when they are empty
you create the config
config = session.sagemaker_client.create_endpoint_config(
EndpointConfigName='local-endpoint-config',
ProductionVariants=[
{
'VariantName': 'local-variant',
'ModelName': 'local',
'InstanceType': 'local',
'InitialInstanceCount': 1,
}
]
)
As you can see, here InstanceType is different from the instances in aws, because it is local mode.
to finish you create the local endpoint
ep = session.sagemaker_client.create_endpoint(
EndpointName='local-endpoint',
EndpointConfigName='local-endpoint-config',
)
All this is done in memory, meaning that it should be done in a .ipynb file to test and when that file is closed it will not be available.
Now you can invoke the endpoint as you would do in your final script, but using the local endpoint name
predictor = sagemaker.predictor.Predictor(endpoint_name='local-endpoint', sagemaker_session=session)
response = predictor.predict(payload)
or
session.sagemaker_runtime_client.invoke_endpoint(
EndpointName='local-endpoint',
ContentType='application/json',
Body=payload,
)['Body']
I couldn't find this info in any current guide and some of the details I have here were not explained in the little info I could gather, but this works as of today, and I used it 2 years ago, but with provided containers, I know it will probably not help you much, but I hope it would, to you or others that, like me, end up here when trying to make this work. AWS is seriously lacking on material to learn how to use their services.
from sagemaker-pytorch-inference-toolkit.
Related Issues (20)
- Serving a model using custom container, instance run of disk HOT 4
- Need for a minimum reproducible example in readme.md
- No model logs from PyTorch 1.10 SageMaker endpoint HOT 2
- Launch TorchServe without repackaging model contents HOT 5
- Batch Inference does not work when using the default handler
- add environment variable "OMP_NUM_THREADS"
- using cuda enabled pytorch image
- how to use gpu in sagemaker instance HOT 1
- Is this Dockerfile compatible with sagemaker elastic inference
- MMS mode in inference does not support in GPU instance
- [Question] Using model.mar with built-in handler script
- Specify batch size for MME
- Prepend `code_dir` to `sys.path` rather than `append`
- Incorrect reporting of memory utilisation
- Documentation for inference.py `transform_fn`
- Reuse the requirements.txt installation logic from sagemaker-inference-toolkit
- ModuleNotFoundError: Sagemaker only copies entry_point file to /opt/ml/code/ instead of the holy-cloned source code
- Improve debuggability during model load and inference failures
- Zombie process exception HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sagemaker-pytorch-inference-toolkit.