Comments (12)
Hi, @mariokostelac , great to see your interest in the solution.
Which scripts and which commands do you run?
For notebook instances, you should use the sm-local-ssh-notebook
. Let me know if it's not the case.
from sagemaker-ssh-helper.
Not really. I've used sm-ssh-ide start
to start it from the console, but it really seem to be developed for SageMaker studio
Then I've found
and try to change the notebook into a script that I can run on-demand from the terminal, but I get following output:
(base) [ec2-user@ip-172-16-24-195 ~]$ python start_ssh.py
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml
[sagemaker-ssh-helper] WARNING: SageMaker SSH Helper is not correctly initialized. Did you forget to call wrapper.create()_before_ fit() / run() / transform() / deploy()?
[sagemaker-ssh-helper] SageMaker SSH Helper startup params: start_ssh=false, ssh_instance_count=1, node_rank=0
[sagemaker-ssh-helper] Skipping SageMaker SSH Helper setup from start_ssh.py
Script
from sagemaker_ssh_helper.wrapper import SSHEstimatorWrapper
from sagemaker.pytorch import PyTorch
import sagemaker
import logging
import os
logging.basicConfig(level=logging.INFO)
import time
import os
import sagemaker_ssh_helper
sagemaker_ssh_helper.setup_and_start_ssh()
while os.environ.get("START_SSH", "false") == "true":
time.sleep(10) # will sleep forever
starting it from the terminal would be very useful because I can start it from the lifecycle script etc.
from sagemaker-ssh-helper.
@mariokostelac I see what you're trying to do. You need to copy all the commands from the notebook to your script. Now your import of SSHEstimatorWrapper
in your script is unused, but you should use it to trigger the local SageMaker training job in a container:
estimator = PyTorch(entry_point='ssh_notebook.py',
source_dir='ssh_notebook/',
dependencies=[SSHEstimatorWrapper.dependency_dir()],
base_job_name='ssh-notebook',
role=sagemaker.get_execution_role(),
framework_version='1.9.1',
py_version='py38',
instance_count=1,
instance_type='local',
container_log_level=logging.INFO)
ssh_wrapper = SSHEstimatorWrapper.create(estimator, connection_wait_time_seconds=0, local_user_id=local_user_id, log_to_stdout=True)
estimator.fit({'notebook': 'file://'})
Note that you should use instance_type='local'
which means the SSH Helper will start on the notebook instance itself without spinning new instances.
Then you need the second script ssh_notebook.py
which will be running inside the container and which will trigger the SSH setup logic inside the container.
If you are trying to produce the same lifecycle configuration script as for SageMaker Studio, you won't get the desired result, because SageMaker notebook instance already has the Systems Manager Agent running that will conflict with the Systems Manager Agent that SSH Helper will try to start. That's why you need to start it in a container and the simplest way to do it is through SSHEstimatorWrapper
and the local job. Of course, you can also build your own container for this purpose, but it will take more effort.
from sagemaker-ssh-helper.
@ivan-khvostishkov just to make sure that I understand the setup.
- Container is really made just to start SMM agent and ssh setup in it.
- Notebooks are still run on the SageMaker host outside of the docker container.
Is that a good description?
When I establish SSH over SSM session, is it SSH-ing into SageMaker instance or the docker container?
from sagemaker-ssh-helper.
Not sure if I understood your concerns, but in case of SageMaker Notebook instances, yes, the container is a work-around to run a SSM agent, because another SSM agent is already running on the host and you don't want to break it.
You can run your notebooks both from the host (if you use Notebook Jupyter UI) or from the container itself (if you run another notebook server and access it through SSH). In both cases the notebook file itself can have the same physical location on the host.
Note the file mapping part when you launch the estimator:
estimator.fit({'notebook': 'file://'})
It will ensure the notebook host filesystem is mapped into container and your notebook files will be both available inside the container in /opt/ml/input/data/notebook
.
from sagemaker-ssh-helper.
My concern is that ssh into the container is limiting a good chunk capabilities I have in the web terminal. E.g. I can copy files, but I can't install packages, change conda environments properly etc.
from sagemaker-ssh-helper.
Got it now!
But how your development flow looks like in this case? Do you want to install the packages over SSH, but use the packages through the SageMaker Notebook Web UI?
If you connected through SSH, why you don't just start a notebook inside your container env and forget about the packages that are installed on the host machine completely? That is, work only inside the container, just like you would normally do it in SageMaker Studio?
from sagemaker-ssh-helper.
But how your development flow looks like in this case? Do you want to install the packages over SSH, but use the packages through the SageMaker Notebook Web UI?
Sometimes yes. I manage conda envs, install new packages, run some offline job. I often use it as EC2 machine.
If you connected through SSH, why you don't just start a notebook inside your container env and forget about the packages that are installed on the host machine completely? That is, work only inside the container, just like you would normally do it in SageMaker Studio?
I could do that, but then my VsCode/ssh can disconnect over night (because of Mac power management for example) and lose the work. I'd really like SSH onto the machine, not a machine inside the machine.
I'll see whether SageMaker studio fits this bit better.
Btw I've seen people disabling Amazon's SSM before to do SSH over SSM. Are there any unintended consequences of that?
from sagemaker-ssh-helper.
I could do that, but then my VsCode/ssh can disconnect over night (because of Mac power management for example) and lose the work.
Why would you lose the work? The work will stay inside the container. Once you power on the Mac again, you just reconnect with SSH Helper to the container that will be (hopefully) still running.
I was running a container inside Notebook Instance for many days and the SSM connection from inside the container stays online.
Of course, because it's a single instance (same as your local laptop), you better save your work to some durable location like version control system for code and database or S3 for any data.
I'll see whether SageMaker studio fits this bit better.
SageMaker Studio will be the same.
Btw I've seen people disabling Amazon's SSM before to do SSH over SSM. Are there any unintended consequences of that?
I'm not the right person to answer this question. I think, it worth trying, but to with regards to consequences I recommend to double-check with AWS Support by raising the support case: https://docs.aws.amazon.com/awssupport/latest/user/case-management.html
from sagemaker-ssh-helper.
Why would you lose the work? The work will stay inside the container. Once you power on the Mac again, you just reconnect with SSH Helper to the container that will be (hopefully) still running.
If you attach the vscode into the remote container, and run the notebook from vscode, when vscode loses the connection, you lose the connection to the kernel and it's pretty much impossible to continue working on that kernel.
I was running a container inside Notebook Instance for many days and the SSM connection from inside the container stays online.
I'm not afraid of the container dying, but more of mental overhead of understanding that what you ssh into and what you VsCode into is not really what you see in the Web interface when you open it.
Thanks for explaining @ivan-khvostishkov and thanks for creating this repo. I'm trying to find the service that's going to have the lowest overhead and it seems that SageMaker notebooks might not be that, which is fine. New SageMaker studio maps closely to what I had in mind (I think). I believe I'll be able to setup ssh there too, and get exactly what I wanted.
I saw there were some plans to rewrite bash helpers into python. Do you know if that might happens soon?
from sagemaker-ssh-helper.
Ok, I'm getting your point. Few more ideas from me:
If you attach the vscode into the remote container, and run the notebook from vscode, when vscode loses the connection, you lose the connection to the kernel and it's pretty much impossible to continue working on that kernel.
I'm not sure about VS Code but PyCharm has the nice capability to connect to the remotely running notebook server:
This is the feature that you are probably looking for. Take a look at SageMaker Studio like you are planning to do anyways and let me know how the testing goes. SageMaker SSH Helper will start a separate instance with the remote notebook for you if you use SageMaker Studio (not with SageMaker Notebook Instances, though, you will need to slightly modify the script by yourself). This is how even if you loose the connection and close your laptop lid, the notebook will continue to run remotely and you will reconnect and continue your work from where you have left last time (considering that the instance will be still alive).
I'm not afraid of the container dying, but more of mental overhead of understanding that what you ssh into and what you VsCode into is not really what you see in the Web interface when you open it.
If you mean the filesystem tree structure, then completely agree with you, this is the slight inconvenience coming from the need to run a docker container with different mount point and directories mapping.
I saw there were some plans to rewrite bash helpers into python. Do you know if that might happens soon?
In the new version (spoiler) there will be the sm-ssh
tool written in Python, but it will still call bash scripts under the hood. However, it will be a good foundation for further Python refactoring and everyone has the chance to contribute.
from sagemaker-ssh-helper.
Related Issues (20)
- [Feature] Support HF accelerate and DeepSpeed for inference HOT 1
- Thoughts on using a configuration management framework? HOT 6
- sm-local-configure only works with bash like installations - no Powershell/CMD support / Windows support at all HOT 4
- Error occurred when starting amazon-ssm-agent: failed to get identity: failed to find agent identity HOT 1
- How to install VSCode, other apps in WebVNC view? HOT 2
- JupyterServer URL suffix when tunnelling into KernelGateway app HOT 2
- Notebook `SageMaker_SSH_Notebook.ipynb` fails due to docker-compose HOT 5
- Enable advanced-instances tier to use Session Manager with your on-premises instances HOT 2
- Connecting to SageMaker BYOC Inference Endpoint? HOT 2
- SSH port forwarding to KernelGateway app container HOT 2
- [Question] Shell environment different from web terminal HOT 2
- [bug] - `SageMaker_SSH_IDE.ipynb` does not work HOT 1
- [Feature] Support shared spaces in SageMaker Studio Classic
- [Feature] Support the updated SageMaker Studio experience HOT 1
- [Question] How to connect to sagemaker notebooks HOT 4
- does ssh helper support sagemaker's remote debug's ssm connection? HOT 2
- vscode connect fails HOT 3
- VSCode disconnects after credentials refresh. HOT 6
- does ssh helper support byoc sagemaker endpoint? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sagemaker-ssh-helper.