aws / studio-lab-examples Goto Github PK
View Code? Open in Web Editor NEWExample notebooks for working with SageMaker Studio Lab. Sign up for an account at the link below!
Home Page: https://studiolab.sagemaker.aws
License: Apache License 2.0
Example notebooks for working with SageMaker Studio Lab. Sign up for an account at the link below!
Home Page: https://studiolab.sagemaker.aws
License: Apache License 2.0
Describe the bug
About python package installation error. When I install a package on terminal with pip
command, I cannot reach that package on notebook.
To Reproduce
Steps to reproduce the behavior:
pip
or pip3
.Expected behavior
Success import installed package which are installed on terminal.
Desktop (please complete the following information):
Others
Temp. solutions is install all packages with notebook commands like !pip3 install spacy
.
Describe the bug
Cloning a single notebook using the "Open In in Sagemaker Studio Lab" fails. Cloning the whole repo works.
Using sagemaker's sample, https://github.com/aws/studio-lab-examples/tree/main/open-in-studio-lab, I get this error:
Unable to copy notebook to project.
The link to this notebook is broken or blocked. If this is a private GitHub notebook, sign in to GitHub before copying the notebook.aws/studio-lab-examples/blob/main/natural-language-processing/NLP_Disaster_Recovery_Translation.ipynb
To Reproduce
Steps to reproduce the behavior:
[![Open in SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/aws/studio-lab-examples/blob/main/natural-language-processing/NLP_Disaster_Recovery_Translation.ipynb)
and run itExpected behavior
My notebook will open and appear, just as it would with cloning a directory
Desktop (please complete the following information):
Describe the bug
Studio Lab Notebook Preview parser is not MathJax-aware
To Reproduce
You can confirm it from the following link.
(The notebook in T81 558:Applications of Deep Neural Networks)
Expected behavior
The MathJax representations should be rendered as the formulas.
Screenshots
The MathJax representations in preview.
Desktop (please complete the following information):
Google Chrome
Custom environments https://github.com/aws/studio-lab-examples/tree/main/custom-environments list fastai as an example custom env, however fastai yml file is missing in the repo?
I received an email titled "Account request approved" and a link to complete the registration last Firday(02/11/2022) and have since not been able to create an account.
I'm using the same email but continuously getting the error, "You haven’t been approved to create an account or your approval has expired. Please request an account."
I was told by a aws service representative to report the issue here.
Hello
I registered my account request 7 days ago but have not yet received an email
Please check it
My email: [email protected]
Thanks
You have subscribed to
2/2/2022 So far my account has not been activated why
Describe the bug
After I have runned %conda install opencv-python
or %pip install opencv-python command
on SageMaker StudioLab notebooks by selecting the default Python kernel , import cv2
not work 。
To Reproduce
import cv2
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
/tmp/ipykernel_50/571303353.py in <module>
----> 1 import cv2
~/.conda/envs/default/lib/python3.9/site-packages/cv2/__init__.py in <module>
3 import sys
4
----> 5 from .cv2 import *
6 from .data import *
7
ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory
Expected behavior
If one has the root premission, sudo apt-get install libglib2.0-dev
may solve the problem.
using ctypes、importlib and libglib-2.0.so.0、libgthread-2.0.so.0 files may solve it too(#Azure/azure-functions-python-worker#497 (comment) ), but the behave is not elegant.
Screenshots
Desktop (please complete the following information):
OS: Linux default 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Python: 3.9.7
conda: 4.10.3
Additional context
one question like this #aws/sagemaker-python-sdk#1546 (comment)
Describe the bug
Amazon Sagemaker studio lab is not opening jupyter notebook. It is loading indefinitely at Preparing project run time after that i am getting There was a problem when starting the project runtime. This should be resolved shortly.
Please try again later. It's been almost a week and it still hasn't been resolved. Even though I tried shifting the runtime from CPU to GPU but issue still persists. Any help would be appreciated.
as the title suggests
Describe the bug
studio lab allows for conda install librosa
, however librosa still needs sndfile
to function. seems like online tutorial have been suggesting to use apt-get install to make it happen, but i am unable to use apt-get to install this function(not a root.)
Curious to know if there are suggestion to make it work?
To Reproduce
Steps to reproduce the behavior:
pip install librosa
import librosa
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
First of all, thank you for studio-lab, it is an amazing resource and generally works very smoothly!
Describe the bug
JupyterLab extension for Dask is a fantastic resource that helps with diagnostics for parallel computations. Unfortunately it seems the dashboard is nonfunctional due to SageMaker network settings as detailed in this existing issue dask/dask#5432.
To Reproduce
conda activate studiolab
conda install -c conda-forge dask distributed dask-labextension
from dask.distributed import Client
client = Client()
client
https://SESSSIONIDHERE.studio.us-east-2.sagemaker.aws/studiolab/default/jupyter/proxy/8787/status
Expected behavior
Dashboard diagnostic plots should appear below the active toolbar on the proxied webpage or via the labextension plots.
See additional screenshots in linked issue (dask/dask#5432)
Desktop (please complete the following information):
Additional context
There is a recent blog post detailing how to do a custom setup, but it would be fantastic if this just worked out of the box with studiolab. https://aws.amazon.com/blogs/machine-learning/machine-learning-on-distributed-dask-using-amazon-sagemaker-and-aws-fargate/
Describe the bug
cannot download big files from sudio lab
To Reproduce
Download a file bigger than 250 mb from studio jupyterlab folder
Expected behavior
Big files should get downloaded just like in similar env like collab and kaggle
Even when trying curl I get
curl https://foo.studio.us-east-2.sagemaker.aws/studiolab/default/jupyter/files/jigsaw/jg-model.bin?_xsrf=2%7C5be63d77%11%7C1639414836
Invalid or Expired Auth Token. Request a new presigned URL to continue using SageMaker.
UPDATE
It sometimes does download but mostly when I try saving the big files it gives a download error
I received an email from Sagemaker Studio Lab team that I was approved for an account couple of days ago. I tried multiple times to sign up, but it says the information is incorrect or the approval is expired. I am using the same email I used to request a free account and I tried to sign up the day I received the approval and I tried it today and I have still no luck. I don't know where else to turn to because there is no help form.
Describe the bug
I ran out of storage when I downloaded files, since then I cannot start a kernel. When I tried to delete the files I downloaded, I get "Delete Failed unhandled error". Is there a way to delete files?
Desktop (please complete the following information):
Describe the bug
After I created a new conda env on studiolab and installed the necessary modules for my project I am now stuck at the following error message
import cv2
File "/opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/cv2/init.py", line 3, in
from .cv2 import *
ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory
After looking this error up, the only solution seems to be
apt-get update
apt-get install libglib2.0-0
However I do not have the permissions on studiolab for running these commands.
How can I solve this issue, is there another way?
It would be great to have git-lfs integration, to be able to version large models for example.
Describe the bug
The lab have a error occur and unable to connect to my project runtime.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expect I can load and start the project runtime on the web.
Is your feature request related to a problem? Please describe.
I try to play with vowpal wabbit, more explicit a derived version of your sagemaker RL examples. I tried to lunch a docker container but docker and docker-compose is not available. So I tried to execute the training files in the notebook itself, but vowpal wabbit is not available.
Describe the solution you'd like
Either support docker & docker-compose to play with provided examples in studio labs, or at least allow setting up vowpal wabbit locally.
Describe alternatives you've considered
enable docker and docker-compose to woraround and be able to execute the provided RL examples that leverages docker-compose for local training
Additional context
Add any other context or screenshots about the feature request here.
@EmilyWebber I am not able to install the following libraries (for opencv) due to permission issues. How to install it.
!apt update
!apt-get install -y libglib2.0-0 libsm6 libxrender1 libxext6
no space ,and can not start the project ,how to solve the problem
my acount si [email protected]
Describe the bug
Stuck in 'Preparing project runtime..' for hours, no errors or extra information were reported in the browser.
To Reproduce
Steps to reproduce the behavior:
Not sure the following steps will reproduce, but I think they are of high probability.
Expected behavior
The environment should be ready within minutes.
Desktop (please complete the following information):
Additional context
During upgrading, I noticed the package 'notebook' was updated and prompted a new version of conda.
I am trying to connect to MongoDB Atlas. Are certain ports blocked?
Steps to reproduce the behavior:
from pymongo import MongoClient
MONGO_URL = "xxx"
client = MongoClient(MONGO_URL)
client.db.find({})
Results in ServerSelectionTimeoutError on port 27017
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
the default cuda version is 11.2, but there is no match pytorch version with cuda 11.2, may be 11.1 or 11.3 can work
Describe the solution you'd like
A clear and concise description of what you want to happen.
when install apex, i face the problem of the dismatch version of pytorch and cudatoolkit
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
can you provide different version of cuda, or preinstall all the popular versions, such as 10.2,11.1,11.3 and so on
Additional context
Add any other context or screenshots about the feature request here.
Describe the bug
A clear and concise description of what the bug is.
When plot is set to True in CatBoost the widget does not show saying: Error displaying widget: model not found
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Should display the plot
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
ipywidgets is installed, CatBoost is up to date, node.js installed both in standard and default as well as catboost-widget (npm i catboost-widget). Importing ipywidegts does not affect the problem
Describe the bug
can't save, create, open, or delete any file on studio lab after trying to migrate around 10GB of data from AWS, an error happened midway (says no storage left), which stopped the data download. From then on I keep getting errors whenever I try to save, create, open, or delete files.
To Reproduce
Not sure how to reproduce the bug, I was able to successfully download a dataset with around the same size which I later delete, then when I try to download this 10GB dataset I get errors.
Expected behavior
If there's no storage left, the data download stops and I should be able to delete files to free up storage instead of getting errors for any kind of file operations.
Desktop (please complete the following information):
Is your feature request related to a problem? Please describe.
Actually, no, there is not problem. Everything works, but it would be nice to have it.
Describe the solution you'd like
I would like to be able to install htop
on my machine from Sagemaker Studio Lab. When I do it, I got permission denied.
Describe alternatives you've considered
I work around this issue by doing conda install htop
. And it worked. But it would be cool to be able to install via sudo apt install htop
, as it is very likely I will need to install some other packages in the future. Is it possible?
Additional context
No additional context. Just thank you very much for this initiative, it is very cool!
The FAQ says there are 15GB space for us, but I can't even upload or download a 900MB DataSet.
When I upload or download it, return ""
Unexpected error while saving file: sagemaker-studiolab-notebooks/preprocessed20152019.zip [Errno 2]
No such file or directory:
'/home/studio-lab-user/sagemaker-studiolab-notebooks/.~preprocessed20152019.zip'
-> '/home/studio-lab-user/sagemaker-studiolab-notebooks/preprocessed20152019.zip'
sometimes it returned "no space"
Is your feature request related to a problem? Please describe.
It's both a feature request and a problem:
I would like to install this jupyterlab extension https://github.com/jupyter-server/jupyter-resource-usage but it needs node.js and npm and I could not find a way to install them in SageMaker Studio Lab since I cannot get root privileges
Describe the solution you'd like
A think that pre-installing node.js and npm would be a good idea (they are required by several jupyterlab plugins)
Describe alternatives you've considered
Allow sudo
Additional context
This is not related to the above, but it's annoying: I also had issues when I tried to download some files from the notebook. When the size is huge, in my case 600 Mb, the system gets stuck and the only way is closing everything, browser included, and restart from scratch. Small files can be downloaded without any issue. For the rest, the first impression is absolutely amazing!
As title, with GPU dependencies installed. Thanks
Describe the bug
Starting the Kernel, the browser shows:
Error Starting Kernel
[Errno 28] No space left on device: '/home/studio-lab-user/.local/share/jupyter/runtime/kernel-b45de32e-b211-4e8e-b7b0-ccc1914268af.json'
Expected behavior
Work without problems
Additional context
Is it possible to release memory ?
Is your feature request related to a problem? Please describe.
I cannot install packages that are 3.8 compatible. Currently Python kernel is 3.9 version.
Describe the solution you'd like
Are there going to be support kernels with different Python versions. Python 3.8 version would be great.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
What about Scala and R support?
Apologies in advance if this repository is not the right place for a request like this, and many thanks for Sagemaker Studio Lab!
I usually retrieve data via git-annex, which is a very convenient way to install datasets and retrieve portions of it on demand. It allows me to install huge datasets, often many TB large, often directly by cloning a GitHub repository, but only retrieve individual files or drop data that I have processed already. I use it as part of the datalad package, which allows me to do the data retrieval in a python session as part of my scripts.
Basic file retrieval with git-annex fails so far:
(studiolab) studio-lab-user@default:~/datatest/machinelearning-books$ git annex get B.Efron_T.Hastie-Computer_Age_Statistical_Inference.pdf
get B.Efron_T.Hastie-Computer_Age_Statistical_Inference.pdf (from web...)
https://web.stanford.edu/~hastie/CASI_files/PDF/casi.pdf download failed: ConnectionFailure Network.BSD.getProtocolByName: does not exist (no such protocol name: tcp)
downloading from all 1 known url(s) failed
Unable to access these remotes: web
Maybe add some of these git remotes (git remote add ...):
d5f231e1-6901-456a-9398-39299242baf6 -- mih@meiner:/tmp/machinelearning-books
(Note that these git remotes have annex-ignore set: origin)
failed
get: 1 failed
The cause of the failure lies in ConnectionFailure Network.BSD.getProtocolByName: does not exist (no such protocol name: tcp)
; I believe this is because netbase isn't installed and /etc/protocols
thus doesn't exist.
(studiolab) studio-lab-user@default:~/datatest/machinelearning-books$ apt-cache policy netbase
netbase:
Installed: (none)
Candidate: (none)
Version table:
Is there a way to have it installed, or a solution I have missed in the documentation so far? Thanks in advance!
Is your feature request related to a problem? Please describe.
When I study machine learning from the GitHub repository that contains multiple .ipynb
files, I have to open a file one by one because the Studio Lab link only allows opening the individual files.
Describe the solution you'd like
Describe alternatives you've considered
To put the Studio Lab button to each Jupyter notebook. But it is redundant work for the repository maintainer.
Additional context
This feature will be necessary to add Studio Lab to Jupyter Book, which is used to write the various machine learning books. (I tried to add Studio Lab link to Jupyter Book and found the discussion that suggests the necessity of the above feature..)
Request Tensorflow custom env example yml
Is your feature request related to a problem? Please describe.
Vim is a go-to IDE option for quickly parsing through files & making edits. However, the default sagemaker environment does not include it.
Describe the solution you'd like
An option to use sudo
(right now gives permission denied error) or vim / neovim pre-installed in the default environment
Describe alternatives you've considered
Working with .py and other text files using the Jupyter notebook interface
Hello the wonderful team at Sagemaker Studio Lab! Thank you for making this free version available for the Data Science community!
There are about 20 of us in a General Assembly data science class and we would love to use Sagemaker Studio Lab for the class if we can get our free account requests approved before next Wednesday 2/23.
Here are the names of some of my classmates, we realize you might have a huge backlog. Let us know if this is possible. Thanks!
**
Hi everybody,
I am trying to use AWS built-in algorithms in Sagemaker Studio Lab. For that I need an execution role and region etc.
When I try to run my code it outputs
ValueError: Must setup local AWS configuration with a region supported by SageMaker.
Is it even possible to link access AWS resources in Studiolab?
Many thanks in advance!
Describe the bug
Calling tensorboard from notebook results in timeout.
To Reproduce
Steps to reproduce the behavior:
%conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
! mkdir -p output
%load_ext tensorboard
%tensorboard --logdir output
Expected behavior
The very same notebook on local installation and on Colab shows empty Tensorboard.
Screenshots
Desktop (please complete the following information):
Additional context
Problem:
Hi, I’m trying to install opencv-python
but failed due to the lack of libgthread-2.0.so.0
.
I got the following error:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
/tmp/ipykernel_324/571303353.py in <module>
----> 1 import cv2
~/.conda/envs/default/lib/python3.9/site-packages/cv2/__init__.py in <module>
6 import sys
7
----> 8 from .cv2 import *
9 from .cv2 import _registerMatType
10 from . import mat_wrapper
ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory
Solution:
Is it possible to install ``libgthread-2.0.so.0` by following this way.
apt-get update -y
apt-get install libglib2.0-0
BTW, It is possible to install some other common applications, such as vim
.
Describe the bug
The environment does not boot up for me (it used to)
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A web version of Jupyter notebook
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
1 Account exists and I was using it with CPU and GPU previously. It was great (for a few days, but great nonetheless)
2 I was installing/removing many Python packages with both pip and conda commands, with some, but not much disk space left
3 One time I was using the GPU environment, I pressed stop (used to work like a charm before)
4 When I wanted to return to the project, my account page (https://studiolab.sagemaker.aws/users/gkalman) looked fine
5 Now, after pressing the "Start runtime" button, it says "Preparing project runtime..." for about ten minutes and then stops.
6 It shows the following error, "There was a problem when starting the project runtime. This should be resolved shortly. Please try again later."
7 I have now tried (5) about a dozen or more times throughout in the last three days since it happened with CPU and GPU. The result is always the same (6)
I got out of space error, even though i deleted the files multiple times, Can you tell me how to permenantly delete the files
Describe the bug
I'm constantly getting an error saying the kernel is restarting automatically. And it is happening at the same code block in the entire notebook.
To Reproduce
Steps to reproduce the behaviour:
Code block where the error is occurring:
X_train = sequence.pad_sequences(features_train.todense(), maxlen=100)
X_test = sequence.pad_sequences(features_test.todense(), maxlen=100)
Prior Code:
#%pip install keras
#%pip install tensorflow
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
Prior to that, I've done some data cleaning, stop words, lemmetizing, etc.
Then created a TF-IDF using ngram-range(1,2)
Then I'm trying to do the classification by different algos.
Naive Bayes is executing well.
MLP is also executing well.
Then I'm trying to do LSTM (RNN).
During that, I'm getting error.
Expected behaviour
It should execute well, without any error
Desktop (please complete the following information):
Describe the bug
after install ipykernel ,i still can't see the conda enviroment appear in the jupyter notebook or in luncher.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
I'm trying to set up torchaudio/fastaudio.
When I do:
import torchaudio
I get:
OSError: sndfile library not found
However when I ran conda install -c conda-forge librosa
,
libsndfile conda-forge/linux-64::libsndfile-1.0.31-h9c3ff4c_1
was one of the packages.
When I try apt update
and apt-get install libsndfile1-dev
, I get
Reading package lists... Done
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
and
E: Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)
E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?
respectively.
Is your feature request related to a problem? Please describe.
After attempting to free up space using rm -rf .*
I have also deleted files required by conda. I've partially recovered the situation but clearly the machine is a bit of a mess now, and I have this issue jupyter/notebook#5321 I would like to do a hard reset to a clean machine, but this doesn't appear possible. Im facing the possibility of having to create a new account just to get a clean machine
Describe the solution you'd like
A menu option to reset/clean the machine. Also whilst at it, some UI to see how the 15GB are used would be useful
Describe alternatives you've considered
Na
Additional context
Na
Is your feature request related to a problem? Please describe.
Hello!
I tried running the code from this repository but the script failed because it depends on tmux. as I understand it, currently there's no possibility to install tmux myself since I don't have root access. Since they expose a docker container I though about running through that, but had no success doing so. Is docker available?
Describe the solution you'd like
The ability to run docker commands through the terminal. Perhaps even integration with the docker desktop ui?
Describe alternatives you've considered
I tried installing docker myself, or the dependencies the repository needed. Both failed
Are there any plans on adding Julia support? That will be great!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.