Code Monkey home page Code Monkey logo

milatools's Introduction

milatools

The milatools package provides the mila command, which is meant to help with connecting to and interacting with the Mila cluster.


Warning

The mila command is meant to be used on your local machine. Trying to run it on the cluster will fail with an error


Install

Requires Python >= 3.8

pip install milatools

Or, for bleeding edge version:

pip install git+https://github.com/mila-iqia/milatools.git

After installing milatools, start with mila init:

mila init

Commands

mila init

Set up your access to the mila cluster interactively. Have your username and password ready!

  • Set up your SSH config for easy connection with ssh mila
  • Set up your public key if you don't already have them
  • Copy your public key over to the cluster for passwordless auth
  • Set up a public key on the login node to enable ssh into compute nodes
  • new: Add a special SSH config for direct connection to a compute node with ssh mila-cpu

mila docs/intranet

  • Use mila docs <search terms> to search the Mila technical documentation
  • Use mila intranet <search terms> to search the Mila intranet

Both commands open a browser window. If no search terms are given you are taken to the home page.

mila code

Connect a VSCode instance to a compute node. mila code first allocates a compute node using slurm (you can pass slurm options as well using --alloc), and then calls the code command with the appropriate options to start a remote coding session on the allocated node.

You can simply Ctrl+C the process to end the session.

usage: mila code [-h] [--alloc ...] [--job VALUE] [--node VALUE] PATH

positional arguments:
  PATH          Path to open on the remote machine

optional arguments:
  -h, --help    show this help message and exit
  --alloc ...   Extra options to pass to slurm
  --job VALUE   Job ID to connect to
  --node VALUE  Node to connect to

For example:

mila code path/to/my/experiment

The --alloc option may be used to pass extra arguments to salloc when allocating a node (for example, --alloc --gres=cpu:8 to allocate 8 CPUs). --alloc should be at the end, because it will take all of the arguments that come after it.

If you already have an allocation on a compute node, you may use the --node NODENAME or --job JOBID options to connect to that node.

mila serve

The purpose of mila serve is to make it easier to start notebooks, logging servers, etc. on the compute nodes and connect to them.

usage: mila serve [-h] {connect,kill,list,lab,notebook,tensorboard,mlflow,aim} ...

positional arguments:
  {connect,kill,list,lab,notebook,tensorboard,mlflow,aim}
    connect             Reconnect to a persistent server.
    kill                Kill a persistent server.
    list                List active servers.
    lab                 Start a Jupyterlab server.
    notebook            Start a Jupyter Notebook server.
    tensorboard         Start a Tensorboard server.
    mlflow              Start an MLFlow server.
    aim                 Start an AIM server.

optional arguments:
  -h, --help            show this help message and exit

For example, to start jupyterlab with one GPU, you may write:

mila serve lab --alloc --gres gpu:1

You can of course write any SLURM arguments after --alloc.

Ending the connection will end the server, but the --persist flag can be used to prevent that. In that case you would be able to write mila serve connect jupyter-lab in order to reconnect to your running instance. Use mila serve list and mila serve kill to view and manage any running instances.

milatools's People

Contributors

breuleux avatar delaunay avatar gandatchabana avatar lebrice avatar manuel-delverme avatar patrickmineault avatar satyaog avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

milatools's Issues

Feature request: Set "remote.SSH.connectTimeout" in local VsCode settings to fix timeout issues with `mila code`.

When using mila code, it often happens that the window will "time out" several times before the connection actually works.
This problem is exacerbated when there is a lot of load on the login nodes, or the $HOME filesystem.

Adding this entry in the local ~/.config/Code/User/settings.json makes these timeout issues a lot less frequent:

{
    "remote.SSH.connectTimeout": 60
}

Kudos to @busycalibrating for finding this solution.

Perhaps we could add this through mila init ?

[v0.0.17] Issue running the command `mila init`

I tried to update milatools to be able to do ssh mila-cpu. I followed the instructions that were put in the slack channel. First I did:
pip install -U milatools and got the following output:

Requirement already satisfied: milatools in ./miniconda3/lib/python3.9/site-packages (0.0.17)
Requirement already satisfied: blessed<2.0.0,>=1.18.1 in ./miniconda3/lib/python3.9/site-packages (from milatools) (1.19.1)
Requirement already satisfied: questionary<2.0.0,>=1.10.0 in ./miniconda3/lib/python3.9/site-packages (from milatools) (1.10.0)
Requirement already satisfied: sshconf<0.3.0,>=0.2.2 in ./miniconda3/lib/python3.9/site-packages (from milatools) (0.2.3)
Requirement already satisfied: Fabric<3.0.0,>=2.7.0 in ./miniconda3/lib/python3.9/site-packages (from milatools) (2.7.1)
Requirement already satisfied: coleo<0.4.0,>=0.3.0 in ./miniconda3/lib/python3.9/site-packages (from milatools) (0.3.2)
Requirement already satisfied: wcwidth>=0.1.4 in ./miniconda3/lib/python3.9/site-packages (from blessed<2.0.0,>=1.18.1->milatools) (0.2.5)
Requirement already satisfied: six>=1.9.0 in ./miniconda3/lib/python3.9/site-packages (from blessed<2.0.0,>=1.18.1->milatools) (1.16.0)
Requirement already satisfied: ptera<2.0.0,>=1.4.1 in ./miniconda3/lib/python3.9/site-packages (from coleo<0.4.0,>=0.3.0->milatools) (1.4.1)
Requirement already satisfied: paramiko>=2.4 in ./miniconda3/lib/python3.9/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (2.11.0)
Requirement already satisfied: invoke<2.0,>=1.3 in ./miniconda3/lib/python3.9/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (1.7.1)
Requirement already satisfied: pathlib2 in ./miniconda3/lib/python3.9/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (2.3.7.post1)
Requirement already satisfied: bcrypt>=3.1.3 in ./miniconda3/lib/python3.9/site-packages (from paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (3.2.2)
Requirement already satisfied: pynacl>=1.0.1 in ./miniconda3/lib/python3.9/site-packages (from paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (1.5.0)
Requirement already satisfied: cryptography>=2.5 in ./miniconda3/lib/python3.9/site-packages (from paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (3.4.7)
Requirement already satisfied: cffi>=1.1 in ./miniconda3/lib/python3.9/site-packages (from bcrypt>=3.1.3->paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (1.14.6)
Requirement already satisfied: pycparser in ./miniconda3/lib/python3.9/site-packages (from cffi>=1.1->bcrypt>=3.1.3->paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (2.20)
Requirement already satisfied: giving<0.5.0,>=0.4.1 in ./miniconda3/lib/python3.9/site-packages (from ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.4.2)
Requirement already satisfied: codefind<0.2.0,>=0.1.2 in ./miniconda3/lib/python3.9/site-packages (from ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.1.3)
Requirement already satisfied: varname<0.11.0,>=0.10.0 in ./miniconda3/lib/python3.9/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.10.0)
Requirement already satisfied: asttokens<3.0.0,>=2.2.1 in ./miniconda3/lib/python3.9/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (2.2.1)
Requirement already satisfied: reactivex<5.0.0,>=4.0.0 in ./miniconda3/lib/python3.9/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (4.0.4)
Requirement already satisfied: prompt_toolkit<4.0,>=2.0 in ./miniconda3/lib/python3.9/site-packages (from questionary<2.0.0,>=1.10.0->milatools) (3.0.30)
Requirement already satisfied: typing-extensions<5.0.0,>=4.1.1 in ./miniconda3/lib/python3.9/site-packages (from reactivex<5.0.0,>=4.0.0->giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (4.3.0)
Requirement already satisfied: executing<2.0,>=1.1 in ./miniconda3/lib/python3.9/site-packages (from varname<0.11.0,>=0.10.0->giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (1.2.0)

Then I tried running mila init and got the following error:

Checking ssh config
? The '*.server.mila.quebec' entry in ~/.ssh/config is too general and should exclude login.server.mila.quebec. Fix this? Yes
Traceback (most recent call last):
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/milatools/cli/commands.py", line 43, in main
    auto_cli(milatools)
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/coleo/cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/coleo/cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/coleo/cli.py", line 587, in thunk
    result = fn(*args)
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/milatools/cli/commands.py", line 129, in init
    setup_ssh_config()
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/milatools/cli/init_command.py", line 76, in setup_ssh_config
    ssh_config.rename("*.server.mila.quebec", cnode_pattern)
  File "/Users/amin/miniconda3/lib/python3.9/site-packages/sshconf.py", line 459, in rename
    raise ValueError("Host %s: already exists." % new_host)
ValueError: Host *.server.mila.quebec !*login.server.mila.quebec: already exists.

An error occured during the execution of the command `init`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=init%2C0.0.17&template=bug_report.md&title=%5Bv0.0.17%5D+Issue+running+the+command+%60mila+init%60
Please provide the error traceback with the report (the red text above).

I tried pip install milatools --upgrade but got the following:

Requirement already satisfied: milatools in ./miniconda3/lib/python3.9/site-packages (0.0.17)
Requirement already satisfied: coleo<0.4.0,>=0.3.0 in ./miniconda3/lib/python3.9/site-packages (from milatools) (0.3.2)
Requirement already satisfied: blessed<2.0.0,>=1.18.1 in ./miniconda3/lib/python3.9/site-packages (from milatools) (1.19.1)
Requirement already satisfied: Fabric<3.0.0,>=2.7.0 in ./miniconda3/lib/python3.9/site-packages (from milatools) (2.7.1)
Requirement already satisfied: sshconf<0.3.0,>=0.2.2 in ./miniconda3/lib/python3.9/site-packages (from milatools) (0.2.3)
Requirement already satisfied: questionary<2.0.0,>=1.10.0 in ./miniconda3/lib/python3.9/site-packages (from milatools) (1.10.0)
Requirement already satisfied: six>=1.9.0 in ./miniconda3/lib/python3.9/site-packages (from blessed<2.0.0,>=1.18.1->milatools) (1.16.0)
Requirement already satisfied: wcwidth>=0.1.4 in ./miniconda3/lib/python3.9/site-packages (from blessed<2.0.0,>=1.18.1->milatools) (0.2.5)
Requirement already satisfied: ptera<2.0.0,>=1.4.1 in ./miniconda3/lib/python3.9/site-packages (from coleo<0.4.0,>=0.3.0->milatools) (1.4.1)
Requirement already satisfied: pathlib2 in ./miniconda3/lib/python3.9/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (2.3.7.post1)
Requirement already satisfied: paramiko>=2.4 in ./miniconda3/lib/python3.9/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (2.11.0)
Requirement already satisfied: invoke<2.0,>=1.3 in ./miniconda3/lib/python3.9/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (1.7.1)
Requirement already satisfied: bcrypt>=3.1.3 in ./miniconda3/lib/python3.9/site-packages (from paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (3.2.2)
Requirement already satisfied: pynacl>=1.0.1 in ./miniconda3/lib/python3.9/site-packages (from paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (1.5.0)
Requirement already satisfied: cryptography>=2.5 in ./miniconda3/lib/python3.9/site-packages (from paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (3.4.7)
Requirement already satisfied: cffi>=1.1 in ./miniconda3/lib/python3.9/site-packages (from bcrypt>=3.1.3->paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (1.14.6)
Requirement already satisfied: pycparser in ./miniconda3/lib/python3.9/site-packages (from cffi>=1.1->bcrypt>=3.1.3->paramiko>=2.4->Fabric<3.0.0,>=2.7.0->milatools) (2.20)
Requirement already satisfied: codefind<0.2.0,>=0.1.2 in ./miniconda3/lib/python3.9/site-packages (from ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.1.3)
Requirement already satisfied: giving<0.5.0,>=0.4.1 in ./miniconda3/lib/python3.9/site-packages (from ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.4.2)
Requirement already satisfied: reactivex<5.0.0,>=4.0.0 in ./miniconda3/lib/python3.9/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (4.0.4)
Requirement already satisfied: asttokens<3.0.0,>=2.2.1 in ./miniconda3/lib/python3.9/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (2.2.1)
Requirement already satisfied: varname<0.11.0,>=0.10.0 in ./miniconda3/lib/python3.9/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.10.0)
Requirement already satisfied: prompt_toolkit<4.0,>=2.0 in ./miniconda3/lib/python3.9/site-packages (from questionary<2.0.0,>=1.10.0->milatools) (3.0.30)
Requirement already satisfied: typing-extensions<5.0.0,>=4.1.1 in ./miniconda3/lib/python3.9/site-packages (from reactivex<5.0.0,>=4.0.0->giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (4.3.0)
Requirement already satisfied: executing<2.0,>=1.1 in ./miniconda3/lib/python3.9/site-packages (from varname<0.11.0,>=0.10.0->giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (1.2.0)

Running mila init for another time gave the same error and it appears that issue persists, so I opened this issue.

Desktop:

  • OS: Mac OS 13.4, build version: 22F66

Thanks a lot for this new feature. I can't wait to use it once this issue is resolved!

SSH connection issue on mila init

The mila init command fails when attempting the SSH connection:

mila init                                 
Checking ssh config
There is no 'mila' entry in ~/.ssh/config. Create one? [Y/n] Y
What is your username?
> semih.canturk
The following code will be appended to your ~/.ssh/config:

Host mila
    HostName login.server.mila.quebec
    User semih.canturk
    PreferredAuthentications publickey,keyboard-interactive
    Port 2222
    ServerAliveInterval 120
    ServerAliveCountMax 5

Is this OK? [Y/n] Y
There is no '*.server.mila.quebec' entry in ~/.ssh/config. Create one? [Y/n] Y
The following code will be appended to your ~/.ssh/config:

Host *.server.mila.quebec
    HostName %h
    User semih.canturk
    ProxyJump mila

Is this OK? [Y/n] Y
Wrote ~/.ssh/config
# OK
Checking passwordless authentication
(local) $ ssh -oPreferredAuthentications=publickey mila 'echo OK'

ssh: connect to host login.server.mila.quebec port 2222: Operation timed out

Failed to connect to mila, could not understand error

OS: macOS Monterey (12.0.1)
Python 3.8.11
milatools: 0.0.5

mila code /full/path/to/remote/code does not work properly when launched from WSL Ubuntu on Windows 11

Make sure you can reproduce the issue with the latest version available

Steps to reproduce:

# Install wsl on win11
# run all commands from linux terminal:
pip install milatools --upgrade
mila init  # puts publickey on cluster
mila code /full/path/to/remote/codebase

> (default) jdv@lyndon:/home
> $ mila code /home/mila/v/vivianoj/code/gfneco
> (->) $ salloc
> salloc: --------------------------------------------------------------------------------------------------
> salloc: # Using default long-cpu partition (CPU-only)
> salloc: --------------------------------------------------------------------------------------------------
> salloc: Granted job allocation 2991990
> salloc: Waiting for resource configuration
> salloc: Nodes cn-f003 are ready for job
> (local) $ '/mnt/c/Users/Joseph Viviano/AppData/Local/Programs/Microsoft VS Code/bin/code' -nw --remote ssh-remote+cn-f003.server.mila.quebec /home/mila/v/vivianoj/code/gfneco

Describe the bug

At this point, vscode will successfully launch, but the instance will be in a strange state:

Screenshot 2023-03-22 134246

A few things are notable here. The terminal instance is local. The folder is also local, and pointing to a folder which does not exist. vscode is successfully tunnelled into WSL:Ubuntu (bottom left corner). I am not sure whether a fix is to launch mila code from the windows side, will try that next.

Desktop (please complete the following information):

  • OS: Windows 11 w/ WSL Ubuntu
(default) jdv@lyndon:/home
$ uname -a
Linux lyndon 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Thank you :)

Error running VSCode from Windows terminal

VSCode fails with following error when I run mila code . from a Windows terminal. There's no issue if I run it from a WSL Ubuntu terminal instead.

[15:32:33.704] Log Level: 2
[15:32:33.714] SSH Resolver called for "ssh-remote+cn-f002.server.mila.quebec", attempt 1
[15:32:33.714] "remote.SSH.useLocalServer": false
[15:32:33.714] "remote.SSH.useExecServer": false
[15:32:33.714] "remote.SSH.showLoginTerminal": false
[15:32:33.714] "remote.SSH.remotePlatform": {"cn-f001.server.mila.quebec":"linux","mila":"linux"}
[15:32:33.714] "remote.SSH.path": undefined
[15:32:33.714] "remote.SSH.configFile": undefined
[15:32:33.715] "remote.SSH.useFlock": true
[15:32:33.715] "remote.SSH.lockfilesInTmp": false
[15:32:33.715] "remote.SSH.localServerDownload": auto
[15:32:33.715] "remote.SSH.remoteServerListenOnSocket": false
[15:32:33.715] "remote.SSH.showLoginTerminal": false
[15:32:33.715] "remote.SSH.defaultExtensions": []
[15:32:33.715] "remote.SSH.loglevel": 2
[15:32:33.715] "remote.SSH.enableDynamicForwarding": true
[15:32:33.715] "remote.SSH.enableRemoteCommand": false
[15:32:33.715] "remote.SSH.serverPickPortsFromRange": {}
[15:32:33.715] "remote.SSH.serverInstallPath": {}
[15:32:33.720] VS Code version: 1.83.1
[15:32:33.720] Remote-SSH version: [email protected]
[15:32:33.720] win32 x64
[15:32:33.721] SSH Resolver called for host: cn-f002.server.mila.quebec
[15:32:33.721] Setting up SSH remote "cn-f002.server.mila.quebec"
[15:32:33.724] Using commit id "f1b07bd25dfad64b0167beb15359ae573aecd2cc" and quality "stable" for server
[15:32:33.726] Install and start server if needed
[15:33:11.188] Checking ssh with "C:\Users\masht\anaconda3\ssh.exe -V"
[15:33:11.191] Got error from ssh: spawn C:\Users\masht\anaconda3\ssh.exe ENOENT
[15:33:11.191] Checking ssh with "C:\Users\masht\anaconda3\Library\mingw-w64\bin\ssh.exe -V"
[15:33:11.192] Got error from ssh: spawn C:\Users\masht\anaconda3\Library\mingw-w64\bin\ssh.exe ENOENT
[15:33:11.192] Checking ssh with "C:\Users\masht\anaconda3\Library\usr\bin\ssh.exe -V"
[15:33:11.221] > OpenSSH_8.0p1, OpenSSL 1.1.1c  28 May 2019

[15:33:11.223] Running script with connection command: "C:\Users\masht\anaconda3\Library\usr\bin\ssh.exe" -T -D 64557 "cn-f002.server.mila.quebec" bash
[15:33:11.224] Terminal shell path: C:\Windows\System32\cmd.exe
[15:33:11.625] > kex_exchange_identification: write: Broken pipe๏ฟฝ]0;C:\Windows\System32\cmd.exe๏ฟฝ
[15:33:11.625] Got some output, clearing connection timeout
[15:33:11.641] > 
> The process tried to write to a nonexistent pipe.
[15:33:12.886] "install" terminal command done
[15:33:12.886] Install terminal quit with output: kex_exchange_identification: write: Broken pipe๏ฟฝ]0;C:\Windows\System32\cmd.exe๏ฟฝ
[15:33:12.886] Received install output: kex_exchange_identification: write: Broken pipe๏ฟฝ]0;C:\Windows\System32\cmd.exe๏ฟฝ
[15:33:12.887] Failed to parse remote port from server output
[15:33:12.888] Resolver error: Error: 
	at g.Create (c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:640937)
	at t.handleInstallOutput (c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:638303)
	at t.tryInstall (c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:760218)
	at async c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:720757
	at async t.withShowDetailsEvent (c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:724063)
	at async I (c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:717728)
	at async t.resolve (c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:721434)
	at async c:\Users\masht\.vscode\extensions\ms-vscode-remote.remote-ssh-0.106.5\out\extension.js:2:905238

[v0.0.18] Issue running the command `mila code`

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila code /home/mila/a/annabel.adeyeri/ --alloc --cpus-per-task=2 --mem=4Gb --gres=gpu:1

Describe the bug

A clear and concise description of what the bug is. If there is an error
traceback, please paste it here.
Traceback (most recent call last):
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 43, in main
auto_cli(milatools)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
result = run_cli(entry, args, **kwargs)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
return call(opts=opts, args=args)
File "/Users/annabel/miniconda3/lib/python3.10
Screenshot 2023-06-29 at 10 47 02 AM
/site-packages/coleo/cli.py", line 587, in thunk
result = fn(*args)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 285, in code
remote = Remote("mila")
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/milatools/cli/remote.py", line 84, in init
connection.open()
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/fabric/connection.py", line 636, in open
self.client.connect(**kwargs)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/paramiko/client.py", line 485, in connect
self._auth(
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/paramiko/client.py", line 818, in _auth
raise saved_exception
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/paramiko/client.py", line 794, in _auth
self._transport.auth_publickey(username, key)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/paramiko/transport.py", line 1658, in auth_publickey
return self.auth_handler.wait_for_response(my_event)
File "/Users/annabel/miniconda3/lib/python3.10/s
Screenshot 2023-06-29 at 10 47 02 AM
ite-packages/paramiko/auth_handler.py", line 248, in wait_for_response
raise e
paramiko.ssh_exception.AuthenticationException: Authentication failed: transport shut down or saw EOF

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • Mac OS 13.4

Additional context

it was working yesterday. idk what the problem is now.

Could not establish connection to compute node (Could not resolve hostname error)

Hello, I am using the mila code command to establish a remote ssh connection with VSCode. It has worked without any problems in the past 3 weeks, but yesterday I started getting this error when running it.

The command I run from my terminal is:
mila code <path_to_my_cluster_code> --alloc --cpus-per-task 4 --gres gpu:1 --time=0-3:00:00 --partition unkillable

The issue likely comes from mila code and not from the cluster or ssh because it works fine when I ssh directly without using mila code. Also, I tried setting up the ssh host manually on VSCode (by first getting an salloc on the cluster, then opening VSCode and manually adding the compute node I was given as new host (e.g. ssh -J mila <username>@<nodename>)) and this works perfectly.

The connection fails with the following log on VSCode:

[[17:05:45.046] Log Level: 2
[17:05:45.046] [email protected]
[17:05:45.047] darwin arm64
[17:05:45.055] SSH Resolver called for "ssh-remote+cn-c017.server.mila.quebec", attempt 1
[17:05:45.055] "remote.SSH.useLocalServer": true
[17:05:45.055] "remote.SSH.path": undefined
[17:05:45.055] "remote.SSH.configFile": undefined
[17:05:45.056] "remote.SSH.useFlock": true
[17:05:45.056] "remote.SSH.lockfilesInTmp": false
[17:05:45.056] "remote.SSH.localServerDownload": auto
[17:05:45.056] "remote.SSH.remoteServerListenOnSocket": false
[17:05:45.056] "remote.SSH.showLoginTerminal": false
[17:05:45.056] "remote.SSH.defaultExtensions": []
[17:05:45.056] "remote.SSH.loglevel": 2
[17:05:45.056] "remote.SSH.enableDynamicForwarding": true
[17:05:45.056] "remote.SSH.enableRemoteCommand": false
[17:05:45.056] "remote.SSH.serverPickPortsFromRange": {}
[17:05:45.056] "remote.SSH.serverInstallPath": {}
[17:05:45.058] SSH Resolver called for host: cn-c017.server.mila.quebec
[17:05:45.058] Setting up SSH remote "cn-c017.server.mila.quebec"
[17:05:45.060] Acquiring local install lock: /var/folders/ml/kpt1qf1s2ngfsbzl5ycgclz00000gp/T/vscode-remote-ssh-feed4297-install.lock
[17:05:45.060] Looking for existing server data file at /Users/aldozaimi/Library/Application Support/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-feed4297-3b889b090b5ad5793f524b5d1d39fda662b96a2a-0.84.0/data.json
[17:05:45.060] Using commit id "3b889b090b5ad5793f524b5d1d39fda662b96a2a" and quality "stable" for server
[17:05:45.062] Install and start server if needed
[17:05:45.063] PATH: /opt/homebrew/Caskroom/miniforge/base/envs/simulation_venv/bin:/opt/homebrew/Caskroom/miniforge/base/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/Library/Frameworks/Python.framework/Versions/3.10/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/aldozaimi/.cargo/bin
[17:05:45.063] Checking ssh with "ssh -V"
[17:05:45.066] > OpenSSH_8.6p1, LibreSSL 3.3.5

[17:05:45.067] askpass server listening on /var/folders/ml/kpt1qf1s2ngfsbzl5ycgclz00000gp/T/vscode-ssh-askpass-135acdbd1f53ce3ecf04f0db01904fbfecec5929.sock
[17:05:45.068] Spawning local server with {"serverId":1,"ipcHandlePath":"/var/folders/ml/kpt1qf1s2ngfsbzl5ycgclz00000gp/T/vscode-ssh-askpass-d3660326b270bf9a6b665c435ec56bc6679ee114.sock","sshCommand":"ssh","sshArgs":["-v","-T","-D","51100","-o","ConnectTimeout=15","cn-c017.server.mila.quebec"],"serverDataFolderName":".vscode-server","dataFilePath":"/Users/aldozaimi/Library/Application Support/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-feed4297-3b889b090b5ad5793f524b5d1d39fda662b96a2a-0.84.0/data.json"}
[17:05:45.068] Local server env: {"SSH_AUTH_SOCK":"/private/tmp/com.apple.launchd.4M5703FS2F/Listeners","SHELL":"/bin/zsh","DISPLAY":"1","ELECTRON_RUN_AS_NODE":"1","SSH_ASKPASS":"/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/local-server/askpass.sh","VSCODE_SSH_ASKPASS_NODE":"/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper.app/Contents/MacOS/Code Helper","VSCODE_SSH_ASKPASS_EXTRA_ARGS":"--ms-enable-electron-run-as-node","VSCODE_SSH_ASKPASS_MAIN":"/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/askpass-main.js","VSCODE_SSH_ASKPASS_HANDLE":"/var/folders/ml/kpt1qf1s2ngfsbzl5ycgclz00000gp/T/vscode-ssh-askpass-135acdbd1f53ce3ecf04f0db01904fbfecec5929.sock"}
[17:05:45.068] Spawned 65717
[17:05:45.134] > local-server-1> Spawned ssh, pid=65724
[17:05:45.136] stderr> OpenSSH_8.6p1, LibreSSL 3.3.5
[17:05:45.141] stderr> OpenSSH_8.6p1, LibreSSL 3.3.5
[17:05:45.154] stderr> kex_exchange_identification: Connection closed by remote host
[17:05:45.154] stderr> Connection closed by 172.16.2.25 port 2222
[17:05:45.154] stderr> kex_exchange_identification: Connection closed by remote host
[17:05:45.154] stderr> Connection closed by UNKNOWN port 65535
[17:05:45.155] > local-server-1> ssh child died, shutting down
[17:05:45.156] Local server exit: 0
[17:05:45.156] Received install output: local-server-1> Spawned ssh, pid=65724
OpenSSH_8.6p1, LibreSSL 3.3.5
OpenSSH_8.6p1, LibreSSL 3.3.5
kex_exchange_identification: Connection closed by remote host
Connection closed by 172.16.2.25 port 2222
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
local-server-1> ssh child died, shutting down

[17:05:45.157] Failed to parse remote port from server output
[17:05:45.157] Resolver error: Error: 
	at Function.Create (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:585222)
	at Object.t.handleInstallOutput (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:583874)
	at Object.e [as tryInstallWithLocalServer] (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:624373)
	at processTicksAndRejections (node:internal/process/task_queues:96:5)
	at async /Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:643506
	at async Object.t.withShowDetailsEvent (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:647224)
	at async /Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:622845
	at async T (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:619351)
	at async Object.t.resolveWithLocalServer (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:622460)
	at async Object.t.resolve (/Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:644834)
	at async /Users/aldozaimi/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:727082
[17:05:45.160] ------

Any ideas on how to resolve this?

Add `--alloc` example

It's unclear how we should use the --alloc options, which are allowed (if not all of them) etc.

Mila code doesn't work with --alloc --nodes>1

$ mila code /network/scratch/n/normandf/imagenet_template --alloc --cpus-per-task=4 --gres=gpu:1 --mem=16G --nodes 2
(local) $ ssh mila -fNMS /home/fabrice/.ssh/sockets/milatools.mila
(mila) $ salloc --cpus-per-task=4 --gres=gpu:1 --mem=16G --nodes 2
# Control socket connect(/home/fabrice/.ssh/sockets/milatools.mila): Connection refused
# salloc: --------------------------------------------------------------------------------------------------
# salloc: # Using default long partition
# salloc: --------------------------------------------------------------------------------------------------
# salloc: Pending job allocation 2149062
# salloc: job 2149062 queued and waiting for resources
# salloc: job 2149062 has been allocated resources
# salloc: Granted job allocation 2149062
# salloc: Waiting for resource configuration
# salloc: Nodes cn-c[007,035] are ready for job
(local) $ code --remote 'ssh-remote+cn-c[007,035].server.mila.quebec' /network/scratch/n/normandf/imagenet_template

VSCode isn't able to connect to the host. This is probably due to how the node name is retrieved inside the mila code command.

Here is some of the output inside the VsCode: Remote - SSH log window:

[10:46:47.518] ------




[10:46:47.518] SSH Resolver called for "ssh-remote+cn-c[007,035].server.mila.quebec", attempt 5, (Reconnection)
[10:46:47.519] SSH Resolver called for host: cn-c[007,035].server.mila.quebec
[10:46:47.519] Setting up SSH remote "cn-c[007,035].server.mila.quebec"
[10:46:47.520] Acquiring local install lock: /tmp/vscode-remote-ssh-855227e0-install.lock
[10:46:47.520] Looking for existing server data file at /home/fabrice/.config/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-855227e0-6d9b74a70ca9c7733b29f0456fd8195364076dda-0.84.0/data.json
[10:46:47.520] Using commit id "6d9b74a70ca9c7733b29f0456fd8195364076dda" and quality "stable" for server
[10:46:47.522] Install and start server if needed
[10:46:47.527] askpass server listening on /run/user/1001/vscode-ssh-askpass-e7e86db49903dbff72edcece1901eb8c786021c3.sock
[10:46:47.528] Spawning local server with {"serverId":5,"ipcHandlePath":"/run/user/1001/vscode-ssh-askpass-bace1ed1c6ac07b4dc3b702a1e5f6f1d66ccc9e5.sock","sshCommand":"ssh","sshArgs":["-v","-T","-D","36461","-o","ConnectTimeout=15","cn-c[007,035].server.mila.quebec"],"serverDataFolderName":".vscode-server","dataFilePath":"/home/fabrice/.config/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-855227e0-6d9b74a70ca9c7733b29f0456fd8195364076dda-0.84.0/data.json"}
[10:46:47.528] Local server env: {"SSH_AUTH_SOCK":"/run/user/1001/keyring/ssh","SHELL":"/bin/bash","DISPLAY":":1","ELECTRON_RUN_AS_NODE":"1","SSH_ASKPASS":"/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/local-server/askpass.sh","VSCODE_SSH_ASKPASS_NODE":"/usr/share/code/code","VSCODE_SSH_ASKPASS_EXTRA_ARGS":"--ms-enable-electron-run-as-node","VSCODE_SSH_ASKPASS_MAIN":"/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/askpass-main.js","VSCODE_SSH_ASKPASS_HANDLE":"/run/user/1001/vscode-ssh-askpass-e7e86db49903dbff72edcece1901eb8c786021c3.sock"}
[10:46:47.533] Spawned 6736
[10:46:47.630] > local-server-5> Spawned ssh, pid=6744
[10:46:47.634] stderr> OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f  31 Mar 2020
[10:46:47.635] stderr> Bad stdio forwarding specification '[cn-c[007,035].server.mila.quebec]:22'
[10:46:47.635] stderr> kex_exchange_identification: Connection closed by remote host
[10:46:47.635] > local-server-5> ssh child died, shutting down
[10:46:47.641] Local server exit: 0
[10:46:47.642] Received install output: local-server-5> Spawned ssh, pid=6744
OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f  31 Mar 2020
Bad stdio forwarding specification '[cn-c[007,035].server.mila.quebec]:22'
kex_exchange_identification: Connection closed by remote host
local-server-5> ssh child died, shutting down

[10:46:47.642] Failed to parse remote port from server output
[10:46:47.643] Resolver error: Error: 
	at Function.Create (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:585222)
	at Object.t.handleInstallOutput (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:583874)
	at Object.e [as tryInstallWithLocalServer] (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:624373)
	at processTicksAndRejections (node:internal/process/task_queues:96:5)
	at async /home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:643506
	at async Object.t.withShowDetailsEvent (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:647224)
	at async /home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:622845
	at async T (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:619351)
	at async Object.t.resolveWithLocalServer (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:622460)
	at async Object.t.resolve (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:644834)
	at async /home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:727082
[10:46:47.644] ------

[v0.0.18] Issue running the command `mila init`

What command did you run?

mila init

Describe the bug

FileNotFoundError when trying to register public key.

Error Traceback

(mila) C:\Users\marle>mila init
Checking ssh config
Fixed the permissions on ssh directory at C:\Users\marle.ssh to 700
Fixing permissions on C:\Users\marle.ssh\config to 600
Did not change ssh config

OK

Checking passwordless authentication
(local) $ ssh -oPreferredAuthentications=publickey mila 'echo OK'
? Your public key does not appear be registered on the cluster. Register it? Yes
(local) $ 'powershell.exe type $env:USERPROFILE.ssh\id_rsa.pub | ssh mila "cat >> ~/.ssh/authorized_keys"'

Traceback (most recent call last):
File "C:\Users\marle\miniconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 68, in main
mila()
File "C:\Users\marle\miniconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 362, in mila
return function(**args_dict)
File "C:\Users\marle\miniconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 408, in init
setup_passwordless_ssh_access()
File "C:\Users\marle\miniconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 445, in setup_passwordless_ssh_access
here.run(command)
File "C:\Users\marle\miniconda3\envs\mila\lib\site-packages\milatools\cli\local.py", line 34, in run
return subprocess.run(
File "C:\Users\marle\miniconda3\envs\mila\lib\subprocess.py", line 503, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\Users\marle\miniconda3\envs\mila\lib\subprocess.py", line 971, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\marle\miniconda3\envs\mila\lib\subprocess.py", line 1456, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] Das System kann die angegebene Datei nicht finden

Desktop (please complete the following information):

  • OS: Windows 10

Additional context

milatools version=0.1.0
created new environment from within miniconda3

Thanks!

ValueError when running 'mila init'

When running 'mila init' I get the following error:

$ mila init
Traceback (most recent call last):
  File "/opt/miniconda3/envs/py3.8/bin/mila", line 8, in <module>
    sys.exit(main())
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/milatools/commands.py", line 14, in main
    auto_cli(milatools)
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/coleo/cli.py", line 613, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/coleo/cli.py", line 584, in run_cli
    opts, call = make_cli(entry, **kwargs)
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/coleo/cli.py", line 534, in make_cli
    _make_cli_helper(parser, entry, tag=tag, eval_env=eval_env, extras=extras)
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/coleo/cli.py", line 460, in _make_cli_helper
    entry2 = tooled(entry2)
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/ptera/deco.py", line 55, in __call__
    new_fn, state = transform(fn, interact=interact)
  File "/opt/miniconda3/envs/py3.8/lib/python3.9/site-packages/ptera/selfless.py", line 510, in transform
    new_fn = compile(
ValueError: field 'id' is required for Name

Details:

  • macOS Big Sur 11.6
  • python 3.9.7
  • milatools 0.0.5

milatools setup for Windows

Hello,
I've been having some issues getting VS code's debugger to run on a compute node from my Windows laptop.

How to reproduce the issue:

  1. Install Python 3 on Windows
  2. Execute the following commands in a Windows PowerShell terminal:
pip install milatools
mila init

At least in my case, the mila init command hangs after displaying the Your public key does not appear be registered on the cluster. Register it? [Y/n] message.

The workaround:
First you need to successfully run mila init to complete your cluster SSH setup.

  1. Install Linux on Windows with WSL
  2. Install Ubuntu on Windows
  3. Launch an Ubuntu terminal from the Windows Start menu (on first launch this will trigger an OS user account setup procedure; proceed with the setup)
  4. Install Anaconda in the Ubuntu app:
    4.1 Download the Anaconda linux setup script from here. For example: wget https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh
    4.2 Install Anaconda. For example: bash Anaconda3-2022.05-Linux-x86_64.sh
    Answer yes to the following question: Do you wish the installer to initialize Anaconda3 by running conda init? [yes|no]
    4.3 Launch a new Ubuntu terminal from Windows Start menu and confirm that the default Pythonโ€™s version is now >= 3.8 (this is a milatools requirement) by executing the following command: python -V
  5. Install and initialize milatools (still in Ubuntu):
pip install milatools
mila init

Then you need to generate a public RSA key on your Windows host and add it to the list of authorized keys on the cluster:

  1. If not already done, install Python 3 on Windows
  2. Open a Windows PowerShell terminal from the Windows Start menu and install milatools by executing the following command: pip install milatools
  3. On Windows, installation of milatools will end on a warning such as: WARNING: The script mila.exe is installed in 'C:\Users\[USERNAME]\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\Scripts' which is not on PATH. Take in note the absolute path of milatoolsโ€™ installation folder
  4. Add the absolute path of milatoolsโ€™ installation folder to Windowsโ€™ PATH environment variable by following this procedure
  5. Open a new Windows PowerShell terminal from the Windows Start menu and execute the following command: mila init
    5.1 Due to a bug, the mila init command will hang after displaying the following message: Your public key does not appear be registered on the cluster. Register it? [Y/n]
  6. Open a new PowerShell terminal and execute the following command to print your public RSA key: type $env:USERPROFILE\.ssh\id_rsa.pub
  7. Use the following command to open an SSH connection to the mila cluster: ssh mila
  8. Append your public RSA key (see step 6 above) at the end of the ~/.ssh/authorized_keys file on Milaโ€™s clusterโ€™s host

Setup is done!
Now you can use the following procedure to launch VS code using a compute node from Milaโ€™s cluster as as development environment:

  1. Install Microsoftโ€™s Remote SSH VS Code plugin
  2. On Mila login node, ask for an interactive session and keep a note of the compute node hostname. For example: salloc --cpus-per-task 1 --gres gpu:0 --time=0-1:00:00
  3. Execute the following command in a Windows PowerShell to launch VS Code using the compute node from the previous step as as development environment: code --remote ssh-remote+[COMPUTE_NODE_HOSTNAME].server.mila.quebec [SRC_CODE_DIRECTORY_ON_CLUSTER]

Lead to a bugfix:
The problem seems to come from trying to execute the ssh-copy-id command on a Windows host (this command does not exist on Windows).

I found a one liner that is supposed to be Windows' equivalent of ssh-copy-id, but that also hanged on me and I haven't had time to investigate further yet.

I will try to find some bandwidth to come up with a bugfix PR in the next few days.
Hopefully the workaround above will save some people some time in the meanwhile.

Long term storage is being provisioned by IDT, we should supoprt it

mila.Archive:
Moves (or copies) folders in userโ€™s the mila google drive, useful for long term storage of old projects

mila archive path/to/my/folder/ # returns permalink to resource.
mila restore path/to/google/drive/folder/ .

We had this in the design doc.

Initially the storage will be gdrive so we should have that implemented first, later on there should be mila local storage and we should swap backend to that transparently to the user, worst case with some annoying warning like:
"Please run mila defrag to move all your drive (deprecated) files to local disks."

(sorry wrote it during IT committee meeting)

[v0.1.3-post.1+db9b7bc] Issue running the command `mila code`

Make sure you can reproduce the issue with the latest version available

I am on version milatools v0.1.3-post.1+db9b7b (the latest version)

What command did you run?

mila code $HOME_DIR --alloc --gres=gpu:1 -c 4 --time=2:00:00 --partition=short-unkillable ($HOME_DIR is my home directory on mila cluster)

Describe the bug

When I try to run the mila code command, it exits with code 1 with the error:
salloc: error: QOSMinCpuNotSatisfied salloc: error: Job submit/allocate failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)

Screenshots

image

Desktop

please complete the following information):

  • OS: MacOS 14.0

Every command slowed 5s by `socket.getfqdn` on Mac

On my Mac, on my home wifi, with the latest version of the tools, it takes about 5 seconds for the call to mila to respond, regardless of which command I call, e.g. code, help, serve, etc. I investigated this a little and found the culprit to be:

image

socket.getfqdn is commonly slow on Mac and solving that requires some elbow grease. It can be patched, e.g. via a judicious use of the UNIX hostname command.

Here's a MVP patch: patrickmineault@aebdc7d

FileNotFoundError when trying to set profile as default

Trying to get milatools working for me, I run into the following error when trying to store a profile as default. The line with REMOTEPATH is from a print statement I put in at paramiko/sftp_client.py:715 to see where it's trying to write. Evidently something goes wrong in constructing the path, which ends up being $HOME/~/.milatools-profile. I've worked around it by creating the file manually.

(py310) tim@mits ~ $ mila serve lab --alloc --gres gpu:1 --partition=unkillable
Checking for preferred profile in ~/.milatools-profile
None found.
Fetching profiles in ~/.milatools/profiles
? Select the profile to use: py310jax
? Do you want to use this profile by default in ~? Yes
REMOTEPATH /home/mila/c/cooijmat/~/.milatools-profile
Traceback (most recent call last):
  File "/home/tim/miniconda3/envs/py310/bin/mila", line 8, in <module>
    sys.exit(main())
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/milatools/cli/__main__.py", line 35, in main
    auto_cli(milatools)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/coleo/cli.py", line 587, in thunk
    result = fn(*args)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/milatools/cli/__main__.py", line 393, in lab
    _standard_server(
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/milatools/cli/__main__.py", line 538, in _standard_server
    prof = setup_profile(remote, path)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/milatools/cli/profile.py", line 44, in setup_profile
    remote.puttext(profile, str(profile_file))
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/milatools/cli/utils.py", line 309, in puttext
    self.put(f.name, dest)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/milatools/cli/utils.py", line 301, in put
    return self.connection.put(src, dest)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/fabric/connection.py", line 870, in put
    return Transfer(self).put(*args, **kwargs)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/fabric/transfer.py", line 311, in put
    self.sftp.put(localpath=local, remotepath=remote)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/paramiko/sftp_client.py", line 760, in put
    return self.putfo(fl, remotepath, file_size, callback, confirm)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/paramiko/sftp_client.py", line 715, in putfo
    with self.file(remotepath, "wb") as fr:
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/paramiko/sftp_client.py", line 372, in open
    t, msg = self._request(CMD_OPEN, filename, imode, attrblock)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/paramiko/sftp_client.py", line 823, in _request
    return self._read_response(num)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/paramiko/sftp_client.py", line 875, in _read_response
    self._convert_status(msg)
  File "/home/tim/miniconda3/envs/py310/lib/python3.10/site-packages/paramiko/sftp_client.py", line 904, in _convert_status
    raise IOError(errno.ENOENT, text)
FileNotFoundError: [Errno 2] No such file

[v0.1.2] `mila code` with `--persist` fails!

What command did you run?

mila code repos/office_hours --persist

Describe the bug

$ mila code repos/office_hours --persist
(mila) $ lfs quota -u $USER $HOME
Disk quotas for usr normandf (uid 1471600598):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
/home/mila/n/normandf
                95748040       0 104857600       -  908726       0 1048576       -
uid 1471600598 is using default block quota setting
uid 1471600598 is using default file quota setting
[02/13/24 15:34:18] WARNING  2024-02-13 15:34:18,136 - WARNING - Unable to check the disk-quota on the cluster: not enough   commands.py:534
                             values to unpack (expected 9, got 1)                                                                           
sbatch: error: Unable to open file ~/.milatools/batch/batch-1707856458144355897.sh
touch: cannot touch '.milatools/batch/out-1707856458144355897.txt': No such file or directory
tail: cannot open '.milatools/batch/out-1707856458144355897.txt' for reading: No such file or directory
tail: no files remaining
Traceback (most recent call last):
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/commands.py", line 80, in main
    mila()
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/commands.py", line 383, in mila
    return function(**args_dict)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/commands.py", line 572, in code
    data, proc = cnode.ensure_allocation()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/remote.py", line 495, in ensure_allocation
    login_node_runner, results = self.extract(
                                 ^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/remote.py", line 364, in extract
    promise.join()
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/invoke/runners.py", line 1622, in join
    return self.runner._finish()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/invoke/runners.py", line 518, in _finish
    raise UnexpectedExit(result)
invoke.exceptions.UnexpectedExit: Encountered a bad command exit code!

Command: "cd $SCRATCH && sbatch -J mila-code '~/.milatools/batch/batch-1707856458144355897.sh'; touch .milatools/batch/out-1707856458144355897.txt; tail -n +1 -f .milatools/batch/out-1707856458144355897.txt"

Exit code: 1

Stdout: already printed

Stderr: n/a (PTYs have no stderr)



An error occurred during the execution of the command `code`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=code%2C0.0.18&template=bug_report.md&title=%5Bv0.0.18%5D+Issue+running+the+command+%60mila+code%60
Please provide the error traceback with the report (the red text above).

Desktop (please complete the following information):

  • OS: Ubuntu 22.04

[v0.0.18] Issue running the command `mila code`: socket.gaierror: [Errno -3] Temporary failure in name resolution

Make sure you can reproduce the issue with the latest version available

 pip install milatools --upgrade
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: milatools in ./.local/lib/python3.10/site-packages (0.0.18)
Requirement already satisfied: Fabric<3.0.0,>=2.7.0 in ./.local/lib/python3.10/site-packages (from milatools) (2.7.1)
Requirement already satisfied: blessed<2.0.0,>=1.18.1 in ./.local/lib/python3.10/site-packages (from milatools) (1.20.0)
Requirement already satisfied: coleo<0.4.0,>=0.3.0 in ./.local/lib/python3.10/site-packages (from milatools) (0.3.2)
Requirement already satisfied: questionary<2.0.0,>=1.10.0 in ./.local/lib/python3.10/site-packages (from milatools) (1.10.0)
Requirement already satisfied: sshconf<0.3.0,>=0.2.2 in ./.local/lib/python3.10/site-packages (from milatools) (0.2.5)
Requirement already satisfied: wcwidth>=0.1.4 in ./.local/lib/python3.10/site-packages (from blessed<2.0.0,>=1.18.1->milatools) (0.2.6)
Requirement already satisfied: six>=1.9.0 in /usr/lib/python3/dist-packages (from blessed<2.0.0,>=1.18.1->milatools) (1.16.0)
Requirement already satisfied: ptera<2.0.0,>=1.4.1 in ./.local/lib/python3.10/site-packages (from coleo<0.4.0,>=0.3.0->milatools) (1.4.1)
Requirement already satisfied: invoke<2.0,>=1.3 in ./.local/lib/python3.10/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (1.7.3)
Requirement already satisfied: paramiko>=2.4 in /usr/lib/python3/dist-packages (from Fabric<3.0.0,>=2.7.0->milatools) (2.9.3)
Requirement already satisfied: pathlib2 in ./.local/lib/python3.10/site-packages (from Fabric<3.0.0,>=2.7.0->milatools) (2.3.7.post1)
Requirement already satisfied: prompt_toolkit<4.0,>=2.0 in ./.local/lib/python3.10/site-packages (from questionary<2.0.0,>=1.10.0->milatools) (3.0.39)
Requirement already satisfied: codefind<0.2.0,>=0.1.2 in ./.local/lib/python3.10/site-packages (from ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.1.3)
Requirement already satisfied: giving<0.5.0,>=0.4.1 in ./.local/lib/python3.10/site-packages (from ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.4.2)
Requirement already satisfied: asttokens<3.0.0,>=2.2.1 in ./.local/lib/python3.10/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (2.4.0)
Requirement already satisfied: reactivex<5.0.0,>=4.0.0 in ./.local/lib/python3.10/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (4.0.4)
Requirement already satisfied: varname<0.11.0,>=0.10.0 in ./.local/lib/python3.10/site-packages (from giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (0.10.0)
Requirement already satisfied: typing-extensions<5.0.0,>=4.1.1 in ./.local/lib/python3.10/site-packages (from reactivex<5.0.0,>=4.0.0->giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (4.7.1)
Requirement already satisfied: executing<2.0,>=1.1 in ./.local/lib/python3.10/site-packages (from varname<0.11.0,>=0.10.0->giving<0.5.0,>=0.4.1->ptera<2.0.0,>=1.4.1->coleo<0.4.0,>=0.3.0->milatools) (1.2.0)

What command did you run?

 mila code /home/mila/c/charlotte.lange/scratch/neurips23/causalpaca --job 3703232

Describe the bug

Cannot access interactive job with mila code. Traceback:

(mila) $ squeue --jobs 3703232 -ho %N
cn-a010
Traceback (most recent call last):
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/milatools/cli/commands.py", line 43, in main
    auto_cli(milatools)
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/coleo/cli.py", line 587, in thunk
    result = fn(*args)
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/milatools/cli/commands.py", line 288, in code
    cnode = _find_allocation(remote, job_name="mila-code")
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/milatools/cli/commands.py", line 703, in _find_allocation
    return Remote(node_name)
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/milatools/cli/remote.py", line 84, in __init__
    connection.open()
  File "/home/fortheswarm/.local/lib/python3.10/site-packages/fabric/connection.py", line 636, in open
    self.client.connect(**kwargs)
  File "/usr/lib/python3/dist-packages/paramiko/client.py", line 340, in connect
    to_try = list(self._families_and_addresses(hostname, port))
  File "/usr/lib/python3/dist-packages/paramiko/client.py", line 203, in _families_and_addresses
    addrinfos = socket.getaddrinfo(
  File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

An error occured during the execution of the command `code`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=code%2C0.0.18&template=bug_report.md&title=%5Bv0.0.18%5D+Issue+running+the+command+%60mila+code%60

Screenshots

image

Desktop (please complete the following information):

Ubuntu 22.04.3 LTS 64bit GNOM 42.9

Additional Context

interactive job started with:


salloc --time=4:0:0  --gres=gpu:1 --mem=24G -c 1
salloc: --------------------------------------------------------------------------------------------------
salloc: # Using default long partition
salloc: --------------------------------------------------------------------------------------------------
salloc: Granted job allocation 3703232
salloc: Waiting for resource configuration
salloc: Nodes cn-a010 are ready for job

mila code <script dir> results in FileNotFoundError for 'code' folder

So the mila code path/to/my/experiment command seems to be not working correctly, or I'm doing something wrong. What would be the appropriate way of running a python script that's in $HOME/GTaxoGym/main.py?

(local) $ code --remote ssh-remote+kepler4.server.mila.quebec /home/mila/s/semih.canturk/GTaxoGym/main.py
Traceback (most recent call last):
  File "/usr/local/bin/mila", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/milatools/commands.py", line 14, in main
    auto_cli(milatools)
  File "/usr/local/lib/python3.8/site-packages/coleo/cli.py", line 613, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/coleo/cli.py", line 585, in run_cli
    return call(opts=opts, args=args)
  File "/usr/local/lib/python3.8/site-packages/coleo/cli.py", line 544, in thunk
    result = fn(*args)
  File "/usr/local/lib/python3.8/site-packages/ptera/core.py", line 853, in __call__
    rval = super().__call__(*self.partial_args, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/ptera/selfless.py", line 656, in __call__
    return self.fn(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/milatools/commands.py", line 192, in code
    here.run("code", "--remote", f"ssh-remote+{node_name}.server.mila.quebec", path)
  File "/usr/local/lib/python3.8/site-packages/milatools/utils.py", line 29, in run
    return subprocess.run(
  File "/usr/local/Cellar/[email protected]/3.8.11/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 493, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/local/Cellar/[email protected]/3.8.11/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/Cellar/[email protected]/3.8.11/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'code'
semo@Semihs-MacBook-Pro ~ % 

[v0.0.16] Issue when using the --persist flag for Mila code

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila code /network/scratch/a/anna.richter/ --persist --alloc --gres=gpu:1 --partition=long --mem=32G --time=0-10:00:00

Describe the bug

A clear and concise description of what the bug is. If there is an error
traceback, please paste it here.

The error solely occurs when using the --persist flag

Traceback (most recent call last):
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\commands.py", line 42, in main
    auto_cli(milatools)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\coleo\cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\coleo\cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\coleo\cli.py", line 587, in thunk
    result = fn(*args)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\commands.py", line 314, in code
    data, proc = cnode.ensure_allocation()
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\remote.py", line 251, in ensure_allocation     
    proc, results = self.extract(
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\remote.py", line 139, in extract
    proc = self.run(cmd, asynchronous=True, out_stream=qio, **kwargs)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\remote.py", line 127, in run
    cmd = transform(cmd)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\remote.py", line 234, in srun_transform_persist
    self.puttext(batch, batch_file)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\remote.py", line 178, in puttext
    self.put(f.name, dest)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\milatools\cli\remote.py", line 170, in put
    return self.connection.put(src, dest)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\fabric\connection.py", line 870, in put
    return Transfer(self).put(*args, **kwargs)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\fabric\transfer.py", line 311, in put
    self.sftp.put(localpath=local, remotepath=remote)
  File "C:\Users\Anna Richter\Documents\GitHub\Biasly_Mila\venv\lib\site-packages\paramiko\sftp_client.py", line 758, in put
    with open(localpath, "rb") as fl:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\ANNARI~1\\AppData\\Local\\Temp\\tmp79b62pzx'
An error occured during the execution of the command `code`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at https://github.com/mila-iqia/milatools/issues/new?labels=code%2C0.0.16&template=bug_report.md&title=%5Bv0.0.16%5D+Issue+running+the+command+%60mila+code%60

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 22.04, Mac OS 12.5, Windows 11, etc.]

Windows 10 pro Version 22H2

Additional context

Add any other context about the problem here.

Unable to install milatools with Poetry

Seems like the sshconf package isn't setup to support PEP517 build. Perhaps we need to replace that package with a fork?
Alternatively we could not use Poetry? Any other solutions?

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

conda create -n milatools python=3.11
conda activate milatools
pip install poetry
poetry install

Describe the bug

A clear and concise description of what the bug is. If there is an error
traceback, please paste it here.

$ poetry install
Installing dependencies from lock file

Package operations: 1 install, 0 updates, 0 removals

  โ€ข Installing sshconf (0.2.2): Failed

  ChefBuildError

  Backend subprocess exited when trying to invoke get_requires_for_build_wheel
  
  <string>:3: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
  !!
  
          ********************************************************************************
          Requirements should be satisfied by a PEP 517 installer.
          If you are using pip, you can try `pip install --use-pep517`.
          ********************************************************************************
  
  !!
  /tmp/tmpshx_m0_f/.venv/bin/python: No module named pip
  Traceback (most recent call last):
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/installer.py", line 101, in _fetch_build_egg_no_warn
      subprocess.check_call(cmd)
    File "/home/fabricenormandin/miniconda3/envs/milatools/lib/python3.10/subprocess.py", line 369, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['/tmp/tmpshx_m0_f/.venv/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmp7j21z310', '--quiet', 'versiontag']' returned non-zero exit status 1.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "/home/fabricenormandin/miniconda3/envs/milatools/lib/python3.10/site-packages/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/home/fabricenormandin/miniconda3/envs/milatools/lib/python3.10/site-packages/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/home/fabricenormandin/miniconda3/envs/milatools/lib/python3.10/site-packages/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
      return hook(config_settings)
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
      return self._get_build_requires(config_settings, requirements=['wheel'])
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
      self.run_setup()
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/build_meta.py", line 480, in run_setup
      super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
      exec(code, locals())
    File "<string>", line 3, in <module>
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/dist.py", line 636, in fetch_build_eggs
      return _fetch_build_eggs(self, requires)
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/installer.py", line 38, in _fetch_build_eggs
      resolved_dists = pkg_resources.working_set.resolve(
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 829, in resolve
      dist = self._resolve_dist(
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 865, in _resolve_dist
      dist = best[req.key] = env.best_match(
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1135, in best_match
      return self.obtain(req, installer)
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1147, in obtain
      return installer(requirement)
    File "/tmp/tmpshx_m0_f/.venv/lib/python3.10/site-packages/setuptools/installer.py", line 103, in _fetch_build_egg_no_warn
      raise DistutilsError(str(e)) from e
  distutils.errors.DistutilsError: Command '['/tmp/tmpshx_m0_f/.venv/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmp7j21z310', '--quiet', 'versiontag']' returned non-zero exit status 1.
  

  at ~/miniconda3/envs/milatools/lib/python3.10/site-packages/poetry/installation/chef.py:164 in _prepare
      160โ”‚ 
      161โ”‚                 error = ChefBuildError("\n\n".join(message_parts))
      162โ”‚ 
      163โ”‚             if error is not None:
    โ†’ 164โ”‚                 raise error from None
      165โ”‚ 
      166โ”‚             return path
      167โ”‚ 
      168โ”‚     def _prepare_sdist(self, archive: Path, destination: Path | None = None) -> Path:

Note: This error originates from the build backend, and is likely not a problem with poetry but with sshconf (0.2.2) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "sshconf (==0.2.2)"'.

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Ubuntu 20.04 LTS

Additional context

Feature suggestion: config file

Would be great to specify an identifier (like a project name or something) to map a project to a path, possibly per cluster. Something like

# ~/.milatools.yaml
myproject:
  mila: /home/mila/s/schmidtv/project/code/repos/repoName
  beluga: /home/vsch/scratch/project/repos/repoName

myproject2:
  mila: /home/mila/s/schmidtv/thatthing/foo/proj2
  beluga: /home/vsch/scratch/thatotherthing/bar/proj2-github

and then specify an identifier or project with -p myproject2 (or -i myproject2)


You could even push this feature further to be more complex and precise with alloc defaults

# ~/.milatools.yaml
myproject:
  mila: 
    path: /home/mila/s/schmidtv/project/code/repos/repoName
    alloc:
      cpus: 2
      memory: 12G
  beluga: 
    path: /home/vsch/scratch/project/repos/repoName

myproject2:
  mila: 
    path: /home/mila/s/schmidtv/thatthing/foo/proj2
    alloc:
      memory: 16G
  beluga: 
    path: /home/vsch/scratch/thatotherthing/bar/proj2-github

And potentially overwrite defaults with something like --remember-alloc

[v0.0.18] Issue running the command `mila init`

What command did you run?

mila init

Describe the bug

I am following the setup procedure as in the user guide. After successfully installing milatools, 'mila init' command fails with traceback:
File "/home/mptouzel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 82, in main
mila()
File "/home/mptouzel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 385, in mila
return function(**args_dict)
File "/home/mptouzel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 455, in init
success = setup_passwordless_ssh_access(ssh_config=ssh_config)
File "/home/mptouzel/miniconda3/lib/python3.10/site-packages/milatools/cli/init_command.py", line 252, in setup_passwordless_ssh_access
success = setup_passwordless_ssh_access_to_cluster("mila")
File "/home/mptouzel/miniconda3/lib/python3.10/site-packages/milatools/cli/init_command.py", line 298, in setup_passwordless_ssh_access_to_cluster
assert ssh_public_key_path.exists()
AssertionError

Desktop (please complete the following information):

WSL2 on Windows 11

Implement cluster-aware dataloaders

Purpose

Provide an interface, akin to from milatools.datasets.torch import ImageNet, to acquire and use certain datasets in a location aware way. For example, when running on the Mila cluster, it should copy ImageNet from its standard location in /networks/datasets to the local scratch. Elsewhere, it should use the user provided path (if necessary, download it).

Given existing attempts in both #8 and #10 by @manuel-delverme and @Delaunay I'm going to take a step back and create this issue so that we have a central way to discuss this. I want to make sure we are all on the same page.

Requisites

  1. Ease of use. Whenever possible, when using torch, we should mirror the torchvision or torchtext interface, so all the user needs to do is swap out the import:
from milatools.datasets.torch import CIFAR10
...
# On local machine: uses ../data
# On the Mila cluster: uses /network/datasets/whatever_the_cifar_path_is
data = CIFAR10("../data")
  1. Supports important locations: the user's local machine, the Mila cluster, the CC cluster, possibly others in the future. For this reason I believe the code should be modular with respect to the concept of "location" or "environment".

  2. Supports important frameworks: for now this is arguably just PyTorch, but we need to be careful not to induce unnecessary dependencies on a single framework. If e.g. a large number of researchers move to Jax we should be able to gracefully support it without clashing with PyTorch support.

Anyone is welcome to contribute to this discussion.

[v0.0.17] Issue running the command `mila code`: Paramiko error

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

Yes I did

What command did you run?

[e.g. mila code ...]
mila code Research/DomainGeneralization --alloc --gres=gpu:1 -c 4 --mem=24000

Describe the bug

A clear and concise description of what the bug is. If there is an error
traceback, please paste it here.

Traceback (most recent call last):
File "C:\ProgramData\miniconda3\lib\site-packages\milatools\cli\commands.py", line 43, in main
auto_cli(milatools)
File "C:\ProgramData\miniconda3\lib\site-packages\coleo\cli.py", line 656, in auto_cli
result = run_cli(entry, args, **kwargs)
File "C:\ProgramData\miniconda3\lib\site-packages\coleo\cli.py", line 628, in run_cli
return call(opts=opts, args=args)
File "C:\ProgramData\miniconda3\lib\site-packages\coleo\cli.py", line 587, in thunk
result = fn(*args)
File "C:\ProgramData\miniconda3\lib\site-packages\milatools\cli\commands.py", line 285, in code
remote = Remote("mila")
File "C:\ProgramData\miniconda3\lib\site-packages\milatools\cli\remote.py", line 84, in init
connection.open()
File "C:\ProgramData\miniconda3\lib\site-packages\fabric\connection.py", line 636, in open
self.client.connect(**kwargs)
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\client.py", line 459, in connect
self._auth(
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\client.py", line 717, in _auth
self._agent = Agent()
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\agent.py", line 406, in init
self._connect(conn)
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\agent.py", line 79, in _connect
ptype, result = self._send_message(cSSH2_AGENTC_REQUEST_IDENTITIES)
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\agent.py", line 96, in _send_message
self._conn.send(struct.pack(">I", len(msg)) + msg)
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\win_pageant.py", line 126, in send
self._response = _query_pageant(data)
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko\win_pageant.py", line 93, in _query_pageant
pymap.write(msg)
File "C:\ProgramData\miniconda3\lib\site-packages\paramiko_winapi.py", line 176, in write
dest = self.view + self.pos
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 22.04, Mac OS 12.5, Windows 11, etc.]

Windows 10

Additional context

Add any other context about the problem here.

[v0.0.18] Issue running the command `mila code`: KeyError: 'node_name'

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

[e.g. mila code ...]

Describe the bug

A clear and concise description of what the bug is. If there is an error
traceback, please paste it here.
can't open mila code
Traceback (most recent call last):
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 43, in main
auto_cli(milatools)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
result = run_cli(entry, args, **kwargs)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
return call(opts=opts, args=args)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 587, in thunk
result = fn(*args)
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 291, in code
data, proc = cnode.ensure_allocation()
File "/Users/annabel/miniconda3/lib/python3.10/site-packages/milatools/cli/remote.py", line 271, in ensure_allocation
node_name = get_first_node_name(results["node_name"])
KeyError: 'node_name'

Screenshots

If applicable, add screenshots to help explain your problem.
Screenshot 2023-06-29 at 12 55 52 PM

Desktop (please complete the following information):

Mac OS 13.4

Additional context

Add any other context about the problem here.

[v0.1.2] Intermittent connection errors ("ssh-copy-id appears to have failed", "An error occured while trying to establish a connection with mila")

Intermittent errors during mila init and other commands:

'ssh-copy-id mila' appears to have failed!
ERROR: An error happened while trying to establish a connection with mila
       -The cluster might be under maintenance
          Check #mila-cluster for updates on the state of the cluster
       -Check the status of your connection to the cluster by ssh'ing onto it.
       -Retry connecting with mila
       -Try to exclude the node with -x mila parameter

For example:

$ mila -vvv init
(...)
Checking connection to compute nodes
(...)
[02/14/24 14:19:21] ERROR    2024-02-14 14:19:21,979 - ERROR - Exception (client): Error reading SSH protocol banner                                      transport.py:1893
                   ERROR    2024-02-14 14:19:21,985 - ERROR - Traceback (most recent call last):                                                         transport.py:1891
                   ERROR    2024-02-14 14:19:21,988 - ERROR -   File                                                                                     transport.py:1891
                            "/home/fabrice/miniconda3/envs/milatools/lib/python3.11/site-packages/paramiko/transport.py", line 2292, in _check_banner                     
                   ERROR    2024-02-14 14:19:21,991 - ERROR -     buf = self.packetizer.readline(timeout)                                                transport.py:1891
                   ERROR    2024-02-14 14:19:21,993 - ERROR -           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                transport.py:1891
                   ERROR    2024-02-14 14:19:21,994 - ERROR -   File                                                                                     transport.py:1891
                            "/home/fabrice/miniconda3/envs/milatools/lib/python3.11/site-packages/paramiko/packet.py", line 374, in readline                              
                   ERROR    2024-02-14 14:19:21,996 - ERROR -     buf += self._read_timeout(timeout)                                                     transport.py:1891
                   ERROR    2024-02-14 14:19:21,998 - ERROR -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                     transport.py:1891
                   ERROR    2024-02-14 14:19:21,999 - ERROR -   File                                                                                     transport.py:1891
                            "/home/fabrice/miniconda3/envs/milatools/lib/python3.11/site-packages/paramiko/packet.py", line 611, in _read_timeout                         
[02/14/24 14:19:22] ERROR    2024-02-14 14:19:22,001 - ERROR -     raise socket.timeout()                                                                 transport.py:1891
                   ERROR    2024-02-14 14:19:22,002 - ERROR - TimeoutError                                                                               transport.py:1891
                   ERROR    2024-02-14 14:19:22,004 - ERROR -                                                                                            transport.py:1891
                   ERROR    2024-02-14 14:19:22,005 - ERROR - During handling of the above exception, another exception occurred:                        transport.py:1891
                   ERROR    2024-02-14 14:19:22,007 - ERROR -                                                                                            transport.py:1891
                   ERROR    2024-02-14 14:19:22,008 - ERROR - Traceback (most recent call last):                                                         transport.py:1891
                   ERROR    2024-02-14 14:19:22,009 - ERROR -   File                                                                                     transport.py:1891
                            "/home/fabrice/miniconda3/envs/milatools/lib/python3.11/site-packages/paramiko/transport.py", line 2113, in run                               
                   ERROR    2024-02-14 14:19:22,010 - ERROR -     self._check_banner()                                                                   transport.py:1891
                   ERROR    2024-02-14 14:19:22,011 - ERROR -   File                                                                                     transport.py:1891
                            "/home/fabrice/miniconda3/envs/milatools/lib/python3.11/site-packages/paramiko/transport.py", line 2296, in _check_banner                     
                   ERROR    2024-02-14 14:19:22,012 - ERROR -     raise SSHException(                                                                    transport.py:1891
                   ERROR    2024-02-14 14:19:22,013 - ERROR - paramiko.ssh_exception.SSHException: Error reading SSH protocol banner                     transport.py:1891
                   ERROR    2024-02-14 14:19:22,014 - ERROR -                                                                                            transport.py:1891
ERROR: An error happened while trying to establish a connection with mila
       -The cluster might be under maintenance
          Check #mila-cluster for updates on the state of the cluster
       -Check the status of your connection to the cluster by ssh'ing onto it.
       -Retry connecting with mila
       -Try to exclude the node with -x mila parameter

These all seem to be caused by this "Error reading SSH protocol banner" error.

Mila code: Ctrl+C while in the "Job X awaiting resources" doesn't cancel request

When running the following, and cancelling:

(base) fabrice@fabrice-XPS-15-9570:~/Source$ mila code ~/scratch/milabench --alloc --gres=cpu:4 --gres=gpu:2 --mem=8G
(local) $ ssh mila -fNMS /home/fabrice/.ssh/sockets/milatools.mila
(mila) $ salloc --gres=cpu:4 --gres=gpu:2 --mem=8G
# Control socket connect(/home/fabrice/.ssh/sockets/milatools.mila): Connection refused
# salloc: --------------------------------------------------------------------------------------------------
# salloc: # Using default long partition
# salloc: --------------------------------------------------------------------------------------------------
# salloc: Pending job allocation 1706105
# salloc: job 1706105 queued and waiting for resources
^CCanceled
(base) fabrice@fabrice-XPS-15-9570:~/Source$ 

If I ssh mila, and then run squeue -u normandf, I see that the request job was not canceled:

(base) normandf@login-3:~$ squeue -u normandf
   JOBID     USER    PARTITION           NAME  ST START_TIME             TIME NODES CPUS TRES_PER_N MIN_MEM NODELIST (REASON) COMMENT
 1706105 normandf         long    interactive  PD 2022-04-05T13:38       0:00     1    1      gpu:2      8G  (Resources) (null)

The job is cancelled correctly if it is interrupted after it starts though.

command not found: `salloc`

$ mila code ~/ccai/github --alloc --gres=gpu:1 --mem=16G --partition=main
(local) $ ssh mila -fNMS /Users/victor/.ssh/sockets/milatools.mila
(mila) $ salloc --gres=gpu:1 --mem=16G --partition=main
# zsh:1: command not found: salloc
ERROR: Could not find the node name for the allocation

I think a way to solve this is by adding . /etc/profile at login before calling salloc but I don't really know why or what it does, I remember it as a quick fix

[v0.0.16] Issue running the command `mila code`: socket.gaierror: [Errno 8] nodename nor servname provided, or not known

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

 mila code --job 3034514 /home/mila/l/luyuchen/alpaca-lora

Describe the bug

I had an interactive job runing with

salloc --gres=gpu:a100l --cpus-per-task=8 --time=12:00:00 --mem=64G

But when I try to run mila code from local laptop, I got

(base) โžœ  ~ mila code --job 3034514 /home/mila/l/luyuchen/alpaca-lora
(mila) $ squeue --jobs 3034514 -ho %N
cn-g016
Traceback (most recent call last):
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 42, in main
    auto_cli(milatools)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/coleo/cli.py", line 587, in thunk
    result = fn(*args)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 311, in code
    cnode = _find_allocation(remote)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 726, in _find_allocation
    return Remote(node_name)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/milatools/cli/remote.py", line 83, in __init__
    connection.open()
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/fabric/connection.py", line 636, in open
    self.client.connect(**kwargs)
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/paramiko/client.py", line 356, in connect
    to_try = list(self._families_and_addresses(hostname, port))
  File "/Users/yuchen/miniconda3/lib/python3.10/site-packages/paramiko/client.py", line 202, in _families_and_addresses
    addrinfos = socket.getaddrinfo(
  File "/Users/yuchen/miniconda3/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

An error occured during the execution of the command `code`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at https://github.com/mila-iqia/milatools/issues/new?labels=code%2C0.0.16&template=bug_report.md&title=%5Bv0.0.16%5D+Issue+running+the+command+%60mila+code%60

Desktop (please complete the following information):

  • OS: Mac 13.0.1 (22A400)

[v0.0.16] Issue running the command `mila init`: KeyError: 'user'

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila init

Describe the bug

I get prompted:

Checking ssh config
? There is no '*.server.mila.quebec' entry in ~/.ssh/config. Create one?

I answer yes, and it runs into a KeyError:

Traceback (most recent call last):
File "/home/vbalvaro/anaconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 42, in main
auto_cli(milatools)
File "/home/vbalvaro/anaconda3/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
result = run_cli(entry, args, **kwargs)
File "/home/vbalvaro/anaconda3/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
return call(opts=opts, args=args)
File "/home/vbalvaro/anaconda3/lib/python3.10/site-packages/coleo/cli.py", line 587, in thunk
result = fn(*args)
File "/home/vbalvaro/anaconda3/lib/python3.10/site-packages/milatools/cli/commands.py", line 173, in init
username = c.host("mila")["user"]
KeyError: 'user'

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Ubuntu 22.04

Additional context

Issue with the "--persist" option

What command did you run?

mila code --persist .

Describe the bug


Traceback (most recent call last):
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/commands.py", line 42, in main
    auto_cli(milatools)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/coleo/cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/coleo/cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/coleo/cli.py", line 587, in thunk
    result = fn(*args)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/commands.py", line 314, in code
    data, proc = cnode.ensure_allocation()
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/remote.py", line 251, in ensure_allocation
    proc, results = self.extract(
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/remote.py", line 139, in extract
    proc = self.run(cmd, asynchronous=True, out_stream=qio, **kwargs)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/remote.py", line 127, in run
    cmd = transform(cmd)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/remote.py", line 234, in srun_transform_persist
    self.puttext(batch, batch_file)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/remote.py", line 178, in puttext
    self.put(f.name, dest)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/milatools/cli/remote.py", line 170, in put
    return self.connection.put(src, dest)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/fabric/connection.py", line 870, in put
    return Transfer(self).put(*args, **kwargs)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/fabric/transfer.py", line 311, in put
    self.sftp.put(localpath=local, remotepath=remote)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paramiko/sftp_client.py", line 759, in put
    return self.putfo(fl, remotepath, file_size, callback, confirm)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paramiko/sftp_client.py", line 714, in putfo
    with self.file(remotepath, "wb") as fr:
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paramiko/sftp_client.py", line 372, in open
    t, msg = self._request(CMD_OPEN, filename, imode, attrblock)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paramiko/sftp_client.py", line 822, in _request
    return self._read_response(num)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paramiko/sftp_client.py", line 874, in _read_response
    self._convert_status(msg)
  File "/Users/leogagnon/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paramiko/sftp_client.py", line 907, in _convert_status
    raise IOError(text)
OSError: Failure

System information

  • macOS 13.2.1
  • M1 pro
  • python 3.9.16

Additional information

Works fine without the "--persist" option

problem regarding 'mila init'

$ mila init
Checking ssh config
Traceback (most recent call last):
File "/home/pharry/ENV/bin/mila", line 8, in
sys.exit(main())
File "/home/pharry/ENV/lib/python3.8/site-packages/milatools/commands.py", line 13, in main
auto_cli(milatools)
File "/home/pharry/ENV/lib/python3.8/site-packages/coleo/cli.py", line 611, in auto_cli
result = run_cli(entry, args, **kwargs)
File "/home/pharry/ENV/lib/python3.8/site-packages/coleo/cli.py", line 583, in run_cli
return call(opts=opts, args=args)
File "/home/pharry/ENV/lib/python3.8/site-packages/coleo/cli.py", line 542, in thunk
result = fn(*args)
File "/home/pharry/ENV/lib/python3.8/site-packages/ptera/core.py", line 853, in call
rval = super().call(*self.partial_args, *args, **kwargs)
File "/home/pharry/ENV/lib/python3.8/site-packages/ptera/selfless.py", line 656, in call
return self.fn(self, *args, **kwargs)
File "/home/pharry/ENV/lib/python3.8/site-packages/milatools/commands.py", line 41, in init
c = SSHConfig()
File "/home/pharry/ENV/lib/python3.8/site-packages/milatools/utils.py", line 137, in init
self.cfg = read_ssh_config(os.path.expanduser("~/.ssh/config"))
File "/home/pharry/ENV/lib/python3.8/site-packages/sshconf.py", line 367, in read_ssh_config
master_config = read_ssh_config_file(master_path)
File "/home/pharry/ENV/lib/python3.8/site-packages/sshconf.py", line 121, in read_ssh_config_file
with open(path, "r") as fh_:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pharry/.ssh/config'

Bug with mila init ssh path in WSL

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila init

Describe the bug

For windows, with some accounts, the $env:UserName used to create the path to the home directory does not match the actual home directory. See this. The current code here creates the home path as "/mnt/c/Users/Myriam" when it should be "/mnt/c/Users/Myria". To fix this, I just did a hack of:

def get_windows_home_path_in_wsl() -> Path:
    assert running_inside_WSL()
    windows_username = subprocess.getoutput("powershell.exe '$env:HomePath'").strip()
    windows_username = windows_username.replace('\\', '/')
    return Path(f"/mnt/c/{windows_username}")

Also, a separate thing, I just set up WSL, and in order to get the command to work that changed the permission of the ssh folder, I had to follow the steps here.

Desktop (please complete the following information):

  • OS: Windows 11 with WSL

[v0.0.18] Issue running the command `mila code`

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila code ./home/mila/t/tooba.rahimnia/ --alloc --gres=gpu:1 --mem=48GB -c 8 --partition=long --time=24:00:00

Describe the bug

A clear and concise description of what the bug is. If there is an error
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/milatools/cli/commands.py", line 43, in main
auto_cli(milatools)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/coleo/cli.py", line 656, in auto_cli
result = run_cli(entry, args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/coleo/cli.py", line 628, in run_cli
return call(opts=opts, args=args)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/coleo/cli.py", line 587, in thunk
result = fn(*args)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/milatools/cli/commands.py", line 291, in code
data, proc = cnode.ensure_allocation()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/milatools/cli/remote.py", line 271, in ensure_allocation
node_name = get_first_node_name(results["node_name"])
KeyError: 'node_name'

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • Mac OS Ventura 13.4.1

Additional context

[v0.0.18] Issue running the command `mila code`

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

[e.g. mila code ...]

Describe the bug

A clear and concise description of what the bug is. If there is an error
traceback, please paste it here.

(base) C:\Users\rajes>mila code .
Traceback (most recent call last):
  File "C:\Users\rajes\anaconda3\lib\site-packages\milatools\cli\commands.py", line 43, in main
    auto_cli(milatools)
  File "C:\Users\rajes\anaconda3\lib\site-packages\coleo\cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "C:\Users\rajes\anaconda3\lib\site-packages\coleo\cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "C:\Users\rajes\anaconda3\lib\site-packages\coleo\cli.py", line 587, in thunk
    result = fn(*args)
  File "C:\Users\rajes\anaconda3\lib\site-packages\milatools\cli\commands.py", line 285, in code
    remote = Remote("mila")
  File "C:\Users\rajes\anaconda3\lib\site-packages\milatools\cli\remote.py", line 84, in __init__
    connection.open()
  File "C:\Users\rajes\anaconda3\lib\site-packages\fabric\connection.py", line 636, in open
    self.client.connect(**kwargs)
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\client.py", line 435, in connect
    self._auth(
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\client.py", line 766, in _auth
    raise saved_exception
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\client.py", line 736, in _auth
    key = self._key_from_filepath(
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\client.py", line 588, in _key_from_filepath
    key = klass.from_private_key_file(key_path, password)
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\pkey.py", line 242, in from_private_key_file
    key = cls(filename=filename, password=password)
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\ed25519key.py", line 63, in __init__
    signing_key = self._parse_signing_key_data(data, password)
  File "C:\Users\rajes\anaconda3\lib\site-packages\paramiko\ed25519key.py", line 112, in _parse_signing_key_data
    raise SSHException("Invalid key")
paramiko.ssh_exception.SSHException: Invalid key

An error occured during the execution of the command `code`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=code%2C0.0.18&template=bug_report.md&title=%5Bv0.0.18%5D+Issue+running+the+command+%60mila+code%60
Please provide the error traceback with the report (the red text above).

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 22.04, Mac OS 12.5, Windows 11, etc.]

Additional context

Add any other context about the problem here.

[v0.0.18] Issue running the command `mila code`

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila code /home/mila/l/le.zhang/scratch/ --node cn-g027

Describe the bug

i can't connect vscode to a compute node

Traceback (most recent call last): File "/opt/anaconda3/lib/python3.9/site-packages/milatools/cli/commands.py", line 43, in main auto_cli(milatools) File "/opt/anaconda3/lib/python3.9/site-packages/coleo/cli.py", line 656, in auto_cli result = run_cli(entry, args, **kwargs) File "/opt/anaconda3/lib/python3.9/site-packages/coleo/cli.py", line 628, in run_cli return call(opts=opts, args=args) File "/opt/anaconda3/lib/python3.9/site-packages/coleo/cli.py", line 587, in thunk result = fn(*args) File "/opt/anaconda3/lib/python3.9/site-packages/milatools/cli/commands.py", line 288, in code cnode = _find_allocation(remote, job_name="mila-code") File "/opt/anaconda3/lib/python3.9/site-packages/milatools/cli/commands.py", line 699, in _find_allocation return Remote(node_name) File "/opt/anaconda3/lib/python3.9/site-packages/milatools/cli/remote.py", line 84, in __init__ connection.open() File "/opt/anaconda3/lib/python3.9/site-packages/fabric/connection.py", line 636, in open self.client.connect(**kwargs) File "/opt/anaconda3/lib/python3.9/site-packages/paramiko/client.py", line 435, in connect self._auth( File "/opt/anaconda3/lib/python3.9/site-packages/paramiko/client.py", line 766, in _auth raise saved_exception File "/opt/anaconda3/lib/python3.9/site-packages/paramiko/client.py", line 742, in _auth self._transport.auth_publickey(username, key) File "/opt/anaconda3/lib/python3.9/site-packages/paramiko/transport.py", line 1635, in auth_publickey return self.auth_handler.wait_for_response(my_event) File "/opt/anaconda3/lib/python3.9/site-packages/paramiko/auth_handler.py", line 259, in wait_for_response raise e paramiko.ssh_exception.AuthenticationException: Authentication failed.

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: macOS 14.0 (23A344) M1 Pro

Additional context

Add any other context about the problem here.

Impossible to "ssh-copy-id mila" in the "mila init" process on Windows 11

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila init
Then i entered my username on the mila cluster as asked by the command prompt.
Then I answered to the question

+Host mila
+  HostName login.server.mila.quebec
+  User basile.terver
+  PreferredAuthentications publickey,keyboard-interactive
+  Port 2222
+  ServerAliveInterval 120
+  ServerAliveCountMax 5
+
+
+Host mila-cpu
+  User basile.terver
+  Port 2222
+  ForwardAgent yes
+  StrictHostKeyChecking no
+  LogLevel ERROR
+  UserKnownHostsFile /dev/null
+  RequestTTY force
+  ConnectTimeout 600
+  ServerAliveInterval 120
+  ProxyCommand ssh mila "/cvmfs/config.mila.quebec/scripts/milatools/slurm-proxy.sh mila-cpu --mem=8G"
+  RemoteCommand /cvmfs/config.mila.quebec/scripts/milatools/entrypoint.sh mila-cpu
+
+
+Host *.server.mila.quebec !*login.server.mila.quebec
+  HostName %h
+  User basile.terver
+  ProxyJump mila
+
?
Is this OK?

Then I answered yes to the question
? You have no public keys. Generate one?
Then I answered yes to the question
? Your public key does not appear be registered on the cluster. Register it? Yes
Then I got the error below.

Describe the bug

In the "mila init process" on my Windows PC, I get stucked at this point, although I upgrade the milatools package.

Traceback (most recent call last):
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\milatools\cli\commands.py", line 43, in main
    auto_cli(milatools)
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\coleo\cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\coleo\cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\coleo\cli.py", line 587, in thunk
    result = fn(*args)
             ^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\milatools\cli\commands.py", line 161, in init
    here.run("ssh-copy-id", "mila")
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\milatools\cli\local.py", line 28, in run
    return subprocess.run(
           ^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\terve\anaconda3\envs\mila\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] Le fichier spรฉcifiรฉ est introuvable

Screenshots

image (4) image (3) ![image (2)](https://github.com/mila-iqia/milatools/assets/146956184/70576fd4-74a5-4107-be86-3f9b37394750) ![image (1)](https://github.com/mila-iqia/milatools/assets/146956184/cf74d81f-f8c9-42d2-80b6-7d391b383d78) ![image](https://github.com/mila-iqia/milatools/assets/146956184/cd9b4ec8-0ba8-4ceb-bf04-35a7887311fc)

Desktop (please complete the following information):

  • OS: [Windows 11]

Additional context

I started following Victor Schmidt's guide (https://vsch.notion.site/YAMSS-5471da23464e41d4bad5e3517d273dea#0742904b6e384aba94b29a24b69e7b0e) on my WSL machine. But I decided to try milatools because I am on Windows, which is a problem if I want to open VS Code on a compute node following Victor's guide.

trouble with installing jupyter notebook

when i type the command (mila serve notebook) on my local computer, it ask me if i want to pip install jupyter or install it myself. When i choose pip install jupyter i get this long script with error messages:

(experiments) compstaffs-MacBook-Pro:~ mliu$ mila serve notebook
Checking for preferred profile in /home/mila/l/liumiche/.milatools-profile
Using profile: .milatools/profiles/cpdagGFN.bash
==================================================
module load python/3.8 cuda/11.2/cudnn/8.1
source ~/virtualenvs/cpdagGFN/bin/activate
==================================================
? jupyter-notebook is not installed in that environment. Do you want to install it? pip install jupyter
(mila) $ srun pip install jupyter
[=== Module python/3.8 loaded ===]
[=== Module cudatoolkit/11.2 loaded ===]
[=== Module cuda/11.2/cudnn/8.1 loaded ===]
srun: --------------------------------------------------------------------------------------------------
srun: # Using default long-cpu partition (CPU-only)
srun: --------------------------------------------------------------------------------------------------
Collecting jupyter
  Using cached https://files.pythonhosted.org/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl
Collecting nbconvert (from jupyter)
  Using cached https://files.pythonhosted.org/packages/37/33/0d339e81b7c6b77020059dd2268c16c4ba63473bdd984991f91ac7ad7ef7/nbconvert-7.2.5-py3-none-any.whl
Collecting jupyter-console (from jupyter)
  Using cached https://files.pythonhosted.org/packages/8b/0c/f9382ca7b7499c8594a5158817a72c95b4c09a6c6f2de10553bfe8905924/jupyter_console-6.4.4-py3-none-any.whl
Collecting notebook (from jupyter)
  Using cached https://files.pythonhosted.org/packages/db/40/2d321ba572dc9a94a090d92c9826291a1dcee1e05bc6c1d641ce419b701d/notebook-6.5.2-py3-none-any.whl
Collecting qtconsole (from jupyter)
  Using cached https://files.pythonhosted.org/packages/cc/00/4133199dc738e7f497385af86e619f5c29592aaa4c1731fbbc3ec7bb7080/qtconsole-5.4.0-py3-none-any.whl
Collecting ipykernel (from jupyter)
  Using cached https://files.pythonhosted.org/packages/c5/91/3740ea00e8bb766742937a781bce18b6d344bfd99c48e5ea89b4681ef089/ipykernel-6.18.0-py3-none-any.whl
Collecting ipywidgets (from jupyter)
  Using cached https://files.pythonhosted.org/packages/e4/56/990c10ca8751182ace2464cb0e4baafb7087a40c185c9142b9cd18683fac/ipywidgets-8.0.2-py3-none-any.whl
Collecting traitlets>=5.0 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/ed/f9/caefd8c90955184e7426ef930e38c185e047169b520b35bdd57d341d03f4/traitlets-5.5.0-py3-none-any.whl
Collecting packaging (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/05/8e/8de486cbd03baba4deef4142bd643a3e7bbe954a784dc1bb17142572d127/packaging-21.3-py3-none-any.whl
Collecting jinja2>=3.0 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/bc/c3/f068337a370801f372f2f8f6bad74a5c140f6fda3d9de154052708dd3c65/Jinja2-3.1.2-py3-none-any.whl
Collecting defusedxml (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/07/6c/aa3f2f849e01cb6a001cd8554a88d4c77c5c1a31c95bdf1cf9301e6d9ef4/defusedxml-0.7.1-py2.py3-none-any.whl
Collecting pandocfilters>=1.4.1 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/5e/a8/878258cffd53202a6cc1903c226cf09e58ae3df6b09f8ddfa98033286637/pandocfilters-1.5.0-py2.py3-none-any.whl
Collecting beautifulsoup4 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/9c/d8/909c4089dbe4ade9f9705f143c9f13f065049a9d5e7d34c828aefdd0a97c/beautifulsoup4-4.11.1-py3-none-any.whl
Collecting mistune<3,>=2.0.3 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/3a/c7/b0a4413a4d9b7a4fda0d710fd90dba62375f0d0c4544e848dc7656757c0c/mistune-2.0.4-py2.py3-none-any.whl
Collecting markupsafe>=2.0 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/1d/97/2288fe498044284f39ab8950703e88abbac2abbdf65524d576157af70556/MarkupSafe-2.1.1.tar.gz
Collecting jupyterlab-pygments (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/c0/7e/c3d1df3ae9b41686e664051daedbd70eea2e1d2bd9d9c33e7e1455bc9f96/jupyterlab_pygments-0.2.2-py2.py3-none-any.whl
Collecting tinycss2 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/da/99/fd23634d6962c2791fb8cb6ccae1f05dcbfc39bce36bba8b1c9a8d92eae8/tinycss2-1.2.1-py3-none-any.whl
Collecting nbclient>=0.5.0 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/ed/aa/d00b9bdd4623a5e4500baee7f4a37b851dcbb2fa6d90f621d367d1d93420/nbclient-0.7.0-py3-none-any.whl
Collecting nbformat>=5.1 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/5c/9f/957655d02f43b8bff77e6da08c94472b1229c13e7455bbd662163c9b78c0/nbformat-5.7.0-py3-none-any.whl
Collecting pygments>=2.4.1 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/4f/82/672cd382e5b39ab1cd422a672382f08a1fb3d08d9e0c0f3707f33a52063b/Pygments-2.13.0-py3-none-any.whl
Collecting jupyter-core>=4.7 (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/a3/60/63e9e1e18ae427cd925eb970dd452a5b322060207e7f5243ca4620ee0507/jupyter_core-5.0.0-py3-none-any.whl
Collecting importlib-metadata>=3.6; python_version < "3.10" (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/b5/64/ef29a63cf08f047bb7fb22ab0f1f774b87eed0bb46d067a5a524798a4af8/importlib_metadata-5.0.0-py3-none-any.whl
Collecting bleach (from nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/d4/87/508104336a2bc0c4cfdbdceedc0f44dc72da3abc0460c57e323ddd1b3257/bleach-5.0.1-py3-none-any.whl
Collecting ipython (from jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/c7/53/072d677a16fd61f5806d80218c65202cc0ee77b831088af8f79ef59efcf2/ipython-8.6.0-py3-none-any.whl
Collecting jupyter-client>=7.0.0 (from jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/4b/b4/df6186d9d1c7e8d943febb8e1a17aedc031ab374924fd19193f9efb0fbb2/jupyter_client-7.4.7-py3-none-any.whl
Collecting prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 (from jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/58/87/cac418cef18781a9081cb2075cc2cf08c77e0679c1f9b474587d71bbf777/prompt_toolkit-3.0.33-py3-none-any.whl
Collecting nest-asyncio>=1.5 (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/e9/1a/6dd9ec31cfdb34cef8fea0055b593ee779a6f63c8e8038ad90d71b7f53c0/nest_asyncio-1.5.6-py3-none-any.whl
Collecting argon2-cffi (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/a8/07/946d5a9431bae05a776a59746ec385fbb79b526738d25e4202d3e0bbf7f4/argon2_cffi-21.3.0-py3-none-any.whl
Collecting nbclassic>=0.4.7 (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/a6/85/2a240df7326b7311ebd926c12d7df5394aef2f9f76ffbb294079cc43960e/nbclassic-0.4.8-py3-none-any.whl
Collecting pyzmq>=17 (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/8e/dd/17a933548e39ac753836d60ebb3de5c5264cb4e2d2c8b88436fcd0262515/pyzmq-24.0.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Collecting Send2Trash>=1.8.0 (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/47/26/3435896d757335ea53dce5abf8d658ca80757a7a06258451b358f10232be/Send2Trash-1.8.0-py3-none-any.whl
Collecting tornado>=6.1 (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/19/bb/b6c3d1668d2b10ad38a584f3a1ec9737984e274f8b708e09fcbb96427f5c/tornado-6.2-cp37-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting terminado>=0.8.3 (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/4c/27/3ddec4ed8f8312d9c4774ae0a62469d29637176df8a5a6321070aa0edc97/terminado-0.17.0-py3-none-any.whl
Collecting prometheus-client (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/2e/5e/4225463cdac1098aac718b1d8adf8f9dc3d6aaea55f4f85a2f7d572b4f7c/prometheus_client-0.15.0-py3-none-any.whl
Collecting ipython-genutils (from notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl
Collecting qtpy>=2.0.1 (from qtconsole->jupyter)
  Using cached https://files.pythonhosted.org/packages/ca/56/3dfbcf8a6808d2b3566b75759c48a281bcdc2b9547760e5d044e6ec7e33b/QtPy-2.3.0-py3-none-any.whl
Collecting psutil (from ipykernel->jupyter)
  Using cached https://files.pythonhosted.org/packages/6e/c8/784968329c1c67c28cce91991ef9af8a8913aa5a3399a6a8954b1380572f/psutil-5.9.4-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting matplotlib-inline>=0.1 (from ipykernel->jupyter)
  Using cached https://files.pythonhosted.org/packages/f2/51/c34d7a1d528efaae3d8ddb18ef45a41f284eacf9e514523b191b7d0872cc/matplotlib_inline-0.1.6-py3-none-any.whl
Collecting debugpy>=1.0 (from ipykernel->jupyter)
  Using cached https://files.pythonhosted.org/packages/d2/e3/d0531ee73216d553d717bf4ac51dff297f89054619fa69db61eef028a07f/debugpy-1.6.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Collecting comm>=0.1 (from ipykernel->jupyter)
  Using cached https://files.pythonhosted.org/packages/d5/f5/3a1e7504dfa46c7faf765d882d8997d771a551ef25c7b4a9fe8c61f9e3ad/comm-0.1.0-py2.py3-none-any.whl
Collecting widgetsnbextension~=4.0 (from ipywidgets->jupyter)
  Using cached https://files.pythonhosted.org/packages/d7/ae/ee70b20dc836d935a9a6483339854c09d8752e55a8104668e2426cf3baf3/widgetsnbextension-4.0.3-py3-none-any.whl
Collecting jupyterlab-widgets~=3.0 (from ipywidgets->jupyter)
  Using cached https://files.pythonhosted.org/packages/d8/52/2f4b8f5975312fb58f4eacab2e6f6cfd2efd05704514a60a151a4e69d608/jupyterlab_widgets-3.0.3-py3-none-any.whl
Collecting pyparsing!=3.0.5,>=2.0.2 (from packaging->nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/6c/10/a7d0fa5baea8fe7b50f448ab742f26f52b80bfca85ac2be9d35cdd9a3246/pyparsing-3.0.9-py3-none-any.whl
Collecting soupsieve>1.2 (from beautifulsoup4->nbconvert->jupyter)
  Downloading https://files.pythonhosted.org/packages/16/e3/4ad79882b92617e3a4a0df1960d6bce08edfb637737ac5c3f3ba29022e25/soupsieve-2.3.2.post1-py3-none-any.whl
Collecting webencodings>=0.4 (from tinycss2->nbconvert->jupyter)
  Downloading https://files.pythonhosted.org/packages/f4/24/2a3e3df732393fed8b3ebf2ec078f05546de641fe1b667ee316ec1dcf3b7/webencodings-0.5.1-py2.py3-none-any.whl
Collecting fastjsonschema (from nbformat>=5.1->nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/e4/be/cf1b876348070a23cb0c3ebfee7a452ad3a91b07b456dade3bd514656009/fastjsonschema-2.16.2-py3-none-any.whl
Collecting jsonschema>=2.6 (from nbformat>=5.1->nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/f2/c5/8e4cdbcbf81c5003a88722c34009bbda692d495dbccc2bf23edf9402d83d/jsonschema-4.17.0-py3-none-any.whl
Collecting platformdirs (from jupyter-core>=4.7->nbconvert->jupyter)
  Using cached https://files.pythonhosted.org/packages/61/e0/15ba41c6716acb033c3793be3a02f26c53914ecd9bdd6b315001f8f5f581/platformdirs-2.5.4-py3-none-any.whl
Collecting zipp>=0.5 (from importlib-metadata>=3.6; python_version < "3.10"->nbconvert->jupyter)
  Downloading https://files.pythonhosted.org/packages/40/8a/d63273ed0fa4a3d06f77e7b043f6577d8894e95515b0c187c52e2c0efabb/zipp-3.10.0-py3-none-any.whl
Requirement already satisfied: six>=1.9.0 in /cvmfs/ai.mila.quebec/apps/x86_64/debian/python/3.8/lib/python3.8/site-packages (from bleach->nbconvert->jupyter) (1.14.0)
Collecting decorator (from ipython->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/d5/50/83c593b07763e1161326b3b8c6686f0f4b0f24d5526546bee538c89837d6/decorator-5.1.1-py3-none-any.whl
Collecting stack-data (from ipython->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/e7/f1/a1f2fd4a75d371412650b3ddc16741e0de383fe701953566c9288f678a5b/stack_data-0.6.1-py3-none-any.whl
Collecting jedi>=0.16 (from ipython->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/b3/0e/836f12ec50075161e365131f13f5758451645af75c2becf61c6351ecec39/jedi-0.18.1-py2.py3-none-any.whl
Collecting pickleshare (from ipython->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl
Collecting pexpect>4.3; sys_platform != "win32" (from ipython->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/39/7b/88dbb785881c28a102619d46423cb853b46dbccc70d3ac362d99773a78ce/pexpect-4.8.0-py2.py3-none-any.whl
Collecting backcall (from ipython->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/4c/1c/ff6546b6c12603d8dd1070aa3c3d273ad4c07f5771689a7b69a550e8c951/backcall-0.2.0-py2.py3-none-any.whl
Collecting entrypoints (from jupyter-client>=7.0.0->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/35/a8/365059bbcd4572cbc41de17fd5b682be5868b218c3c5479071865cab9078/entrypoints-0.4-py3-none-any.whl
Collecting python-dateutil>=2.8.2 (from jupyter-client>=7.0.0->jupyter-console->jupyter)
  Using cached https://files.pythonhosted.org/packages/36/7a/87837f39d0296e723bb9b62bbb257d0355c7f6128853c78955f57342a56d/python_dateutil-2.8.2-py2.py3-none-any.whl
Collecting wcwidth (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->jupyter-console->jupyter)
  Downloading https://files.pythonhosted.org/packages/59/7c/e39aca596badaf1b78e8f547c807b04dae603a433d3e7a7e04d67f2ef3e5/wcwidth-0.2.5-py2.py3-none-any.whl
Collecting argon2-cffi-bindings (from argon2-cffi->notebook->jupyter)
  Using cached https://files.pythonhosted.org/packages/b9/e9/184b8ccce6683b0aa2fbb7ba5683ea4b9c5763f1356347f1312c32e3c66e/argon2-cffi-bindings-21.2.0.tar.gz
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /home/mila/l/liumiche/virtualenvs/cpdagGFN/bin/python /home/mila/l/liumiche/virtualenvs/cpdagGFN/lib/python3.8/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-4qpqdizx/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=45' 'setuptools_scm>=6.2' wheel 'cffi>=1.0.1'
       cwd: None
  Complete output (85 lines):
  Collecting setuptools>=45
    Using cached https://files.pythonhosted.org/packages/1f/97/c03668380f278f1f8b0486d820c142cf224bba1bd78416e1797b52e0e81c/setuptools-65.6.0-py3-none-any.whl
  Collecting setuptools_scm>=6.2
    Using cached https://files.pythonhosted.org/packages/01/ed/75a20e7b075e8ecb1f84e8debf833917905d8790b78008915bd68dddd5c4/setuptools_scm-7.0.5-py3-none-any.whl
  Collecting wheel
    Using cached https://files.pythonhosted.org/packages/bd/7c/d38a0b30ce22fc26ed7dbc087c6d00851fb3395e9d0dac40bec1f905030c/wheel-0.38.4-py3-none-any.whl
  Collecting cffi>=1.0.1
    Using cached https://files.pythonhosted.org/packages/2b/a8/050ab4f0c3d4c1b8aaa805f70e26e84d0e27004907c5b8ecc1d31815f92a/cffi-1.15.1.tar.gz
  Collecting packaging>=20.0 (from setuptools_scm>=6.2)
    Using cached https://files.pythonhosted.org/packages/05/8e/8de486cbd03baba4deef4142bd643a3e7bbe954a784dc1bb17142572d127/packaging-21.3-py3-none-any.whl
  Collecting tomli>=1.0.0 (from setuptools_scm>=6.2)
    Using cached https://files.pythonhosted.org/packages/97/75/10a9ebee3fd790d20926a90a2547f0bf78f371b2f13aa822c759680ca7b9/tomli-2.0.1-py3-none-any.whl
  Collecting typing-extensions (from setuptools_scm>=6.2)
    Using cached https://files.pythonhosted.org/packages/0b/8e/f1a0a5a76cfef77e1eb6004cb49e5f8d72634da638420b9ea492ce8305e8/typing_extensions-4.4.0-py3-none-any.whl
  Collecting pycparser (from cffi>=1.0.1)
    Using cached https://files.pythonhosted.org/packages/62/d5/5f610ebe421e85889f2e55e33b7f9a6795bd982198517d912eb1c76e1a53/pycparser-2.21-py2.py3-none-any.whl
  Collecting pyparsing!=3.0.5,>=2.0.2 (from packaging>=20.0->setuptools_scm>=6.2)
    Using cached https://files.pythonhosted.org/packages/6c/10/a7d0fa5baea8fe7b50f448ab742f26f52b80bfca85ac2be9d35cdd9a3246/pyparsing-3.0.9-py3-none-any.whl
  Installing collected packages: setuptools, pyparsing, packaging, tomli, typing-extensions, setuptools-scm, wheel, pycparser, cffi
    Running setup.py install for cffi: started
      Running setup.py install for cffi: finished with status 'error'
      ERROR: Command errored out with exit status 1:
       command: /home/mila/l/liumiche/virtualenvs/cpdagGFN/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-a5obestd/cffi/setup.py'"'"'; __file__='"'"'/tmp/pip-install-a5obestd/cffi/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-gs9rsn3v/install-record.txt --single-version-externally-managed --prefix /tmp/pip-build-env-4qpqdizx/overlay --compile --install-headers /home/mila/l/liumiche/virtualenvs/cpdagGFN/include/site/python3.8/cffi
           cwd: /tmp/pip-install-a5obestd/cffi/
      Complete output (56 lines):
      Package libffi was not found in the pkg-config search path.
      Perhaps you should add the directory containing `libffi.pc'
      to the PKG_CONFIG_PATH environment variable
      No package 'libffi' found
      Package libffi was not found in the pkg-config search path.
      Perhaps you should add the directory containing `libffi.pc'
      to the PKG_CONFIG_PATH environment variable
      No package 'libffi' found
      Package libffi was not found in the pkg-config search path.
      Perhaps you should add the directory containing `libffi.pc'
      to the PKG_CONFIG_PATH environment variable
      No package 'libffi' found
      Package libffi was not found in the pkg-config search path.
      Perhaps you should add the directory containing `libffi.pc'
      to the PKG_CONFIG_PATH environment variable
      No package 'libffi' found
      Package libffi was not found in the pkg-config search path.
      Perhaps you should add the directory containing `libffi.pc'
      to the PKG_CONFIG_PATH environment variable
      No package 'libffi' found
      running install
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-3.8
      creating build/lib.linux-x86_64-3.8/cffi
      copying cffi/lock.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/error.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/vengine_cpy.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/recompiler.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/model.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/vengine_gen.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/cparser.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/cffi_opcode.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/ffiplatform.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/commontypes.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/api.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/setuptools_ext.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/pkgconfig.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/backend_ctypes.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/verifier.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/__init__.py -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/_cffi_include.h -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/parse_c_type.h -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/_embedding.h -> build/lib.linux-x86_64-3.8/cffi
      copying cffi/_cffi_errors.h -> build/lib.linux-x86_64-3.8/cffi
      running build_ext
      building '_cffi_backend' extension
      creating build/temp.linux-x86_64-3.8
      creating build/temp.linux-x86_64-3.8/c
      gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DFFI_BUILDING=1 -DUSE__THREAD -DHAVE_SYNC_SYNCHRONIZE -I/usr/include/ffi -I/usr/include/libffi -I/home/mila/l/liumiche/virtualenvs/cpdagGFN/include -I/cvmfs/ai.mila.quebec/apps/arch/distro/python/3.8/include/python3.8 -c c/_cffi_backend.c -o build/temp.linux-x86_64-3.8/c/_cffi_backend.o
      c/_cffi_backend.c:15:10: fatal error: ffi.h: No such file or directory
       #include <ffi.h>
                ^~~~~~~
      compilation terminated.
      error: command 'gcc' failed with exit status 1
      ----------------------------------------
  ERROR: Command errored out with exit status 1: /home/mila/l/liumiche/virtualenvs/cpdagGFN/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-a5obestd/cffi/setup.py'"'"'; __file__='"'"'/tmp/pip-install-a5obestd/cffi/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-gs9rsn3v/install-record.txt --single-version-externally-managed --prefix /tmp/pip-build-env-4qpqdizx/overlay --compile --install-headers /home/mila/l/liumiche/virtualenvs/cpdagGFN/include/site/python3.8/cffi Check the logs for full command output.
  WARNING: You are using pip version 19.2.3, however version 22.3.1 is available.
  You should consider upgrading via the 'pip install --upgrade pip' command.
  ----------------------------------------
ERROR: Command errored out with exit status 1: /home/mila/l/liumiche/virtualenvs/cpdagGFN/bin/python /home/mila/l/liumiche/virtualenvs/cpdagGFN/lib/python3.8/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-4qpqdizx/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=45' 'setuptools_scm>=6.2' wheel 'cffi>=1.0.1' Check the logs for full command output.
WARNING: You are using pip version 19.2.3, however version 22.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
srun: error: cn-f001: task 0: Exited with exit code 1
srun: launch/slurm: _step_signal: Terminating StepId=2492788.0
Traceback (most recent call last):
  File "/anaconda3/envs/experiments/bin/mila", line 8, in <module>
    sys.exit(main())
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/milatools/cli/__main__.py", line 39, in main
    auto_cli(milatools)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/coleo/cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/coleo/cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/coleo/cli.py", line 587, in thunk
    result = fn(*args)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/milatools/cli/__main__.py", line 447, in notebook
    _standard_server(
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/milatools/cli/__main__.py", line 584, in _standard_server
    if not ensure_program(
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/milatools/cli/profile.py", line 307, in ensure_program
    remote.run(f"srun {install}")
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/milatools/cli/utils.py", line 259, in run
    return self._run(cmd, hide=hide, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/milatools/cli/utils.py", line 243, in _run
    return self.connection.run(cmd, **kwargs)
  File "<decorator-gen-3>", line 2, in run
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/fabric/connection.py", line 30, in opens
    return method(self, *args, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/fabric/connection.py", line 725, in run
    return self._run(self._remote_runner(), command, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/invoke/context.py", line 102, in _run
    return runner.run(command, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/fabric/runners.py", line 72, in run
    return super(Remote, self).run(command, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/invoke/runners.py", line 380, in run
    return self._run_body(command, **kwargs)
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/invoke/runners.py", line 442, in _run_body
    return self.make_promise() if self._asynchronous else self._finish()
  File "/anaconda3/envs/experiments/lib/python3.10/site-packages/invoke/runners.py", line 509, in _finish
    raise UnexpectedExit(result)
invoke.exceptions.UnexpectedExit: Encountered a bad command exit code!

Command: 'source .milatools/profiles/cpdagGFN.bash && srun pip install jupyter'

Exit code: 1

Stdout: already printed

Stderr: already printed


(experiments) compstaffs-MacBook-Pro:~ mliu$ mila serve notebook

Issue running mila code

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

[e.g. mila code ...]

Describe the bug

When I run mila code /path/to/remote_dir, all seems to work but vscode never launches, I get this traceback

See screenshot.

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Mac Os Ventura Version 13.4

Additional context

Add any other context about the problem here.
image

Feature request: support VSCode Insiders edition

I'd like to open a ssh connection to a compute node in VSCode Insiders, rather than VSCode (that is, I think as simple as using cli command code-insiders rather than code). From what I can tell, there isn't currently a way to do this, as running

mila code $PATH_ON_CLUSTER

will salloc a node $NODE_NAME, and try to connect with it using the command code -nw --remote ssh-remote+$NODE_NAME $PATH_ON_CLUSTER. I'd like to have the option to use code-insiders ... instead.

To enable this is I think as simple as modifying "code", -> "code-insiders", in milatools/cli/__main__.py (line 277)

here.run(
"code",
"-nw",
"--remote",
f"ssh-remote+{node_name}",

But ideally, I'd be able to do this with a different command, like

mila code-insiders $PATH_ON_CLUSTER

Is there a reason why this isn't a good idea?

Forwarding fails silently with ssh ControlMaster

In my ssh config I have ControlMaster auto. This causes the ssh -L command in _forward to return immediately, which _forward interprets as the remote server having terminated. As a result, any mila serve command sets up the connections and then immediately breaks them down.

Setting -o ControlMaster=no in the ssh -L command fixes the issue for me.

(master) Bug in `mila init`: `'tuple' has no attribute 'pop'`

NOTE: This issue is present on the master branch at commit ed140e5

What command did you run?

$ mila init
Checking ssh config
Traceback (most recent call last):
  File "/home/fabricenormandin/repos/milatools/milatools/cli/commands.py", line 68, in main
    mila()
  File "/home/fabricenormandin/repos/milatools/milatools/cli/commands.py", line 361, in mila
    return function(**args_dict)
  File "/home/fabricenormandin/repos/milatools/milatools/cli/commands.py", line 397, in init
    ssh_config = setup_ssh_config()
  File "/home/fabricenormandin/repos/milatools/milatools/cli/init_command.py", line 89, in setup_ssh_config
    _add_ssh_entry(
  File "/home/fabricenormandin/repos/milatools/milatools/cli/init_command.py", line 395, in _add_ssh_entry
    ssh_config.cfg.set(host, **existing_entry)
  File "/home/fabricenormandin/miniconda3/envs/milatools/lib/python3.10/site-packages/sshconf.py", line 435, in set
    c.set(host, **kwargs)
  File "/home/fabricenormandin/miniconda3/envs/milatools/lib/python3.10/site-packages/sshconf.py", line 237, in set
    value = values.pop()
AttributeError: 'tuple' object has no attribute 'pop'

An error occurred during the execution of the command `init`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=init%2C0.0.18&template=bug_report.md&title=%5Bv0.0.18%5D+Issue+running+the+command+%60mila+init%60
Please provide the error traceback with the report (the red text above).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.