aiidateam / aiida-code-registry Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 11.0 131 KB

Registry of simulation codes and computers for easy setup in AiiDA.

Python 92.46% HTML 7.54%

aiida-code-registry's People

Contributors

Stargazers

Watchers

Forkers

kjappelbaum ltalirz unkcpz cpignedoli csadorf mbercx mpougin superstar54 geigerj2 xiaoqzhang

aiida-code-registry's Issues

A list globally control the list to be shown in database

Currently, all machines in the repository will dump to https://aiidateam.github.io/aiida-code-registry/database.json. We could add a file to control which machines will finally dump to the database. Then we can drop the machines without deleting their corresponding directory immediately.

Naming of files inside computer directories

The current file naming scheme

computers/daint-mc/daint-mc.yml
computers/daint-mc/[email protected]

obviously duplicates the computer label that is already encoded in the directory structure in the file names.

Alternative without duplication:

computers/daint-mc/computer-setup.yml
computers/daint-mc/cp2k-6.1.yml

Advantages current scheme:

When downloading the yaml file (e.g. for local editing), the filename explains what it contains.

Disadvantages current scheme:

More tedious to rename a computer
Need basic logic for identifying the computer file inside the directory (alternative: always call it computer.yml or computer-setup.yml)

To me, both of these alternatives seem workable. I like the advantage of having informative filenames, but of course the usefulness of this will depend on how people use the registry (whether they manually download files that then reside on the local fs for some time).
Input welcome!

👌 IMPROVE: Reorganise the YAML files for different project accounts

I'm not sure if the current directory structure is optimal. For example the one for daint.cscs.ch:

(aiida-dev) mbercx@theospc46:~/envs/aiida-dev/code/aiida-code-registry/daint.cscs.ch$ tree
.
├── default -> ./hybrid
├── hybrid
│   ├── computer-setup.yaml
│   └── cp2k-6.1.yaml
├── hybrid-s1005
│   ├── computer-configure.yaml
│   ├── computer-setup.yaml
│   ├── cp2k-7.1.yaml
│   ├── cp2k-8.1.yaml
│   ├── pp-6.5.yaml
│   ├── projwfc-6.5.yaml
│   └── pw-6.5.yaml
...TRIMMED
└── multicore
    ├── computer-setup.yaml
    ├── cp2k-6.1.yaml
    ├── cp2k-7.1.yaml
    └── cp2k-8.1.yaml

There seems to be a lot of duplication here. At first I wasn't sure what s1005 etc stood for, only to realise later this stood for the project. I now understand the motivation is to add the project to the --account of the prepend_text, but this means that for each project you have to copy all the codes to each of these project directories.

Wouldn't it be better to have one directory for each machine, with multiple computer-setup.yaml files? Of course, you'd have to set the same computer name for each of these to make sure the codes work, and that could lead to conflicts in case someone is running with multiple accounts on the same machine. But I'm not sure how often this is the case.

Final note: You can also specify the account when setting up the calculation job. But, I understand that this can be easily forgotten. ^^

Web site using github pages

Analogously to the AiiDA plugin registry, it would be useful to have an AiiDA Code Registry web site that

Lists the available computer configurations
When clicking on a computer, shows the verdi commands to set up & configure the computer, as well as all the available codes (so that they can simply be copy-pasted)

How to create the web page is open for discussion - a python script like in the case of the aiida registry would certainly work.

For the Github Actions workflow that commits the changes to the gh-pages branch, see https://github.com/aiidateam/aiida-registry/blob/master/.github/workflows/webpage.yml

Automatic tests

As suggested by @yakutovicha , it would be useful if a Github Actions job would try setting computers and codes in AiiDA, making sure that

all required fields are specified correctly
all values are valid

Of course, this does not test whether the configuration actually works. For this, see #8

templating mechanism

This issue is just to keep track of the desire to potentially introduce a templating mechanism in the AiiDa code registry in order to allow users to specify things like their own slurm account, etc.

See description of what would need to be done on the AiiDa side in aiidateam/aiida-core#4680

Also an early attempt in building a separate cli in #26

Add logos

The original CSCS AiiDA lab app had a logo of the machine it was setting up.
I think this is a nice touch and would also be useful on the AiiDA code registry web site.

For the moment, I simply propose to add a logo.png to the computer folder (I'd be ok with duplication for different variants of the same computer).
Eventually, this would be used by the web site & dedicated AiiDA lab applications

Template computer/code setup and new code registry configs

(Added to aiidateam/team-compass#4)

Backgroud

For beginners and even for the experienced AiiDA user, setting up computers and codes is still a tedious mission. If using the interactive mode, although it is good that options are prompted up and the user can set every option one by one carefully, it requires going through all options even if some are not necessary and time-consuming for a similar setup that has shared options with other code/computer setup.

AiiDA provides the non-interactive mode to set up the computer/code from a config YAML file, which lowers the burden for users who need to set up the computer/code next time. However, the non-interactive mode requires a YAML file as the input, and not clear which options are mandatory, and let alone it is not clear which default value will be used without checking the command help message or even the source code. Let alone for the computer setup it is a two-stage process, user needs to set up the computer for attributes that are common information for the computer that is stored in the database using verdi computer setup. Then running verdi computer configuration to set up information of the computer that is specific to the user or required to modify after the node is stored in the database.

The computer/code can be set up from a YAML file, and we provide a repository aiida-code-registry to store the YAML files for public computers and codes to share with others. Need to mention that the interactive setup command can accept a URL of a remote YAML file for setup. This makes it possible to not download/clone the aiida-code-registry repo to use the YAML to set up computer/code.

Motivation and impact

As a user, I want to have a way to easily setup computer/code and start using AiiDA quickly.

As a user, I want to go to a centralized place to share or find the configurations of my settings.

As a user, when I set the for the same computer and codes for another AiiDA profile, I can do it without duplicating all processes but very easily accomplish it.

As an aiida-code-registry developer, I need all the config sharing processes automatically and put less effort on review the new computer/code configs added in the repo.

As an aiidalab-widgets-base developer, I want to use a official supported template system to set up computer/code for ComputationalResourceWidget for setting computer/code from "database".

Desired outcome

Setup computer/code from template foramt config files
Registry page to show/share new configure files

Complexity

The template system need to be supported with verdi, which the DynamicEntryPointCommandGroup can be potentially used to prompt the required parameters dynamically for user's input. But it is not clear how to make it works.

The computer/code registry page need redesigned and well documented in the AiiDA-core docs to encourage users to contribute and maintain it.

Further comments (by GP)

@csadorf @unkcpz you are using this for the AiiDAlab right?
Can you please comment here on what is the status and if/what are the missing actions?
E.g.:

is the repo easily findable for a generic AiiDA user, e.g. from the AiIDA-core docs?
are there clear instructions on how to contribute a new code/computer?
are there outstanding technical issues, e.g. how to always ask to customize some aspects of a computer such as account or allocation or project?
are at least the (public, i.e. not private inside a university) computers/codes we use there? E.g. Daint, Eiger, Lumi-C, ...
can we identify a person in charge of maintaining the repo?

I think some of these are in the issues of the repo https://github.com/aiidateam/aiida-code-registry, but probably not all of them. If we find a person in charge, this person can triage a bit the tasks and prioritise them (and possibly distribute them, e.g. organising a "coding day"?)

Name of the repository

While the repository contains both codes and computers, I believe the name "AiiDA Code Registry" reflects the fact that the intention of users of the registry will be to set up codes for their calculations.

That being said, I'm certainly open to alternative suggestions.

account set in daint-mc

I do not know if you want to keep the account set here https://github.com/ltalirz/aiida-code-registry/blob/9a01321033b9d6356fd682a0351b34347fcc72e7/computers/daint-mc/daint-mc.yml#L14

It can be useful as an example but also lead to wrong configurations if people just blindly follow the instructions in the readme.

Large memory nodes at CSCS

I got an OUT_OF_MEMORY error when running a job at CSCS. There are two solutions:

increasing the number of nodes
use the Large memory nodes option.
```
#SBATCH --mem=120GB
```

Currently, the second option is not supported.

Naming of top-level computer directories

During a discussion on 2020-04-06 it was suggested to consider using fully qualified domain names as the directory names in computers/.

This has the advantage of being automatically unique, but the disadvantage of introducing another label.
It also means that storing separate variants of computers (e.g. daint-mc, daint-gpu) will require the introduction of an additional nesting level

computers/daint.cscs.ch/mc/...
computers/daint.cscs.ch/gpu/...

Since the domain name is not decided by the registry, but by the supercomputing centre, it is also possible (perhaps unlikely?) that it changes over time... perhaps not a big issue.

While I'm not strongly against using domain names, I feel the simpler solution would be to use the computer label as we do now.
This has the advantage of avoiding clashes of identical labels by design, which I would consider a feature.

Policy for accepting pull requests

Besides passing automatic checks that we set up, for new pull requests, it was suggested that we require

computers: the output of verdi computer test <computername>
codes: output of ls -l /absolute/path/to/executable

We should create a PR template for this

pre-commit failed in deploy stage by cache exist issue

This happened many times randomly which can all be workaround by triggering the pre-commit action manually.

/opt/hostedtoolcache/Python/3.8.14/x64/bin/pre-commit run --show-diff-on-failure --color=always --all-files
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Initializing environment for https://github.com/adrienverge/yamllint.git.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/adrienverge/yamllint.git.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Check Yaml...............................................................Passed
yamllint.................................................................Passed
Error: reserveCache failed: Cache already exists. Scope: refs/heads/master, Key: pre-commit-2-62624b5f817e5ad50836c9485ef39ad2400892a1792ea364154de932ecbfdacd-06aa6aa4168aa02577b824514528c2d6f2735558481578aa90a8261809cea0a9, Version: 28cdb9f5496f334116f23e86f0063f5d3a9348c2e22425a33171e071aadada7e

authinfo paramerters in transport configuration is also particularly specified

In my case, paratera uses 2200 as ssh port and use_login_shell need to be set to False in transport configuration. So not only should we just keep computer configure in the folder, but also the transport configure file which loaded by verdi computer configure ssh --config <url> .
My opinion is keeping this file with minimal required parameters, and left other parameters as default.

I am curious why not aiida setting computer by just one step, rather by verdi computer setup first and then verdi computer configure ssh ?

Ideas for making it easier to distribute computer/code environments

Distributing a set of pre-configured computers and codes is a problem that needs to be solved in basically all groups who want to start using AiiDA.
Currently, the person in charge needs to write some hand-crafted python scripts (e.g. like this one) in order to automate the computer/code setup, which is tedious and extra code that needs to be maintained.

One way to improve the user experience could be the following:

A subfolder of the AIIDA_PATH is used to store a set of computer/code configurations (following a well-defined schema). All the user needs to do is to drop a few specific configuration files into this folder.
When load_computer or load_code detect that a desired code/computer is not present in the database, they first look inside this folder. If a corresponding configuration is found, the computer/code is set up on the fly (if not, NotExistent error is raised as usual)
A verdi config option could be used to enable/disable this feature
It may be necessary to allow passing of template variables to load_computer / load_code

I'll open a draft pull request against the aiida-code-registry that outlines how this could work.
It's not yet touching AiiDA core (this could be done later, if others agree that this would be a welcome feature).

Pinging @unkcpz for info

Add computer-configure.yml to every computer

I am not sure why this hasn't happened before (was it discussed?), but I think this information would be needed to achieve a fully automated setup. What do you think @ltalirz and @unkcpz?