aiidateam / aiida-code-registry Goto Github PK
View Code? Open in Web Editor NEWRegistry of simulation codes and computers for easy setup in AiiDA.
Registry of simulation codes and computers for easy setup in AiiDA.
Currently, all machines in the repository will dump to https://aiidateam.github.io/aiida-code-registry/database.json. We could add a file to control which machines will finally dump to the database. Then we can drop the machines without deleting their corresponding directory immediately.
The current file naming scheme
computers/daint-mc/daint-mc.yml
computers/daint-mc/[email protected]
obviously duplicates the computer label that is already encoded in the directory structure in the file names.
Alternative without duplication:
computers/daint-mc/computer-setup.yml
computers/daint-mc/cp2k-6.1.yml
Advantages current scheme:
Disadvantages current scheme:
computer.yml
or computer-setup.yml
)To me, both of these alternatives seem workable. I like the advantage of having informative filenames, but of course the usefulness of this will depend on how people use the registry (whether they manually download files that then reside on the local fs for some time).
Input welcome!
I'm not sure if the current directory structure is optimal. For example the one for daint.cscs.ch
:
(aiida-dev) mbercx@theospc46:~/envs/aiida-dev/code/aiida-code-registry/daint.cscs.ch$ tree
.
โโโ default -> ./hybrid
โโโ hybrid
โย ย โโโ computer-setup.yaml
โย ย โโโ cp2k-6.1.yaml
โโโ hybrid-s1005
โย ย โโโ computer-configure.yaml
โย ย โโโ computer-setup.yaml
โย ย โโโ cp2k-7.1.yaml
โย ย โโโ cp2k-8.1.yaml
โย ย โโโ pp-6.5.yaml
โย ย โโโ projwfc-6.5.yaml
โย ย โโโ pw-6.5.yaml
...TRIMMED
โโโ multicore
โโโ computer-setup.yaml
โโโ cp2k-6.1.yaml
โโโ cp2k-7.1.yaml
โโโ cp2k-8.1.yaml
There seems to be a lot of duplication here. At first I wasn't sure what s1005
etc stood for, only to realise later this stood for the project. I now understand the motivation is to add the project to the --account
of the prepend_text
, but this means that for each project you have to copy all the codes to each of these project directories.
Wouldn't it be better to have one directory for each machine, with multiple computer-setup.yaml
files? Of course, you'd have to set the same computer name for each of these to make sure the codes work, and that could lead to conflicts in case someone is running with multiple accounts on the same machine. But I'm not sure how often this is the case.
Final note: You can also specify the account when setting up the calculation job. But, I understand that this can be easily forgotten. ^^
Analogously to the AiiDA plugin registry, it would be useful to have an AiiDA Code Registry web site that
verdi
commands to set up & configure the computer, as well as all the available codes (so that they can simply be copy-pasted)How to create the web page is open for discussion - a python script like in the case of the aiida registry would certainly work.
For the Github Actions workflow that commits the changes to the gh-pages
branch, see https://github.com/aiidateam/aiida-registry/blob/master/.github/workflows/webpage.yml
As suggested by @yakutovicha , it would be useful if a Github Actions job would try setting computers and codes in AiiDA, making sure that
Of course, this does not test whether the configuration actually works. For this, see #8
This issue is just to keep track of the desire to potentially introduce a templating mechanism in the AiiDa code registry in order to allow users to specify things like their own slurm account, etc.
See description of what would need to be done on the AiiDa side in aiidateam/aiida-core#4680
Also an early attempt in building a separate cli in #26
The original CSCS AiiDA lab app had a logo of the machine it was setting up.
I think this is a nice touch and would also be useful on the AiiDA code registry web site.
For the moment, I simply propose to add a logo.png
to the computer folder (I'd be ok with duplication for different variants of the same computer).
Eventually, this would be used by the web site & dedicated AiiDA lab applications
(Added to aiidateam/team-compass#4)
For beginners and even for the experienced AiiDA user, setting up computers and codes is still a tedious mission. If using the interactive mode, although it is good that options are prompted up and the user can set every option one by one carefully, it requires going through all options even if some are not necessary and time-consuming for a similar setup that has shared options with other code/computer setup.
AiiDA provides the non-interactive mode to set up the computer/code from a config YAML file, which lowers the burden for users who need to set up the computer/code next time. However, the non-interactive mode requires a YAML file as the input, and not clear which options are mandatory, and let alone it is not clear which default value will be used without checking the command help message or even the source code. Let alone for the computer setup it is a two-stage process, user needs to set up the computer for attributes that are common information for the computer that is stored in the database using verdi computer setup. Then running verdi computer configuration to set up information of the computer that is specific to the user or required to modify after the node is stored in the database.
The computer/code can be set up from a YAML file, and we provide a repository aiida-code-registry to store the YAML files for public computers and codes to share with others. Need to mention that the interactive setup command can accept a URL of a remote YAML file for setup. This makes it possible to not download/clone the aiida-code-registry repo to use the YAML to set up computer/code.
As a user, I want to have a way to easily setup computer/code and start using AiiDA quickly.
As a user, I want to go to a centralized place to share or find the configurations of my settings.
As a user, when I set the for the same computer and codes for another AiiDA profile, I can do it without duplicating all processes but very easily accomplish it.
As an aiida-code-registry developer, I need all the config sharing processes automatically and put less effort on review the new computer/code configs added in the repo.
As an aiidalab-widgets-base developer, I want to use a official supported template system to set up computer/code for ComputationalResourceWidget
for setting computer/code from "database".
The template system need to be supported with verdi
, which the DynamicEntryPointCommandGroup
can be potentially used to prompt the required parameters dynamically for user's input. But it is not clear how to make it works.
The computer/code registry page need redesigned and well documented in the AiiDA-core docs to encourage users to contribute and maintain it.
@csadorf @unkcpz you are using this for the AiiDAlab right?
Can you please comment here on what is the status and if/what are the missing actions?
E.g.:
I think some of these are in the issues of the repo https://github.com/aiidateam/aiida-code-registry, but probably not all of them. If we find a person in charge, this person can triage a bit the tasks and prioritise them (and possibly distribute them, e.g. organising a "coding day"?)
While the repository contains both codes and computers, I believe the name "AiiDA Code Registry" reflects the fact that the intention of users of the registry will be to set up codes for their calculations.
That being said, I'm certainly open to alternative suggestions.
I do not know if you want to keep the account set here https://github.com/ltalirz/aiida-code-registry/blob/9a01321033b9d6356fd682a0351b34347fcc72e7/computers/daint-mc/daint-mc.yml#L14
It can be useful as an example but also lead to wrong configurations if people just blindly follow the instructions in the readme.
I got an OUT_OF_MEMORY
error when running a job at CSCS. There are two solutions:
Large memory nodes
option.
#SBATCH --mem=120GB
Currently, the second option is not supported.
During a discussion on 2020-04-06 it was suggested to consider using fully qualified domain names as the directory names in computers/
.
This has the advantage of being automatically unique, but the disadvantage of introducing another label.
It also means that storing separate variants of computers (e.g. daint-mc
, daint-gpu
) will require the introduction of an additional nesting level
computers/daint.cscs.ch/mc/...
computers/daint.cscs.ch/gpu/...
Since the domain name is not decided by the registry, but by the supercomputing centre, it is also possible (perhaps unlikely?) that it changes over time... perhaps not a big issue.
While I'm not strongly against using domain names, I feel the simpler solution would be to use the computer label as we do now.
This has the advantage of avoiding clashes of identical labels by design, which I would consider a feature.
Besides passing automatic checks that we set up, for new pull requests, it was suggested that we require
verdi computer test <computername>
ls -l /absolute/path/to/executable
We should create a PR template for this
This happened many times randomly which can all be workaround by triggering the pre-commit action manually.
/opt/hostedtoolcache/Python/3.8.14/x64/bin/pre-commit run --show-diff-on-failure --color=always --all-files
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Initializing environment for https://github.com/adrienverge/yamllint.git.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/adrienverge/yamllint.git.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Check Yaml...............................................................Passed
yamllint.................................................................Passed
Error: reserveCache failed: Cache already exists. Scope: refs/heads/master, Key: pre-commit-2-62624b5f817e5ad50836c9485ef39ad2400892a1792ea364154de932ecbfdacd-06aa6aa4168aa02577b824514528c2d6f2735558481578aa90a8261809cea0a9, Version: 28cdb9f5496f334116f23e86f0063f5d3a9348c2e22425a33171e071aadada7e
In my case, paratera uses 2200
as ssh port and use_login_shell
need to be set to False
in transport configuration. So not only should we just keep computer configure in the folder, but also the transport configure file which loaded by verdi computer configure ssh --config <url>
.
My opinion is keeping this file with minimal required parameters, and left other parameters as default.
I am curious why not aiida setting computer by just one step, rather by verdi computer setup
first and then verdi computer configure ssh
?
Distributing a set of pre-configured computers and codes is a problem that needs to be solved in basically all groups who want to start using AiiDA.
Currently, the person in charge needs to write some hand-crafted python scripts (e.g. like this one) in order to automate the computer/code setup, which is tedious and extra code that needs to be maintained.
One way to improve the user experience could be the following:
AIIDA_PATH
is used to store a set of computer/code configurations (following a well-defined schema). All the user needs to do is to drop a few specific configuration files into this folder.load_computer
or load_code
detect that a desired code/computer is not present in the database, they first look inside this folder. If a corresponding configuration is found, the computer/code is set up on the fly (if not, NotExistent
error is raised as usual)verdi config
option could be used to enable/disable this featureload_computer
/ load_code
I'll open a draft pull request against the aiida-code-registry that outlines how this could work.
It's not yet touching AiiDA core (this could be done later, if others agree that this would be a welcome feature).
Pinging @unkcpz for info
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.