aiidateam / aiida-atomistic Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 3.0 582 KB

AiiDA plugin which contains data and methods for atomistic simulations.

License: MIT License

Python 100.00%

aiida-atomistic's People

Contributors

Watchers

aiida-atomistic's Issues

1. Setting a property: built-in methods vs constructor

Using built-in methods:

The current implementation supports two ways to set a property, both via methods of the StructureData itself or of the property class:

### Store the Pbc property in a StructureData node.
##########################################################
#
#
structure = StructureData()
#
# (1) using the set_property method:
#
structure.set_property(name='pbc', value={'value':[True,False,True]})
#
# (2) using built-in method of Pbc property. 
#     This calls internally the set_property method.
#
structure.pbc.set_from_string(dimensionality="3D") 
# ==> structure.pbc.value is then [True,True,True]
#
#The built-in methods then should work in this way:
structure.pbc.set_pbc(input_list=[True,True,True],...)
structure.pbc.from_structure(structure=other_structure,...)
#
# We cannot do the following:
#
# A property should be  read-only if accessed directly:
structure.pbc.value = 54
structure.pbc = Pbc([True,False,True])



#
# Not possible to set unsupported properties or to provide random parameters
structure.set_property(name='unsupported', value={'value':5})
structure.set_property(name='pbc', value={'unsupported_params':5})

The built-in methods of the properties should call the set_property method in order to set/update the value of a property. This ensures consistency and data validation checks and also consistency for the usage of the StructureData in different plugins.

Using a constructor

However, in this way the immutability of a property is less robust: we can always use one of the methods above to override a property at any time (of course, only if the node is not stored). This may also lead to some unconsistency with other properties which are already defined. A way to avoid these two issues is to delegate everything (StructureData and properties setting) to the constructor of the StructureData class:

### Store the Pbc property in a StructureData node.
##########################################################
#
#
pbc_3d = Pbc(value=[True,True,True])
spin_up = Magnetization(moments=[1,1,1,...])
...
#
structure = StructureData(pbc=pbc_3d, magnetization=spin_up, ...)

some advantages are:

concrete immutability: you have to create a new StructureData class to change a property - everything is really read-only;
consistency checks at creation time
can instantiate a property directly; more intuitive and flexible - at least before the StructureData node creation

A disadvantage that I see it is that structure dependent property that we may want to initialize are not straightforward: for example, if we want to provide Hubbard U for a given set of atoms, we can provide the name of the atom, and then a set of operations are done in order to understand each atomic site which should have the same U. This cannot be done a priori and it is a bit counterintuitive on how to do it form the constructor (we pass the Hubbard class, or a dictionary then to be internally processed in order to set the instance of Hubbard?)

In my opinion, we can provide some method in the property classes that allows to instantiate one of them. This however may give a fake hint on the possiblity to mutate them once the StructureData is created. I don't really know about this, but if I think of the Hubbard property, one and very useful method is the from_list one.

4. A site-based approach

In this issue we comment on the description of properties in a site-based approach.

pros:

intuitive and straightforward definition of the properties (just a list where for each site we define the property)
no "interference" between properties (successive kind redefinition if one property is set, may create ambiguities)
Hubbard parameters are naturally site-based: some property is complicated to be defined via kinds.
intersite properties (e.g. Hubbard) are easy to define in a site-based approach

cons:

need a mapping into kinds by means of thresholds(like magmoms), and this can be done on the fly in the plugin or as method in our StructureData ->may results in unexpected behaviors (too many starting magnetizations, i.e. too many kinds); however, we can do it in a self-consistent way (if kinds>10, increase the thresholds...)
duplication of information with respect to kinds, which are more compact

The `to_kinds` method

If we adopt the sites based approach, the sites-to-kinds mapping can be provided via a to_kinds method of the StructureData. This can support tolerances for the definition of new kinds.
However, such procedure can be also done by hands for specific cases (for example, we may want a specific order of the kinds)

3. How properties should be stored inside the StructureData Node

The `properties` namespace

Properties may be stored inside an attribute called properties. For example in case of the pbc property, it should be stored as StructureData.properties.pbc: this is the class Pbc and its actual value (which is a list of booleans) can be accessed via StructureData.properties.pbc.value.

everything in the properties attribute can be subject to a consistency check, to understand if there is something unknown. and then raise an error (the same check can be done via the structure.get_valid_properties() and structure.get_defined_properties() methods).
The question also becomes: what is a property? Are the pbc a property? What about the mass of each site?

Physical storage of a property: database of repository?

The advantage of database (storing properties as attribute of StructureData node) is that it is possible to query and search. The repository, instead, is not subject to any query.
However, for example in case of Hubbard parameters (see HubbardStructureData), these are stored in the repository as JSON files, using pydantic functionalities.

Sometimes, we want to be queryable only some part of the properties. This means that we can design a property class to determine which of its parameters/values should be stored and where; we add a level of flexibility for the storage location. This flexibility however should be only in the implementation of the property, i.e. we cannot decide the storage location on-the-fly when creating a StructureData instance.

The getter and setter methods (of the current implementation) should be adjusted to be general and support both db- and repo- based type of properties. Or we should define two different methods, one for db and one for repository

DB-like properties

Stored in self.base.attributes._property_attributes dictionary. --> should change to self.base.attributes.property

Repository-like properties

Stored in the repository and accessed via pydantic BaseModel functionalities.

2. Unsupported and custom properties

Currently, we cannot define unsupported properties (some check in set_property method against the get_valid_properties output), or defined property with some parameters which is not recognized (extra parameters);

unsupported properties

A given plugin may not support all the implemented properties (e.g. hubbard).
Two codes that are using the same StructureData may not support the same properties that are stored in the node.

When a structure is passed to a plugin calcjob/workchain, there should be a check on all the properties associated to the structure. Then two behaviors can happen:

1 - the code uses a calcfunction to create a new instance of the StructureData without the not supported property
2 - the code warns about the unsupported property, but skips it and continue; this can also be done using some keyword ignore_property="hubbard"

The second option is the preferred one if we have multiple codes which support different properties, and also avoid some StructureData duplication.
-> it should be obvious how and why the property is ignored in the plugin (for example if you look at an old calculation, you should understand the decision clearly). There should be a part of the documentation which explain how to do it in a plugin, the keywords to be used, the warnings/exceptions

custom properties

We may also let the user to define some custom properties, as done in optimade by means of a prefix.
This means that the StructureData should support such custom property definition, and maybe store them under a different key in the attributes, in such a way to easily find/recognise them.
We should provide a template on how to define such properties.