computationalradiationphysics / contributing Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 10.0 106 KB

:books: guidelines and styles to contribute to our projects

License: Creative Commons Attribution 4.0 International

contributing's People

Contributors

Stargazers

Watchers

Forkers

psychocoderhpc slizzered n01r ax3l sbastrakov theziz bertwesarg j-stephan bernhardmgruber simeonehrig

contributing's Issues

Initial Discussion

This is the discussion to the first CRP C++ coding style

the style based on
- PIConGPU coding style
- code style discussion in ALPAKA alpaka-group/alpaka#102

The style is a mix of both sources. I hope that I not lost rules during the reordering.

Goals:

create a coding style which can used for all C++ projects in ComputationalRadiationPhysics
the code should looks equal independent of the used editor, IDE ( incl. web based editors)
it must be write/ read able without IDE features such as
- block collapse function
- code highlighting
- ...
break the 80 column limit should avoid as best as possible by design (new line rules)
rules should be consistent ( to add linter or code beautifier support )
with heavy template usage the code should stay readable

Bash

I propose to follow for shell/bash scripts the guidelines & styles outlined here: https://github.com/azet/community_bash_style_guide

They contain best-practices, are very clean and easy to understand (and I contributed a very little to them ;) ).

Formating tools

Here you can add tools those help to format the source code.
The goal is that we find a formatting tool to automate the formation.

Example of misformatting with the current clang-format

This piece in PIConGPU seems to have been misunderstood and therefore misformatted by clang-format. I am not sure exactly why.

Admittedly, it is rather (and likely too) complicated anyhow. What is going on there:

ForEachIdx<IdxConfig<numCellsPerSuperCell, numWorkers>> is a functor type
We instantiate an object of this type by passing workerIdx to its constructor, so the curly brace at the end of line 398 is call for constructor, not start of a block
Then we immediately call operator() of the freshly constructed object in the same expression
And pass a lambda defined just there as its argument

Edit: so what the code with the current formatting boils down to (line 398 and start of line 399) is

Type{
     constructor parameter}( lots of other stuff ...

and I suspect the other stuff was complicated and it did not recognize the constructor parameter.

cc @j-stephan

It would probably make sense to define if variables should in principle be in singular or if plural is encouraged for some, e.g. how to handle names for lists, class names, folders, namespaces, cmd. line options, etc.

Usually, singular is easier to read & write.

new line after semicolon but not if it's in a for loop?

It might be obvious that the statement semicolon ; is always followed by a new line doesn't target the semicolons in the head of a for loop, but as there is a slight logical conflict between the global rule there _must_ be a new line after semicolon and for loop's complex parameter _can_ be placed on seperate lines it would not hurt to mention that the semicolons in a for loop are the exception.

P.S.: I'm not sure if thats a typo or not but in the naming oc-token it says toke instead of tokens.

Comments in macros

The proposed way of commenting macros in chapter 8 will not work:

#define MY_MACRO( )                                                           \
    template<                                                                 \
        // typename T0, ..., typename TN                                      \
        BOOST_PP_ENUM_PARAMS( N, typename T )                                 \
    >                                                                         \
    HDINLINE                                                                  \
    void                                                                      \
    //         ( T0 const, ... , TN const           )                         \
    operator( )( BOOST_PP_ENUM_PARAMS( N, T const ) ) const

// will start a comment for the rest of the line. However as macros use \ the whole code above is just one line. So everything after // typename T0, ..., typename TN will be ignored. 😉 Even the github highlighting shows the behaviour! 😆

Which way of commenting should be favoured instead? Just plain old /* … */ ?

CamelCaseIsOutdated

It's a hard truth to face, but the modern C++ community is moving away from camelCase, e.g. in member/function names since several years. And let's be honest, abbreviations were never great in it: doVSWorkaround, LWFAExample, ...

If we want to maintain our street credibility, we will have to switch to more_readable_names as well.

Btw, the STL always did _.

Hopefully there is a Clang-tool assisting us in the transition ;-)

Improve formatting of concepts

The current style config makes concept definitions quite verbose:

Here is a LLAMA concept using my private .clang-format:

    template <typename M>
    concept Mapping = requires(M m) {
        typename M::ArrayExtents;
        typename M::ArrayIndex;
        typename M::RecordDim;
        { m.extents() } -> std::same_as<typename M::ArrayExtents>;
        { +M::blobCount } -> std::same_as<std::size_t>;
        std::integral_constant<std::size_t, M::blobCount>{}; // validates constexpr-ness
        { m.blobSize(typename M::ArrayExtents::value_type{}) } -> std::same_as<typename M::ArrayExtents::value_type>;
    };

And now formatted with the CRP .clang-format:

    template<typename M>
    concept Mapping = requires(M m) {
                          typename M::ArrayExtents;
                          typename M::ArrayIndex;
                          typename M::RecordDim;
                          {
                              m.extents()
                              } -> std::same_as<typename M::ArrayExtents>;
                          {
                              +M::blobCount
                              } -> std::same_as<std::size_t>;
                          std::integral_constant<std::size_t, M::blobCount>{}; // validates constexpr-ness
                          {
                              m.blobSize(typename M::ArrayExtents::value_type{})
                              } -> std::same_as<typename M::ArrayExtents::value_type>;
                      };

Please consider improving the status-quo.

Character limit for lines

I'd vote against the 80 char limit as this is pretty narrow especially on current wide screen monitors. For 4:3 it was reasonable but now? I'd choose 100-120 as a limit. Almost 240 chars fit on a screens width, so 110 would be a good choice (a bit below half of the screen)
Choosing a low limit means lots of lines and much vertical scrolling to avoid the (much less often) required horizontal scrolling.
I can't follow the reason that (some) terminals are by default 80 chars wide. You can configure that...
So only point where horizontal scrolling MIGHT be required then is a 3 way merge and only when the conflicting line is really that long.

Some questions about the Coding Guidlines

I have some questions and unclearities about the coding guidlines.

no code alignment (only indention)
What does this mean? No code like this

int result = my_fancy_function( parameter1, parameter2, parameter3, parameter4,
                                parameter5, parameter6, parameter7 );

but this

int result = my_fancy_function( parameter1, parameter2, parameter3, parameter4,
    parameter5, parameter6, parameter7 );

@ax3l:

int result = my_fancy_function(
    parameter1,
    parameter2,
    // ...
    parameter7
);

code inside the deepest namespace is indented
This is missleading. What if I have such a code:

namespace foo
{
namespace traits
{
    template<
        typename T_Type
    >
    struct Bar;
} // namesspace traits

template<
    typename T_Type
>
void bar( T_Type t )
{
    traits::Bar::bar( t );
}

} // namespace foo

It fulfills the rule, but is this really the intended behaviour?

Template Class / Struct and Type Definitions
There is no explicit rule, that the opening tocken of a template parameter needs to be bing to the template keyword, such that this is forbidden:

template <
    T_Keks
>

although it makes sense in the general context. However especially with these defintions over more than one line, it it harder to spot template< than template <. The same is in fact also true for function and method definitions, but ( is slimer and therefore better to spot, so the effect is not so strong.
Is it really that important to forbid these spaces?

CUDA Kernel <<< >>>

CUDA Kernel <<< and >>> should stay without spaces in between.

Reason one: llvm-cuda does not understand spaces in it.
Reason two: it's one fixed syntax element for CUDA and not three template brackets (even if nvcc understands spaces in it)

Python PEP8 max line length

The PEP8 formatting standard that we are enforcing for our python code limits the code lines to max 79 characters, doc-strings and comments are restricted to 72. Together with the indentation and alignment rules in PEP8 this makes the code hard to format and read. For our C++ code, we decided on the 120 characters limit. I suggest we relax the rules for python too. Even the PEP8 says:

For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters.

Please, comment your opinion on this. I will suggest we vote first if we increase the numbers at all. If you vote to increase one of the limits, please, include a suggestion for a new limit in your comment.
Vote:

Question	Yes	No
Increase limit for code lines	1	0
Increase limit for comments and doc-strings	1	0

Reevaluate newline style/number of newlines

Looking at the last example in https://github.com/ComputationalRadiationPhysics/contributing/blob/master/codingGuideLines/cpp.md#17-template-class--struct-and-type-definitions (under the comment // template function specialization) I think the number of newlines should be substantially reduced as it destroys the context. If you scroll over something like that you cannot tell if this is a class, method or where anything really starts or ends. Especially with the C++11 trailing return type that adds an extra > in the -> making it much harder to distinguish method name, signature, return type and body.

It would help to put short template argument lists on the same line:

// old
foo<
    int,
    float
>

// new
foo< int, float >

VOTE: const west or east const

Currently there is no way to enforce a ruling on this since both clang-format and clang-tidy don't offer any related options. AFAIK there is a PR to clang-format in the pipeline but who knows when that will be incorporated. In any case we should decide on this as well so that we don't need to vote again once the option arrives.

The arguments for const west are found in the C++ Core Guidelines: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rl-const

tl;dr: It is more familiar to programmers and already present in a lot of code bases.

const west examples:

const int some_var = 1;
const int* some_ptr;
const int* const constant_ptr;
const char* a_function();
const std::string& someClass::a_method();

The arguments for east const are presented here: https://mariusbancila.ro/blog/2018/11/23/join-the-east-const-revolution/

tl;dr: It is more logical and consistent.

east const examples:

int const some_var = 1;
int const* some_ptr;
int const* const constant_ptr;
char const* a_function();
std::string const& someClass::a_method();

VOTE

const west	east const
0	1

Update to clang-format 12

clang-format 12 was released so we can update the .clang-format file here.

Python

For Python scripts, PIConGPU currently enforces:

Versions

Both support for Python 2.7+ and Python 3

Style (PEP8 conformance)

pip install -U flake8
flake8 .

in both Python 2 & Python 3 variants of the flake8 script.
(Note: Recognizes only python scripts ending on *.py, unfortunately.)

Style (Comments)

propose: numpy style
tool: ...?

Static Code Analysis (Warnings, unused code, etc.)

pip install -U pyflakes
pyflakes .

in both Python 2 & Python 3 variants of the flake8 script.
(Note: Recognizes all python scripts.)

East & West Coast Style

Just an little interesting observation: We are using east coast style for all {} bodies but west-coast style for all other <> and () brackets:

template<
     typename T_Bar,
     typename T_Foo
 >
 // ...
 ;
 
 // template function specialization
 template< >
 auto
 foo<
     int,
     float
 >(
     int const & extent,
     float & buf
 )
 -> mem::view::foo::detail::Bar<
     int,
     float
 >
 {
     // ...
 }

The current reasoning is, that {} are used large code groups and <>/() are rather short code lines that would blow up with the opening newline, according to @psychocoderHPC. I am very ok with that.

PR merging options: merge commit, rebase and squash

Previous discussion in: alpaka-group/alpaka#1103

Should we establish a common guideline how PRs are merged? And what should it be?
Currently, alpaka does a rebase and most other projects create a merge commit.

Pre-Compiler vs. Macros

Regarding ComputationalRadiationPhysics/picongpu#1693 it looks odd to force #pragma calls to also start at the 0 column in a file.

#pragma, besides #pragma once, are not necessary pre-compiler instructions that make sense to be aligned that way but are non-C++-standard, extended type descriptions (attributes, no-inline, etc.) and non-C++-standard, compiler-specific compile-time hints.

Overtacking of source code from internal projects

During my work on creating a CMake installer for cupla, I looked in the Alpaka CMake and found some useful stuff, which I would overtake. Now, there the is the question, if we should cite the original author if yes, how?

Indentation of innermost namespace

https://github.com/ComputationalRadiationPhysics/contributing/blob/master/codingGuideLines/cpp.md section 6 points out, that the innermost code in namespaces should be indented (by 4 spaces). Furthermore namespaces are always forced (section 2).

I think this global indent of 4 spaces is problematic.
I vote that the innermost namespace doesn't have to be indented. Or at least not for definitions. For the declarations in .hpp/.h it may be ok, because ideally it would only encompass short function headers. Although in PiConGPU many inline functions are used.

Most of the files I looked at in PiConGPU also ignore the indentation of the innermost namespace.

https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/picongpu/include/particles/ionization/ionization.hpp
https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/libPMacc/include/math/Tuple.hpp
https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/libPMacc/include/mappings/simulation/Selection.hpp
https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/picongpu/include/particles/shapes/PCS.hpp
https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/picongpu/include/algorithms/DifferenceToLower.hpp
https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/picongpu/include/algorithms/Gamma.hpp
https://github.com/ComputationalRadiationPhysics/picongpu/blob/master/src/picongpu/include/fields/FieldManipulator.hpp

class or typename to type template parameters

Looking at the examples here:

template<
    typename T_Foo,
    class T_FooBar
>

I find the mixed use of typename and class not really intutive

const inconsistencies

As per 10. the const specifier should be right of the type. But the examples in 7. const T, 8. and 12. (return type: int const) have it to the left of the type.