The edge of C++

Everything we interact with during our daily life, well except the universe, has a boundary to it which restrains its existence into well defined outskirts. There are borders delimiting a country, there are walls, keeping out bluer than white walkers, there is a finite number of bits that allow a variable to reach to and of course there is the maximum amount of source a C++ compiler can swallow without choking.

In our daily usage of C++ we rarely step on the outer edges of this, but regardless, the almighty standard covers these fringe situations too. This article, in which we will explore the outer edges of some of the most known compilers, is based on “Annex B - (informative) Implementation quantities [implimits]” of the (current) C++ standard.

Through the article I will walk you through these limitations, what they mean for your daily life, and will present the creation of a tool that will generate small test source files that will push the compilers to their edges when it generates code for testing each of the specific limits from Annex B.

Annex B

No, I am not going to include Annex B in the article, it would be a waste of paper and we try to be as environmentally conscious as possible, so anyone interested can fetch it from [ANNEXB], we will just give a short overview of what it is.

This Annex B lists the maximum recommended values for various code snippets of a C++ application that are recommended by the standard writers that a compiler should support, “However, these quantities are only guidelines and do not determine compliance.” (quote from the standard).

For example, it is recommended that the number of arguments in one function call should be at least 256. Certainly, this sounds like a pretty big number and no-one should be required to manually type in 256 arguments by hand, but considering that today a lot of the code that is compiled is firstly being generated by code generators (let’s think about Google’s protobuf compiler for example, or just the unreadable output of a software modeling/CASE application, Qt’s resource compiler or any other applications out there which generate code for you) you might get to a situation where code generated actually has a tendency to head towards this limit.

The actual limits imposed by the compiler

All the current compilers I have tested have a page ([GCCLIMITS], [CLANGLIMITS], [MSVCLIMITS]) where they present the actual limitations imposed by their implementation, but not all the available limits presented in Annex B were to be found in all the documented limitations, and not all the compilers have identical values.

The test suite

As mentioned before, the main purpose of this article is to provide a set of tests for compilers to test out the supported edge situations. The code is generated by an application, for funs’ sake written in the go programming language and it is available at the [GITHUBCST] location. Everyone is free to download it, modify it and extend it to fit their needs.

The test suite is contained in a big json package where each entry is of the format:

{
    "run": true,
    "testName" : "parameterCountInFunctionDefinition",
    "count": ["256"],
    "minimum": "256",
    "description": "Parameters in one function definition ([dcl.fct.def.general]"
}

where most of the fields are self explanatory, however the testName is required to be mapped to one of the functions in the go program, which will parse this json, and call the specific methods, for each value in the count field.

As a side note, some of these test cases while were meant to test a specific feature of the compilers, but unwillingly highlighted an error somewhere else in the product, so I took the decision to leave them as they are, because highlighting these errors might be useful for compiler writers on the quest for continuously improving their products.

The compilers

All the tests I have performed on a computer using dual boot between two operating systems: Firstly a brand new shiny Ubuntu 20.04 just downloaded from Canonical which by default comes with the following compilers:

g++ 9.3.0 (installed via apt)
clang 10.0.0 (installed again via apt)
icc (ICC) 19.1.2.254 20200623 installed as a byproduct from a trial version of Intel Parallels Studio

And secondly, under Windows 10:

msvc from Visual Studio 2019

Intentionally did I not choose to use a locally compiled version of any of those compilers. I tend to stick to the mainstream Linux distributions, and use what is available for the largest communities of programmers right out of the box, so making a highly personalized compiler would not have been an ideal comparison ground for everyone who uses default compilers on their OS.

Some of these test cases required the activation of C++17 features, however I consider, that in 2020 this should not be such a big issue.

Timing issues

In the results of the test intentionally I did not include a precise measurement of time it took each test to compile. That would have made sense only for my computer, and if someone repeats the test on a much slower or faster computer the results they have obtained would be significantly different.

Where I have observed a noticeable difference between the various compilers I have added my comments regarding that specific situation.

The compilers’ own test suite

Before digging deeper in the subject, I have to mention that both gcc and clang come with exhaustive test suites meant to verify the correct functionality and standard compliance of the compilers, but I did not find a dedicated test suite for the edge situations I am researching through this article, so I thought that providing a unified set of tests for all the C++ compilers would be beneficial.

Unfortunately I did not find any test suite for the Microsoft compiler and Intel's compiler, considering the closed source nature of the product, but I would love to hear from developers who actually work/ed on MS’s C++ compiler to see whether they have considered these test cases too.

Numbers

For the test cases I intentionally used the numbers which are powers of two. Only for very special cases did I dig deeper and identified a number which is outside of this family. For most of the test cases I have specifically tested against the standard recommended value, and for some test cases I have pushed the compilers a bit further, there will be a note in the test case.

The tests

Most of the tests are represented as a single generated C++ file, however some cases required that some of the tests are joined together, for example testing the maximum number of arguments really makes sense with the number of maximum parameters a function can have.

These small applications were carefully engineered to cover all the required edge cases and are all compilable independently by each other. Upon running the test generator beside the CPP files, it also can create a Makefile (and CMakeLists.txt) in order to facilitate easy compilation, if requested in the json file ("generateMakefile": false/true) .

In the json file describing the tests you also can instruct the generated Makefile to include commands to measure the execution time (and other important data) by requiring an invocation of the time command (on Linux: /usr/bin/time) before the actual compile commands by setting "timedCompilation" to true and giving the "timeFlags" some values if required. Right now I use "-f '%E,%M'" to specify what is to be measured is the time spent in seconds (%E) and the amount of memory used (%M) by the process.

In order to not to depend on data from only one invocation of the compiler, if you want to gather an average execution time of the compiler compiling the same source you can specify the "compilationTimes" property to be the number of compiler invocations you want.

In the tests there are places where local (global) variables are initialized. For easiness sake and in order to get a consistent and reproducible behaviour between test runs, all of them initialized to one. I have found no difference in the compilers’ performance if I used a set of random numbers or just plain ones.

All of the tests require the output of some values on the screen so I am using the standard iostream header with std::cout to print out all necessary values.

Nesting level of iteration, selection, compounds statements - nestingOfStatements

For this test I have generated a simple source file containing alternatively for and if statements, like the sequence below:

 int main() {
  for (int f0 = 1; f0<256; f0++ )
   if (f0 % 2 == 0)
    for (int f1 = f0; f1<256; f1++ )
     if (f1 % 3 == 0)
      for (int f2 = f1; f2<256; f2++ )
       if (f2 % 4 == 0)
       ...

This seemed to be complex enough for the optimizer to not to optimize out everything and still generate assembly code which is not utterly complex.

Regardless, no-one in this life should be required to handle applications in which the nesting level reaches even up-to half of the standard recommended value, which is 256 (maybe this is why clang, being a pragmatic compiler, actually got stuck at 128 and was killed after five hours of struggling with the generated source consisting of 256 nested statements) but gcc had no problems compiling application which contained nesting levels up to 1024. More I did not dare.

icc had no problems generating code for up to 256 nesting levels, but msvc does not support the depth of 256, actually it gives an error even at 166: nestingOfStatements-166.cpp(169): fatal error C1061: compiler limit: blocks nested too deeply but compiles finely for 164.

gcc	clang	msvc	intel
1024	128	164	256

Nesting levels of conditional inclusion - nestingLevelOfConditionalInclusion

This test required the definition of a specific number of identifiers, all of which could be used as tests in a conditional check, and if all of them evaluated to true the proper header file for writing out the actual number for this test was included. The code generated was like:

#define COND_0 1
#define COND_1 1
...
#if defined COND_0
 #if defined COND_1
  ...
   #include <iostream>
  ...  
 #endif 
#endif

No compilers had any issue compiling the code up to 512, which is double the standard recommended value, however clang was way more slower than gcc.

gcc	clang	msvc	intel
512	512	512	512

Pointer, array, and function declarators modifying something - pointerAndArrayDeclaratorsModifyingSomething

I have to admit, this was one of the trickiest cases I had to generate code for. The optimizers in todays’ compilers are simply too clever, and they see instantly through your intentions and just throw out all your efforts to generate code for calculating values and instead they just calculate themselves and replace in the generated code, so I really had to use a lot of trickery.

For example, this is the code generated for 4:

#include <iostream>

constexpr int z() {
     return 0;
}
int main() {
    volatile int i = 0;
    volatile int *volatile p1=&i;
    volatile int *volatile *p2 = &p1;

    * & z()[* & z()[* & z()[&p2] ] ] = 4;
    std::cout << i << std::endl;
}

As expected, it prints out 4. Some explanations: firstly, if there is no volatile, the compiler simply ignores all the code, just generates the required assignment. The weird looking expression of * & z()[* & z()[* & z()[&p2] ] ] = 4; is actually equivalent to **&p2[0] = 4; but I wanted to use use both pointer arithmetic, array indexing and function in the same expression, thus ended up with this monstrosity.

gcc had no problems compiling up to 1024, clang complained at a certain point that fatal error: bracket nesting level exceeded maximum of 256 however if I have specified -fbracket-depth=1024 it could compile without any issues.

msvcand icc have had again no problems compiling up to 1024.

gcc	clang	msvc	intel
1024	1024	1024	1024

Nesting levels of parenthesized expressions within a full-expression - nestingLevelsOfParenthesizedExpressionsInAFullExpression

Because the compiler can be very effective at optimizing code, by precalculating values in the compilation phase, this test generated a complex parenthesized expression to calculate the summation and multiplication of various numbers. gcc had no issues with the depth of the expression up to 1024 (quadruple of the standard recommended version) however clang gave a very clear error message in the form of fatal error: bracket nesting level exceeded maximum of 256 and I also appreciated the suggestion on the next line on how to fix it: use -fbracket-depth=N to increase maximum nesting level. After using this parameter, clang compiled without problems.

msvc and icc have had no problems compiling nesting parentheses up to 1024, which is a pretty large value for this purpose, so I concluded this to be the accepted value for this test case, because some compilers (well, all except gcc) started showing error messages for 2048.

gcc	clang	msvc	intel
2048	1024	1024	1024

Number of characters in an internal identifier or macro name - identifierOrMacroNameLength

This was an easy run, just define a macro with a long random name, then a function with a different long random name containing variable with the third long random name being assigned to the macro. Then in the main call this function. Mostof the tested compilers had errors compiling code up to variable names with name as long as 8192 characters, except msvc which conjured up the message:

identifierOrMacroNameLength-8192.cpp(3): fatal error C1064: compiler limit: token overflowed internal buffer. msvc proved to be successful for 2048.

gcc	clang	msvc	intel
8192	8192	2048	8192

Number of characters in an external identifier - externIdentifierNameLength

Almost as easy as the previous test, I just had to use a small trick to avoid multiple compilation units for the externness of the variable, just defined it after the main function. So the code for example might look like:

#include <iostream>

int main() {
	extern int vxvl;
	std::cout << vxvl << std::endl;
}
int vxvl = 4;

Most of the tested compilers had issues in compiling code with variable names up to 8192 which I consider to be more than enough, except msvc which gave up with a similar error message to the previous case, but it succeeded for 2048.

gcc	clang	msvc	intel
8192	8192	2048	8192

External identifiers in one translation unit - externIdentifiersInOneTranslationUnit

The code generated for this case pretty much follows the recipe for the previous case, just varies the number of identifiers. Here, to my surprise clang crashed in something which as per the stack trace printed looks like a recursive call when it tried to compile the standard suggested value, 65536. Also gcc had its fair share of struggle with this value, it took several minutes, however it completed its task successfully. icc gave up with the following error:

externIdentifiersInOneTranslationUnit-65536.cpp(65540): internal error: bad pointer

so actually had to lower my expectations. clang successfully managed to compile 8192 external identifiers, and icc did manage 4096.

msvc really had no problems compiling the test case with 65536 values.

gcc	clang	msvc	intel
65536	8192	65536	4096

Identifiers with block scope declared in one block - identifiersWithBlockScopeDeclaredInOneBlock

This test just consisted in generating a long list of variables in a block and see when the compiler complains, but all the tested compilers successfully compiled even up to 8192 local variables.

gcc	clang	msvc	intel
8192	8192	8192	8192

Parameters in one function definition - parameterCountInFunctionDefinition

This is one of the test cases which was joined together with another, namely “Arguments in one function call” because it just made sense. The application generates a function with the required number of parameters, also it generates a list of variables of different type, and calls the function with the required number. None of the tested compilers had issues compiling functions with parameters up to 4096 which is sixteen times the recommended amount, so I consider that to be a fair number for this reason.

gcc	clang	msvc	intel
4096	4096	4096	4096

Structured bindings introduced in one declaration - structuredBindingsInOneDeclaration

The code generated for this is a larger scale of the following one:

#include <iostream>

int main() {
	int arr[] = {1, 1, 1, 1};
	auto volatile [v0, v1, v2, v3] = arr;
	int i = v0 + v1 + v2 + v3;
	std::cout << i << std::endl;
}

Some of the tested compilers (well, all except icc) had no issues compiling code with length up to 8192. To my biggest surprise, however this is one of the tests gcc proved to be slower than clang, but both compiled the test files nicely.

My other surprise came from icc which gave a core dump upon compiling 8192:

structuredBindingsInOneDeclaration-8192
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for structuredBindingsInOneDeclaration-8192.cpp (code 1)

but in the end, 4096 seemed like a good number for icc.

gcc	clang	msvc	intel
8192	8192	8192	4096

Macro identifiers simultaneously defined in one translation unit - macroCountInOneTranslationUnit

This was one of the easiest tests to come up with, just generate a file with enough macros and let the compilers go wild on them. gcc and icc have had no problems sorting out files with blazing speeds containing up to 65536 macros, however clang started choking after 8192 with a coredump. A similar fate awaited msvc:

macroCountInOneTranslationUnit-8192.cpp(8197): fatal error C1009: compiler limit: macros nested too deeply

so I had to lower my expectations and the number of generated macros to 256.

When even this number gave a compiler error (not a compile error) I started thinking, that maybe my testcase is plainly wrong, maybe I expect too much from the macro engine of msvc, or that the test with the following logic is plainly not good:

#include <iostream>

#define V0 1
#define V1 V0 + 1
#define V2 V1 + 1
#define V3 V2 + 1
#define V4 V3 + 1

int main() {
    std::cout << V4<< std::endl;
}

From the error message I somehow felt that this specific test case must have stepped on the toes of the msvc compiler, so I concluded, this test is using the wrong approach toward this situation because I had the feeling that no (decent) compiler will have problems with 256 macros defined in a source file, so the problem must be the recursive substitution part of it. However since it managed to annoy two of the compilers to the point of breaking, I decided to leave it in here, maybe someone will have a look at these cases in one of the development teams.

gcc	clang	msvc	intel
65536	8192	128	65536

Parameters in one macro definition - parametersInMacroDefinition

A very simple to make test, similar to the parameterCountInFunctionDefinition it just involves a macro. This test case was implemented together with the “Arguments in one macro invocation” test, since it sort of made sense to have both run together.

All the compilers took very well the code, up to 4096 (except MS, see below) which is several time above the one recommended by the standard. On [GCCLIMITS] it is mentioned that gcc allows up to USHRT_MAX number of arguments which should be at least 65535. 65535 worked nicely, but the small devil woke up somewhere inside and I had to try to run with 65536.

gcc provided a cute (but weird) error:

parametersInMacroDefinition-65536.cpp:5: error: macro "M" passed 65536 arguments, but takes just 0
    5 |   int v = M(1, 1, 1,

Seemingly there was an overflow somewhere deep inside gcc. clang threw a tantrum in form of a coredump for the same number. icc compiled without any complains.

Microsofts’ own compiler was very consistent with their amendments mentioned in [MSVCLIMITS] it accurately gave a warning that 127 is the maximum number of parameters supported for these situations.

clang successfully compiled for 9216, failed for 10240 so I decided, that the max supported value must be somewhere between.

gcc	clang	msvc	intel
65535	9216	127	65536

Characters in one logical source line - charactersInOneLogicalSourceLine

This test case was just about generating a long list of summations, that in the end will print out the number of characters in the test case. The following listing gives the source for example for 15.

#include <iostream>
int main() {
int a=9+2+2+2 ;
	std::cout << a << std::endl;
}

gcc struggled with the standard recommended value (65536), but after a while it completed the operation successfully. clang to my biggest surprise produced another crash, however I am not sure whether it was due to the very long sequence of operations handled in a peculiar mode by clang or due to the line length, but since I personally don’t consider this to be the most important test case, I just let it lay down. This test case will not work correctly for values under 10, but lines with length under ten should not be a struggle for any compiler.

msvc and icc had no problems compiling lines with the required length.

gcc	clang	msvc	intel
65536	16384	65536	65536

Characters in a string literal after concatenation - charactersInAStringLiteral

This is again was one of the easiest test cases, just generate a string long eno vugh and run a strlen on it. This case might be useful for tools which are generating source code for embedding resources into C++ applications (such as aforementioned Qt’s resource compiler). None of the compilers (except msvc, with a well defined limit from [MSVCLIMITS]) I have tested had any problems running with strings long as 131072 characters, the double of the standard recommended value.

gcc	clang	msvc	intel
131072	131072	65535	131072

Size of an object - sizeOfAnObject

This required some tricks in order to beat the optimizer, thus the following source for 262144 is generated (which is by the way the value recommended by the standard).

#include <iostream>
#include <numeric>

class A {
public:
	A() {
	    std::iota(std::begin(c), std::end(c), 0);
	}
	void printer() {
    	for(auto i=0ULL; i<sizeof(c); i++) {
	    	if(c[i] * 256 == i && i > 0) {
		    	std::cout << i ;
		    }
	    }
		volatile auto x = sizeof(*this);
		std::cout << x << std::endl;
	}
private:
	unsigned char c[262144];

};
int main() {
	A a;
	a.printer();
}

It actually surprised me how far the optimizer can go in order to save memory, time and space for you. Unless you place some complex calculations and constraints on the values it has to manager it will simply precalculate all the values for you without leaving a trace in the generated binary of its origins. Of course I am talking about release builds with optimization turned on.

Some compilers (icc, gcc, clang) could generate code for class sizes up to 2097152 which is 8 times the standard required size.

gcc	clang	msvc	intel
2097152	2097152	524288	2097152

msvc produced an executable for the required size, however that silently crashed. The "Reliability Monitor" of Windows 10, just mentioned vaguely: Problem Event Name: APPCRASH, Exception Code: c00000fd which pointed me towards a stack overflow resolution. msvc managed to compile an executable which did not crash for 524288.

Nesting levels for `#include` filesnestingLevelsForIncludes

For this test I have created actually the required number of header files, placed them in the inc directory and each header includes the next one. The current iteration of the standard suggested here a nesting level of 256, and there is a mention regarding this situation on [GCCLIMITS] however with a value smaller, specifically 200. Both gcc and clang subscribe to this 200, and we get a very specific error in the form of error: #include nested too deeply.

icc and msvc on the other end managed up to 256, which is considered a success.

gcc	clang	msvc	intel
200	200	256	256

Case labels for a switch statement - caseLabelsForSwitch

For this test I implemented a simple random generator, which could pick values between 1 and the required value, and in a long switch printed out the square of that number. No compiler had problems compiling the code up to 16384, the value recommended by the standard.

gcc	clang	msvc	intel
16384	16384	16384	16384

Non-static data members in a single class - nonStaticDataMembersOfClass

This also was one of the more wood cutting type of work, just generate a class, with the required number of data members (and for simplicity’s sake all in one class) and sum up those. msvc, gcc and clang had no problems generating code for classes which contained 65536 data members, which is more than the double of the recommended amount.

icc choked on that value (and for 32768, 16384, 8192 and 4096 too), but nicely compiled for 2048 which I found a bit stranger:

nonStaticDataMembersOfClass-4096
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for nonStaticDataMembersOfClass-4096.cpp (code 1)
Command exited with non-zero status 1

because an application of the form (generated for 4):

#include <iostream>

class TestClass {
public:
    short int m_member0 = 1;
    unsigned short int m_member1 = 1;
    unsigned int m_member2 = 1;
    int m_member3 = 1;
};

int main() {
    TestClass tc; int v = 0;v += tc.m_member0;
    v += tc.m_member1;
    v += tc.m_member2;
    v += tc.m_member3;
    std::cout << v << std::endl;
}

does not possess a huge level of complexity, so theoretically should not be a huge problem for a compiler.

gcc	clang	msvc	intel
65536	65536	65536	2048

Lambda-captures in one lambda-expression - lambdaCapturesInOneLambdaExpression

Again, one of the easiest test cases, just generate the required number of variables, and a lambda trying to capture them. msvc, gcc and clang had no issues compiling lambdas capturing 8192 values, which I considered enough for even the most evil code generated by any code generator.

icc coredumped for that value, but successfully compiled code generated for 4096.

gcc	clang	msvc	intel
8192	8192	8192	4096

Enumeration constants in a single enumeration - enumerationConstantsInEnum

This was again one of the easiest cases, just generate an enum with enough members and let the go compiler pick out a random value from them. No compiler had issues compiling code with values generated to up to 8192, which is the double of the indicated value in the standard.

gcc	clang	msvc	intel
8192	8192	8192	8192

Levels of nested class definitions - nestingOfClasses

Nested classes are used in projects when encapsulating information should provide a better overview of what the class is about, and what information to keep apart, however a too deep nesting of inner classes after a while will produce unreadable code (personal opinion) and possibly will lead to a maintenance nightmare. Possibly this is why Microsoft reduced the nesting level to a humanly manageable number (16) while other compilers keep their value at 256, the value recommended by the Standard.

gcc	clang	msvc	intel
256	256	16	256

Functions registered by atexit() - functionsRegisteredByatexit

The standard recommends 32 here, but no compiler had problems generating code (which worked as expected) for sane values, however this all was dependent on my OS actually. On systems conforming to POSIX the correct method in finding out the number of functions that can be registered for atexit is using the sysconf function with _SC_ATEXIT_MAX as parameter.

The Windows SDK had a remark in the form that the number of functions that can be registered is limited by the available heap space.

gcc	clang	msvc	intel
64	64	64	64

Functions registered by at_quick_exit() functionsRegisteredByat_quick_exit

According to the documentation, the difference between std::exit and std::quick_exit is the amount of cleanup done when the application exits, for example calling static objects’ destructors, or other fine nuances. The standard recommends at least 32 functions, I have found that registering 8192 is also all right with both gcc, icc and clang. And because sadly this feature being among the ones for which there is no POSIX assigned retrieval count, as in case for atexit I just concluded that 64, the same value as for atexit should be a good value for this situation.

gcc	clang	msvc	intel
64	64	64	64

Direct and indirect base classes for a class - directAndIndirectBaseClassesOfClass

Making this test would have been much more easier if I would have opted just to generate a bunch of classes like for directBaseClassesOfClass, however what I did was to create a full binary tree to a number of nodes up close as possible to the required and generate a class hierarchy from this tree. A binary tree with 13 levels already contains a huge number of nodes and this pretty much covers the classes for the main part of our test. In case the requested number is not exactly a power of two - 1 I generate a set of additional classes that will be added to the inheritance list for the tests’ target Derived class, bumping the number of classes up to the required one.

icc, gcc and clang had no problems of compiling code with values up to 65535.

gcc	clang	msvc	intel
65535	65535		65535

Direct base classes for a single class - directBaseClassesOfClass

This test was also a straightforward one, just had to generate a long list of base classes and a derived one from them. In order for the compiler to not to optimize away the classes for each class I stored in a class member (and also printed out in the constructor) a global static value, which increased itself with every constructor call and also I performed a sum of the values at the end.

No compilers had problems compiling code with generated direct classes up to 4096 which is 4 times more than the one recommended by the standard.

gcc	clang	msvc	intel
4096	4096		4096

Class members declared in a single member-specification - classMembersDeclaredInASingleMemberSpecification

The code generated for the value of 5 is something like:

#include <iostream>
class A {
public:
        int v1 = 1, v2 = v1 + 1, v3 = v2 + 1, v4 = v3 + 1, v5 = v4 + 1;
};

int main() {
        A a;
        std::cout << a.v5 << std::endl;
}

so for the standard recommended 4096, the same logic is used. No compiler had problems compiling code for class members up to 16384, which I have considered enough for this purpose, since I strongly advocate the principles of clean code, and would recommend everyone to have maximum one, or in the worst case a small group of member that logically belong to the same notion (and of course, don’t forget to add comments to explain their purpose).

gcc	clang	msvc	intel
4096	4096		4096

Final overriding virtual functions in a class - finalOverridingVirtualFunctions

This test case also required the usage of the class hierarchy generation just like for directAndIndirectBaseClassesOfClass but also introduced virtual functions, to make the generation more fun. Since even for small numbers the code tends to be long and repetitive, I will not put here any example code, but feel free to check out the code generated for this situation by the test application. The standard recommend 16384 as the magic limit for this situation, and I think that 16384 is indeed a very good number for this situation.

gcc and clang had no problems generating code for values up to 32768, however msvc didn't manage to compile code generated for this value, neither the 32 bit compiler, nor the 64 bit one, both failed with the error:

finalOverridingVirtualFunctions-32768.cpp(262149): fatal error C1060: compiler is out of heap space

I have found this a bit strange, since neither of those compilers consumed a too large amount of memory while running. The real surprise came when I have tried to compile for 16384, and I was greeted by an internal error of the compiler:

finalOverridingVirtualFunctions-16384.cpp(196618): fatal error C1001: An internal error has occurred in the compiler. (compiler file 'msc1.cpp', line 1528). Finally, 8192 gave a result for msvc in form of a compiled executable.

Intel’s icc could not manage even 8192, it’s final value is 4096.

gcc	clang	msvc	intel
32768	32768	8192	4096

Direct and indirect virtual bases of a class - directAndIndirectVirtualBaseClassesOfClass

This test case is also very familiar to the directAndIndirectBaseClassesOfClass except that the inheritance must be virtual. The same language mechanisms were used to generate the required number of base classes as in the case of directAndIndirectBaseClassesOfClass, and the result is also very similar. Compiling with the limit set 65535 took forever, but successfully completed both for gcc and clang. I did not dare trying with a larger value. Interestingly, the code generated by clang is only 62 megabytes, while the one generated by gcc is 72.

Sadly msvc after half an hour of struggle gave up with the following error message:

directAndIndirectVirtualBaseClassesOfClass-65535.cpp(65527): fatal error C1060: compiler is out of heap space and the same result was produced for 32768, and 16384 and 8192 too, so I came to the conclusion, that the C++ compiler of Visual Studio 2019 can't handle these large applications, thus I have reduced the maximum supported number to be 4096.

icc really struggled with 4096, it took more than 30 minutes to compile the source file, so again, I have decided that this should be enough for it, and the same value applies for msvc too.

gcc	clang	msvc	intel
65535	65535	4096	4096

Static data members of a class - staticDataMemberOfClass

This test case consisted in generating a class with the specified number of public static members, of various numeric types. Afterwards there is code to initialize these values to 1 and in the main function there is code generated to sum up all the members.

None of the tested compilers had any issues with code generated to contain up to 16384 static members, except icc, which coredumped again:

staticDataMemberOfClass-16384
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for staticDataMemberOfClass-16384.cpp (code 1)
Command exited with non-zero status 1

gcc	clang	msvc	intel
16384	16384	16384	2048

Friend declarations in a class - friendsOfAClass

Friends of a class provide a useful backdoor into the internals of a class, but too much of backdoors isn’t a very good approach to optimal application design, so you should not over-abuse them. The standard indicates value of 4096 and the compilers had no problems compiling with values up to 8192.

For this test I have generated a class and a combination of friend classes and functions, and a summation of the private member of the class via these friend functions and classes.

gcc	clang	msvc	intel
8192	8192	8192	8192

Access control declarations in a class - accessControlDeclarationsInClass

I interpreted this test case as alternating protected, public, private of various data members, so the test generated is also nothing else but a long list of data members with alternating visibility, a set of public getter functions (the test application will print out the required number - 1, due to this last set being public) for the private and protected members, and a main function which simply generates a summation of all the members (which were set to one).

Some of the compiler I have tested had no problems in generating code for alternating the visibility of data members up to 16384. To my surprise, this was one of the test cases where clang outperformed gcc in terms of speed.

icc sadly gave up for 16384 with the following error message:

accessControlDeclarationsInClass-16384.cpp(43698): internal error: bad pointer

threw an exception for 8192:

: internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for accessControlDeclarationsInClass-8192.cpp (code 1)

but compiled nicely for 4096.

gcc	clang	msvc	intel
16384	16384	16384	4096

Member initializers in a constructor definition - memberInitializersInAConstructorDefinition

A not extremely complicated test case, just generated the required number of members in a class, generate code for the constructor and print their sum in order to have some confirmation. Except icc no compiler had problems in compiling code generated for values up to 16384, and again, this is one of the test cases where clang was faster than gcc, but for this high value icc came up with the following error: memberInitializersInAConstructorDefinition-16384.cpp(16720): internal error: bad pointer

Finally icc was settled at the value of 4096 (not being able to compile the standard recommended 6144 either) and I had the feeling there is a connection to the previous test case.

gcc	clang	msvc	intel
16384	16384	16384	4096

Initializer-clauses in one braced-init-list - initializerClauseInBracedInitList

Another test case which just repeatedly required to generate an array with required number of elements and the iterating over it sum up a value to get the required number of the test case.

Although that the standard recommends 16384, I have found that no compiler had problems generating code for initializers list with length up to 262144, which is several times the multiple of the standard recommended value.

gcc	clang	msvc	intel
262144	262144	262144	262144

Scope qualifications of one identifier - scopeQualificationOfOneIdentifier

Although this seemed to be one of the more banal test cases, the higher values turned out to be in the end fatal to clang and msvc when I have increased the bracket depth (via -fbracket-depth=4096 for clang) but gcc was happy even with the 4096 (this being 16 times the standard recommended value).

clang gives up somewhere at a value between 1024 and 2048 in a seemingly infinite recursive call between EmitTopLevelDecl(clang::Decl*) and EmitDeclContext(clang::DeclContext const*), but I’d rather say, that 1024 scopes for a variable is more than enough.

msvc does not even support the standard recommended 256, it gives up at 128 with the error message:

scopeQualificationOfOneIdentifier-128.cpp(130): fatal error C1061: compiler limit: blocks nested too deeply but with a depth set to 127 there were no problems.

icc had no problems with compiling to depths of 2048, but failed with 4096.

gcc	clang	msvc	intel
4096	1024	127	2048

Nested linkage-specifications - nestedLinkageSpecifiers

Personally I think that nesting linkage specifications on a long term can lead to highly unmaintainable code. But if the standard allows it, and there is even a recommended depth, who am I to protest. So, some code in the form of the one below was generated (example shown for 4), and I have just acknowledged that 1024 sounds like a good number for this purpose unless you really want to be the source of future headaches.

#include <iostream>

extern "C" { int fC() { return 1; }
 extern "C++" { int fCx() { return 1; }
  extern "C" { int fCxC() { return 1; }
   extern "C++" { int fCxCx() { return 1; }
    int fun() { return 0+fC()+fCx()+fCxC()+fCxCx();}
   }
  }
 }
}

int main() {
	std::cout << fun() << std::endl;
}

gcc and clang had no issues compiling code with aforementioned depth, however the msvc compiler gave up somewhere at a value between 736 and 752 with the following error:

nestedLinkageSpecifiers-752.cpp(748): fatal error C1026: parser stack overflow, program too complex but works for 736.

icc accepted for this test case the standard recommended 1024.

gcc	clang	msvc	intel
1024	1024	736	1024

Recursive `constexpr` function invocations - recursiveConstexpr

Recursive constexpr function are not the most frequent ones, however they can come in very handy from time to time. This test case required the following application:

#include <iostream>
constexpr unsigned long long sum(unsigned long long n, unsigned long long s=0) {
	return n ? sum(n-1,s+n) : s;
}
constexpr unsigned long long k = sum(512);

int main() {
	std::cout << k<<std::endl;
}

where the 512 is the actual required depth of the recursion. Running this test case gave the following results:

clang gave a very correct assessment of the situation:

recursiveConstexpr-512.cpp:5:30: error: constexpr variable 'k' must be initialized by a constant expression
constexpr unsigned long long k = sum(512);
                             ^   ~~~~~~~~
recursiveConstexpr-512.cpp:3:13: note: constexpr evaluation exceeded maximum depth of 512 calls

gcc also recognized the situation, in the form of a warning message like:

recursiveConstexpr-512.cpp:5:41: error: ‘constexpr’ evaluation depth exceeds maximum of 512 (use ‘-fconstexpr-depth=’ to increase the maximum)
    5 | constexpr unsigned long long k = sum(512);

however after applying the suggested -fconstexpr-depth=513 it actually managed to compile the code, and using that switch we can bring the recursiveness up to 16384, but with 32768 gcc also decided it’s time to give up:

g++: internal compiler error: Segmentation fault signal terminated program cc1plus

icc did not like 512 but worked nicely with 256.

msvc correctly recognized the scenario:

recursiveConstexpr-512.cpp(5): error C2131: expression did not evaluate to a constant
recursiveConstexpr-512.cpp(3): note: failure was caused by evaluation exceeding call depth limit of 512 (/constexpr:depth<NUMBER>)

After specifying the required depth, the msvc compiler choked at 16384, 8192, 4096 in the form of an Internal Compiler Error but successfully compiled for 2048.

gcc	clang	msvc	intel
16384	16384	2048	256

Full-expressions evaluated within a core constant expression - fullExpressionInAConst

The value recommended for this situation is just simply so huge (1048576) that I did not consider it to increase it. The application generated for this case is just a simple addition of ones, being assigned to a constant value. Compiling a test case takes a long time, but not compiler tested had any problems with it, except msvc which gave up at somewhere a value between 65536 and 131072.

gcc	clang	msvc	intel
1048576	1048576	65536	1048576

Template parameters in a template declaration - templateParametersInTemplateDeclaration

This test case consisted of creating a source file on the lines of:

#include <iostream>

template<int N0,int N1,int N2,int N3>
struct C {
	static const int v = N0 + N1 + N2 + N3;
};
int main() {
	C<1,1,1,1> c;
	std::cout << c.v << std::endl;
}

No compiler, except msvc had problems compiling code with values up to 16384 which is 4 times the value recommended by the standard. Just a small interesting observation is that while gcc generally outperformed from the speed point of view all the other compilers (that were run on the same platform), this test case was aced by clang which delivered blazing fast speed for this test case, easily outperforming all the other compilers.

msvc failed 16384 with fatal error C1111: too many template parameters but in the end it managed to compile the standard recommended 1024.

gcc	clang	msvc	intel
16384	16384	1024	16384

Recursively nested template instantiations - recursivelyNestedTemplateInstantiations

The following application

#include <iostream>

template<typename T>
struct B {
        typedef T BT;
};
template<int N>
struct C {
        typedef typename B<typename C<N-1>::T>::BT T;
};
template<>
struct C<0> {
        typedef int T;
};

int main()
{
        C<1024>::T c = 1024;
        std::cout << c << std::endl;
}

gave actually headache to a few compilers. It seems that icc can handle recursively nested templates in a very predictable way:

recursivelyNestedTemplateInstantiations-1024.cpp(9): error: excessive recursion at instantiation of class "C<524>". The troubles for icc were not over, since after a while of experimentation I have discovered that the maximum value it supports is 500. For values above 500 I get the previous strange error, with the value being always the test value - 500. So for 501 the error is: error: excessive recursion at instantiation of class "C<1>". Strange, but interesting.

gcc also had its troubles:

recursivelyNestedTemplateInstantiations-1024.cpp:9:45: fatal error: template instantiation depth exceeds maximum of 900 (use ‘-ftemplate-depth=’ to increase the maximum)

but after specifying -ftemplate-depth=1025 as an extra parameter gcc succeeded. Interestingly gcc excepts +1 to the actual number.

clang aced this test, compiled without complaining the entire 1024 iterations of template madness. An interesting side-note for clang : for 16384 it gave me the hint to use -ftemplate-depth=16384 and then it gave me the following warning:

warning: stack nearly exhausted; compilation time may suffer, and crashes due to stack overflow are likely [-Wstack-exhausted]
        typedef typename B<typename C<N-1>::T>::BT T;
                                    ^
recursivelyNestedTemplateInstantiations-1024.cpp:9:30: note: in instantiation of template class 'C<15286>'

never seen this till now, but my admiration towards compiler writers just was increased by 1. But gcc compiled 16384 too without this warning (just had to specify -ftemplate-depth=16385 as an extra parameter.

1024 proved to be fatal for msvc: recursivelyNestedTemplateInstantiations-1024.cpp(9): fatal error C1202: recursive type or function dependency context too complex finally managed to compile 128.

gcc	clang	msvc	intel
16384	16384	128	500

Handlers per try block - handlersPerTryBlock

This test case consisted in generating a number of classes derived from std::exception that will act as objects to be thrown, then throw an object of that kind in a try block and then writing a long list of catch statements for each class. No compiler had problems with code that contained 256 different handlers for a try block, as per the standard recommended value.

gcc	clang	msvc	intel
256	256	256	256

Number of placeholders - numberOfPlaceholders

This is not specifically a compiler limit but more a library feature, but in the end we have to agree that all the compilers tested had an upper limit of 29, except msvc which draws the upper limit at 20.

gcc	clang	msvc	intel
29	29	20	29

Conclusion

Before you jump boat, and decide that based on these results it's time to ditch your current compiler and switch to a different one, a big warning for you: don't. These test cases were specifically engineered for a unique purpose, and they are not real life situations, if yes, then maybe it's time to rethink your source strategy.

Each of these compilers is highly able to perform adequately for any project you can find on the market today, and the purpose of this test was not to make a winner, but to see which does what good and what improvements should be done for future releases.

Each of the tested compilers shines in some areas and performs poorly in different ones, and what follows are just a few (personal) observations from my side. If you will run the test case, possibly you will reach a different conclusion.

gcc and msvc are the oldest one of the tested bunch. Their age has positively affected their performance. Both of them are blazingly fast when it comes about all areas in compilation. msvc has a set of limitations, that you will not observe in your average daily programming routine, unless you specifically look for it, while gcc can compile basically everything that you throw at him, given you have the patience to wait for the compilation time of very large sources, and your computer can cope with the expectations of the compiler.

icc which came more than 20 years after msvc promises faster than average code targeting its own processors, good c++17 support and also a decent speed. Sadly it is packaged into a suite downloadable on a trial basis from intel's' homepage, and this possibly makes the hobbyist programmers or the advocates of open source stay away unless forced by some specific requirements.

clang is the newcomer and the youngest of the tested compilers. It outperforms all the other compilers when it comes about more recent c++ features, however feelingly struggles with notions and constructs that other compilers had a few extra decades to polish till perfection. But the speed at which the community picked it up, and made it into one of the most used compilers today hints at a bright future for this product.

References

[GCCLIMITS] - https://gcc.gnu.org/onlinedocs/gcc-9.2.0/cpp/Implementation-limits.html

[CLANGLIMITS] - https://clang.llvm.org/docs/UsersManual.html#controlling-implementation-limits

[MSVCLIMITS] - https://docs.microsoft.com/en-us/cpp/cpp/compiler-limits?view=vs-2019

[ANNEXB] - https://eel.is/c++draft/implimits

[GITHUBCST] - https://github.com/fritzone/cpp-stresstest

fritzone / cpp-stresstest Goto Github PK

cpp-stresstest's Introduction