modocache / clang Goto Github PK

This project forked from llvm-mirror/clang

Mirror of official clang git repository located at http://llvm.org/git/clang. Updated hourly.

License: Other

CMake 0.10% Objective-C 8.30% C++ 70.82% C 15.71% Makefile 0.11% Python 0.65% Objective-C++ 2.46% MATLAB 0.08% Mercury 0.01% LLVM 0.01% Cuda 0.14% Mathematica 0.01% Shell 0.01% Assembly 0.04% M 0.01% Fortran 0.01% Limbo 0.01% Perl 0.03% Emacs Lisp 0.02% HTML 1.50%

clang's People

Contributors

Watchers

clang's Issues

Rename await_transform

Rename await_transform to await_value, and make it required for any coroutine that has a co_await in its body. This creates a nice symmetry: three co_ keywords (co_await, co_yield, and co_return), and three customization points (await_value, yield_value, and return_value); and each customization point is needed if the keyword is used.

Optimizing virtual coroutines

https://godbolt.org/z/_KJI4q demonstrates an example of a coroutine for which Halo should be applied, but isn't. Specifically:

struct Base {
    virtual generator<int> vg() = 0;
};

struct A: Base {
    generator<int> vg() override { co_yield 1; }
};

struct B : Base {
    generator<int> vg() override { co_yield 2; }
};

int main() {
    Base *x = new A;
    auto g = x->vg();
    return std::accumulate(g.begin(), g.end(), 0);
}

If the compiler is able to devirtualize a call to x->vg(), then it should be able to apply heap allocation elision. A bit of path ordering adjustment should make it work.

Make unhandled_exception optional

The unhandled_exception customization point is optional. If a promise type does not define unhandled_exception, then the body of the coroutine is not wrapped in a try/catch, and the exception propagates out of coroutine_handle.resume(). This is now defined behavior with the adoption of the resolution of Coroutine TS issue 25: Allow unhandled exception escape the user-defined body of the coroutine and give it well defined semantics. In addition to making it simpler to define a synchronous coroutine type, it also greatly helps code generation in that case.

Simplify final_suspend

A coroutine now always suspends at the final suspend point. The final_suspend customization is retained, but now accepts the coroutine_handle for the current coroutine, and returns a coroutine_handle to symmetrically resume, or a noop coroutine handle if execution should be transferred to the caller of resume(). To implement fire-and-forget coroutines, the library writer can explicitly destroy the coroutine from final_suspend.

Remove initial_suspend

Remove the initial_suspend point and always create the coroutine suspended, and pass a handle of a suspended coroutine as an argument to get_return_object. For use cases where the coroutine needs to start execution immediately, get_return_object can call resume() on the passed in coroutine handle as needed.

Add std::coroutine_max_size_v<Callable, T1, T2, ...> magic template variable

One of the strong feedback for Coroutine TS was the desire to have an ability to synthesize coroutines that are guarantee not to allocate any memory even in debug mode. This template variable implemented by the compiler extends the existing Coroutine TS by offering an ability to create on stack coroutines.

[Related issue to be handled by Gor, middle end assert guaranteeing a compilation error if the actual coroutine size ended up bigger than the front end estimate]

Investigate implications of making coroutine frame a type with deferred layout

Some have requested a desire to be able to expose the coroutine frame object as a type in the type-system rather than have this coroutine frame be type-erased and completely hidden from the programmer by the compiler.

However, the compiler needs to be able to adjust the layout of the coroutine frame as it applies optimisations during compilation. A compiler may be able to shrink the size of the coroutine frame after inlining by eliminating suspend-points and thus avoiding writing certain local variables to the coroutine frame which no longer span a suspend-point. Also, a compiler may want to be able to increase the size of the coroutine frame in order to be able to inline allocations of nested coroutine frames into the frame of the caller and elide a call to allocate memory on the heap.

In essence, we want to defer calculation of the layout of the coroutine frame type/object until later in the compilation phase.

To give the compiler the required flexibility to be able to adjust the size of the coroutine frame during optimisation, we need to prevent the program for asking, either directly or indirectly, for the size of the coroutine frame object.

This means that taking sizeof(CoroutineFrameObject) would make the program ill-formed.

Let's call any such type for which it is ill-formed to ask for the size at compile-time a 'deferred layout type'.

A 'deferred layout type' is any type that:

Is a coroutine frame type generated by the compiler
Contains as a member a 'deferred layout type'
Has as a base-class a 'deferred layout type'
Is an array of 'deferred layout type'
etc... (anything else??)

Then sizeof(T) is considered ill-formed if T identifies a deferred layout type.
Similarly, sizeof(obj) is considered ill-formed if obj is an instance of an object that has deferred layout type.
Similarly for alignof(T).

Note that it is still important to be able to get hold of the size, for example for using in a memory-allocator that needs to be able to allocate, but we want to provide the size as a runtime value rather than a compile-time value.

One suggested solution for this is to add a std::get_sizeof<T>() function to the standard library that allows the size of the type to be queried at runtime. In the case that T is a normal type then this is a constexpr function that returns sizeof(T). However, if T is a deferred layout type then this function is not constexpr, but still returns a constant value equal to the final size of the type.

A similar function should be added to allow querying alignof(T) for deferred layout types.
eg. std::get_alignof<T>().

Bikeshedding: alternative spellings of these could be std::sizeof_v<T> and std::alignof_v<T>.
Where the variable is constexpr if T is normal type and non-constexpr if T is deferred layout type.

eg. something like this:

namespace std
{
  template<typename T>
  constexpr bool is_deferred_layout_type_v = __is_deferred_layout_compiler_intrinsic(T);

  template<typename T>
    requires !is_deferred_layout_type_v<T>
  constexpr size_t sizeof_v = sizeof(T);

  template<typename T>
    requires is_deferred_layout_type_v<T>
  size_t sizeof_v = __deferred_layout_size_compiler_intrinsic(T);
}

Compilers should leave this constant as undefined in the early phases until the size of the type is known. Once the size of the coroutine frames is known then the compiler can substitute the constant in here and run additional constant folding optimisation passes.

Outstanding questions:

What sort of impact would this design have on the type-system of C++?
Are there any ways of observing layout of a type other than by sizeof(T) and alignof(T)?
eg. use of offsetof(T, member) and similar.

Remove support for co_await in range-based for

N4775's [stmt.ranged] describes a for co_await statement. Support for the use of co_await within a range-based for should be removed in order to simplify the Coroutines TS.

Optimizing exception propagation

This is also tracked internally at Facebook by task T35565697.

Clang should optimize out the rethrow/catch at each coroutine level to avoid the overhead of the exception throw/unwind/catch/type-erase, since with coroutines we have the rethrow-point in the same function body as the catch, and so the compiler can see the coroutine handle. If these optimisations are implementable then this would mean we don't need any changes to the Coroutines TS design to make exception handling efficient.

The following use cases ought to be optimized:

Case 1A: For `task<T>` types

The await_resume() method typically looks something like:

T await_resume() {
  if (coro.promise().exception) {
    std::rethrow_exception(coro.promise().exception);
  }
  return coro.promise().value;
}

And the awaiting coroutine's unhandled_exception() method typically looks something like:

void unhandled_exception()
{
  this->exception = std::current_exception();
}

And this occurs within the coroutine body that looks something like this:

try { ... }
catch (...) { promise.unhandled_exception(); }
final_suspend:
co_await promise.final_suspend();

So after inlining we would end up with the body of the coroutine looking something like this:

try { ... std::rethrow_exception(e); ... }
} catch (...) { this->exception = std::current_exception(); }
final_suspend:
co_await promise.final_suspend();

Is it possible to have the compiler optimise this code to something like the following, so that we avoid the overhead of throwing and catching an exception?

The compiler can see lexically the nearest enclosing try/catch so it seems like something that should be possible, but it may need some integration with the exception-handling runtime.

std::exception_ptr __e;
try {
  ...
  __e = e;
  ++std::__unhandled_exceptions_counter;
  goto handle_inline_exception;
  ...
} catch (...) {
  this->exception = std::current_exception();
}
goto final_suspend;

handle_inline_exception:
  --std::__unhandled_exceptions_counter;
  this->exception = __e;

final_suspend:
co_await promise.final_suspend();

Case 1B: The `task<T>` case where an awaitable directly throws an exception from `await_resume()` rather than rethrowing an exception_ptr

eg. An await_resume() method that looks like this:

void awaitable::await_resume() {
  if (someCondition) throw some_exception{};
}

So that after inlining the coroutine body looks like this:

try {
  ...
  throw some_exception{};
  ...
} catch (...) {
  promise.exception = std::current_exception();
}
final_suspend:
co_await promise.final_suspend();

Ideally we would be able to short-circuit the exception handling mechanisms in these cases so that it optimised to something like this:

std::exception_ptr __e;
try {
  ...
  __e = __runtime_make_exception_ptr(some_exception{});
  ++std::__unhandled_exceptions_counter;
  goto handle_inline_exception;
  ...
} catch (...) {
  promise.exception = std::current_exception();
}
goto final_suspend;

handle_inline_exception:
--std::__unhandled_exceptions_counter;
promise.exception = __e;

final_suspend:
co_await promise.final_suspend();

Case 2: For `generator<T>`

With a generator class we typically want the exception to propagate out of the coroutine to the caller of coroutine_handle<>::resume(), ie. the unhandled_exception() method of the promise should look like this:

void unhandled_exception() { throw; }

So that the body of the generator coroutine would look like this (after some inlining):

try { ... }
catch (...) { throw; }

I'm assuming that the compiler should be able to optimize this to remove the try/catch altogether.
Does it already do this?

Simplify await_suspend

Have only coroutine_handle-returning await_suspend. bool- and void-returning variants of await_suspend can be expressed in terms of the former with use of noop_coroutine().

In addition to simply making the change, this also requires a performance investigation, both in terms of the time it takes Clang to compile this new await_suspend, and in terms of the memory footprint of programs that use many awaits.

Add contract check in clang and emit llvm::core_size_chk in CGCoroutine.cpp

To provide a compile-time guarantee that a coroutine can fit a fixed-size buffer we explore the following approach.

   template <size_t MaxSize>
   void *operator new(size_t sz, Storage<MaxSize>& put_it_here)[[expects:sz <= MaxSize]] {
       return &put_it_here;
   }

To guarantee that it will be compile time error, the wording along the following lines can be added to [dcl.fct.def.coroutines]: in the paragraph describing how the storage for the coroutine is obtained:

when describing how memory for the coroutine state is acquired).

“If operator new has a precondition contract AND that contract condition would be a constant expression if “sz” argument were a constant time expression, then, the contract will be checked even in the Contract checking “Off” mode and violation will make the program ill-formed.”

Now, there is no contract support in the clang trunk. I think for right now we can hookup __attribute__((diagnose_if(sz <= MaxSize, "", "error")).
CGCoroutines.cpp, if this attribute is present will lower it to a call to llvm::coro_size_chk(MaxSize).

The following patches providing a rough implementation of coro_size_chk intrinsic.

https://reviews.llvm.org/D54500

https://reviews.llvm.org/D54501

modocache / clang Goto Github PK

clang's People

Contributors

Watchers

clang's Issues

Case 1A: For task<T> types

Case 1B: The task<T> case where an awaitable directly throws an exception from await_resume() rather than rethrowing an exception_ptr

Case 2: For generator<T>

Recommend Projects

Recommend Topics

Recommend Org

Case 1A: For `task<T>` types

Case 1B: The `task<T>` case where an awaitable directly throws an exception from `await_resume()` rather than rethrowing an exception_ptr

Case 2: For `generator<T>`