Code Monkey home page Code Monkey logo

Comments (16)

aswaterman avatar aswaterman commented on July 20, 2024

I agree that _Complex foo should be passed the same way as struct { foo re; foo im; }. The current behavior is a holdover from the SGI MIPS64 ABI, on which the RISC-V calling convention is based.

To avoid reducing performance, we should at the same time allow structs with 2 float32s to be passed in F registers for RV64G. At the moment, only structs with 2 float64s will get passed in F registers; the former case is passed by reference. I believe that is consistent with rule 7.

from riscv-elf-psabi-doc.

sorear avatar sorear commented on July 20, 2024

@aswaterman Do you have an example handy of a psABI that does that or is this something for me to research?

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

http://math-atlas.sourceforge.net/devel/assembly/mipsabi64.pdf is the best I can find.

Confusingly, it claims "A struct with only one or two floating point fields is returned in $f0 (and $f2 if
necessary). This is a generalization of the Fortran COMPLEX case." That is not what MIPS GCC does. _Complex float and _Complex double are both passed in f-registers. A struct with 2 floats is passed by reference, and a struct with 2 doubles is passed like _Complex.

Not sure if that is just a bug in MIPS GCC that we inherited.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

FWIW, AArch64 does pass struct{float;float} the same way it passes float _Complex.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

I've been thinking about how to handle floating-point aggregates when FLEN != XLEN, and I'm starting to think we should just decouple integer and floating-point argument passing altogether.

Basic idea is that if an argument is a candidate to be passed in FPRs, it goes in the first available FPRs, regardless of how many integer registers have been allocated, and vice versa. (This increases the max number of arguments in registers to 16, which is only marginally useful.)

Structs are either passed entirely in registers or entirely in memory.

Structs are passed in floating-point registers only if they consist of one or two fields that are both floating-point numbers that are both <= the ABI's FLEN. (Things passed in FP registers are not subject to the usual 2*XLEN constraint.)

Thoughts?

from riscv-elf-psabi-doc.

sorear avatar sorear commented on July 20, 2024

What are the advantages and tradeoffs of this scheme?

Would a struct with 4 floats be passed in 2 GPRs on RV64G? Is that desired?

This seems to lose a lot of the simplicity of the "conceptual struct".

I'd like to have a rule that "structs and arrays are conceptually flattened", so that _Complex double, struct { double; double }, struct { struct { double }; double }, struct { double[2] }, struct { double; char[0]; double } are all handled the same, as are struct { double } and double.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

Main advantage is that it's simpler to describe and implement the passing of floating-point values/aggregates in floating-point registers when XLEN and FLEN differ. In the current RV64G scheme, you can't pass struct{float;float} or float _Complex in registers without adding a special case.

A struct with 4 floats would indeed be passed in 2 GPRs on RV64G in this proposal. That's already the case, though, and is also the case on x86-64 and AArch64.

Ancillary advantage I just discovered: my proposal appears to match AArch64 exactly.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

I'm struggling to identify anything about it that's actually worse. It does do away with the conceptual struct, but that's fine, IMO, as long as the algorithm can be expressed clearly & concisely.

from riscv-elf-psabi-doc.

sorear avatar sorear commented on July 20, 2024

So:

  1. If an argument has 2 or fewer float fields of ≤FLEN, and there are enough free FPRs, pass in an FPR.
  2. If an argument is > 2 * XLEN, pass it in read-only caller allocated memory, replace the argument with a pointer, and continue
  3. If an argument has exactly 1 integer field and there is a GPR free, pass in that GPR after type-extending to 32 bits and sign-extending from 32 bits to XLEN
  4. Otherwise pack the argument into 0–2 GPRs and pass in GPRs if there are enough free
  5. Otherwise pass on the stack

?

How is va_list handled? If passing struct { long; long; } sometimes leaves a hole in a7 that sounds like a complication.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

Definitely shouldn't have a hole at a7 in an unnamed argument in the varags case. It's of course OK if the passing convention for unnamed arguments differs from the named-parameter passing convention. (The varags calling convention should be simple for the callee to achieve reasonable runtime performance, so inherently it should be reasonably simple to spec.)

Your description seems to match my proposal. I need to think about it more carefully and do some benchmarking, though. It's important we get this right, since this is probably our last good chance to change it.

from riscv-elf-psabi-doc.

kito-cheng avatar kito-cheng commented on July 20, 2024

Push all vaargs to stack by caller is an option too, it's simpler and vaarg is not performance critical part in general case.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

Yeah, I was going to evaluate the code size of that option, too.

On Tuesday, November 22, 2016, Kito Cheng [email protected] wrote:

Push all vaargs to stack by caller is an option too, it's simpler and
vaarg is not performance critical part in general case.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA-7ws45zQ_Igem43nvjpM7jKCK9zXAOks5rA-JugaJpZM4K3erj
.

from riscv-elf-psabi-doc.

sorear avatar sorear commented on July 20, 2024

If you can figure out how, I'd also like to see a comparison between s390 / amd64 struct arguments. Making the callee copy saves effort if the caller wasn't going to modify the argument anyway (or take a & reference…), making the caller copy saves effort if the struct is a temporary which doesn't need to be preserved, and I can't determine from first principles which is better.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

@kito-cheng any idea how you inform gcc who "owns" arguments passed on the stack? It seems that they belong to the caller for s390 (i.e., they are are callee-saved).

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

Passing all varargs on the stack has the downside that you can't tail-call a varargs function (e.g., if printf is the last statement in a subroutine). This increases stack usage further and might have a more profound impact on code size than the extra arg-pushing.

Net code size increase is 0.2% for my sysroot and 0.5% for SPEC. That's a fairly large regression, so let's not go that route.

from riscv-elf-psabi-doc.

aswaterman avatar aswaterman commented on July 20, 2024

In my recent proposal, _Complex types are treated the same as structs with two floats, i.e., passed in FP registers if possible.

from riscv-elf-psabi-doc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.