Comments (16)
I agree that _Complex foo should be passed the same way as struct { foo re; foo im; }. The current behavior is a holdover from the SGI MIPS64 ABI, on which the RISC-V calling convention is based.
To avoid reducing performance, we should at the same time allow structs with 2 float32s to be passed in F registers for RV64G. At the moment, only structs with 2 float64s will get passed in F registers; the former case is passed by reference. I believe that is consistent with rule 7.
from riscv-elf-psabi-doc.
@aswaterman Do you have an example handy of a psABI that does that or is this something for me to research?
from riscv-elf-psabi-doc.
http://math-atlas.sourceforge.net/devel/assembly/mipsabi64.pdf is the best I can find.
Confusingly, it claims "A struct with only one or two floating point fields is returned in $f0 (and $f2 if
necessary). This is a generalization of the Fortran COMPLEX case." That is not what MIPS GCC does. _Complex float and _Complex double are both passed in f-registers. A struct with 2 floats is passed by reference, and a struct with 2 doubles is passed like _Complex.
Not sure if that is just a bug in MIPS GCC that we inherited.
from riscv-elf-psabi-doc.
FWIW, AArch64 does pass struct{float;float} the same way it passes float _Complex.
from riscv-elf-psabi-doc.
I've been thinking about how to handle floating-point aggregates when FLEN != XLEN, and I'm starting to think we should just decouple integer and floating-point argument passing altogether.
Basic idea is that if an argument is a candidate to be passed in FPRs, it goes in the first available FPRs, regardless of how many integer registers have been allocated, and vice versa. (This increases the max number of arguments in registers to 16, which is only marginally useful.)
Structs are either passed entirely in registers or entirely in memory.
Structs are passed in floating-point registers only if they consist of one or two fields that are both floating-point numbers that are both <= the ABI's FLEN. (Things passed in FP registers are not subject to the usual 2*XLEN constraint.)
Thoughts?
from riscv-elf-psabi-doc.
What are the advantages and tradeoffs of this scheme?
Would a struct with 4 floats be passed in 2 GPRs on RV64G? Is that desired?
This seems to lose a lot of the simplicity of the "conceptual struct".
I'd like to have a rule that "structs and arrays are conceptually flattened", so that _Complex double
, struct { double; double }
, struct { struct { double }; double }
, struct { double[2] }
, struct { double; char[0]; double }
are all handled the same, as are struct { double }
and double
.
from riscv-elf-psabi-doc.
Main advantage is that it's simpler to describe and implement the passing of floating-point values/aggregates in floating-point registers when XLEN and FLEN differ. In the current RV64G scheme, you can't pass struct{float;float} or float _Complex in registers without adding a special case.
A struct with 4 floats would indeed be passed in 2 GPRs on RV64G in this proposal. That's already the case, though, and is also the case on x86-64 and AArch64.
Ancillary advantage I just discovered: my proposal appears to match AArch64 exactly.
from riscv-elf-psabi-doc.
I'm struggling to identify anything about it that's actually worse. It does do away with the conceptual struct, but that's fine, IMO, as long as the algorithm can be expressed clearly & concisely.
from riscv-elf-psabi-doc.
So:
- If an argument has 2 or fewer float fields of ≤FLEN, and there are enough free FPRs, pass in an FPR.
- If an argument is > 2 * XLEN, pass it in read-only caller allocated memory, replace the argument with a pointer, and continue
- If an argument has exactly 1 integer field and there is a GPR free, pass in that GPR after type-extending to 32 bits and sign-extending from 32 bits to XLEN
- Otherwise pack the argument into 0–2 GPRs and pass in GPRs if there are enough free
- Otherwise pass on the stack
?
How is va_list handled? If passing struct { long; long; } sometimes leaves a hole in a7 that sounds like a complication.
from riscv-elf-psabi-doc.
Definitely shouldn't have a hole at a7 in an unnamed argument in the varags case. It's of course OK if the passing convention for unnamed arguments differs from the named-parameter passing convention. (The varags calling convention should be simple for the callee to achieve reasonable runtime performance, so inherently it should be reasonably simple to spec.)
Your description seems to match my proposal. I need to think about it more carefully and do some benchmarking, though. It's important we get this right, since this is probably our last good chance to change it.
from riscv-elf-psabi-doc.
Push all vaargs to stack by caller is an option too, it's simpler and vaarg is not performance critical part in general case.
from riscv-elf-psabi-doc.
Yeah, I was going to evaluate the code size of that option, too.
On Tuesday, November 22, 2016, Kito Cheng [email protected] wrote:
Push all vaargs to stack by caller is an option too, it's simpler and
vaarg is not performance critical part in general case.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA-7ws45zQ_Igem43nvjpM7jKCK9zXAOks5rA-JugaJpZM4K3erj
.
from riscv-elf-psabi-doc.
If you can figure out how, I'd also like to see a comparison between s390 / amd64 struct arguments. Making the callee copy saves effort if the caller wasn't going to modify the argument anyway (or take a & reference…), making the caller copy saves effort if the struct is a temporary which doesn't need to be preserved, and I can't determine from first principles which is better.
from riscv-elf-psabi-doc.
@kito-cheng any idea how you inform gcc who "owns" arguments passed on the stack? It seems that they belong to the caller for s390 (i.e., they are are callee-saved).
from riscv-elf-psabi-doc.
Passing all varargs on the stack has the downside that you can't tail-call a varargs function (e.g., if printf is the last statement in a subroutine). This increases stack usage further and might have a more profound impact on code size than the extra arg-pushing.
Net code size increase is 0.2% for my sysroot and 0.5% for SPEC. That's a fairly large regression, so let's not go that route.
from riscv-elf-psabi-doc.
In my recent proposal, _Complex types are treated the same as structs with two floats, i.e., passed in FP registers if possible.
from riscv-elf-psabi-doc.
Related Issues (20)
- Collect psABI requirements for next release
- Alignment of __int128 on ILP32 HOT 7
- Question on calculation for HI20 HOT 2
- Question on calculation for HI20 HOT 1
- Clarification of rules for flattening structs containing arrays of empty records HOT 3
- Specify relocation overflow checks HOT 1
- Specify a platform reserved register HOT 20
- Should calling convention also define ptrdiff_t? HOT 1
- Should we use lw/sw in push pop when we used ILP32, whether it's RV32 or RV64 HOT 3
- Operation semantics of __bf16 datatype HOT 2
- representation of GNU C fixed-size vectors HOT 4
- Deprecate R_RISCV_RVC_LUI? HOT 4
- Define GOT-Relative data relocation HOT 8
- Embedding R_RISCV_RELAX to another relocations HOT 7
- Define gp(x3) as global VLENB HOT 4
- Bitfield integer calling convention garbled
- Calling convention uses RV64GQ without definition or reference HOT 3
- Calling convention description of va_list et al. are unclear HOT 2
- Interpretation of floating-point types
- Linux ABI for Pointer Masking HOT 19
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from riscv-elf-psabi-doc.