Code Monkey home page Code Monkey logo

Comments (19)

jrtc27 avatar jrtc27 commented on June 30, 2024

Regardless of Linux kernel <-> user ABI concerns that are OS-specific, you still need an ABI within userspace, which means you need psABI additions. I'm concerned that no attempt, as far as I know, has been made to engage with the psABI TG / Toolchains SIG about this, despite being out for public review. I dispute https://jira.riscv.org/browse/RVS-1125 being marked "not required to freeze" and cannot see a justification for it.

from riscv-elf-psabi-doc.

martinmaas avatar martinmaas commented on June 30, 2024

Thanks for the quick response! The reasoning was the following: The pointer masking standard under review is only the base functionality needed for HWASan (other use cases were deferred until later; including the ability of user mode to reconfigure pointer masking itself). For HWASan, pointer masking is configured on a per-process basis. Within the same process, all tagged pointers are handled like any other pointer when, e.g., being passed between libraries (identical to how HWASan works on other architectures).

There are currently no defined use cases beyond HWASan, although it is expected that future extensions and use cases building on top of pointer masking (e.g., the memory tagging standard that is currently getting started) may have psABI implications. This was part of the reason for the last paragraph in my message.

Based on your comment, it sounds like this needs more discussions from the toolchain/psABI side, and so it seems particularly timely to start a conversation now. What would be your preferred way to proceed? (e.g., starting an email thread, discussing on this GitHub issue, some group call?)

from riscv-elf-psabi-doc.

jrtc27 avatar jrtc27 commented on June 30, 2024

I would suggest putting forward a proposal on this issue with what you believe the psABI specification should be.

from riscv-elf-psabi-doc.

jrtc27 avatar jrtc27 commented on June 30, 2024

(and pointing out that libraries and applications, e.g. Qt's QJSValue in Qt Declarative, make assumptions about what they can do with high bits of pointers, so repurposing them for something else has software-visible effects that are hit in the real world)

from riscv-elf-psabi-doc.

SiFiveHolland avatar SiFiveHolland commented on June 30, 2024

The pointer masking ISA extension does not prescribe any particular usage for the high bits of pointers, so outside of any particular OS, the ISA extension itself has little effect on the userspace ABI. The three impacts I can see are:

  1. The uppermost memory addresses are inaccessible in Svbare (or if S-mode is not implemented). I don't think this is a real problem, as any hardware implementing Supm can just not put physical memory there.
  2. It is possible that two pointers may reference the same data object while comparing unequal.
  3. Data accesses apply the ignore transformation, while instruction accesses do not, so conversion from a data pointer to a function pointer may require transforming the pointer in software. But I don't think that can be done implicitly since the runtime may not know PMLEN.

I think the implication of 3. is that the psABI should mandate that the high bits of pointers retain their original values across function calls (in other words, software following the RISC-V ABI should always see sign/zero-extended pointers, but without needing to know which address translation mode is in use).

Then, any software that wishes to manipulate the high bits (HWASAN, QT, etc.) is defining its own custom ABI.

from riscv-elf-psabi-doc.

ptomsich avatar ptomsich commented on June 30, 2024

(and pointing out that libraries and applications, e.g. Qt's QJSValue in Qt Declarative, make assumptions about what they can do with high bits of pointers, so repurposing them for something else has software-visible effects that are hit in the real world)

This is an interesting point, as software using the high bits for its own purposes (unless these are explicitly masked off before dereferencing) without pointer-masking is making false assumptions about its runtime environment.

For me this brings up the opposite question: how should software signal to the runtime that it depends on pointer-masking? And could the PMLEN be a selection criteria for ifunc to use either an optimized implementation that pushes the masking to hardware or fall back to masking in software?

from riscv-elf-psabi-doc.

SiFiveHolland avatar SiFiveHolland commented on June 30, 2024

For me this brings up the opposite question: how should software signal to the runtime that it depends on pointer-masking?

It could make sense to put the software's minimum required PMLEN in the ELF header somewhere. I see GCC has an -mlam option for x86, but this is not saved anywhere in the object file. The dynamic linker could use this information to automatically request pointer masking from the kernel.

This raises the question of how the linker should handle objects with different PMLEN requirements: raise an error? max()?

And could the PMLEN be a selection criteria for ifunc to use either an optimized implementation that pushes the masking to hardware or fall back to masking in software?

The software masking operation is two instructions (shift left, shift right, plus loading PMLEN if the number of tag bits used by software is not constant). So anything that adds overhead is going to be more expensive than unconditionally doing the masking in software. I assume you mean using PMLEN in the selection criteria for functions which already use ifunc?

from riscv-elf-psabi-doc.

ptomsich avatar ptomsich commented on June 30, 2024

And could the PMLEN be a selection criteria for ifunc to use either an optimized implementation that pushes the masking to hardware or fall back to masking in software?

The software masking operation is two instructions (shift left, shift right, plus loading PMLEN if the number of tag bits used by software is not constant). So anything that adds overhead is going to be more expensive than unconditionally doing the masking in software. I assume you mean using PMLEN in the selection criteria for functions which already use ifunc?

ifunc resolution is a one-time overhead at program startup. So replacing a hot function that masks pointers using a sll+srl with a function that relies on hardware-masking will not incur any overhead after the initial ifunc resolution.

from riscv-elf-psabi-doc.

kito-cheng avatar kito-cheng commented on June 30, 2024

@SiFiveHolland

I think the implication of 3. is that the psABI should mandate that the high bits of pointers retain their original values across function calls (in other words, software following the RISC-V ABI should always see sign/zero-extended pointers, but without needing to know which address translation mode is in use).

Pass pointer across function is not get sign or zero extension, since we don't have ilp32 on RV64, that's the only combination may happen extension (we don't formally define that ABI variant so I say may).

from riscv-elf-psabi-doc.

kito-cheng avatar kito-cheng commented on June 30, 2024

Also I am not fan of asking dynamic linker to reserve or request pointer masking from the kernel, I suspect dynamic linker can't give right/precise diagnostic message when something wrong, this can be done very easier by constructor function or crt start stuffs rather than dynamic linker IMO.

from riscv-elf-psabi-doc.

SiFiveHolland avatar SiFiveHolland commented on June 30, 2024

And could the PMLEN be a selection criteria for ifunc to use either an optimized implementation that pushes the masking to hardware or fall back to masking in software?

The software masking operation is two instructions (shift left, shift right, plus loading PMLEN if the number of tag bits used by software is not constant). So anything that adds overhead is going to be more expensive than unconditionally doing the masking in software. I assume you mean using PMLEN in the selection criteria for functions which already use ifunc?

ifunc resolution is a one-time overhead at program startup. So replacing a hot function that masks pointers using a sll+srl with a function that relies on hardware-masking will not incur any overhead after the initial ifunc resolution.

Right, but functions using ifunc cannot be inlined, so there is only no overhead if the function doing the pointer masking is already out of line. My point was that ifunc will not remove all software pointer masking, since not all functions containing inlined pointer masking operations will be worth duplicating just to remove the shift instructions. I think we are in agreement on this, if you are selecting specific hot functions for ifunc optimization. My initial assumption was still wrong; thanks for clarifying.

I think the implication of 3. is that the psABI should mandate that the high bits of pointers retain their original values across function calls (in other words, software following the RISC-V ABI should always see sign/zero-extended pointers, but without needing to know which address translation mode is in use).

Pass pointer across function is not get sign or zero extension, since we don't have ilp32 on RV64, that's the only combination may happen extension (we don't formally define that ABI variant so I say may).

I'm referring to the sign/zero extension from 48 or 57 bits to 64 bits that is the "ignore transformation" defined by the pointer masking extension. In other words, software should always see "canonical" pointers, though the RISC-V Privileged spec never uses that term.

from riscv-elf-psabi-doc.

kito-cheng avatar kito-cheng commented on June 30, 2024

I'm referring to the sign/zero extension from 48 or 57 bits to 64 bits that is the "ignore transformation" defined by the pointer masking extension. In other words, software should always see "canonical" pointers, though the RISC-V Privileged spec never uses that term.

User mode software mostly don't need to aware that, I only aware ASan need to know that, otherwise the pointer mostly come from kernel, e.g. mmap, then keep the pointer unchanged during the whole life time of the user mode software.

Anyway, we don't need more wording for this on psABI site so far IMO :)

from riscv-elf-psabi-doc.

ptomsich avatar ptomsich commented on June 30, 2024

Pass pointer across function is not get sign or zero extension, since we don't have ilp32 on RV64, that's the only combination may happen extension (we don't formally define that ABI variant so I say may).

Just to get a sense of where everyone stands on the ILP32 question of needing to zero-extend (and ideally not having to insert zext.w intructions) the 32bit address before loading from it: would we prefer a solution with pointer-masking (PMLEN=32) or using special ld-instructions (that zero-extend a 32bit address after address-calculation) for ILP32?

from riscv-elf-psabi-doc.

martinmaas avatar martinmaas commented on June 30, 2024

I think it'd probably be the latter. Conceptually, pointer masking is an operation that happens at the very last step before a memory access is performed, so any ILP32-related zero extension is conceptually independent from pointer masking, in my view (and makes pointer masking a no-op with the currently defined PMLENs).

from riscv-elf-psabi-doc.

SiFiveHolland avatar SiFiveHolland commented on June 30, 2024

I'm referring to the sign/zero extension from 48 or 57 bits to 64 bits that is the "ignore transformation" defined by the pointer masking extension. In other words, software should always see "canonical" pointers, though the RISC-V Privileged spec never uses that term.

User mode software mostly don't need to aware that, I only aware ASan need to know that, otherwise the pointer mostly come from kernel, e.g. mmap, then keep the pointer unchanged during the whole life time of the user mode software.

The problem is that there are specific cases where user software leaving the pointer unchanged is not sufficient.

Say for example you have a user program that uses malloc() to allocate a buffer, JITs some instructions into it, casts the address of the buffer to a function pointer, and calls the function pointer. This is perfectly valid with the existing ABI.

Now say you have a libc where the malloc() implementation generates and returns tagged pointers, and the free() implementation checks the tag to catch double-free errors. The ABI problem is that you cannot link the above user program to this libc implementation, because the "call the function pointer" step will crash if the pointer is tagged.

Anyway, we don't need more wording for this on psABI site so far IMO :)

I would want to be able to point to some text in the psABI to clarify that the above libc implementation is not conforming. But I'll defer to the opinion of the psABI maintainers.

from riscv-elf-psabi-doc.

fmayer avatar fmayer commented on June 30, 2024

Now say you have a libc where the malloc() implementation generates and returns tagged pointers, and the free() implementation checks the tag to catch double-free errors. The ABI problem is that you cannot link the above user program to this libc implementation, because the "call the function pointer" step will crash if the pointer is tagged.

I have been working on HWASan, and me and people on my team agree we wouldn't support that. Allocating a buffer from malloc and using it for code seems like a bad idea for various reasons. One of them is around W^X, you would need to mprotect malloc memory to make it non-writable and executable, which is not something you should do (and you'd have to mess around to get something properly around in the first place). People should just use mmap if they want memory for code.

from riscv-elf-psabi-doc.

martinmaas avatar martinmaas commented on June 30, 2024

Hi everyone! Closing the loop on this as we are wrapping up public review for pointer masking. It sounds like we agree that, for the most part, psABI will not require to cover pointer masking. The main open question is whether we would consider passing tagged pointers across boundaries psABI conforming or not, which may be worth documenting.

If we define it to be non-conforming, HWAsan environments would effectively have their own psABI, but be mostly compatible with standard compiled software as it is the case on other architectures today (with the exception of some software that uses tag bits for other purposes). If we define it to be conforming, we would likely need to define that a psABI compliant function calls must not modify the top bits of pointers.

Based on Florian's comment, it sounds like both should be possible.

from riscv-elf-psabi-doc.

kito-cheng avatar kito-cheng commented on June 30, 2024

It sounds like we agree that, for the most part, psABI will not require to cover pointer masking.

Yes from my point of view as psABI chair, we may need some wording for pointer masking on psABI in future, but it's not now since we don't have clear usage and request for that yet.

from riscv-elf-psabi-doc.

kito-cheng avatar kito-cheng commented on June 30, 2024

I gonna close this issue, feel free to reopen or open a new one if needed.

from riscv-elf-psabi-doc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.