Code Monkey home page Code Monkey logo

Comments (10)

antoinemine avatar antoinemine commented on September 28, 2024

This is interesting, but I am not considering adding this at the moment.
I think that, with the optimisations that you are proposing, there will be very little code sharing with the low-level implementation in ZArith:

  • you want to use two's complement representation, but Zarith uses GMP's sign bit plus unsigned integer representation for long numbers
  • Zarith ensures that operations on short integers are mapped directly to the OCaml ones; this is not the case if you reserve 10 bits in every number

from zarith.

recoules avatar recoules commented on September 28, 2024

Zarith ensures that operations on short integers are mapped directly to the OCaml ones; this is not the case if you reserve 10 bits in every number

Well, in fact, it only needs a wrapper that shift right (asr 10) the value and then, the operation can be mapped directly to the OCaml one. Then, before returning, if ((x lsl 10) asr 10) = x then you know you can use compact form.

you want to use two's complement representation, but Zarith uses GMP's sign bit plus unsigned integer representation for long numbers

If think on the contrary that we want to use the GMP's sign bit plus unsigned value because for positive value, it changes nothing but for example, if you want to represent a big negative 128 bit value, e.g. -9 223 372 036 854 775 807, you would prefer to represent it as -0x7fffffffffffffff,128 rather than 0xffffffffffffffff8000000000000001,128. It is still very hypothetical.

I think that, with the optimisations that you are proposing, there will be very little code sharing with the low-level implementation in ZArith:

In fact, as I am very tempted to reuse some of the helper functions and constant declared in the stub c file, I thought it could be smarter to directly embed it in the same library rather than creating my own "almost copy-pasted" project.

Also, as a side note, I am wondering why there is a need for long numbers to have a word reserved for sign | size. Assuming you are willing to truncate ocaml memory layout (aka, updating the header), the size would be directly mapped to the one of the header (minus one because of the custom block). Yet, we still need a bit for the sign but, if the custom block is used (and I think it is the default implementation choice), we can use different struct custom_operations addresses to make the difference between positive and negative. Thus, for a small static duplication cost, we save one word by allocated long Z. I am planning to try this because it sounds fun but I agree it looks pretty pointless.

from zarith.

antoinemine avatar antoinemine commented on September 28, 2024

Sorry for the delay in answering.

Zarith ensures that operations on short integers are mapped directly to the OCaml ones; this is not the case if you reserve 10 bits in every number

Well, in fact, it only needs a wrapper that shift right (asr 10) the value and then, the operation can be mapped directly to the OCaml one. Then, before returning, if ((x lsl 10) asr 10) = x then you know you can use compact form.

If I understand correctly, it is possible that you have a bitvector in compact representation that does not fit into a machine integer (bitsize between 53 and 1024 I guess). In that case, you'll do asr and lsl on big numbers. I think that it could end up much more costly than keeping the size separately, as an OCaml int, and let the large integer be unshifted.

you want to use two's complement representation, but Zarith uses GMP's sign bit plus unsigned integer representation for long numbers

If think on the contrary that we want to use the GMP's sign bit plus unsigned value because for positive value, it changes nothing but for example, if you want to represent a big negative 128 bit value, e.g. -9 223 372 036 854 775 807, you would prefer to represent it as -0x7fffffffffffffff,128 rather than 0xffffffffffffffff8000000000000001,128. It is still very hypothetical.

For the modular arithmetic part, would it be possible to simply use the standard Z.add, Z.sub, etc operations, followed with an extract (or signed_extract) ?
The current downside is that there is no fast (OCaml) path for these operators, but we could work on this.

Also, as a side note, I am wondering why there is a need for long numbers to have a word reserved for sign | size. Assuming you are willing to truncate ocaml memory layout (aka, updating the header), the size would be directly mapped to the one of the header (minus one because of the custom block). Yet, we still need a bit for the sign but, if the custom block is used (and I think it is the default implementation choice), we can use different struct custom_operations addresses to make the difference between positive and negative. Thus, for a small static duplication cost, we save one word by allocated long Z. I am planning to try this because it sounds fun but I agree it looks pretty pointless.

Modifying the header of an OCaml block after its allocation seems an extremely bad idea to me.

from zarith.

recoules avatar recoules commented on September 28, 2024

If` I understand correctly, it is possible that you have a bitvector in compact representation that does not fit into a machine integer (bitsize between 53 and 1024 I guess).

It is not exactly that.
In a first time, only the positive value that actually fit in 52 bits can use the compact representation (cover usual small constants).
Then we can figure out that handling "sign extension" could be beneficial to cover small negative constant too.
For instance -1<64> is a good candidate and could be represented as -1 lsl 10 lor 64 instead of Bigint 0xffffffffffffffffL; assuming all methods deal correctly with the "sign extension" compact representation.

For the modular arithmetic part, would it be possible to simply use the standard Z.add, Z.sub, etc operations, followed with an extract (or signed_extract) ?
The current downside is that there is no fast (OCaml) path for these operators, but we could work on this.

This is actually what is done today. I opened some times ago the PR #90 (need to be reviewed) for adding fast path to extract operations.

Modifying the header of an OCaml block after its allocation seems an extremely bad idea to me.

The function Obj.truncate makes this (but it could be more efficient to redo it inline). It can not really mess with the garbage collector since the "remaining" bytes will not be reclaimed right away (it is not as if we changed the pointer in the minor heap to mark them reallocable). However, it could help if the block is promoted as only the useful part will be copied in major heap.

Still I expect all 64 bits common values to fall in the compact form so, saving 1 word time to time should not really matter a lot.

from zarith.

Gbury avatar Gbury commented on September 28, 2024

Modifying the header of an OCaml block after its allocation seems an extremely bad idea to me.

The function Obj.truncate makes this (but it could be more efficient to redo it inline). It can not really mess with the garbage collector since the "remaining" bytes will not be reclaimed right away (it is not as if we changed the pointer in the minor heap to mark them reallocable). However, it could help if the block is promoted as only the useful part will be copied in major heap.

Still I expect all 64 bits common values to fall in the compact form so, saving 1 word time to time should not really matter a lot.

Truncating ocaml values after their creation breaks assumptions made by the flambda optimization pass of the compiler. From what I remember, with multicore (i.e. ocaml 5.0), it is also unsafe to modify headers of objects. And finally, the Obj.truncate function has been deprecated since april 2019 (see ocaml/ocaml@466d3bc ), so I would really recommend not to use it.

from zarith.

recoules avatar recoules commented on September 28, 2024

@Gbury Thank you for the precision. Of course, I would not recommend to truncate arbitrary values, but here is a special case: the value is created by an external function (so flambda is, I think, out of the discussion) and, more importantly, the header will be updated before this external function return (meaning it is the only owner of the value).

Still, I do not really know how the garbage collector will work in multicore with external function calls. Can an external C primitive be interrupted by the garbage collector without calling anything related to OCaml? If yes, then I think a lot of code will become unsafe. If it is still no, then it is safe to modify the header since no one except the C function can use the value yet.

from zarith.

antoinemine avatar antoinemine commented on September 28, 2024

It is not exactly that. In a first time, only the positive value that actually fit in 52 bits can use the compact representation (cover usual small constants).

OK, this seems more reasonable.

For the modular arithmetic part, would it be possible to simply use the standard Z.add, Z.sub, etc operations, followed with an extract (or signed_extract) ?
The current downside is that there is no fast (OCaml) path for these operators, but we could work on this.

This is actually what is done today.

OK. I was not sure, given the current bitvector.ml code (linked on top of the issue).

I opened some times ago the PR #90 (need to be reviewed) for adding fast path to extract operations.

Sorry, I meant a way to code the current fast path in OCaml instead of C, as done in the current version of most operators.
PR #90 is for calls that output a small int even though the arguments are not small.

Modifying the header of an OCaml block after its allocation seems an extremely bad idea to me.

Given the discussion, it is clearly not future-proof. I am against making and exploiting such assumptions on the GC.

from zarith.

recoules avatar recoules commented on September 28, 2024

PR #90 is for calls that output a small int even though the arguments are not small.

It was originally for this but I updated it a time ago (#90 (comment)) to give fast path in OCaml (because small integer input restrict is a special case where the function return a small integer).

Given the discussion, it is clearly not future-proof. I am against making and exploiting such assumptions on the GC.

Fair enough, it is not the center of the topic anyway.
Still, for my personal curiosity (since I have not closely followed the last GC development), I would be curious to know an expert view (@damiendoligez). My hypothesis is that each domain will have its own minor heap and that, as long as there is no reference on a newly created value in another minor heap / in the major heap and that there is no OCaml allocation in this minor heap, then it is safe to do what you want with this value (eg. make a big alloc and split it in smaller values, truncate it, etc.). If no, some older code may not be compatible anymore. Also, I am wondering if the major collection can be triggered when one domain is currently running an external primitive? If so, does this mean that all OCaml parameter should be declared volatile in the C part?

from zarith.

xavierleroy avatar xavierleroy commented on September 28, 2024

Re: truncating an already-allocated block, this was supported in OCaml up to versions 4.xx (function Obj.truncate), but rather inefficiently. In particular, reducing the size of a major heap block by one creates a one-word fragment that cannot be used for allocation, causing heap fragmentation.

In Multicore OCaml, which became OCaml 5.00 earlier today, Obj.truncate is not supported at all because of strange race conditions between truncating an object and concurrently accessing it. So, don't even think of using it.

from zarith.

antoinemine avatar antoinemine commented on September 28, 2024

Now that #90 is merged, it should be possible to write a modular arithmetic library on top of Zarith in an efficient way in 100% pure OCaml. So, closing the issue.

from zarith.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.