Code Monkey home page Code Monkey logo

Comments (5)

frankmcsherry avatar frankmcsherry commented on July 30, 2024 1

It sounds interesting! I'll need a bit to get my head around what it entails, but it seems quite reasonable to 1. limit the use of assumptions about time, and 2. be as serious as possible about usize sizes.

If it's alright, I'll take a few to read up on things. Mostly, I'd love to avoid having code in TD/DD that I don't understand, because it's largely on me if something is broken elsewhere. But I have to imagine if nothing else a feature flag for time would be really easy, as should be fixing DD's 64 bit assumption (which is probably a bug anyhow).

from differential-dataflow.

antiguru avatar antiguru commented on July 30, 2024 1

I think all changes look fine from a high level. The time changes don't seem to change any behavior on non-wasm targets. The RHH code isn't used anywhere at this point. Before committing to supporting wasm, I'd like to know if there's a regression test we could have. It seems testing isn't as simple as it is for other targets, but if we can get it into CI then I don't think there's anything blocking this.

As @frankmcsherry points out, there are some parts of the code that might assume usize to have the same length as u64. Removing this assumption would be a win in itself.

from differential-dataflow.

frankmcsherry avatar frankmcsherry commented on July 30, 2024 1

I've started to take a look (sorry for the delay) and have some quick thoughts:

  1. The DD changes .. could probably just instead be deletion of the associated code. At least, the time changes are a. an unused _timer, and b. a YieldingIter that isn't used other than in some commented-out Kafka code (it could become commented out also, and place a similar obligation on uncommenting the code as the rdkafka connections). If we clean this all up, then no one needs timers at all in here.
  2. The DD changes miss an important use in the dogsdogsdogs project, which is the only load-bearing subproject (though I can understand that it isn't obvious). Specifically, half_join.rs uses Instant as an argument to the yield_function, allowing the user to call .elapsed() if they are interested and not otherwise. We could also just not have dogs3 compile on web assembly / delay that work until someone needs it.
  3. The TD changes are harder to avoid, but do seem some amount of harmless. There's some weirdly dead-ish code (viz Worker::timer()), but things like scheduling, sequencing, and activation seem to want to do things based on external notions of time, and logging just seems somewhat wedged without access to time.

I'm going to look into a DD PR that essentially tracks your changes (fixing RHH) but subtracts out the time-based code rather than modifies it. I'll report back here about that.

from differential-dataflow.

frankmcsherry avatar frankmcsherry commented on July 30, 2024

The RHH fix seems more subtle than at first glance. The code also has elsewhere this

        /// Indicates both the desired location and the hash signature of the key.
        fn desired_location<K: Hashable>(&self, key: &K) -> usize {
            let hash: usize = key.hashed().into().try_into().unwrap();
            hash / self.divisor
        }

where the try_into() is from u64 to usize.

The code is not currently active, though .. it could become so in the future when I get some time. Ideally it wouldn't silently break the WASM builds, though this is an example of where it could/would.

My understanding of WASM is that 64-bit types are fine, just that usize is 32 bits. So, probably the right thing here (and perhaps elsewhere) is to be more serious about either using u64 throughout, or pivoting to usize earlier (and using it throughout).

from differential-dataflow.

oli-w avatar oli-w commented on July 30, 2024

Apologies for my lack of reply sooner - I've been away. Thanks for making progress on these changes 😃.

The DD changes miss an important use in the dogsdogsdogs project

Ah yes, I haven't used that before.

We could also just not have dogs3 compile on web assembly / delay that work until someone needs it.

I'd be very happy with that - I have had a lot of fun composing DD operators without having to reach for what dogsdogsdogs provides.

My understanding of WASM is that 64-bit types are fine, just that usize is 32 bits

Yep that's right, similar to using u128 on a 64-bit machine. There may be other issues that aren't exposed until runtime of some code paths - full disclosure: I tried out making the surgical changes required to get DD working in WASM, found issues, fixed, rinse-and-repeat, rather than reading through all the code to find where usize being 32 might be problematic. The left shift was picked up by the compiler - you can reproduce with cargo build --target wasm32-unknown-unknown.

Let me know if there's something I can help with that doesn't involve changing production "load-bearing" code. E.g. setting up some tests as suggested - details for WASM: https://rustwasm.github.io/wasm-bindgen/wasm-bindgen-test/index.html. Note this does run the compiled WASM using Node JS. Or I can provide a basic WASM example for motivation - play around with an interactive dataflow by visiting a web page and clicking on some buttons.

from differential-dataflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.