Code Monkey home page Code Monkey logo

Comments (9)

jheer avatar jheer commented on May 14, 2024

While Arquero does not provide a built-in operator for this, you can add your own by extending Arquero via the addFunction method. The following code block can be directly pasted into an Observable cell:

{
  // generate a recode function for the given recoding parameters
  function recoder(map, other) {
    return value => map.has(value) ? map.get(value)
      : other === undefined ? value
      : other;
  }

  // register a custom recode function
  aq.addFunction(
    'my_recode',
    recoder(new Map([['foo', 'farp'], ['bar', 'borp']]), 'other'),
    { override: true } // suppress errors if we re-evaluate this code
  );

  // apply the recode function to a data table
  return aq.table({ x: ['foo', 'bar', 'baz'] })
    .derive({ z: d => op.my_recode(d.x) })
    .toJSON();

  // returns: '{"x":["foo","bar","baz"],"z":["farp","borp","other"]}'
}

from arquero.

juba avatar juba commented on May 14, 2024

Yes, absolutely, I should have mentioned it but I already use this possibility. I also added custom (quick and dirty) "batch_derive" and "batch_recode" functions to tables thanks to Arquero's awesome customisability.

Many thanks for your detailed answer, your recoder code is of course better than mine, so I'll gladly take it !

from arquero.

jheer avatar jheer commented on May 14, 2024

An op.recode() method is now staged for v1.2.0. Unlike my earlier example, it uses a vanilla object to specify the value map, as this can be specified and serialized more easily in Arquero table expressions:

table.derive({ val: d => op.recode(d.val, { oldA: 'newA', oldB: 'newB' }, '?') });

This also avoids the need to register a new function, though you can still define the map outside of Arquero and then bind it as a parameter:

const map = { oldA: 'newA', oldB: 'newB' };
table.params({ map }).derive({ val: (d, $) => op.recode(d.val, $.map, '?') });

from arquero.

juba avatar juba commented on May 14, 2024

Thanks a lot !
Please excuse me if I am wrong, but one reason I used a Map instead of an Object is because Map can accept any type of values as keys, so with it I can recode numbers or even undefined values. I think this would not be possible with a vanilla object ?

from arquero.

jheer avatar jheer commented on May 14, 2024

Any value that can be coerced to a string (including numbers, booleans, null, undefined, dates, etc) can be recoded with the new function, which suffices for many use cases I've encountered. However, there can of course be collisions (e.g., the string 'true' and boolean true map to the same key). If strict object equality is a requirement for you, I would stick with the earlier solution.

from arquero.

juba avatar juba commented on May 14, 2024

On one hand using a vanilla object makes it easier to create, whereas a Map is a bit more complicated with new Map([[..]]). But I think having the possibility to declare precise recodings, not prone to errors or confusion due to conversion to strings, is really important.

I imagine accepting both and testing for the type before recoding would add unnecessary code complexity ?

Many thanks for looking into it.

from arquero.

jheer avatar jheer commented on May 14, 2024

Can you provide a specific example (illustrative of a common real-world use case) where the string conversion is notably problematic? If you are recoding primitive types or dates, string coercion works reasonably well. Whether loose equivalence (e.g., 2 and '2' being equivalent for recoding) is a "feature" or a "bug" is use case dependent.

Meanwhile, here is an example where Map's use of strict object equality can lead to potentially unexpected results:

const d1 = new Date(2000, 0, 1);
const d2 = new Date(2000, 0, 1);

const m = new Map().set(d1, 'foo');
m.get(d1) // 'foo'
m.get(d2) // undefined

const o = {[d1]: 'foo'};
o[d1] // 'foo'
o[d2] // 'foo

And you can always use the alternative method above (with custom function registration) if you need it!

from arquero.

jheer avatar jheer commented on May 14, 2024

I thought about this some more and realized there is no reason op.recode can't support either an Object or a Map. I've updated the v1.2.0 branch accordingly. Note that (for the time being) when using a Map you must define it externally and bind it as a parameter, as Arquero table expressions do not permit use of the new operator.

from arquero.

juba avatar juba commented on May 14, 2024

I think that would be perfect. A simpler solution suitable for most use cases with vanilla object, and the ability to use a Map if it is needed to be very specific.

You're right, though, Map use cases should be quite rare, I don't find any other cases that the one you mentioned, when there is the same value with a different type in the same variable (such as undefined and "undefined") and you would want to recode them to different values.

Many thanks again for implementing this.

from arquero.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.