Code Monkey home page Code Monkey logo

Comments (10)

lorenzofox3 avatar lorenzofox3 commented on August 21, 2024

compared to v1, It will be a bit tricky to implement only. However skip can be done quite easily

from zora.

dy avatar dy commented on August 21, 2024

Yes, but otherwise that is never close to tape, unfortunately (I know it by developing tst)

from zora.

TehShrike avatar TehShrike commented on August 21, 2024

I would appreciate only.

from zora.

lorenzofox3 avatar lorenzofox3 commented on August 21, 2024

only is a bit tricky with the run fast, in parallel design of zora.

It is actually more like "skip all others" and would require to change the design into a "collect tests" phase and a "run spec functions" phase so if "only" is found during the collection, all others regular tests are marked as skipped.

This is still doable even with the current design, actually it used to be the case in v1 of zora.
However it starts to become inconsistent with nested tests as the parent test needs to run its specification function in order to collect its children tests and eventually find a "only".

It also depends on the intent behind the "only", especially with zora which run tests really fast compared to others tools and makes the "only" less needed:
is it because you want only a particular spec to be logged and avoid noise when working on the related test ? In that case it is pretty easy to implement (or even move it at the reporter process as a tap filter or at the runner level if you use some) and one could actually do so without touching the test files. For example logging the spec with "multiply" keyword only could be done like this:
node ./test.js | some-tap-filter-program --spec multiply
The downside is that other specs would still run silently with their eventual side effects.

So to sum up:

  1. we could easily implement a .only method at root level but it would be more complicated at nested level.
  2. If the intent is to avoid noise pollution in the console, I would rather create a specific reporter

from zora.

rixo avatar rixo commented on August 21, 2024

I'm just starting with zora so I might have missed something obvious, but with other tools I'm an avid user of .only so I miss it dearly here.

The filter / reporter approach is probably best when moving forward, happily adding new tests / code. If the suite is fast enough to run all the tests on each change, it's great to have only outputs of what you're working on... and the one you've just broken.

For me however, the most important use case of .only is to run a given code path in isolation, mostly for diagnostic purpose. So that, for example, I can put a debugger or console.log somewhere in the code under test and be sure of the conditions (i.e. the specific test) that triggered it. So a filter / reporter won't do for that.

we could easily implement a .only method at root level but it would be more complicated at nested level.

I'd be cool with having to call .only on the whole parent chain to focus a nested test if that's the price to pay for keeping the implementation light. It remains important to be able to skip siblings of the focused test though.

test(...)

only('', t => { // <- all spec functions that are executed are marked with an arrow
  t.test(...)
  t.only('', async t => { // <-
    await t.test(...)
    t.only(...) // <- only this one runs
  })
  t.ok(...) // <- obviously this runs too
  t.test(...)
})

// in this situation running "nothing" would be acceptable
only('', t => { // <-
  t.only('', t => { // <-
    t.test(...)
    t.test(...)
  })
  t.test('', t => {
    t.only(...)
  })
})

test(...)

Although there are also times where you want to focus on the group of tests you're working on for performance reasons -- sometimes it's the tested objects themselves that are slow.

So maybe another function could let the user pick the right behaviour for their current situation?

test(...)

focus('', t => { // <-
  t.test('', t => { // <-
    t.test(...) // <-
    t.test(...) // <-
  })
  t.only('', t => { // <- only affects its own children
    t.test(...)
    t.only(...) // <-
  })
  t.test(...) // <-
})

test(...)

It could also enables to break out of "only jail" at any level of nesting:

only('', t => { // <-
  t.test(...)
  t.focus('', t => { // <-
    t.test(...) // <-
    t.only(...) // <-
    t.test(...) // <-
  })
  t.test(...)
})

This would be possible without a collect phase, wouldn't it? At non-root level at least. Why is it easier to do at root level, by the way? Having to execute the module's root code looks like a similar problem to running a specification function 🤔

from zora.

lorenzofox3 avatar lorenzofox3 commented on August 21, 2024

diagnostic

For me however, the most important use case of .only is to run a given code path in isolation, mostly for diagnostic purpose. So that, for example, I can put a debugger or console.log somewhere in the code under test and be sure of the conditions (i.e. the specific test) that triggered it. So a filter / reporter won't do for that.

side note

I would avoid console.log for diagnostic as the tests run in parallel and console.log is used to output the TAP stream, so you might have various messages mixed up.
However we can add a comment method to the assertion object to output a message at the right place in the stream:

test('a test', t=> {
   let val = 0;
   console.log('foo')
   t.comment(`val is ${val}`);
   t.eq(val, 0);
   increment(1);
   console.log('bar');
   t.comment(`val is ${val}`);
   t.eq(val, 1);
});

the output will likely be

foo
bar
TAP version 13
# a test
# val is 0
ok 1 - should be equivalent
# val is 1
ok 2 - should be equivalent
1..2

# ok
# success: 2
# skipped: 0
# failure: 0

debug

which leaves you with breakpoints. And this should work as expected as they track javascript execution.
I see how it could bother if you put the breakpoint in the application code which may be run by other tests. However if you set up the break point in the testing function you should not have any problem... but not always ideal if you have a big call stack to traverse to finally reach the application code in cause.

Is implementation at library level worth it ? (open question)

Yet I agree it is still nice to quickly narrow the testing scope on a single or several tests:

  1. for speed in some cases as you mentioned (although it should not often be the case with zora)
  2. side effects and especially thrown errors (if you are in the middle of a big refactoring which breaks code at different places for example).

To solve these eventual problems, I usually try to group functionally similar tests together in the same file so I can run the specific file instead of the whole suite. I may use some skip in this file if it is really needed.

Semantically only is similar to skip all others and is more a convenience. skip is already supported and is part of TAP specification. It also makes sense from the perspective of the testing program: you can withdraw a test for a good reason (ex: database is not yet available in QA environment).
Whereas only remains in my opinion a bit ambiguous, particularly with nested tests (see below).

I think it should rather be handled at test runner level as it is more in the domain of user experience: from a testing program point of view it does not make sense to "only" run one (or even some) test(s). If you start to add only here and there in your testing program for your own convenience you have then to change them back to make the program valid for the rest of the team.
With other words: you change the nature of the program whereas what you want is to change the way it is run.

Root level

Implemented by the test runner

Let's consider root level only for the moment.

you can have a only functionality pretty easily:

// tester.js
import {createHarness} from 'zora';

const harness = createHarness();

// We decorate the test function
export const test = (desc, fn) => {
    if(process.env.ONLY && desc.includes('#only') === false){
        return harness.skip(desc, fn);
    }
    return harness.test(desc, fn);
};

export const report = async () => {
    try {
        await harness.report();
    } catch (e) {
        console.error(e);
        process.exit(1);
    } finally {
        process.exit(harness.pass ? 0 : 1);
    }
};
import {test} from './tester.js';

test('should not run', t=> {
    t.fail('do not run');
});

test('should run #only', t=> {
    t.ok(true, 'I ran');
});

test('should not run either', t=> {
    t.fail('do not run');
});
// index.js
import {report} from './tester.js';

report();

and the following scripts to run the tests

// package.json
{
  "scripts": {
    "test": "node -r esm -r ./spec.js ./index.js", // will run the whole suite (and fail in this case)
    "test:only": "ONLY=true npm t" // will run tests marked as "only"
  }
}

you cans see this example in the following gist

If implemented at library level

If we still believe it is worth it at zora's API level, The implementation would be (without changing considerably the design) more or less:

  1. Collect a test when zora.test is called (at the moment il also immediately triggers the spec function for better performance and simplicity)
  2. If zora.only -> mark all other tests as skipped
  3. On "next tick", when all tests have been collected synchronously, run the spec functions and start the reporting stream.

This is doable (it was implemented in v1) and would not affect much performance, from what I recall.

Nested

Now if you decide to have a only in a sub test, I find it unclear (even from your examples) what should be the behavior in some cases. It would also be inconsistent with the spec/collection semantic.
We can make some assumptions for the most general cases but it is usually a bad decision in programming.

This would be possible without a collect phase, wouldn't it? At non-root level at least. Why is it easier to do at root level, by the way? Having to execute the module's root code looks like a similar problem to running a specification function

You are right. In essence the module's root code is nothing more than a big spec function. But in practice most of the testing programs are made of root test statements. I don't see many top level assertions and we could simply decide to remove them.

On the other hand as soon as you run a spec function, it runs to its completion or fails (unless you use some sophisticated co-routines with generators)
And that's the whole problem with sub tests: the spec function of a parent will more likely mix assertions, sub tests declarations and eventually other unrelated code.

consider the number of uncertainties in the following program:

test(`test 1`, t => {
    t.fail(`I am not in the only`);
});

only(`test 2`, t => {
    t.ok(true, 'pass');
});

const otherUnrelatedCodeWhichMayThrow = () => {
    throw new Error(`oh noooo`);
};

only(`other 3`, t => {

    t.ok(false, 'I am an assertion, should I run or not?');

    // should I run ? whatever I will do anyway...
    otherUnrelatedCodeWhichMayThrow();

    t.test('I am a sub test, should I run? how am I different from a regular assertion statement ?', t => {
        t.ok(false, 'failing');
    });

    t.only('I should definitely run as I am explicitly set as only', t =>{
        t.only(`I am even more nested - hey don't forget to remove this 'only chain' before you commit`, t=>{
           t.ok(true);
        });
    });
});

Another issue I foresee (inherent to zora's design): zora allows you to control the sequence of your test with regular async control flow.
It will eventually lead to impossible states:

only(`some tests I want to run in sequence because they share a common state`, async t => {

    let state = {}; // shared state

    // should I run ?
    await someAsynchronousSideEffectOnState(state);

    // well, if I don't run ...
    await t.test('first test', async t => {
          const result = await otherAsyncFunctionWithSideEffectOnState(state);
          t.eq(result, someExpectation);
    });

    // ... I should not exist (I depend on the previous state modifications)
    await t.only('first test', async t => {
       // whatever test which depends on the result of previously executed code
    });

});

For information, tape does not allow you to write more than one only (probably the real semantic of an eventual only API) and does not allow you to nest them. I am not familiar with other testing libraries but I am pretty sure if they implement it, it must be buggy or inconsistent unless they put some big restrictions like tape. Otherwise they may run (slowly) the tests in sequence and make an only API badly needed.

Conclusion

So my position on this is:

  • only has no clear semantic
  • only at zora's api level does not make much sense or is very complex by design (especially for nested tests)
  • only is the domain of user experience and may be useful
  • it is relatively easy to workaround to get the same user experience
  • zora aims to be simple while being easily extensible. So that only, snapshot testing, test runners, etc can be implemented atop of zora

and when I see the balance between the benefits and the trade-offs, I am quite reluctant to implement it.

from zora.

rixo avatar rixo commented on August 21, 2024

Thank you for your detailed answer. Helps a lot.

So, test runners are probably the obvious thing I had overlooked...

I would love to see full working examples of zora setups. That would be of great help to understand the "zora way". In particular, I think the most useful would be an example on how to best imitate mocha (describe / it interface, spec reporter with indentation and watch), as well as examples of what you think are most idiomatic / best zora ways (probably a minimal setup example for fast integration scenarios, and a full-fledged example with all the bells and whistles). Just throwing some ideas here, you're already doing a fantastic job answering questions in the issues.

Nested only implemented by the test runner

So, building upon your example, I managed to implement only as per my above (and bellow) specs, including in sub tests, relatively easy. Here's a gist.

There is one thing that is badly broken with my implementation unfortunately. Since I've wrapped the spec functions, when a test fails, zora always reports the failure location as somewhere in my helper's code, instead of the actual site of the failure. For example, pointing at this line:

      ---
        wanted: "fail not called"
        found: "fail called"
        at: " /home/eric/projects/zora/zora-only/test/zora-only.js:40:25"
        operator: "fail"
      ...

I'm a bit surprised that zora apparently tries to report the test rather than the actual line / trace where the exception has been raised... Is that a TAP thing?

how [is a test] different from a regular assertion statement?

Is there something I can do to fix this? (Also this doesn't seem to play too well with esm...)

Implementation at lib level

Anyhow, you're most probably right. only is a higher level construct than skip (because skip must know some internals, how to generate TAP output), and can be implemented on top of it.

That means only should probably not be implemented in zora's "core" since it can be done at a higher level -- and, let's be honest, the semantics are all but open for debate.

So the question shifts to: How should a "zora-only" be published? Another package? Some "extras" layer in zora itself? I have no idea personally, although, as a zora greener, I'd appreciate to have it bundled with the library -- but I see how this will probably evolve when I'll know the ecosystem better...

Semantics of only

I should probably have written them down to make this explicit, but the semantics I proposed for only and focus were designed to be absolutely unambiguous to both the user and the test runner.

They are designed to be applicable in a single forward pass because, like you, I think that auto discovery of only calls in non-only blocks is intractable (once you've discovered that a block should not run, you've already run it).

So the rules are:

  • in "only mode" (e.g. enabled by process.env.ONLY):

    • every top level test is skipped
    • every top level only and focus test is run
  • in non "only mode":

    • every only and focus raises an exception (this is a protection against forgotten debug code)
    • every test and everything else runs normally
  • directly nested inside a only test:

    • every test is skipped
    • every only and focus test is run
  • directly nested inside a focus test:

    • every test, only, and focus test is run
  • in every spec function that is run as per the previous rules:

    • everything (assertions, custom code, etc.) is run and processed normally

I'm absolutely not a fan of the resulting "only chain" but it is a necessary compromise to avoid the need for a collect phase.

The consistency of test state resulting of zora's control flow (async / await) remains a responsibility of the user. They are able to finely control and easily predict what will run or not, so that doesn't seem like an issue to me. In most cases I guess it should just work, except if a test shares and depends on state change by a previous (skipped) test. I think it's fair to let the user sort this kind of mess out.

from zora.

lorenzofox3 avatar lorenzofox3 commented on August 21, 2024

I would love to see full working examples of zora setups

Yes that is an excellent idea, I had started some time ago a recipes list, I think that would be appropriate to publish it. There are already few shared examples here and there in the issues sections but clearly not visible enough.

That would be of great help to understand the "zora way". In particular, I think the most useful would be an example on how to best imitate mocha (describe / it interface, spec reporter with indentation and watch), as well as examples of what you think are most idiomatic / best zora ways (probably a minimal setup example for fast integration scenarios, and a full-fledged example with all the bells and whistles). Just throwing some ideas here, you're already doing a fantastic job answering questions in the issues.

Well that is the point, there is no "zora way". Zora is just a library to write Ecmascript testing programs which output TAP streams:

  • You can use whatever test runner you want - often Node or any Browser is enough
  • If you need code transformation you are free(have to) do it yourself the way you want.
  • If you need custom reporting, you delegate it to another processes

"Imitating Mocha" brings already a lot of opinions !

I wrote an article to explain how it fits in the Unix philosophy and where it stands in the testing ecosystem. In essence it is way closer to Tape (although tape was designed for nodejs) than full battery included frameworks (node-tap, ava, jest, mocha, etc).

It is its strength and its "weakness" at the same time. It is very flexible whereas sometimes you will need to build on top to have a better user experience. Ideally, I hoped for some more opinionated tools on top of zora, but that certainly not the goal of zora. In many case it is trivial, yet you have to do it..

For example the following test runner is a CLI for Nodejs testing programs. It comes with its own decisions on which problems to solve and how to do it which may make it user friendly in some situation

  • It is a CLI
  • It uses indented tap stream
  • You can select a custom reporter (in the list of tap-mocha-reporter)
  • You can pick up files you want to test with glob syntax (with common default).
  • You can say you want to use EcmaScript Module syntax
  • It supports "only" feature
  • It handles process exit code
  • You can have code coverage for free with c8

example of usage:
c8 zn --esm will run every test file matching *.spec.js while supporting ESM syntax using the default mocha reporter ("classic") with code coverage and with the "proper" exit code

Yet the code of the runner remains simple, that would be typically the tools I would expect on top of zora. You can always build more sophisticated runners (which spawn child processes, compile typescript, watch src/test files, print error in red, etc)

So, building upon your example, I managed to implement only as per my above (and bellow) specs, including in sub tests, relatively easy. Here's a gist.

Great it seems promising

I'm a bit surprised that zora apparently tries to report the test rather than the actual line / trace where the exception has been raised... Is that a TAP thing?

Nope, the idea is to point to the first line of the stacktrace in the user land, which is often the assertion failing in a user's spec function, basically where you would start the investigation. I think that is the expected behavior as it seems to be shared across many testing frameworks (although they often have a fancier way to display it)
example in AvA:
ava example of reporting failing test

It is coherent with what you see: the first line not in zora core would be .... your wrapper. I think you can workaround this issue if you explicitly name your wrapping function "zora_spec_fn" (see example in the test runner above)

The specs you provided for only and focus seem robust while remaining ... complex :)

The consistency of test state resulting of zora's control flow (async / await) remains a responsibility of the user. They are able to finely control and easily predict what will run or not, so that doesn't seem like an issue to me. In most cases I guess it should just work, except if a test shares and depends on state change by a previous (skipped) test. I think it's fair to let the user sort this kind of mess out.

Yes you are definitely right. Yet my experience in maintaining few open source libraries is that people open tickets for "bugs" when the cause is often a "wrong usage" of the library.

from zora.

rixo avatar rixo commented on August 21, 2024

Yes, recipes would probably better suit the philosophy here. The multiple solutions approach you have in the article is very interesting. I'd love to see how you tackle the watch problematic ;) That's not just a question of "ways", there are also tools, and technicalities, like zora_spec_fn for example.

Said zora_spec_fn works like a charm for me, thanks! I get my trace just where I want it, just like you said. That's precisely the point where I'd need to put a debugger or a console.trace if I needed the full stack trace, by the way. And to run only one code path.

So, now that the line numbers are fixed, I am personally very satisfied with my own solution. I also understand how you may not want to have to provide support & explain its unobvious behaviour. Yet I don't think there's a better simple alternative (it goes unspecifiable, as you've noted). As a consequence, in regard to this issue, I guess only should not be part of the core library. I, at least, am happy with it.

In fact, I love to hack my test tools per project. I already do it all the time. In many cases, it would have been easier with zora because of the things it doesn't have.

Thanks again for all your help! I've got another question, but I'll open another issue to keep it clean (if I don't find an already existing one).

from zora.

lorenzofox3 avatar lorenzofox3 commented on August 21, 2024

At the end, I implemented it partially: you convinced me :)

  • only is implemented following your spec with the "only" mode.
  • I liked how your addressed the issue of making sure code with only is not committed by throwing an error: so like you did, the mode can be switched with env variable or global (for browsers).

Thanks for the good ideas and the implementation details

from zora.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.