Code Monkey home page Code Monkey logo

proposal-string-matchall's Introduction

String.prototype.matchAll

Proposal and specs for String.prototype.matchAll.

Polyfill/Shim

See string.prototype.matchall on npm or on github.

Spec

You can view the spec in markdown format or rendered as HTML.

Rationale

If I have a string, and either a sticky or a global regular expression which has multiple capturing groups, I often want to iterate through all of the matches. Currently, my options are the following:

var regex = /t(e)(st(\d?))/g;
var string = 'test1test2';

string.match(regex); // gives ['test1', 'test2'] - how do i get the capturing groups?

var matches = [];
var lastIndexes = {};
var match;
lastIndexes[regex.lastIndex] = true;
while (match = regex.exec(string)) {
	lastIndexes[regex.lastIndex] = true;
	matches.push(match);
	// example: ['test1', 'e', 'st1', '1'] with properties `index` and `input`
}
matches; /* gives exactly what i want, but uses a loop,
		* and mutates the regex's `lastIndex` property */
lastIndexes; /* ideally should give { 0: true } but instead
		* will have a value for each mutation of lastIndex */

var matches = [];
string.replace(regex, function () {
	var match = Array.prototype.slice.call(arguments, 0, -2);
	match.input = arguments[arguments.length - 1];
	match.index = arguments[arguments.length - 2];
	matches.push(match);
	// example: ['test1', 'e', 'st1', '1'] with properties `index` and `input`
});
matches; /* gives exactly what i want, but abuses `replace`,
	  * mutates the regex's `lastIndex` property,
	  * and requires manual construction of `match` */

The first example does not provide the capturing groups, so isn’t an option. The latter two examples both visibly mutate lastIndex - this is not a huge issue (beyond ideological) with built-in RegExps, however, with subclassable RegExps in ES6/ES2015, this is a bit of a messy way to obtain the desired information on all matches.

Thus, String#matchAll would solve this use case by both providing access to all of the capturing groups, and not visibly mutating the regular expression object in question.

Iterator versus Array

Many use cases may want an array of matches - however, clearly not all will. Particularly large numbers of capturing groups, or large strings, might have performance implications to always gather all of them into an array. By returning an iterator, it can trivially be collected into an array with the spread operator or Array.from if the caller wishes to, but it need not.

Previous discussions

Naming

The name matchAll was selected to correspond with match, and to connote that all matches would be returned, not just a single match. This includes the connotation that the provided regex will be used with a global flag, to locate all matches in the string. An alternate name has been suggested, matches - this follows the precedent set by keys/values/entries, which is that a plural noun indicates that it returns an iterator. However, includes returns a boolean. When the word is not unambiguously a noun or a verb, "plural noun" doesn't seem as obvious a convention to follow.

Update from committee feedback: ruby uses the word scan for this, but the committee is not comfortable introducing a new word to JavaScript. matchEach was suggested, but some were not comfortable with the naming similarity to forEach while the API was quite different. matchAll seems to be the name everyone is most comfortable with.

In the September 2017 TC39 meeting, there was a question raised about whether "all" means "all overlapping matches" or "all non-overlapping matches" - where “overlapping” means “all matches starting from each character in the string”, and “non-overlapping” means “all matches starting from the beginning of the string”. We briefly considered either renaming the method, or adding a way to achieve both semantics, but the objection was withdrawn.

proposal-string-matchall's People

Contributors

ljharb avatar claudepache avatar lencioni avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.