Code Monkey home page Code Monkey logo

qsx's Introduction

qsx

Extended CSS selectors for querying the DOM and extracting parts of it. Used by the hred command-line tool.

Installation

The library is currently packaged in CJS (CommonJS) format, for Node.js.

npm install qsx

Usage

let qsx = require('qsx');
qsx(el, ':scope > a');

In Node.js, which lacks a built-in DOM environment, you can use jsdom.

The query language

If you're familiar with CSS selectors and Element.querySelectorAll, you are mostly good to go. qsx introduces only a few differences and extensions, listed below.

Commas select parralel data

The CSS selector h2, h3 matches all elements that are either an h2 or an h3. In qsx, however, it selects all h2 elements, and all h3 elements, in separate arrays.

<h2>Installation</h2>
<h3>With npm</h3>
<h3>With yarn</h3>
<h2>Usage</h2>
<h3>From the command-line</h3>

<script>
	document.querySelectorAll('h2, h3');
	// =>
	['<h2>Installation</h2>', '<h3>With npm</h3>', ...]

	qsx(document, 'h2, h3');
	// =>
	[
		['<h2>Installation</h2>', '<h2>Usage</h2>'],
		['<h3>With npm</h3>', '<h3>With yarn</h3>', ...]
	]
</script>

Note: The :is() pseudo-class would have provided a mechanism to restore the CSS semantics to the comma. qsx(el, ':is(h2, h3)') could have been used to mean h2, h3. Unfortunately, at the moment it's unevenly implemented across browsers and in jsdom.

Curly brackets introduce sub-scopes

Whenever you use a pair of curly brackets {...}, you create a sub-scope.

Here's a query to pick the first and last columns off each row in the table below:

<table>
	<tbody>
		<tr>
			<td>1.1</td>
			<td>1.2</td>
			<td>1.3</td>
			<td>1.4</td>
		</tr>
		<tr>
			<td>2.1</td>
			<td>2.2</td>
			<td>2.3</td>
			<td>2.4</td>
		</tr>
	</tbody>
</table>

<script>
	qsx(document, `tr { :scope > td:first-child, :scope > td:last-child }`);
	// =>
	[
		[['<td>1.1</td>'], ['<td>1.4</td>']],
		[['<td>2.1</td>'], ['<td>2.4</td>']]
	];
</script>

Here's the equivalent query in vanilla querySelectorAll and JavaScript:

const arr = Array.from;
arr(document.querySelectorAll('tr')).map(tr => [
	arr(tr.querySelectorAll(':scope > td:firstChild')).map(td => td.outerHTML),
	arr(tr.querySelectorAll(':scope > td:firstChild')).map(td => td.outerHTML)
]);

Extracting HTML attributes and DOM properties

By default, for each leaf element in the query, qsx() returns its .outerHTML. Instead, we can extract specific attributes and properties:

  • @attr (the attribute accessor) extracts the attr HTML attribute via el.getAttribute('attr');
  • @.prop (the property accessor) reads the prop DOM property via el.prop;
  • @* (the attribute wildcard) extracts all the HTML attributes into an object via el.attributes.

This query extracts the href and label off each anchor element:

<ul>
	<li title="item 1"><a href="/first-link">First link</a></li>
	<li title="item 2"><a href="/second-link">Second link</a></li>
</ul>

<script>
	qsx(document, `a { @href, @.textContent }`);
	// =>
	[
		{ href: '/first-link', '.textContent': 'First link' },
		{ href: '/second-link', '.textContent': 'Second link' }
	];
</script>

Notice that, to prevent collisions between attribute and property names, the latter are always prefixed with . in the resulting JSON, similar to how they were defined in the query.

Attributes, properties and scoped selectors can be combined at will. When present among other attributes / properties, scoped selectors are added under the .scoped key:

qsx(document, `li { a, @title }`);
// =>
[
	{
		title: 'item 1',
		'.scoped': ['<a href="/first-link">First link</a>']
	},
	{
		title: 'item 2',
		'.scoped': ['<a href="/second-link">Second link</a>']
	}
];

:scope and combining selectors

In stock Element.querySelectorAll, the :scope selector cannot be combined with the next-sibling selector (:scope + el), nor the subsequent-sibling selector (:scope ~ el).

qsx does not impose this limitation, so you can group attributes from things like definition lists:

<dl>
	<dt><a href="#ref1">First term</a></dt>
	<dd>First definition</dd>

	<dt><a href="#ref2">Second term</a></dt>
	<dd>Second definition</dd>
</dl>
<script>
	qsx(
		document,
		`dt { 
		a { @href, @.textContent },
		:scope + dd { @.textContent }
	}`
	);
	// =>
	[
		[
			[
				{
					href: '#ref1',
					'.textContent': 'First term'
				}
			],
			['First definition']
		],
		[
			[
				{
					href: '#ref2',
					'.textContent': 'Second term'
				}
			],
			['Second definition']
		]
	];
</script>

Alias selectors with =>

Keys in the resulting JSON can be aliased to any other name, using => alias.

Alias HTML attributes and DOM properties:

qsx(el, 'a { @href => url, @.textContent => text }');

Alias individual scoped selectors:

qsx(el, 'tr { td:first-child => first, td:last-child => last }');

Alias whole .scoped object:

qsx(el, 'tr { @title, td:first-child, td:last-child } => cells');

Spread operator

The special alias . will cause the object to be merged into the current context:

qsx(el, 'tr { td:first-child, td:last-child } => .');

Alternatively, you can use the ... (spread) operator for the same purpose:

qsx(el, 'tr ...{ td:first-child, td:last-child }');

Pick first result with ^

For more complex queries where there resulting JSON contains several nested arrays, but for which you want to select a single element, you can prefix a selector with ^ to select just the first matching element — like querySelector() rather than querySelectorAll().

qsx(document, `li { ^ a, @title }`);

// =>
[
	{
		title: 'item 1',
		'.scoped': '<a href="/first-link">First link</a>'
	},
	{
		title: 'item 2',
		'.scoped': '<a href="/second-link">Second link</a>'
	}
];

Some other situations will trigger first-result behavior even in the absence of the ^ prefix:

  • When requesting a direct attribute in a sub-scope: a { @href }
  • When using the . alias (as in a { @href, @.textContent } => .) or the spread ... operator

qsx's People

Contributors

danburzo avatar roryokane avatar

Stargazers

カシオ 金城 大関 avatar Alexey Zaharchenko avatar Roman avatar Nicole avatar Andrew Purcell avatar Harry Gallagher avatar Brian Hetro avatar Andreas Motl avatar Victor Shepelev  avatar Wiktor avatar  avatar Mihai Chereji avatar  avatar Ruslan Prakapchuk avatar

Watchers

 avatar  avatar

Forkers

roryokane

qsx's Issues

Transpile to UMD, CJS

...and use ES modules in the source. Not sure on the exact Node.js compat on the syntax I'm currently using.

Allow matching DOM properties

Extend HTML attribute selectors:

[href='...']

...to allow DOM properties:

[.textContent='...']

This will make qsx more useful for certain classes of XML (such as the result of a plutil -convert xml1 -o Bookmarks.xml Bookmarks.plist command).

Alias HTML attributes / DOM properties

Not essential, since the point is to use jq next in the pipeline, but the dots in the DOM properties and/or .scoped makes the jq query more annoying.

Ability to specify one result (a la `querySelector`) instead of many

In some cases, the resulting JSON contains more nested arrays than what makes intuitive sense, because on each scope we perform querySelectorAll, even though the author's intent might be to select just one.

Provide a syntax to apply querySelector() instead of querySelectorAll() on a per-selector basis. Maybe an exclamantion mark (!).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.