Code Monkey home page Code Monkey logo

webref's Introduction

Webref

Description

This repository contains machine-readable references of CSS properties, definitions, IDL, and other useful terms that can be automatically extracted from web browser specifications. The contents of the repository are updated automatically every 6 hours (although note information about published /TR/ versions of specifications are updated only once per week).

Specifications covered by this repository are technical Web specifications that are directly implemented or that will be implemented by Web browsers; in other words, those that appear in browser-specs.

The main branch of this repository contains automatically-generated raw extracts from web browser specifications. These extracts come with no guarantee on validity or consistency. For instance, if a specification defines invalid IDL snippets or uses an unknown IDL type, the corresponding IDL extract in this repository will be invalid as well.

The curated branch contains curated extracts. Curated extracts are generated from raw extracts in the ed folder by applying manually-maintained patches to fix invalid content and provide validity and consistency guarantees. The curated branch is updated automatically whenever the main branch is updated, unless patches need to be modified (which requires manual intervention). Curated extracts are published under https://w3c.github.io/webref/ed/.

Additionally, subsets of the curated content get manually reviewed and published as NPM packages on a weekly basis:

Important: Unless you are ready to deal with invalid content, we strongly recommend that you process contents of the curated branch or NPM packages instead of raw content in the main branch.

Available extracts

This repository contains raw and curated information about latest Editor's Drafts of Web specifications in the ed folder, as well as raw information about the latest published version (for /TR/ specifications) in the tr folder.

More often that not, published versions of specifications are much older than their latest Editor's Draft. Data in the tr folder is more invalid/inconsistent than data in the ed folder as a result. Additionally, no attempt is being made at curating data in the tr folder, so use that folder at your own risk!

The following subfolders in the curated branch contain individual machine-readable JSON or text files generated from specifications:

  • ed/css: CSS terms (properties, descriptors, value spaces). One file per specification series.
  • ed/dfns: <dfn> terms, along with metadata such as linking text, access level, namespace. One file per specification.
  • ed/elements: Markup elements defined, along with the interface that they implement. One file per specification.
  • ed/headings: Section headings. One file per specification.
  • ed/idl: Raw WebIDL index. One file per specification series.
  • ed/idlnames: WebIDL definitions per referenceable IDL name. One file per IDL name.
  • ed/idlnamesparsed: Parsed WebIDL structure of definitions in the idlnames folder. One file per IDL name.
  • ed/idlparsed: Parsed WebIDL structure of definitions in the idl folder. One file per specification.
  • ed/ids: Fragments defined in the specification. One file per specification.
  • ed/links: Links to other documents, along with targeted fragments. One file per specification.
  • ed/refs: Normative and informative references to other specifications. One file per specification.

Individual files are named after the shortname of the specification, or after the shortname of the specification series for CSS definitions and raw IDL files. Individual files are only created when needed, meaning when the specification actually includes relevant terms.

The ed/index.json file contains the index of specifications that have been crawled, and relative links to individual files that have been created.

This repository uses Reffy, a Web spec exploration tool, to crawl the specifications and generate the data. In particular, the data it contains is the result of running Reffy. The repository does not contain any more data.

Raw WebIDL extracts are used in web-platform-tests, please see their interfaces/README.md for details.

Curation guarantees

Data curation brings the following guarantees.

Web IDL extracts

  • All IDL files can be parsed by the version of webidl2.js referenced in package.json.
  • WebIDL2.validate passes with the exception of the "no-nointerfaceobject" rule about [LegacyNoInterfaceObject], which is in wide use.
  • All types are defined by some specification.
  • All extended attributes are defined by some specification.
  • No duplicate top-level definitions or members.
  • No missing or mismatched types in inheritance chains.
  • No conflicts when applying mixins and partials.

CSS extracts

  • All CSS files can be parsed by the version of CSSTree referenced in package.json, with the exception of a handful CSS value definitions that, although valid, are not yet supported by CSSTree.
  • No duplicate definitions of CSS properties provided that CSS extracts of delta specs are not taken into account (such extracts end with -n.json, where n is a level number).
  • CSS extracts contain a base definition of all CSS properties that get extended by other CSS property definitions (those for which newValues is set).

Elements extracts

  • All Web IDL interfaces referenced by elements exist in Web IDL extracts.

Events extracts

  • All events have a type attribute that match the name of the event
  • All events have a interface attribute to describe the interface used by the Event. The Web IDL interface exists in the latest version of the @webref/idl package at the time the @webref/events package is released, and represents an actual interface (i.e. not a mixin).
  • All events have a targets attribute with a non-empty list of target interfaces on which the event may fire. All Web IDL interfaces in the list exist in the latest version of the @webref/idl package at the time the @webref/events package is released, and represent an actual interface (i.e. not a mixin).
  • The bubbles attribute is always set to a boolean value for target interfaces that belong to a bubbling tree (DOM, IndexedDB, Serial API, Web Bluetooth).
  • The bubbles attribute is not set for target interface that do not belong to a bubbling tree.
  • The targets attribute contains the top most interfaces in an inheritance chain, unless bubbling conditions differ. For instance, the list may contain { "target": "Element", "bubbles": true } but not also { "target": "HTMLElement", "bubbles": true } since HTMLElement inherits from Element.
  • For target interfaces that belong to a bubbling tree, the targets attribute only contains the deepest interface in the bubbling tree on which the event may fire and bubble. For instance, the list may contain { "target": "HTMLElement", "bubbles": true }, but not also { "target": "Document" } since event would de facto fire at Document through bubbling.

Potential spec anomalies

This repository used to contain analyses of potential spec anomalies, such as missing references and invalid Web IDL definitions. These analyses are now published in the companion w3c/webref-analysis repository.

How to suggest changes or report an error

Feel free to raise issues in this repository as needed. Note that most issues likely more directly apply to underlying tools:

  • Errors in the data are most likely caused by bugs or missing features in Reffy, which is the tool that crawls and parses specifications under the hoods. If you spot an error, please report it in Reffy's issue tracker.
  • If you believe that a spec is missing from the list, please check browser-specs and report it there.

Development notes

GitHub Actions workflows are used to automate most of the tasks in this repo.

Data update

  • Update ED report - crawls the latest version of Editor's Drafts and updates the contents of the ed folder. Workflow runs every 6 hours. A typical crawl takes about 10mn to complete.
  • Update TR report - crawls the published version of Editor's Drafts and updates the contents of the tr folder. Workflow runs once per week on Monday. A typical crawl takes about 10mn to complete.
  • Curate data & Prepare package PRs - runs whenever crawled data gets updated and updates the curated branch accordingly (provided all tests pass). The job also creates pull requests to release new versions of NPM packages when needed. Each pull request details the diff that would be released, and bumps the package version in the relevant packages/xxx/package.json file.
  • Clean up abandoned files - checks the contents of repository to detect orphan crawl files that are no longer targeted by the latest crawl's result and creates a PR to delete these files from the repository. Runs once per week on Wednesday. The crawl workflows does not delete these files automatically because crawl sometimes fails on a spec due to transient network or spec errors.
  • Test - runs tests on pull requests.
  • Clean patches when issues/PR are closed - drops patches that no longer need to apply because underlying issues got fixed. Runs once per week.

Releases to NPM

  • Publish @webref package if needed - publishes a new version of the @webref/css, @webref/elements, @webref/events or @webref/idl package to NPM, tags the corresponding commits on the main and curated branches, and updates the relevant @webref/xxx@latest tag to point to the right commit on the curated branch. Runs whenever a pre-release PR is merged. Note that the released version is the version that appeared in packages/css/package.json, packages/elements/package.json, packages/events/package.json or packages/idl/package.json before the pre-release PR is merged.
  • [@webref release: Request review of pre-release PR] - assigns reviewers to NPM package pull requests. Runs once per week.

webref's People

Contributors

dependabot[bot] avatar dontcallmedom avatar dontcallmedom-bot avatar foolip avatar github-actions[bot] avatar lahmatiy avatar saschanaz avatar tidoust avatar timbl avatar tripu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.