Code Monkey home page Code Monkey logo

chromium-bidi's Introduction

WebDriver BiDi for Chromium chromium-bidi on npm

CI status

E2E Tests Unit Tests WPT Tests

Pre-commit

codecov

This is an implementation of the WebDriver BiDi protocol with some extensions (BiDi+) for Chromium, implemented as a JavaScript layer translating between BiDi and CDP, running inside a Chrome tab.

Current status can be checked at WPT WebDriver BiDi status.

BiDi+

"BiDi+" is an extension of the WebDriver BiDi protocol. In addition to WebDriver BiDi it has:

Command cdp.sendCommand

CdpSendCommandCommand = {
  method: "cdp.sendCommand",
  params: ScriptEvaluateParameters,
}

CdpSendCommandParameters = {
   method: text,
   params: any,
   session?: text,
}

CdpSendCommandResult = {
   result: any,
   session: text,
}

The command runs the described CDP command and returns the result.

Command cdp.getSession

CdpGetSessionCommand = {
   method: "cdp.getSession",
   params: ScriptEvaluateParameters,
}

CdpGetSessionParameters = {
   context: BrowsingContext,
}

CdpGetSessionResult = {
   session: text,
}

The command returns the default CDP session for the selected browsing context.

Command cdp.resolveRealm

CdpResolveRealmCommand = {
   method: "cdp.resolveRealm",
   params: ScriptEvaluateParameters,
}

CdpResolveRealmParameters = {
   realm: Script.Realm,
}

CdpResolveRealmResult = {
   executionContextId: text,
}

The command returns resolves a BiDi realm to its CDP execution context ID.

Events cdp

CdpEventReceivedEvent = {
   method: "cdp.<CDP Event Name>",
   params: CdpEventReceivedParameters,
}

CdpEventReceivedParameters = {
   event: text,
   params: any,
   session: text,
}

The event contains a CDP event.

Field channel

Each command can be extended with a channel:

Command = {
   id: js-uint,
   channel?: text,
   CommandData,
   Extensible,
}

If provided and non-empty string, the very same channel is added to the response:

CommandResponse = {
   id: js-uint,
   channel?: text,
   result: ResultData,
   Extensible,
}

ErrorResponse = {
  id: js-uint / null,
  channel?: text,
  error: ErrorCode,
  message: text,
  ?stacktrace: text,
  Extensible
}

When client uses commands session.subscribe and session.unsubscribe with channel, the subscriptions are handled per channel, and the corresponding channel filed is added to the event message:

Event = {
  channel?: text,
  EventData,
  Extensible,
}

Dev Setup

npm

This is a Node.js project, so install dependencies as usual:

npm install

cargo

We use cddlconv to generate our WebDriverBidi types before building.

  1. Install Rust.
  2. Run cargo install --git https://github.com/google/cddlconv.git cddlconv

pre-commit.com integration

Refer to the documentation at .pre-commit-config.yaml.

pre-commit install --hook-type pre-push

Starting WebDriver BiDi Server

This will run the server on port 8080:

npm run server

Use the PORT= environment variable or --port= argument to run it on another port:

PORT=8081 npm run server
npm run server -- --port=8081

Use the DEBUG environment variable to see debug info:

DEBUG=* npm run server

Use the DEBUG_DEPTH (default: 10) environment variable to see debug deeply nested objects:

DEBUG_DEPTH=100 DEBUG=* npm run server

Use the CHANNEL=... environment variable with one of the following values to run the specific Chrome channel: stable, beta, canary, dev, local. Default is local. The local channel means the pinned in .browser Chrome version will be downloaded if it is not yet in cache. Otherwise, the requested Chrome version should be installed.

CHANNEL=dev npm run server

Use the CLI argument --verbose to have CDP events printed to the console. Note: you have to enable debugging output bidi:mapper:debug:* as well.

DEBUG=bidi:mapper:debug:* npm run server -- --verbose

or

DEBUG=* npm run server -- --verbose

Starting on Linux and Mac

TODO: verify it works on Windows.

You can also run the server by using npm run server. It will write output to the file log.txt:

npm run server -- --port=8081 --headless=false

Running with in other project

Sometimes it good to verify that a change will not affect thing downstream for other packages. There is a useful puppeteer label you can add to any PR to run Puppeteer test with your changes. It will bundle chromium-bidi and install it in Puppeteer project then run that package test.

Running

Unit tests

Running:

npm run unit

E2E tests

The E2E tests are written using Python, in order to learn how to eventually do this in web-platform-tests.

Installation

Python 3.10+ and some dependencies are required:

python -m pip install --user pipenv
pipenv install

Running

The E2E tests require BiDi server running on the same host. By default, tests try to connect to the port 8080. The server can be run from the project root:

npm run e2e  # alias to to e2e:headless
npm run e2e:headful
npm run e2e:headless

This commands will run ./tools/run-e2e.mjs, which will log the PyTest output to console, Additionally the output is also recorded under ./logs/<DATE>.e2e.log, this will contain both the PyTest logs and in the event of FAILED test all the Chromium-BiDi logs.

If you need to see the logs for all test run the command with VERBOSE=true.

Simply pass npm run e2e -- tests/<PathOrFile> and the e2e will run only the selected one. You run a specific test by running npm run e2e -- -k <TestName>.

Use CHROMEDRIVER environment to run tests in chromedriver instead of NodeJS runner:

CHROMEDRIVER=true npm run e2e

Use the PORT environment variable to connect to another port:

PORT=8081 npm run e2e

Use the HEADLESS to run the tests in headless (new or old) or headful modes. Values: new, old, false, default: new.

HEADLESS=new npm run e2e

Updating snapshots

npm run e2e -- --snapshot-update

See https://github.com/tophat/syrupy for more information.

Local http server

E2E tests use local http server pytest-httpserver, which is run automatically with the tests. However, sometimes it is useful to run the http server outside the test case, for example for manual debugging. This can be done by running:

pipenv run local_http_server

...or directly:

python tests/tools/local_http_server.py

Examples

Refer to examples/README.md.

WPT (Web Platform Tests)

WPT is added as a git submodule. To get run WPT tests:

Check out and setup WPT

1. Check out WPT

git submodule update --init

2. Go to the WPT folder

cd wpt

3. Set up virtualenv

Follow the System Setup instructions.

4. Setup hosts file

Follow the hosts File Setup instructions.

4.a On Linux, macOS or other UNIX-like system
./wpt make-hosts-file | sudo tee -a /etc/hosts
4.b On Windows

This must be run in a PowerShell session with Administrator privileges:

python wpt make-hosts-file | Out-File $env:SystemRoot\System32\drivers\etc\hosts -Encoding ascii -Append

If you are behind a proxy, you also need to make sure the domains above are excluded from your proxy lookups.

5. Set BROWSER_BIN

Set the BROWSER_BIN environment variable to a Chrome, Edge or Chromium binary to launch. For example, on macOS:

# Chrome
export BROWSER_BIN="/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary"
export BROWSER_BIN="/Applications/Google Chrome Dev.app/Contents/MacOS/Google Chrome Dev"
export BROWSER_BIN="/Applications/Google Chrome Beta.app/Contents/MacOS/Google Chrome Beta"
export BROWSER_BIN="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
export BROWSER_BIN="/Applications/Chromium.app/Contents/MacOS/Chromium"

# Edge
export BROWSER_BIN="/Applications/Microsoft Edge Canary.app/Contents/MacOS/Microsoft Edge Canary"
export BROWSER_BIN="/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge"

Run WPT tests

1. Make sure you have Chrome Dev installed

https://www.google.com/chrome/dev/

2. Build Chromedriver BiDi

Oneshot:

npm run build

Continuously:

npm run build --watch

3. Run

npm run wpt -- webdriver/tests/bidi/

Update WPT expectations if needed

UPDATE_EXPECTATIONS=true npm run wpt -- webdriver/tests/bidi/

How does it work?

The architecture is described in the WebDriver BiDi in Chrome Context implementation plan .

There are 2 main modules:

  1. backend WS server in src. It runs webSocket server, and for each ws connection runs an instance of browser with BiDi Mapper.
  2. front-end BiDi Mapper in src/bidiMapper. Gets BiDi commands from the backend, and map them to CDP commands.

Contributing

The BiDi commands are processed in the src/bidiMapper/commandProcessor.ts. To add a new command, add it to _processCommand, write and call processor for it.

Publish new npm release

Automatic release

We use release-please to automate releases. When a release should be done, check for the release PR in our pull requests and merge it.

Manual release

  1. Dry-run

    npm publish --dry-run
  2. Open a PR bumping the chromium-bidi version number in package.json for review:

    npm version patch -m 'chore: Release v%s' --no-git-tag-version

    Instead of patch, use minor or major as needed.

  3. After the PR is reviewed, create a GitHub release specifying the tag name matching the bumped version. Our CI then automatically publishes the new release to npm based on the tag name.

Roll into Chromium

This section assumes you already have a Chromium set-up locally, and knowledge on how to submit changes to the repo. Otherwise submit an issue for a project maintainer.

  1. Create a new branch in chromium src/.
  2. Update the mapper version:
third_party/bidimapper/pull.sh
third_party/bidimapper/build.sh
  1. Submit a CL with bug chromedriver:4226.

  2. Regenerate WPT expectations or baselines:

    4.1. Trigger a build and test run:

    third_party/blink/tools/blink_tool.py rebaseline-cl --build="linux-blink-rel" --verbose

    4.2. Once the test completes on the builder, rerun that command to update the baselines. Update test expectations if there are any crashes or timeouts. Commit the changes (if any), and upload the new patch to the CL.

  3. Add appropriate reviewers or comment the CL link on the PR.

chromium-bidi's People

Contributors

browser-automation-bot avatar bwalderman avatar christian-bromann avatar dependabot[bot] avatar foolip avatar guangyuexu avatar jrandolf avatar lightning00blade avatar mathiasbynens avatar nechaev-chromium avatar orkon avatar release-please[bot] avatar sadym-chromium avatar thiagowfx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chromium-bidi's Issues

Support multiple CdpClients with different CDP sessionId sharing a connection

We'll need a way to connect a CdpClient to a different CDP target than the browser target. In other words, support the "flat" mode of the protocol where multiple CDP sessions may share a connection and use the "sessionId" property to route messages to a particular session.

This is already used in mapperServer.ts to create and communicate with the mapper tab, and we will likely need it later to attach to related targets such as OOPIFs or workers.

Proposed API: Add a new method to CdpClient:
public attachToSession(sessionId: string): CdpClient

Given an existing CdpClient (e.g. the initial browser client), calling this method returns a new CdpClient that shares the same transport and can be used to send messages to or receive events from the given session.

Prototype `binding` type of `LocalValue`

w3c/webdriver-bidi#157

Instead of adding global bindings, pass bindings as deserializable arguments to script.callFunction.
...
The argument can be something like:

{
  type: "binding", 
  value: "SOME_BINDING_NAME"
}

, which will be deserialized to a callback. Calling that callback with any arguments will cause the BiDi event with those arguments like:

{
  method: "script.bindingCalled",
  params: {
    arguments: [... remote values ...],
    name: "SOME_BINDING_NAME"
  }
}

Switch to targetCreated/targetDestroyed for BiDi contextCreated/contextDestroyed events

The BrowsingContextProcessor is using attachedToTarget and detachedFromTarget to track known browsing contexts and fire contextCreated and contextDestroyed events.

These events are intended to track when a CDP client attaches to or detaches from a Target. This may not always map 1:1 with target creation and target destruction. For example, there can be multiple sessions attached to a Target, some may detach and re-attach, and the target doesn't necessarily go away once all sessions have ended.

This tracks updating BrowsingContextProcessor to use targetCreated/targetDestroyed instead. Note that we're already using targetInfoChanged which is in the same "family" of events as targetCreated/targetDestroyed.

line number starts from 1 instead of 0 from CDP

In the test case test_consoleLog_logEntryAddedEventEmitted, the line number returned from Bidi server starts from 1. Based on CDP definition of Runtime.CallFrame (link), the lineNumber should start from 0.

WebDriver BiDi clients implement cross-browser e2e scenario: print to PDF

Cross-browser e2e scenario of navigating + printing the page to PDF.

Includes:

  • WebDriver BiDi specification.
  • WPT tests.
  • Implementations in Chromium passing the WPT tests.
  • Implementations in Firefox passing the WPT tests.
  • Implementations in Safari passing the WPT tests.
  • Support in Puppeteer.

once() method for CDP events

There are a few instances of this pattern appearing in our code, used to listen for a single occurrence of an event:

const eventHandler = (params) => {
  cdpClient.Page.removeEventListener('frameStoppedLoading', eventHandler);
  // ... Do stuff ...
};

cdpClient.Page.addEventListener('frameStoppedLoading', eventHandler);

We could add a .once() method to encapsulate this pattern instead:

const params = await cdpClient.Page.once('frameStoppedLoading');
// ... Do stuff ...

WebDriver BiDi clients implement cross-browser e2e scenario: Google Search

Cross-browser e2e scenario of

  1. loading Google.com
  2. entering a search term via keyboard input
  3. submitting via mouse click
  4. extracting the results from the page.

This includes

  • WebDriver BiDi specification.
  • WPT tests.
  • Implementations in Chromium passing the WPT tests.
  • Implementations in Firefox passing the WPT tests.
  • Implementations in Safari passing the WPT tests.
  • Support in Puppeteer.

Use sessionId instead of targetId to associate CDP client with a Context

The _handleDetachedFromTargetEvent function in browsingContextProcessor.ts uses the targetId parameter on the detachedFromTarget event which is depreacted. The sessionId parameter is recommended instead. This is likely because CDP supports multiple CDP sessions attached to the same Target (although we only use 1 CDP session for our purposes).

This tracks updating our code to use the sessionId parameter here instead. We would likely need to maintain a map in the browsingContextProcessor mapping CDP session IDs to Context objects.

WebDriver BiDi clients implement cross-browser e2e scenario: blocking images

Cross-browser e2e scenario of loading a page with images blocked using request interception.

Includes:

  • WebDriver BiDi specification.
  • WPT tests.
  • Implementations in Chromium passing the WPT tests.
  • Implementations in Firefox passing the WPT tests.
  • Implementations in Safari passing the WPT tests.
  • Support in Puppeteer.

Implement CDP events filtering

Tracking

  • Consider adding filtering CDP.receivedEvent events by session/method/etc in BiDi+.
    • Filtering by Method
    • Filtering by Session
  • Re-add ability to subscribe to all CDP events

Update BigInt format

Currently, BigInt is serialized as {"type":"bigint","value":"12345678901234567168n"} with a trailing n at the end, while in the serialization spec written BigInt should be a result of calling BigInt.toString(), which doesn't have a trailing n character.

Get `examples/cross-browser.py` to run on top of Firefox’s BiDi implementation

With Firefox Nighty:

$ ./firefox --remote-debugging-port=9222
WebDriver BiDi listening on ws://localhost:9222
DevTools listening on ws://localhost:9222/devtools/browser/6cf65b2b-54e9-444e-a36e-0c04f12b47c9
…

$ # in another terminal / tab

$ PORT=9222 python3 examples/cross-browser.py
Traceback (most recent call last):
  File "~/projects/chromium-bidi/examples/cross-browser.py", line 150, in <module>
    result = loop.run_until_complete(main())
  File "~/homebrew/Cellar/[email protected]/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "~/projects/chromium-bidi/examples/cross-browser.py", line 53, in main
    websocket = await get_websocket()
  File "~/projects/chromium-bidi/examples/cross-browser.py", line 28, in get_websocket
    return await websockets.connect(url)
  File "~/Library/Python/3.9/lib/python/site-packages/websockets/client.py", line 542, in __await_impl__
    await protocol.handshake(
  File "~/Library/Python/3.9/lib/python/site-packages/websockets/client.py", line 296, in handshake
    raise InvalidStatusCode(status_code)
websockets.exceptions.InvalidStatusCode: server rejected WebSocket connection: HTTP 200

We should figure out what’s going on here.

Support both a "standalone" and "embedded" build

Idea: Support building/running the bidi mapper in two configurations:

  • "embedded" - This is the current configuration. Mapper is injected into a browser tab and communicates with the rest of the browser through exposeDevToolsProtocol.
  • "standalone" - New config. Mapper runs as a node module outside the browser and communicates with the browser through remote-debugging-port/pipe.

Motivation:

Standalone mode would simplify development and testing of mapper code. The bidi server code and mapper code would all run in the same node process which simplifies debugging.

Move serialization to CDP

Specification bases serialization logic on the JS internal slots. E.g. value has a [[DateValue]] internal slot.

In JS there is no way to implement this logic. Current BiDi implementation is based on checking the prototypes, which can lead to incorrect or unwanted results.

The only way to implement the serialization properly according to the spec is to move it to CDP level by adding a field bidiValue to the CDP Runtime.RemoteObject and a flag generateBiDiValue to the CDP Runtime.callFunctionOn method. On the CDP level there is an access to all the internals of the V8, and implementation can be done according to spec.

Optimize Mapper compiled code

Actual: the rolled-up Mapper script /src/.build/bidiMapper/mapper.js has a size of 748K, and the vast majority of the file is CDP string literals describing CDP methods, which are added for development convenience and are required for the functionality.

Removing those definitions reduces the size of the plain /src/.build/bidiMapper/mapper.js to 68K, and minimizing afterwards leads to only 28K source size.

Expected: the rolled-up Mapper script /src/.build/bidiMapper/mapper.js doesn't have unnecessary string literals and has a size of ~ 30K.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.